feat: add firecrawl and vane applications, fix lxserver form config

首先新增Firecrawl和Vane两款应用,包含完整的应用配置、docker-compose编排、说明文档以及logo资源;同时修复了lxserver时区配置项中多余的rule参数。
This commit is contained in:
arch3rPro
2026-05-17 17:52:54 +08:00
parent 87bc4e7f86
commit e98811cd04
28 changed files with 705 additions and 2 deletions
+65
View File
@@ -0,0 +1,65 @@
# Firecrawl
将任意网站转换为适合大语言模型(LLM)的结构化数据。强大的网页抓取、爬取、搜索和数据提取平台。
## 功能特点
- **单页抓取**:将任意 URL 转换为 Markdown、HTML、截图或结构化 JSON
- **全站爬取**:递归抓取整个网站,智能过滤链接
- **URL 发现**:通过站点地图、索引查询或搜索快速发现网站所有 URL
- **网络搜索**:搜索网络并一次性获取结果的完整页面内容
- **AI 提取**:基于 LLM 的结构化数据提取,支持 Schema 验证
- **智能代理**:自主研究代理,自动导航并提取数据
- **远程浏览器**:支持远程浏览器会话,提供 CDP 访问和代码执行能力
- **批量操作**:异步批量抓取多个 URL
- **自托管支持**:完全开源,支持本地部署,数据掌握在自己手中
## 使用说明
### 默认端口
- API服务: 3002
- 队列管理界面: http://your-ip:3002/admin/YOUR_BULL_AUTH_KEY/queues
### API 访问
部署后可以通过 `http://your-ip:3002` 访问 API 服务。
测试爬取端点:
```bash
curl -X POST http://localhost:3002/v1/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://firecrawl.dev"
}'
```
### 数据目录
应用数据存储在以下目录:
- `./data/api` - API 服务数据
- `./data/postgres` - PostgreSQL 数据库数据
- `./data/redis` - Redis 缓存数据
- `./data/playwright` - Playwright 浏览器缓存
### 环境变量
- `POSTGRES_USER` / `POSTGRES_PASSWORD`PostgreSQL 数据库凭据
- `BULL_AUTH_KEY`:队列管理界面的访问密钥
- `OPENAI_API_KEY`OpenAI API 密钥(用于 AI 相关功能,可选)
### 架构说明
Firecrawl 自托管版本包含以下服务组件:
- **API 服务**:主 API 服务器,处理所有请求(4核CPU,8GB内存限制)
- **Playwright 服务**:浏览器自动化服务(2核CPU,4GB内存限制)
- **Redis**:任务队列和缓存后端
- **RabbitMQ**NuQ 消息代理
- **PostgreSQL**:任务状态管理数据库
## 相关链接
- 官方网站: https://www.firecrawl.dev
- GitHub: https://github.com/firecrawl/firecrawl
- 文档: https://docs.firecrawl.dev
- Discord社区: https://discord.gg/firecrawl
+65
View File
@@ -0,0 +1,65 @@
# Firecrawl
Turn any website into LLM-ready structured data. A powerful web scraping, crawling, search and data extraction platform.
## Features
- **Single Page Scraping**: Convert any URL to Markdown, HTML, screenshots, or structured JSON
- **Multi-Page Crawling**: Recursively scrape entire websites with intelligent link filtering
- **URL Discovery**: Discover all URLs on a website instantly via sitemaps, index queries, or search
- **Web Search**: Search the web and get full page content from results in a single call
- **AI Extraction**: LLM-powered structured data extraction with schema validation
- **Autonomous Agent**: AI research agent that automatically navigates and extracts data
- **Remote Browser**: Remote browser sessions with CDP access and code execution
- **Batch Operations**: Asynchronous bulk scraping of multiple URLs
- **Self-Hosted**: Fully open source, supports local deployment with complete data control
## Usage
### Default Port
- API Service: 3002
- Queue Admin UI: http://your-ip:3002/admin/YOUR_BULL_AUTH_KEY/queues
### API Access
After deployment, access the API at `http://your-ip:3002`.
Test the crawl endpoint:
```bash
curl -X POST http://localhost:3002/v1/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://firecrawl.dev"
}'
```
### Data Directories
Application data is stored in the following directories:
- `./data/api` - API service data
- `./data/postgres` - PostgreSQL database data
- `./data/redis` - Redis cache data
- `./data/playwright` - Playwright browser cache
### Environment Variables
- `POSTGRES_USER` / `POSTGRES_PASSWORD`: PostgreSQL database credentials
- `BULL_AUTH_KEY`: Access key for the queue admin UI
- `OPENAI_API_KEY`: OpenAI API key for AI-powered features (optional)
### Architecture
The self-hosted version includes the following service components:
- **API Service**: Main API server handling all requests (4 CPU cores, 8GB RAM limit)
- **Playwright Service**: Browser automation service (2 CPU cores, 4GB RAM limit)
- **Redis**: Job queue and cache backend
- **RabbitMQ**: NuQ message broker
- **PostgreSQL**: Job state management database
## Links
- Website: https://www.firecrawl.dev
- GitHub: https://github.com/firecrawl/firecrawl
- Documentation: https://docs.firecrawl.dev
- Discord: https://discord.gg/firecrawl
+25
View File
@@ -0,0 +1,25 @@
name: Firecrawl
tags:
- 开发工具
- AI
- 爬虫
title: 将任意网站转换为适合大语言模型的结构化数据
description:
en: Turn any website into LLM-ready structured data. Scrape, crawl, search and extract clean markdown, structured JSON or screenshots from websites
zh: 将任意网站转换为适合大语言模型的结构化数据。支持抓取、爬取、搜索和提取干净的 Markdown、结构化 JSON 或截图
additionalProperties:
key: firecrawl
name: Firecrawl
tags:
- DevTool
- AI
- Crawler
shortDescZh: 将任意网站转换为适合大语言模型的结构化数据
shortDescEn: Turn any website into LLM-ready structured data
type: website
crossVersionUpdate: true
limit: 0
recommend: 0
website: https://www.firecrawl.dev
github: https://github.com/firecrawl/firecrawl
document: https://docs.firecrawl.dev
+47
View File
@@ -0,0 +1,47 @@
additionalProperties:
formFields:
- default: "3002"
envKey: PANEL_APP_PORT_HTTP
label:
en: API Port
zh: API端口
required: true
type: number
edit: true
rule: paramPort
- default: "CHANGEME"
envKey: BULL_AUTH_KEY
label:
en: Bull Queue Admin Key
zh: 队列管理密钥
required: true
type: text
edit: true
rule: paramCommon
- default: "firecrawl"
envKey: POSTGRES_USER
label:
en: PostgreSQL Username
zh: 数据库用户名
required: true
type: text
edit: true
rule: paramCommon
- default: ""
envKey: POSTGRES_PASSWORD
label:
en: PostgreSQL Password
zh: 数据库密码
required: true
type: password
edit: true
rule: paramCommon
- default: ""
envKey: OPENAI_API_KEY
label:
en: OpenAI API Key (Optional)
zh: OpenAI API密钥(可选)
required: false
type: text
edit: true
rule: ""
View File
+121
View File
@@ -0,0 +1,121 @@
services:
firecrawl-api:
image: ghcr.io/firecrawl/firecrawl:latest
container_name: ${CONTAINER_NAME}
restart: always
environment:
- HOST=0.0.0.0
- PORT=${PANEL_APP_PORT_HTTP}
- REDIS_URL=redis://firecrawl-redis:6379
- REDIS_RATE_LIMIT_URL=redis://firecrawl-redis:6379
- PLAYWRIGHT_MICROSERVICE_URL=http://firecrawl-playwright:3000/scrape
- POSTGRES_USER=${POSTGRES_USER:-firecrawl}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-firecrawl}
- POSTGRES_DB=firecrawl
- POSTGRES_HOST=firecrawl-postgres
- POSTGRES_PORT=5432
- USE_DB_AUTHENTICATION=false
- NUM_WORKERS_PER_QUEUE=8
- CRAWL_CONCURRENT_REQUESTS=10
- MAX_CONCURRENT_JOBS=5
- BROWSER_POOL_SIZE=5
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
- BULL_AUTH_KEY=${BULL_AUTH_KEY:-CHANGEME}
- NUQ_RABBITMQ_URL=amqp://firecrawl-rabbitmq:5672
- ENV=local
- EXTRACT_WORKER_PORT=3004
- WORKER_PORT=3005
- HARNESS_STARTUP_TIMEOUT_MS=60000
- TZ=Asia/Shanghai
depends_on:
firecrawl-redis:
condition: service_started
firecrawl-playwright:
condition: service_started
firecrawl-rabbitmq:
condition: service_healthy
firecrawl-postgres:
condition: service_started
ports:
- "${PANEL_APP_PORT_HTTP}:3002"
command: node dist/src/harness.js --start-docker
ulimits:
nofile:
soft: 65535
hard: 65535
volumes:
- ./data/api:/app/data
networks:
- 1panel-network
labels:
createdBy: "Apps"
cpus: 4.0
mem_limit: 8G
memswap_limit: 8G
firecrawl-playwright:
image: ghcr.io/firecrawl/playwright-service:latest
container_name: ${CONTAINER_NAME}-playwright
restart: always
environment:
- PORT=3000
- PROXY_SERVER=
- PROXY_USERNAME=
- PROXY_PASSWORD=
- ALLOW_LOCAL_WEBHOOKS=false
- BLOCK_MEDIA=false
- MAX_CONCURRENT_PAGES=10
- TZ=Asia/Shanghai
volumes:
- ./data/playwright:/tmp/.cache
networks:
- 1panel-network
tmpfs:
- /tmp/.cache:noexec,nosuid,size=1g
labels:
createdBy: "Apps"
cpus: 2.0
mem_limit: 4G
memswap_limit: 4G
firecrawl-redis:
image: redis:alpine
container_name: ${CONTAINER_NAME}-redis
restart: always
command: redis-server --bind 0.0.0.0
networks:
- 1panel-network
volumes:
- ./data/redis:/data
firecrawl-rabbitmq:
image: rabbitmq:3-management
container_name: ${CONTAINER_NAME}-rabbitmq
restart: always
command: rabbitmq-server
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "-q", "check_running"]
interval: 5s
timeout: 5s
retries: 3
start_period: 5s
networks:
- 1panel-network
firecrawl-postgres:
image: postgres:16-alpine
container_name: ${CONTAINER_NAME}-postgres
restart: always
environment:
- POSTGRES_USER=${POSTGRES_USER:-firecrawl}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-firecrawl}
- POSTGRES_DB=firecrawl
- TZ=Asia/Shanghai
networks:
- 1panel-network
volumes:
- ./data/postgres:/var/lib/postgresql/data
networks:
1panel-network:
external: true
Binary file not shown.

After

Width:  |  Height:  |  Size: 8.7 KiB

+47
View File
@@ -0,0 +1,47 @@
additionalProperties:
formFields:
- default: "3002"
envKey: PANEL_APP_PORT_HTTP
label:
en: API Port
zh: API端口
required: true
type: number
edit: true
rule: paramPort
- default: "CHANGEME"
envKey: BULL_AUTH_KEY
label:
en: Bull Queue Admin Key
zh: 队列管理密钥
required: true
type: text
edit: true
rule: paramCommon
- default: "firecrawl"
envKey: POSTGRES_USER
label:
en: PostgreSQL Username
zh: 数据库用户名
required: true
type: text
edit: true
rule: paramCommon
- default: ""
envKey: POSTGRES_PASSWORD
label:
en: PostgreSQL Password
zh: 数据库密码
required: true
type: password
edit: true
rule: paramCommon
- default: ""
envKey: OPENAI_API_KEY
label:
en: OpenAI API Key (Optional)
zh: OpenAI API密钥(可选)
required: false
type: text
edit: true
rule: ""
+121
View File
@@ -0,0 +1,121 @@
services:
firecrawl-api:
image: ghcr.io/firecrawl/firecrawl:v2.10.0
container_name: ${CONTAINER_NAME}
restart: always
environment:
- HOST=0.0.0.0
- PORT=${PANEL_APP_PORT_HTTP}
- REDIS_URL=redis://firecrawl-redis:6379
- REDIS_RATE_LIMIT_URL=redis://firecrawl-redis:6379
- PLAYWRIGHT_MICROSERVICE_URL=http://firecrawl-playwright:3000/scrape
- POSTGRES_USER=${POSTGRES_USER:-firecrawl}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-firecrawl}
- POSTGRES_DB=firecrawl
- POSTGRES_HOST=firecrawl-postgres
- POSTGRES_PORT=5432
- USE_DB_AUTHENTICATION=false
- NUM_WORKERS_PER_QUEUE=8
- CRAWL_CONCURRENT_REQUESTS=10
- MAX_CONCURRENT_JOBS=5
- BROWSER_POOL_SIZE=5
- OPENAI_API_KEY=${OPENAI_API_KEY:-}
- BULL_AUTH_KEY=${BULL_AUTH_KEY:-CHANGEME}
- NUQ_RABBITMQ_URL=amqp://firecrawl-rabbitmq:5672
- ENV=local
- EXTRACT_WORKER_PORT=3004
- WORKER_PORT=3005
- HARNESS_STARTUP_TIMEOUT_MS=60000
- TZ=Asia/Shanghai
depends_on:
firecrawl-redis:
condition: service_started
firecrawl-playwright:
condition: service_started
firecrawl-rabbitmq:
condition: service_healthy
firecrawl-postgres:
condition: service_started
ports:
- "${PANEL_APP_PORT_HTTP}:3002"
command: node dist/src/harness.js --start-docker
ulimits:
nofile:
soft: 65535
hard: 65535
volumes:
- ./data/api:/app/data
networks:
- 1panel-network
labels:
createdBy: "Apps"
cpus: 4.0
mem_limit: 8G
memswap_limit: 8G
firecrawl-playwright:
image: ghcr.io/firecrawl/playwright-service:v2.10.0
container_name: ${CONTAINER_NAME}-playwright
restart: always
environment:
- PORT=3000
- PROXY_SERVER=
- PROXY_USERNAME=
- PROXY_PASSWORD=
- ALLOW_LOCAL_WEBHOOKS=false
- BLOCK_MEDIA=false
- MAX_CONCURRENT_PAGES=10
- TZ=Asia/Shanghai
volumes:
- ./data/playwright:/tmp/.cache
networks:
- 1panel-network
tmpfs:
- /tmp/.cache:noexec,nosuid,size=1g
labels:
createdBy: "Apps"
cpus: 2.0
mem_limit: 4G
memswap_limit: 4G
firecrawl-redis:
image: redis:alpine
container_name: ${CONTAINER_NAME}-redis
restart: always
command: redis-server --bind 0.0.0.0
networks:
- 1panel-network
volumes:
- ./data/redis:/data
firecrawl-rabbitmq:
image: rabbitmq:3-management
container_name: ${CONTAINER_NAME}-rabbitmq
restart: always
command: rabbitmq-server
healthcheck:
test: ["CMD", "rabbitmq-diagnostics", "-q", "check_running"]
interval: 5s
timeout: 5s
retries: 3
start_period: 5s
networks:
- 1panel-network
firecrawl-postgres:
image: postgres:16-alpine
container_name: ${CONTAINER_NAME}-postgres
restart: always
environment:
- POSTGRES_USER=${POSTGRES_USER:-firecrawl}
- POSTGRES_PASSWORD=${POSTGRES_PASSWORD:-firecrawl}
- POSTGRES_DB=firecrawl
- TZ=Asia/Shanghai
networks:
- 1panel-network
volumes:
- ./data/postgres:/var/lib/postgresql/data
networks:
1panel-network:
external: true
-1
View File
@@ -66,7 +66,6 @@ additionalProperties:
labelEn: Time Zone
labelZh: 时区
required: true
rule: paramCommon
type: text
label:
en: Time Zone
-1
View File
@@ -66,7 +66,6 @@ additionalProperties:
labelEn: Time Zone
labelZh: 时区
required: true
rule: paramCommon
type: text
label:
en: Time Zone
+57
View File
@@ -0,0 +1,57 @@
# Vane 🔍
Vane 是一个**专注于隐私的 AI 问答搜索引擎**,完全在您自己的硬件上运行。它将来自广阔互联网的知识与对**本地 LLM**(Ollama)和云提供商(OpenAI、Claude、Groq)的支持相结合,提供带有**引用来源**的准确答案,同时保持您的搜索完全私密。
## 功能特点
🤖 **支持所有主流 AI 提供商** - 使用本地 LLMOllama)或连接 OpenAI、Anthropic Claude、Google Gemini、Groq 等。根据需求混合搭配模型。
**智能搜索模式** - 需要快速回答时选择速度模式,日常搜索选择平衡模式,深度研究选择质量模式。
🧭 **选择您的来源** - 搜索网页、讨论或学术论文。更多来源和集成正在开发中。
🧩 **小部件** - 有用的 UI 卡片在相关时显示,如天气、计算、股票价格等快速查询。
🔍 **SearxNG 驱动的网页搜索** - 访问多个搜索引擎,同时保持您的身份私密。即将支持 Tavily 和 Exa 以获得更好的结果。
📷 **图片和视频搜索** - 查找文本结果之外的视觉内容。搜索不再局限于文章。
📄 **文件上传** - 上传文档并提出相关问题。PDF、文本文件、图片 - Vane 都能理解。
🌐 **搜索特定域名** - 限制在特定网站内搜索。非常适合技术文档或研究论文。
💡 **智能建议** - 输入时获得智能搜索建议,帮助您提出更好的查询。
📚 **发现** - 浏览全天有趣的 trending 文章和内容。无需搜索即可保持信息畅通。
🕒 **搜索历史** - 每次搜索都被本地保存,以便随时回顾您的发现。您的研究永远不会丢失。
## 使用说明
### 默认端口
- Web 界面: 3000
### 数据目录
应用数据存储在 `./data` 目录。
### 配置说明
部署后,打开浏览器访问 `http://localhost:3000`,在设置界面中配置您的 API 密钥、模型等。
如果您已有 SearxNG 实例,可以在环境变量中设置 `SEARXNG_API_URL` 指向您的 SearxNG 地址。
### 本地 LLM 支持
如果使用 Ollama 等本地 LLM,请确保:
- 服务器运行在 `0.0.0.0`(而非 `127.0.0.1`
- 在设置中正确配置了 API URL 和模型名称
- Linux 用户需要配置 `OLLAMA_HOST=0.0.0.0:11434`
## 相关链接
- 官方网站: https://github.com/ItzCrazyKns/Vane
- GitHub: https://github.com/ItzCrazyKns/Vane
- 文档: https://github.com/ItzCrazyKns/Vane/tree/master/docs
+56
View File
@@ -0,0 +1,56 @@
# Vane 🔍
Vane is a **privacy-focused AI answering engine** that runs entirely on your own hardware. It combines knowledge from the vast internet with support for **local LLMs** (Ollama) and cloud providers (OpenAI, Claude, Groq), delivering accurate answers with **cited sources** while keeping your searches completely private.
## Features
🤖 **Support for all major AI providers** - Use local LLMs through Ollama or connect to OpenAI, Anthropic Claude, Google Gemini, Groq, and more. Mix and match models based on your needs.
**Smart search modes** - Choose Speed Mode when you need quick answers, Balanced Mode for everyday searches, or Quality Mode for deep research.
🧭 **Pick your sources** - Search the web, discussions, or academic papers. More sources and integrations are in progress.
🧩 **Widgets** - Helpful UI cards that show up when relevant, like weather, calculations, stock prices, and other quick lookups.
🔍 **Web search powered by SearxNG** - Access multiple search engines while keeping your identity private. Support for Tavily and Exa coming soon.
📷 **Image and video search** - Find visual content alongside text results. Search isn't limited to just articles anymore.
📄 **File uploads** - Upload documents and ask questions about them. PDFs, text files, images - Vane understands them all.
🌐 **Search specific domains** - Limit your search to specific websites when you know where to look. Perfect for technical documentation or research papers.
💡 **Smart suggestions** - Get intelligent search suggestions as you type, helping you formulate better queries.
📚 **Discover** - Browse interesting articles and trending content throughout the day. Stay informed without even searching.
🕒 **Search history** - Every search is saved locally so you can revisit your discoveries anytime. Your research is never lost.
## Usage
### Default Port
- Web UI: 3000
### Data Directory
Application data is stored in the `./data` directory.
### Configuration
After deployment, open your browser and navigate to `http://localhost:3000` to configure your API keys, models, and other settings in the setup screen.
If you already have a SearxNG instance, you can set the `SEARXNG_API_URL` environment variable to point to your SearxNG URL.
### Local LLM Support
If using local LLMs like Ollama, ensure that:
- Your server is running on `0.0.0.0` (not `127.0.0.1`)
- You have correctly configured the API URL and model name in settings
- Linux users need to configure `OLLAMA_HOST=0.0.0.0:11434`
## Links
- GitHub: https://github.com/ItzCrazyKns/Vane
- Documentation: https://github.com/ItzCrazyKns/Vane/tree/master/docs
+27
View File
@@ -0,0 +1,27 @@
name: Vane
tags:
- AI / 大模型
- 搜索
title: 专注于隐私的 AI 问答搜索引擎
description: 专注于隐私的 AI 问答搜索引擎,完全在您自己的硬件上运行
additionalProperties:
key: vane
name: Vane
tags:
- AI
- Search
shortDescZh: 专注于隐私的 AI 问答搜索引擎
shortDescEn: Privacy-focused AI answering engine
description:
en: Vane is a privacy-focused AI answering engine that runs entirely on your own hardware. It combines knowledge from the vast internet with support for local LLMs (Ollama) and cloud providers (OpenAI, Claude, Groq), delivering accurate answers with cited sources while keeping your searches completely private.
zh: Vane 是一个专注于隐私的 AI 问答搜索引擎,完全在您自己的硬件上运行。它将来自互联网的知识与本地 LLM(Ollama)和云提供商(OpenAI、Claude、Groq)的支持相结合,提供带有引用来源的准确答案,同时保持您的搜索完全私密。
type: website
crossVersionUpdate: true
limit: 0
recommend: 0
website: https://github.com/ItzCrazyKns/Vane
github: https://github.com/ItzCrazyKns/Vane
document: https://github.com/ItzCrazyKns/Vane/tree/master/docs
architectures:
- amd64
- arm64
+18
View File
@@ -0,0 +1,18 @@
additionalProperties:
formFields:
- default: 3000
edit: true
envKey: PANEL_APP_PORT_HTTP
labelEn: Web Port
labelZh: Web 端口
required: true
rule: paramPort
type: number
- default: ""
edit: true
envKey: SEARXNG_API_URL
labelEn: SearXNG API URL (Optional, leave empty to use built-in)
labelZh: SearXNG API 地址(可选,留空则使用内置)
required: false
rule: paramExtUrl
type: text
+19
View File
@@ -0,0 +1,19 @@
services:
vane:
image: itzcrazykns1337/vane:latest
container_name: ${CONTAINER_NAME}
restart: always
ports:
- "${PANEL_APP_PORT_HTTP}:3000"
volumes:
- ./data:/home/vane/data
environment:
- TZ=Asia/Shanghai
- SEARXNG_API_URL=${SEARXNG_API_URL}
networks:
- 1panel-network
labels:
createdBy: "Apps"
networks:
1panel-network:
external: true
Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

+18
View File
@@ -0,0 +1,18 @@
additionalProperties:
formFields:
- default: 3000
edit: true
envKey: PANEL_APP_PORT_HTTP
labelEn: Web Port
labelZh: Web 端口
required: true
rule: paramPort
type: number
- default: ""
edit: true
envKey: SEARXNG_API_URL
labelEn: SearXNG API URL (Optional, leave empty to use built-in)
labelZh: SearXNG API 地址(可选,留空则使用内置)
required: false
rule: paramExtUrl
type: text
+19
View File
@@ -0,0 +1,19 @@
services:
vane:
image: itzcrazykns1337/vane:v1.12.2
container_name: ${CONTAINER_NAME}
restart: always
ports:
- "${PANEL_APP_PORT_HTTP}:3000"
volumes:
- ./data:/home/vane/data
environment:
- TZ=Asia/Shanghai
- SEARXNG_API_URL=${SEARXNG_API_URL}
networks:
- 1panel-network
labels:
createdBy: "Apps"
networks:
1panel-network:
external: true