mirror of
https://github.com/arch3rPro/1Panel-Appstore.git
synced 2026-06-11 00:59:40 +08:00
e98811cd04
首先新增Firecrawl和Vane两款应用,包含完整的应用配置、docker-compose编排、说明文档以及logo资源;同时修复了lxserver时区配置项中多余的rule参数。
2.3 KiB
2.3 KiB
Firecrawl
Turn any website into LLM-ready structured data. A powerful web scraping, crawling, search and data extraction platform.
Features
- Single Page Scraping: Convert any URL to Markdown, HTML, screenshots, or structured JSON
- Multi-Page Crawling: Recursively scrape entire websites with intelligent link filtering
- URL Discovery: Discover all URLs on a website instantly via sitemaps, index queries, or search
- Web Search: Search the web and get full page content from results in a single call
- AI Extraction: LLM-powered structured data extraction with schema validation
- Autonomous Agent: AI research agent that automatically navigates and extracts data
- Remote Browser: Remote browser sessions with CDP access and code execution
- Batch Operations: Asynchronous bulk scraping of multiple URLs
- Self-Hosted: Fully open source, supports local deployment with complete data control
Usage
Default Port
- API Service: 3002
- Queue Admin UI: http://your-ip:3002/admin/YOUR_BULL_AUTH_KEY/queues
API Access
After deployment, access the API at http://your-ip:3002.
Test the crawl endpoint:
curl -X POST http://localhost:3002/v1/crawl \
-H 'Content-Type: application/json' \
-d '{
"url": "https://firecrawl.dev"
}'
Data Directories
Application data is stored in the following directories:
./data/api- API service data./data/postgres- PostgreSQL database data./data/redis- Redis cache data./data/playwright- Playwright browser cache
Environment Variables
POSTGRES_USER/POSTGRES_PASSWORD: PostgreSQL database credentialsBULL_AUTH_KEY: Access key for the queue admin UIOPENAI_API_KEY: OpenAI API key for AI-powered features (optional)
Architecture
The self-hosted version includes the following service components:
- API Service: Main API server handling all requests (4 CPU cores, 8GB RAM limit)
- Playwright Service: Browser automation service (2 CPU cores, 4GB RAM limit)
- Redis: Job queue and cache backend
- RabbitMQ: NuQ message broker
- PostgreSQL: Job state management database
Links
- Website: https://www.firecrawl.dev
- GitHub: https://github.com/firecrawl/firecrawl
- Documentation: https://docs.firecrawl.dev
- Discord: https://discord.gg/firecrawl