Compare commits

...

4 Commits

Author SHA1 Message Date
congsh
762cc41c5a feat: 添加图片OCR结果转换为待办事项的功能
- 后端新增 `/images/:id/convert-to-todo` 路由及控制器方法,支持将图片OCR结果创建为待办事项
- 前端添加 `useConvertImageToTodo` hook 和 `ImageService.convertToTodo` 方法
- 在图片列表页面为已识别图片添加“转为待办”操作按钮
- 新增文档详情页面路由及基础UI
- 修复PaddleOCR和RapidOCR提供商的配置及可用性检查逻辑
- 优化图片上传时对document_id的处理(空字符串转为null)
2026-02-27 23:22:29 +08:00
congsh
ecf3999d2a fix: 修复Docker构建和路径解析问题
- 在Dockerfile中使用npm中国镜像加速依赖安装
- 修正后端健康检查端点路径
- 修复Docker环境中图片路径解析问题
- 修正前端API地址默认值
2026-02-27 22:42:53 +08:00
congsh
9a301cc434 feat(ocr): 集成 PaddleOCR 服务并优化 OCR 系统
- 新增 PaddleOCR 本地高精度 OCR 服务支持,包括 Dockerfile、API 服务和 provider 实现
- 在 docker-compose 中集成 RapidOCR 和 PaddleOCR 服务,并配置健康检查
- 优化后端 API 路由前缀,移除 `/api` 以简化代理配置
- 更新 Nginx 配置以正确传递请求头和代理 WebSocket 连接
- 在前端设置页面添加 PaddleOCR 和 RapidOCR 的测试与配置选项
- 修复后端 Dockerfile 以支持 Python 原生模块构建
- 更新 OCR 设置指南,反映当前服务状态和部署方式
- 添加上传文件调试日志和权限设置
2026-02-27 18:43:07 +08:00
congsh
764c6a8c0c fix: 修复 Docker 环境中图片无法访问的问题
在 Nginx 配置中添加 /uploads 路径的代理规则,
将图片请求转发到后端容器的静态文件服务。

- 添加 location /uploads 代理配置
- 设置 7 天缓存策略提高性能

修复前: 图片上传后返回 404
修复后: 图片可以正常显示

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-27 09:55:57 +08:00
26 changed files with 1098 additions and 136 deletions

View File

@@ -5,9 +5,9 @@
| Provider | 类型 | 状态 | 配置说明 |
|----------|------|------|----------|
| **Tesseract.js** | 本地 | ✅ 已安装 | 默认使用,无需配置 |
| **RapidOCR** | 本地 | ⚠️ 需配置 | 需要额外部署 |
| **RapidOCR** | 本地 | ✅ 已部署 | 端口 13058快速准确 |
| **Baidu OCR** | 云端 | ⚠️ 需配置 | 需要 API Key |
| **PaddleOCR** | 本地 | ❌ 暂不支持 | 需要 Python 环境 |
| **PaddleOCR** | 本地 | ❌ 镜像问题 | protobuf 兼容性问题 |
---
@@ -33,51 +33,55 @@
- ✅ 速度快
- ✅ 准确率高
- ✅ 本地部署,隐私安全
- ⚠️ 需要单独部署服务
- **已集成到 Docker Compose**
### Docker 部署方式
### 当前部署状态
#### 方案 A: 使用 Docker Compose (推荐)
RapidOCR 已集成到项目的 Docker Compose 配置中:
- **容器名**: `picanalysis-rapidocr`
- **内部端口**: 9004
- **外部端口**: 13058
- **健康检查**: 自动启动和重启
`docker-compose.yml` 中添加 RapidOCR 服务:
### Docker Compose 部署 (已配置)
`docker-compose.yml` 中已包含:
```yaml
services:
# ... 其他服务 ...
rapidocr:
image: xiaoshaizaiai/rapidocr:latest
container_name: picanalysis-rapidocr
restart: unless-stopped
ports:
- "8080:8080"
networks:
- picanalysis-network
rapidocr:
image: volador/rapidocr:latest
container_name: picanalysis-rapidocr
restart: unless-stopped
ports:
- "13058:9004"
networks:
- picanalysis-network
```
然后更新 `.env` 文件
后端环境变量自动配置
```bash
RAPIDOCR_API_URL="http://rapidocr:9004"
```
### 启动服务
```bash
RAPIDOCR_API_URL="http://rapidocr:8080"
OCR_PROVIDER="rapidocr"
# 启动 RapidOCR 服务
docker compose up -d rapidocr
# 查看日志
docker compose logs -f rapidocr
# 测试服务
curl http://localhost:13058
# 应返回: {"message":"Welcome to RapidOCR Server!"}
```
#### 方案 B: 使用外部 RapidOCR 服务
### 使用方式
如果你已经有运行中的 RapidOCR 服务,只需要配置 URL
```bash
# .env 文件
RAPIDOCR_API_URL="http://your-rapidocr-host:8080"
OCR_PROVIDER="rapidocr"
```
#### 验证 RapidOCR
```bash
# 测试 RapidOCR 服务是否可用
curl http://localhost:8080
```
1. **自动选择**: 在设置页面选择 "RapidOCR"
2. **环境变量**: 设置 `OCR_PROVIDER=rapidocr`
3. **API 调用**: 后端自动使用 `http://rapidocr:9004`
---
@@ -110,16 +114,25 @@ OCR_PROVIDER="baidu"
---
## 4. PaddleOCR (暂不支持)
## 4. PaddleOCR (暂时不可用)
### 限制
PaddleOCR 是 Python 库,在 Node.js 环境中集成比较复杂
### 当前状态
❌ Docker 镜像存在 protobuf 兼容性问题,暂时无法使用
### 问题描述
常见的 PaddleOCR Docker 镜像(如 987846/paddleocr使用了旧版本的 protobuf与当前环境不兼容会导致服务无法启动。
### 替代方案
建议使用以下替代方案:
- **RapidOCR** - 同样使用 PaddleOCR 引擎,但提供 HTTP API
- **Baidu OCR** - 云端调用,准确率高
- **Tesseract.js** - 本地轻量级方案
- **RapidOCR** - 同样基于 PaddleOCR 引擎,但提供稳定的 HTTP API (已集成)
- **Baidu OCR** - 云端调用,准确率高,有免费额度
- **Tesseract.js** - 本地轻量级方案,无需额外部署
### 如果需要使用 PaddleOCR
您可以:
1. 寻找其他维护良好的 PaddleOCR Docker 镜像
2. 手动构建 PaddleOCR 服务(需要 Python 环境)
3. 使用官方 PaddleOCR 的其他部署方式
---
@@ -181,7 +194,8 @@ docker compose ps rapidocr
docker compose logs rapidocr
# 3. 测试 RapidOCR 连接
curl http://localhost:8080
curl http://localhost:13058
# 应返回: {"message":"Welcome to RapidOCR Server!"}
# 4. 如果服务未运行,启动它
docker compose up -d rapidocr

View File

@@ -3,7 +3,8 @@
# ========================================
FROM node:20-alpine AS deps
RUN apk add --no-cache libc6-compat
# Add Python and build tools for native modules (bcrypt, etc.)
RUN apk add --no-cache libc6-compat python3 make g++
WORKDIR /app
@@ -11,8 +12,9 @@ WORKDIR /app
COPY package*.json ./
COPY prisma ./prisma/
# Install dependencies
RUN npm ci
# Install dependencies with Chinese mirror
RUN npm config set registry https://registry.npmmirror.com && \
npm ci
# ========================================
# Stage 2: Builder

View File

@@ -4,6 +4,11 @@
mkdir -p /app/data
chown -R nodejs:nodejs /app/data
# Ensure uploads directory exists with proper permissions (after volume mount)
mkdir -p /app/uploads
chown -R nodejs:nodejs /app/uploads
chmod 755 /app/uploads
# Set database path to data directory
export DATABASE_URL="file:/app/data/prod.db"
@@ -13,10 +18,14 @@ npx prisma db push --skip-generate || echo "Database push failed, will try on st
# Fix database file permissions after creation
if [ -f /app/data/prod.db ]; then
chown nodejs:nodejs /app/data/prod.db
chmod 664 /app/data/prod.db
chown nodejs:nodejs /app/data/prod.db
chmod 664 /app/data/prod.db
fi
# Log uploads directory status
echo "Uploads directory status:"
ls -la /app/uploads || echo "Uploads directory does not exist"
# Start the application as nodejs user
echo "Starting application..."
exec su-exec nodejs npx tsx src/index.ts

View File

@@ -20,6 +20,15 @@ export class ImageController {
const file = req.file;
const { document_id } = req.body;
console.log('[UPLOAD] File received:', {
originalname: file?.originalname,
filename: file?.filename,
path: file?.path,
size: file?.size,
mimetype: file?.mimetype,
document_id,
});
if (!file) {
res.status(400).json({
success: false,
@@ -28,12 +37,15 @@ export class ImageController {
return;
}
// 处理 document_id空字符串转换为 null
const processedDocumentId = document_id && document_id.trim() !== '' ? document_id : null;
const image = await ImageService.create({
user_id: userId,
file_path: `/uploads/${file.filename}`,
file_size: file.size,
mime_type: file.mimetype,
document_id,
document_id: processedDocumentId,
});
// 触发异步 OCR 处理(不等待完成)
@@ -182,6 +194,76 @@ export class ImageController {
}
}
/**
* Convert image to todo
* POST /api/images/:id/convert-to-todo
*/
static async convertToTodo(req: Request, res: Response): Promise<void> {
try {
const userId = req.user!.user_id;
const { id } = req.params as { id: string };
const { title, description, priority } = req.body;
// 获取图片信息
const image = await ImageService.findById(id, userId);
if (!image) {
res.status(404).json({
success: false,
error: '图片不存在',
});
return;
}
// 检查是否有 OCR 结果
if (!image.ocr_result || image.processing_status !== 'completed') {
res.status(400).json({
success: false,
error: '图片尚未完成 OCR 识别,无法转换为待办',
});
return;
}
// 动态导入服务避免循环依赖
const { TodoService } = await import('../services/todo.service');
const { DocumentService } = await import('../services/document.service');
// 如果图片还没有关联文档,先创建文档
let documentId = image.document_id;
if (!documentId) {
const doc = await DocumentService.create({
user_id: userId,
title: title || `从图片提取: ${image.file_path.split('/').pop()}`,
content: image.ocr_result,
});
documentId = doc.id;
// 将图片关联到文档
await ImageService.linkToDocument(id, userId, documentId);
}
// 从文档创建待办事项
const todo = await TodoService.create({
user_id: userId,
title: title || `待办: ${image.ocr_result.slice(0, 50)}${image.ocr_result.length > 50 ? '...' : ''}`,
description: description || image.ocr_result,
priority: priority || 'medium',
document_id: documentId,
});
res.status(201).json({
success: true,
data: todo,
message: '已将 OCR 结果转换为待办事项',
});
} catch (error) {
const message = error instanceof Error ? error.message : '转换失败';
res.status(400).json({
success: false,
error: message,
});
}
}
/**
* Delete image
* DELETE /api/images/:id

View File

@@ -29,16 +29,16 @@ app.use(express.urlencoded({ extended: true }));
app.use('/uploads', express.static(path.join(process.cwd(), 'uploads')));
// Health check
app.get('/api/health', (_req, res) => {
app.get('/health', (_req, res) => {
res.json({ success: true, message: 'API is running' });
});
// Routes
app.use('/api/auth', authRoutes);
app.use('/api/documents', documentRoutes);
app.use('/api/todos', todoRoutes);
app.use('/api/images', imageRoutes);
app.use('/api/user', userRoutes);
app.use('/auth', authRoutes);
app.use('/documents', documentRoutes);
app.use('/todos', todoRoutes);
app.use('/images', imageRoutes);
app.use('/user', userRoutes);
// 404 handler
app.use((_req, res) => {

View File

@@ -69,7 +69,15 @@ export function resolveImagePath(imagePath: string): string {
// 处理 /uploads/ 开头的相对路径
if (imagePath.startsWith('/uploads/')) {
return path.join(getUploadsDir(), imagePath.replace('/uploads/', ''));
const resolved = path.join(getUploadsDir(), imagePath.replace('/uploads/', ''));
// 在 Docker 环境中,确保使用绝对路径
if (process.env.NODE_ENV === 'production' || fs.existsSync('/app/uploads')) {
// Docker 环境:直接使用 /app/uploads/
return `/app/uploads/${imagePath.replace('/uploads/', '')}`;
}
return resolved;
}
// 其他相对路径,使用项目根目录

View File

@@ -54,6 +54,40 @@ router.get('/ocr/providers', authenticate, async (_req, res) => {
}
});
/**
* @route POST /api/images/ocr/test
* @desc Test OCR provider with uploaded image
* @access Private
* @body { provider: 'tesseract' | 'baidu' | 'rapidocr' | 'paddleocr' }
*/
router.post('/ocr/test', authenticate, upload.single('file'), async (req, res) => {
try {
const { provider } = req.body;
if (!req.file) {
res.status(400).json({
success: false,
error: '请上传测试图片',
});
return;
}
// 使用 OCRProcessorService 测试
const result = await OCRProcessorService.testProvider(
provider as OCRProviderType,
req.file.path
);
res.json(result);
} catch (error) {
const message = error instanceof Error ? error.message : 'OCR 测试失败';
res.status(500).json({
success: false,
error: message,
});
}
});
/**
* @route GET /api/images/:id
* @desc Get image by ID
@@ -103,6 +137,13 @@ router.put('/:id/ocr', authenticate, ImageController.updateOCR);
*/
router.put('/:id/link', authenticate, ImageController.linkToDocument);
/**
* @route POST /api/images/:id/convert-to-todo
* @desc Convert image OCR result to todo
* @access Private
*/
router.post('/:id/convert-to-todo', authenticate, ImageController.convertToTodo);
/**
* @route DELETE /api/images/:id
* @desc Delete image

View File

@@ -8,15 +8,17 @@ export type { IImageSource, OCRRecognitionResult, OCRProviderConfig } from './ba
export { TesseractProvider, tesseractProvider } from './tesseract.provider';
export { BaiduProvider, baiduProvider } from './baidu.provider';
export { RapidOCRProvider, rapidocrProvider } from './rapidocr.provider';
export { PaddleOCRProvider, paddleocrProvider } from './paddleocr.provider';
import { TesseractProvider } from './tesseract.provider';
import { BaiduProvider } from './baidu.provider';
import { RapidOCRProvider } from './rapidocr.provider';
import { PaddleOCRProvider } from './paddleocr.provider';
/**
* OCR Provider 类型
*/
export type OCRProviderType = 'tesseract' | 'baidu' | 'rapidocr' | 'auto';
export type OCRProviderType = 'tesseract' | 'baidu' | 'rapidocr' | 'paddleocr' | 'auto';
/**
* OCR Provider 工厂
@@ -27,6 +29,7 @@ export class OCRProviderFactory {
tesseract: TesseractProvider,
baidu: BaiduProvider,
rapidocr: RapidOCRProvider,
paddleocr: PaddleOCRProvider,
};
/**
@@ -35,7 +38,7 @@ export class OCRProviderFactory {
static create(
type: OCRProviderType,
config?: any
): TesseractProvider | BaiduProvider | RapidOCRProvider {
): TesseractProvider | BaiduProvider | RapidOCRProvider | PaddleOCRProvider {
if (type === 'auto') {
// 自动选择可用的 provider
return this.autoSelect();
@@ -51,9 +54,9 @@ export class OCRProviderFactory {
/**
* 自动选择可用的 provider
* 优先级: RapidOCR > Tesseract > Baidu
* 优先级: PaddleOCR > RapidOCR > Tesseract > Baidu
*/
private static autoSelect(): TesseractProvider | BaiduProvider | RapidOCRProvider {
private static autoSelect(): TesseractProvider | BaiduProvider | RapidOCRProvider | PaddleOCRProvider {
const envProvider = process.env.OCR_PROVIDER as OCRProviderType;
// 如果指定了 provider 且不是 auto使用指定的
@@ -63,6 +66,11 @@ export class OCRProviderFactory {
}
// 检查可用性并选择
// PaddleOCR (本地高精度)
if (process.env.PADDLEOCR_API_URL) {
return new PaddleOCRProvider();
}
// RapidOCR (本地快速)
if (process.env.RAPIDOCR_API_URL) {
return new RapidOCRProvider();
@@ -84,6 +92,7 @@ export class OCRProviderFactory {
Array<{ type: string; name: string; available: boolean; typeDesc: string }>
> {
const providers = [
{ type: 'paddleocr', name: 'PaddleOCR', instance: new PaddleOCRProvider(), typeDesc: '本地高精度' },
{ type: 'rapidocr', name: 'RapidOCR', instance: new RapidOCRProvider(), typeDesc: '本地快速准确' },
{ type: 'baidu', name: 'Baidu OCR', instance: new BaiduProvider(), typeDesc: '云端准确' },
{ type: 'tesseract', name: 'Tesseract.js', instance: new TesseractProvider(), typeDesc: '本地轻量' },

View File

@@ -0,0 +1,157 @@
/**
* PaddleOCR Provider
* 特点:高精度、多语言支持、本地运行
* 基于 PaddlePaddle 深度学习框架
*
* 部署方式:
* 1. 使用 Docker: docker run -p 8866:8866 987846/paddleocr:latest
*
* GitHub: https://github.com/PaddlePaddle/PaddleOCR
* Docker Hub: https://hub.docker.com/r/paddlepaddle/paddleocr
*/
import { BaseOCRProvider, IImageSource, OCRRecognitionResult, OCRProviderConfig } from './base.provider';
import fs from 'fs';
interface PaddleOCRResponse {
msg: string;
results: Array<Array<{
boxes: number[][];
rec_text: string;
rec_score: number;
}>>;
status: string;
}
interface PaddleOCRRequest {
images: string[];
}
export class PaddleOCRProvider extends BaseOCRProvider {
private apiUrl: string;
constructor(config: OCRProviderConfig & { apiUrl?: string } = {}) {
super(config);
this.apiUrl = config.apiUrl || process.env.PADDLEOCR_API_URL || 'http://localhost:13059';
}
getName(): string {
return 'PaddleOCR';
}
getType(): 'local' | 'cloud' {
return 'local';
}
/**
* 检查 PaddleOCR 服务是否可用
*/
async isAvailable(): Promise<boolean> {
try {
const response = await fetch(`${this.apiUrl}/`, { signal: AbortSignal.timeout(3000) });
// PaddleOCR 服务正常时返回 200
return response.ok || response.status === 200;
} catch {
// 服务不可用
return false;
}
}
/**
* 执行 OCR 识别
*/
async recognize(
source: IImageSource,
options?: OCRProviderConfig
): Promise<OCRRecognitionResult> {
const startTime = Date.now();
// 获取图片 Base64
const imageBase64 = await this.getImageBase64(source);
// 调用 PaddleOCR API
const response = await this.withTimeout(
fetch(`${this.apiUrl}/predict/ocr_system`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
images: [imageBase64],
} as PaddleOCRRequest),
}),
options?.timeout || this.config.timeout || 30000
);
const data = (await response.json()) as PaddleOCRResponse;
const duration = Date.now() - startTime;
// 检查错误
if (data.status !== '000' && data.status !== '200') {
throw new Error(`PaddleOCR 错误: ${data.msg || data.status}`);
}
// 提取文本和置信度
const ocrResults = data.results[0] || [];
const text = ocrResults.map((r) => r.rec_text).join('\n');
// 计算平均置信度
const confidence = ocrResults.length > 0
? ocrResults.reduce((acc, r) => acc + (r.rec_score || 0), 0) / ocrResults.length
: 0;
return {
text: text.trim(),
confidence,
duration,
extra: {
provider: 'paddleocr',
textCount: ocrResults.length,
},
};
}
getRecommendations() {
return {
maxImageSize: 10 * 1024 * 1024,
supportedFormats: ['jpg', 'jpeg', 'png', 'webp', 'bmp'],
notes: 'PaddleOCR 是百度开源的 OCR 工具,支持多语言识别,准确率高。需要先启动 PaddleOCR 服务。',
};
}
/**
* 获取图片 Base64
*/
private async getImageBase64(source: IImageSource): Promise<string> {
if (source.base64) {
// 移除 data URL 前缀
return source.base64.replace(/^data:image\/\w+;base64,/, '');
}
if (source.buffer) {
return source.buffer.toString('base64');
}
if (source.path) {
// 使用基类的路径解析方法
const fullPath = this.resolveImagePath(source.path);
const buffer = fs.readFileSync(fullPath);
return buffer.toString('base64');
}
throw new Error('无效的图片来源');
}
/**
* 超时包装
*/
private async withTimeout<T>(promise: Promise<T>, timeout: number): Promise<T> {
return Promise.race([
promise,
new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('timeout')), timeout)
),
]);
}
}
// 导出单例实例
export const paddleocrProvider = new PaddleOCRProvider();

View File

@@ -24,21 +24,12 @@ interface RapidOCRResponse {
}>;
}
interface RapidOCRRequest {
images: string[];
options?: {
use_dilation?: boolean;
use_cls?: boolean;
use_tensorrt?: boolean;
};
}
export class RapidOCRProvider extends BaseOCRProvider {
private apiUrl: string;
constructor(config: OCRProviderConfig & { apiUrl?: string } = {}) {
super(config);
this.apiUrl = config.apiUrl || process.env.RAPIDOCR_API_URL || 'http://localhost:8080';
this.apiUrl = config.apiUrl || process.env.RAPIDOCR_API_URL || 'http://localhost:13058';
}
getName(): string {
@@ -54,9 +45,10 @@ export class RapidOCRProvider extends BaseOCRProvider {
*/
async isAvailable(): Promise<boolean> {
try {
const response = await fetch(`${this.apiUrl}/health`, {
const response = await fetch(`${this.apiUrl}/`, {
signal: AbortSignal.timeout(2000),
});
// RapidOCR 返回 {"message":"Welcome to RapidOCR Server!"}
return response.ok;
} catch {
return false;
@@ -72,21 +64,18 @@ export class RapidOCRProvider extends BaseOCRProvider {
): Promise<OCRRecognitionResult> {
const startTime = Date.now();
// 获取图片 Base64
const imageBase64 = await this.getImageBase64(source);
// 获取图片 Buffer
const imageBuffer = await this.getImageBuffer(source);
// 使用 FormData 发送图片
const formData = new FormData();
formData.append('file', new Blob([imageBuffer]), 'image.png');
// 调用 RapidOCR API
const response = await this.withTimeout(
fetch(`${this.apiUrl}/ocr`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
images: [imageBase64],
options: {
use_dilation: true, // 使用膨胀增强识别
use_cls: true, // 使用文字方向分类
},
} as RapidOCRRequest),
body: formData,
}),
options?.timeout || this.config.timeout || 15000
);
@@ -94,12 +83,12 @@ export class RapidOCRProvider extends BaseOCRProvider {
const data = (await response.json()) as RapidOCRResponse;
const duration = Date.now() - startTime;
// 检查错误(支持两种错误格式)
if (data.code !== 200 && 'error' in data) {
throw new Error(`RapidOCR 错误: ${(data as any).error || data.msg} (${data.code})`);
// 检查错误
if (data.code !== 200) {
throw new Error(`RapidOCR 错误: ${data.msg} (${data.code})`);
}
// 提取文本和置信度(确保 data.data 存在)
// 提取文本和置信度
const ocrResults = Array.isArray(data.data) ? data.data : [];
const text = ocrResults.map((r) => r.text).join('\n');
@@ -128,23 +117,23 @@ export class RapidOCRProvider extends BaseOCRProvider {
}
/**
* 获取图片 Base64
* 获取图片 Buffer
*/
private async getImageBase64(source: IImageSource): Promise<string> {
if (source.base64) {
// 移除 data URL 前缀
return source.base64.replace(/^data:image\/\w+;base64,/, '');
}
private async getImageBuffer(source: IImageSource): Promise<Buffer> {
if (source.buffer) {
return source.buffer.toString('base64');
return source.buffer;
}
if (source.path) {
// 使用基类的路径解析方法
const fullPath = this.resolveImagePath(source.path);
const buffer = fs.readFileSync(fullPath);
return buffer.toString('base64');
return fs.readFileSync(fullPath);
}
if (source.base64) {
// 移除 data URL 前缀并转换为 Buffer
const base64 = source.base64.replace(/^data:image\/\w+;base64,/, '');
return Buffer.from(base64, 'base64');
}
throw new Error('无效的图片来源');

View File

@@ -26,19 +26,64 @@ services:
DEEPSEEK_API_KEY: ${DEEPSEEK_API_KEY:-}
UPLOAD_MAX_SIZE: ${UPLOAD_MAX_SIZE:-10485760}
UPLOAD_ALLOWED_TYPES: ${UPLOAD_ALLOWED_TYPES:-image/jpeg,image/png,image/webp}
# OCR Services URLs
RAPIDOCR_API_URL: ${RAPIDOCR_API_URL:-http://rapidocr:9004}
PADDLEOCR_API_URL: ${PADDLEOCR_API_URL:-http://paddleocr:8866}
volumes:
# Persist database and uploads
- backend-data:/app/data
- backend-uploads:/app/uploads
networks:
- picanalysis-network
depends_on:
- rapidocr
- paddleocr
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:13057/api/health"]
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:13057/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 40s
# ========================
# RapidOCR Service (本地快速 OCR)
# ========================
rapidocr:
image: volador/rapidocr:latest
container_name: picanalysis-rapidocr
restart: unless-stopped
ports:
- "13058:9004"
networks:
- picanalysis-network
healthcheck:
test: ["CMD", "wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:9004"]
interval: 30s
timeout: 10s
retries: 3
start_period: 20s
# ========================
# PaddleOCR Service (本地高精度 OCR - 使用官方预构建镜像)
# ========================
paddleocr:
image: 987846/paddleocr:latest
container_name: picanalysis-paddleocr
restart: unless-stopped
ports:
- "13059:8866"
environment:
# 修复 protobuf 兼容性问题
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION: python
networks:
- picanalysis-network
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8866/"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
# ========================
# Frontend Service
# ========================

View File

@@ -8,8 +8,9 @@ WORKDIR /app
# Copy package files
COPY package*.json ./
# Install dependencies
RUN npm ci
# Install dependencies with Chinese mirror
RUN npm config set registry https://registry.npmmirror.com && \
npm ci
# ========================================
# Stage 2: Builder

View File

@@ -16,16 +16,43 @@ server {
add_header X-XSS-Protection "1; mode=block" always;
# API proxy to backend
location /api {
proxy_pass http://backend:13057;
location /api/ {
proxy_pass http://backend:13057/;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
# Pass ALL request headers to backend
proxy_pass_request_headers on;
# Set standard proxy headers
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Explicitly pass authentication headers
proxy_set_header Authorization $http_authorization;
# WebSocket support
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
# Bypass cache
proxy_cache_bypass $http_upgrade;
proxy_no_cache $http_upgrade;
}
# Upload files proxy to backend
location /uploads/ {
proxy_pass http://backend:13057/;
proxy_http_version 1.1;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Cache uploaded images
expires 7d;
add_header Cache-Control "public, immutable";
}
# Static files with caching

View File

@@ -22,7 +22,7 @@
},
"devDependencies": {
"@eslint/js": "^9.39.1",
"@playwright/test": "^1.58.2",
"@playwright/test": "^1.40.0",
"@testing-library/jest-dom": "^6.9.1",
"@testing-library/react": "^16.3.2",
"@testing-library/user-event": "^14.6.1",
@@ -1328,19 +1328,19 @@
}
},
"node_modules/@playwright/test": {
"version": "1.58.2",
"resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.58.2.tgz",
"integrity": "sha512-akea+6bHYBBfA9uQqSYmlJXn61cTa+jbO87xVLCWbTqbWadRVmhxlXATaOjOgcBaWU4ePo0wB41KMFv3o35IXA==",
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/@playwright/test/-/test-1.40.0.tgz",
"integrity": "sha512-PdW+kn4eV99iP5gxWNSDQCbhMaDVej+RXL5xr6t04nbKLCBwYtA046t7ofoczHOm8u6c+45hpDKQVZqtqwkeQg==",
"deprecated": "Please update to the latest version of Playwright to test up-to-date browsers.",
"dev": true,
"license": "Apache-2.0",
"dependencies": {
"playwright": "1.58.2"
"playwright": "1.40.0"
},
"bin": {
"playwright": "cli.js"
},
"engines": {
"node": ">=18"
"node": ">=16"
}
},
"node_modules/@polka/url": {
@@ -5200,35 +5200,33 @@
}
},
"node_modules/playwright": {
"version": "1.58.2",
"resolved": "https://registry.npmjs.org/playwright/-/playwright-1.58.2.tgz",
"integrity": "sha512-vA30H8Nvkq/cPBnNw4Q8TWz1EJyqgpuinBcHET0YVJVFldr8JDNiU9LaWAE1KqSkRYazuaBhTpB5ZzShOezQ6A==",
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/playwright/-/playwright-1.40.0.tgz",
"integrity": "sha512-gyHAgQjiDf1m34Xpwzaqb76KgfzYrhK7iih+2IzcOCoZWr/8ZqmdBw+t0RU85ZmfJMgtgAiNtBQ/KS2325INXw==",
"dev": true,
"license": "Apache-2.0",
"dependencies": {
"playwright-core": "1.58.2"
"playwright-core": "1.40.0"
},
"bin": {
"playwright": "cli.js"
},
"engines": {
"node": ">=18"
"node": ">=16"
},
"optionalDependencies": {
"fsevents": "2.3.2"
}
},
"node_modules/playwright-core": {
"version": "1.58.2",
"resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.58.2.tgz",
"integrity": "sha512-yZkEtftgwS8CsfYo7nm0KE8jsvm6i/PTgVtB8DL726wNf6H2IMsDuxCpJj59KDaxCtSnrWan2AeDqM7JBaultg==",
"version": "1.40.0",
"resolved": "https://registry.npmjs.org/playwright-core/-/playwright-core-1.40.0.tgz",
"integrity": "sha512-fvKewVJpGeca8t0ipM56jkVSU6Eo0RmFvQ/MaCQNDYm+sdvKkMBBWTE1FdeMqIdumRaXXjZChWHvIzCGM/tA/Q==",
"dev": true,
"license": "Apache-2.0",
"bin": {
"playwright-core": "cli.js"
},
"engines": {
"node": ">=18"
"node": ">=16"
}
},
"node_modules/playwright/node_modules/fsevents": {
@@ -5237,7 +5235,6 @@
"integrity": "sha512-xiqMQR4xAeHTuB9uWm+fFRcIOgKBMiOBP+eXiyT7jsgVCq1bkVygt00oASowB7EdtpOHaaPgKt812P9ab+DDKA==",
"dev": true,
"hasInstallScript": true,
"license": "MIT",
"optional": true,
"os": [
"darwin"

View File

@@ -29,7 +29,7 @@
},
"devDependencies": {
"@eslint/js": "^9.39.1",
"@playwright/test": "^1.58.2",
"@playwright/test": "^1.40.0",
"@testing-library/jest-dom": "^6.9.1",
"@testing-library/react": "^16.3.2",
"@testing-library/user-event": "^14.6.1",

View File

@@ -5,6 +5,7 @@ import LoginPage from './pages/LoginPage';
import RegisterPage from './pages/RegisterPage';
import DashboardPage from './pages/DashboardPage';
import DocumentsPage from './pages/DocumentsPage';
import DocumentDetailPage from './pages/DocumentDetailPage';
import TodosPage from './pages/TodosPage';
import ImagesPage from './pages/ImagesPage';
import SettingsPage from './pages/SettingsPage';
@@ -47,6 +48,7 @@ function App() {
<Route index element={<Navigate to="/dashboard" replace />} />
<Route path="dashboard" element={<DashboardPage />} />
<Route path="documents" element={<DocumentsPage />} />
<Route path="documents/:id" element={<DocumentDetailPage />} />
<Route path="todos" element={<TodosPage />} />
<Route path="images" element={<ImagesPage />} />
<Route path="settings" element={<SettingsPage />} />

View File

@@ -101,3 +101,16 @@ export function useOCRProviders() {
staleTime: 5 * 60 * 1000, // 5 分钟内不重新获取
});
}
export function useConvertImageToTodo() {
const queryClient = useQueryClient();
return useMutation({
mutationFn: ({ id, options }: { id: string; options?: { title?: string; description?: string; priority?: string } }) =>
ImageService.convertToTodo(id, options),
onSuccess: () => {
queryClient.invalidateQueries({ queryKey: ['images'] });
queryClient.invalidateQueries({ queryKey: ['todos'] });
},
});
}

View File

@@ -0,0 +1,226 @@
import { useParams, useNavigate, Link } from 'react-router-dom';
import { useDocument, useDocumentAnalysis, useAnalyzeDocument } from '@/hooks/useDocuments';
import { Button } from '@/components/Button';
import { Card } from '@/components/Card';
import { ArrowLeft, FileText, Sparkles, Tag, FolderOpen, Trash2, Edit2, CheckSquare } from 'lucide-react';
import { useState, useEffect } from 'react';
export default function DocumentDetailPage() {
const { id } = useParams<{ id: string }>();
const navigate = useNavigate();
const { data: document, isLoading: docLoading, error: docError } = useDocument(id || '');
const { data: analysis, isLoading: analysisLoading } = useDocumentAnalysis(id || '');
const analyzeMutation = useAnalyzeDocument();
const [isEditing, setIsEditing] = useState(false);
const [editedTitle, setEditedTitle] = useState('');
const [editedContent, setEditedContent] = useState('');
useEffect(() => {
if (document) {
setEditedTitle(document.title || '');
setEditedContent(document.content);
}
}, [document]);
const handleAnalyze = async () => {
if (!id) return;
try {
await analyzeMutation.mutateAsync({ id });
} catch (err: any) {
alert(err.message || 'AI 分析失败');
}
};
const handleSave = () => {
// TODO: 实现保存功能
setIsEditing(false);
};
if (docLoading) {
return (
<div className="flex items-center justify-center min-h-screen">
<div className="text-gray-600">...</div>
</div>
);
}
if (docError || !document) {
return (
<div className="flex items-center justify-center min-h-screen">
<Card variant="bordered" className="max-w-md">
<div className="text-center py-8">
<p className="text-red-600 mb-4">访</p>
<Link to="/documents">
<Button></Button>
</Link>
</div>
</Card>
</div>
);
}
return (
<div className="max-w-4xl mx-auto space-y-6">
{/* 头部导航 */}
<div className="flex items-center gap-4">
<Link to="/documents">
<button className="p-2 hover:bg-gray-100 rounded-lg">
<ArrowLeft className="h-5 w-5 text-gray-600" />
</button>
</Link>
<div className="flex-1">
<h1 className="text-2xl font-bold text-gray-900"></h1>
</div>
</div>
{/* 文档内容卡片 */}
<Card variant="bordered">
<div className="space-y-4">
{/* 标题和操作栏 */}
<div className="flex items-start justify-between">
{isEditing ? (
<input
type="text"
value={editedTitle}
onChange={(e) => setEditedTitle(e.target.value)}
className="flex-1 mr-4 px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500"
placeholder="文档标题"
/>
) : (
<h2 className="flex-1 text-xl font-semibold text-gray-900">
{document.title || '无标题'}
</h2>
)}
<div className="flex gap-2">
{isEditing ? (
<>
<Button size="sm" onClick={handleSave}>
</Button>
<Button size="sm" variant="secondary" onClick={() => setIsEditing(false)}>
</Button>
</>
) : (
<>
<button
onClick={() => setIsEditing(true)}
className="p-2 hover:bg-gray-100 rounded-lg"
title="编辑"
>
<Edit2 className="h-4 w-4 text-gray-600" />
</button>
<button
onClick={handleAnalyze}
disabled={analyzeMutation.isPending || analysisLoading}
className="p-2 hover:bg-purple-100 rounded-lg disabled:opacity-50"
title="AI 分析"
>
<Sparkles className={`h-4 w-4 ${analyzeMutation.isPending ? 'animate-pulse' : ''} text-purple-600`} />
</button>
</>
)}
</div>
</div>
{/* 文档内容 */}
{isEditing ? (
<textarea
value={editedContent}
onChange={(e) => setEditedContent(e.target.value)}
className="w-full min-h-[300px] px-3 py-2 border border-gray-300 rounded-lg focus:ring-2 focus:ring-blue-500 text-sm"
placeholder="文档内容"
/>
) : (
<div className="p-4 bg-gray-50 rounded-lg">
<p className="whitespace-pre-wrap text-sm text-gray-700">{document.content}</p>
</div>
)}
{/* 元数据 */}
<div className="flex items-center gap-4 text-xs text-gray-500">
<span>: {new Date(document.created_at).toLocaleString()}</span>
<span>: {new Date(document.updated_at).toLocaleString()}</span>
</div>
</div>
</Card>
{/* AI 分析结果 */}
{(analysis || analyzeMutation.isPending || analysisLoading) && (
<Card title="AI 分析结果" variant="bordered">
{analyzeMutation.isPending || analysisLoading ? (
<div className="flex items-center justify-center py-8">
<Sparkles className="h-6 w-6 text-purple-600 animate-pulse mr-2" />
<span className="text-gray-600">AI ...</span>
</div>
) : analysis ? (
<div className="space-y-4">
{/* 标签 */}
{analysis.suggested_tags && analysis.suggested_tags.length > 0 && (
<div className="flex items-start gap-2">
<Tag className="h-4 w-4 text-gray-500 mt-0.5" />
<div className="flex flex-wrap gap-2">
{analysis.suggested_tags.map((tag, idx) => (
<span
key={idx}
className="inline-flex items-center rounded-full bg-blue-100 px-3 py-1 text-sm font-medium text-blue-800"
>
{tag}
</span>
))}
</div>
</div>
)}
{/* 分类建议 */}
{analysis.suggested_category && (
<div className="flex items-start gap-2">
<FolderOpen className="h-4 w-4 text-gray-500 mt-0.5" />
<span className="text-gray-700">
:{' '}
<span className="font-medium text-gray-900">{analysis.suggested_category}</span>
</span>
</div>
)}
{/* 摘要 */}
{analysis.summary && (
<div className="bg-purple-50 rounded-lg p-4">
<div className="flex items-start gap-2 mb-2">
<Sparkles className="h-4 w-4 text-purple-600 mt-0.5" />
<span className="font-medium text-gray-900"></span>
</div>
<p className="text-sm text-gray-700 ml-6">{analysis.summary}</p>
</div>
)}
{/* 提供商信息 */}
<p className="text-xs text-gray-500">
{analysis.provider} ({analysis.model})
</p>
</div>
) : null}
</Card>
)}
{/* 转为待办 */}
<Card variant="bordered">
<div className="flex items-center justify-between">
<div className="flex items-center gap-2">
<CheckSquare className="h-5 w-5 text-green-600" />
<div>
<h3 className="font-medium text-gray-900"></h3>
<p className="text-xs text-gray-500"></p>
</div>
</div>
<Button
onClick={() => navigate(`/todos?createFromDocument=${id}`)}
variant="secondary"
>
</Button>
</div>
</Card>
</div>
);
}

View File

@@ -5,12 +5,15 @@ import {
useUploadImageFile,
useReprocessImage,
useOCRProviders,
useConvertImageToTodo,
} from '@/hooks/useImages';
import { useCreateDocument } from '@/hooks/useDocuments';
import { Button } from '@/components/Button';
import { Card } from '@/components/Card';
import { Upload, Camera, FileText, CheckSquare, X, RefreshCw, ChevronDown, Settings } from 'lucide-react';
import { Upload, Camera, FileText, CheckSquare, X, RefreshCw, ChevronDown, Settings, Sparkles } from 'lucide-react';
import type { Image } from '@/types';
import { useDeleteImage } from '@/hooks/useImages';
import { useNavigate } from 'react-router';
const API_BASE_URL = import.meta.env.VITE_API_URL || '/api';
@@ -39,12 +42,15 @@ const PROVIDER_DESCRIPTIONS: Record<string, string> = {
};
export default function ImagesPage() {
const navigate = useNavigate();
const { data: images, refetch } = useImages();
const { data: pendingImages } = usePendingImages();
const { data: providers } = useOCRProviders();
const uploadMutation = useUploadImageFile();
const deleteMutation = useDeleteImage();
const reprocessMutation = useReprocessImage();
const convertMutation = useConvertImageToTodo();
const createDocumentMutation = useCreateDocument();
const fileInputRef = useRef<HTMLInputElement>(null);
// Provider 选择状态
@@ -167,6 +173,45 @@ export default function ImagesPage() {
}
};
const handleCreateDocument = async (image: Image, e?: React.MouseEvent) => {
e?.stopPropagation();
if (!image.ocr_result) {
alert('OCR 识别尚未完成,无法创建文档');
return;
}
try {
const doc = await createDocumentMutation.mutateAsync({
title: `从图片提取: ${image.file_path.split('/').pop()}`,
content: image.ocr_result,
});
// 将图片关联到文档(需要调用后端 API
await fetch(`${API_BASE_URL}/images/${image.id}/link`, {
method: 'PUT',
headers: {
'Content-Type': 'application/json',
Authorization: `Bearer ${localStorage.getItem('auth_token')}`,
},
body: JSON.stringify({ document_id: doc.id }),
});
refetch();
alert('已创建文档!');
} catch (err: any) {
alert(err.message || '创建文档失败');
}
};
const handleConvertToTodo = async (imageId: string, e?: React.MouseEvent) => {
e?.stopPropagation();
try {
const todo = await convertMutation.mutateAsync({ id: imageId });
alert('已成功转换为待办事项!');
// 可选:跳转到待办页面
// navigate('/todos');
} catch (err: any) {
alert(err.message || '转换为待办失败');
}
};
const getStatusLabel = (status: string) => {
switch (status) {
case 'completed':
@@ -447,17 +492,38 @@ export default function ImagesPage() {
<div className="flex flex-wrap gap-2">
<ReprocessButton image={image} />
{image.document_id ? (
<button className="flex items-center rounded bg-blue-50 px-2 py-1 text-xs text-blue-600 hover:bg-blue-100">
<FileText className="mr-1 h-3 w-3" />
</button>
) : (
<button className="flex items-center rounded border border-gray-300 px-2 py-1 text-xs text-gray-600 hover:bg-gray-50">
<CheckSquare className="mr-1 h-3 w-3" />
</button>
)}
{image.processing_status === 'completed' && image.ocr_result ? (
<>
{image.document_id ? (
<>
<button
onClick={() => navigate(`/documents/${image.document_id}`)}
className="flex items-center rounded bg-blue-50 px-2 py-1 text-xs text-blue-600 hover:bg-blue-100"
>
<FileText className="mr-1 h-3 w-3" />
</button>
<button
onClick={(e) => handleConvertToTodo(image.id, e)}
disabled={convertMutation.isPending}
className="flex items-center rounded border border-gray-300 px-2 py-1 text-xs text-gray-600 hover:bg-gray-50 disabled:opacity-50"
>
<CheckSquare className="mr-1 h-3 w-3" />
{convertMutation.isPending ? '转换中...' : '转为待办'}
</button>
</>
) : (
<button
onClick={(e) => handleCreateDocument(image, e)}
disabled={createDocumentMutation.isPending}
className="flex items-center rounded bg-green-50 px-2 py-1 text-xs text-green-600 hover:bg-green-100 disabled:opacity-50"
>
<FileText className="mr-1 h-3 w-3" />
{createDocumentMutation.isPending ? '创建中...' : '创建文档'}
</button>
)}
</>
) : null}
</div>
</div>
</div>

View File

@@ -6,7 +6,7 @@ import { Settings, Save, CheckCircle, XCircle, Eye, EyeOff, Server, Globe, Datab
// 从环境变量或 localStorage 获取 API 地址
const getDefaultApiUrl = () => {
return import.meta.env.VITE_API_URL || localStorage.getItem('api_base_url') || '/api';
return import.meta.env.VITE_API_URL || localStorage.getItem('api_base_url') || '';
};
type ApiConfig = {
@@ -14,13 +14,14 @@ type ApiConfig = {
};
type OCRConfig = {
provider: 'auto' | 'tesseract' | 'baidu' | 'tencent' | 'rapidocr';
provider: 'auto' | 'tesseract' | 'baidu' | 'tencent' | 'rapidocr' | 'paddleocr';
confidenceThreshold: number;
baiduApiKey: string;
baiduSecretKey: string;
tencentSecretId: string;
tencentSecretKey: string;
rapidocrUrl: string;
paddleocrUrl: string;
};
type AIConfig = {
@@ -62,7 +63,8 @@ const defaultOCRConfig: OCRConfig = {
baiduSecretKey: '',
tencentSecretId: '',
tencentSecretKey: '',
rapidocrUrl: 'http://localhost:8080',
rapidocrUrl: 'http://localhost:13058',
paddleocrUrl: 'http://localhost:13059',
};
const defaultAIConfig: AIConfig = {
@@ -504,6 +506,7 @@ export default function SettingsPage() {
<option value="auto"></option>
<option value="tesseract">Tesseract.js ()</option>
<option value="rapidocr">RapidOCR ()</option>
<option value="paddleocr">PaddleOCR ()</option>
<option value="baidu"> OCR ()</option>
<option value="tencent"> OCR ()</option>
</select>
@@ -640,6 +643,45 @@ export default function SettingsPage() {
)}
</div>
</Card>
<Card title="PaddleOCR" variant="bordered">
<div className="space-y-4">
<div>
<label className="block text-sm font-medium text-gray-700 mb-2">
</label>
<Input
value={ocrConfig.paddleocrUrl}
onChange={(e) => updateOcrConfig('paddleocrUrl', e.target.value)}
placeholder="http://localhost:8866"
/>
</div>
<div className="flex items-center justify-between">
<div>
<p className="text-xs text-gray-600">
特点: 免费线 ()
</p>
<p className="text-xs text-gray-500 mt-1">
: <code className="bg-gray-100 px-1 rounded">docker run -p 8866:8866 987846/paddleocr:latest</code>
</p>
</div>
<Button
variant="secondary"
size="sm"
onClick={() => handleTest('paddleocr')}
loading={testing === 'paddleocr'}
>
</Button>
</div>
{testResults.paddleocr && (
<div className={`flex items-center gap-2 text-sm ${testResults.paddleocr.success ? 'text-green-600' : 'text-red-600'}`}>
{testResults.paddleocr.success ? <CheckCircle className="h-4 w-4" /> : <XCircle className="h-4 w-4" />}
{testResults.paddleocr.message}
</div>
)}
</div>
</Card>
</>
)}

View File

@@ -132,6 +132,23 @@ class ImageServiceClass {
return [];
}
}
/**
* 将图片 OCR 结果转换为待办事项
* @param id 图片 ID
* @param options 转换选项
*/
async convertToTodo(id: string, options?: { title?: string; description?: string; priority?: string }): Promise<any> {
try {
const response = await apiClient.post<{ success: boolean; data: any }>(`/images/${id}/convert-to-todo`, options || {});
if (response.data.success && response.data.data) {
return response.data.data;
}
throw new Error('转换失败');
} catch (error: any) {
throw new Error(error.response?.data?.error || '转换为待办失败');
}
}
}
export const ImageService = new ImageServiceClass();

42
paddleocr/Dockerfile Normal file
View File

@@ -0,0 +1,42 @@
# PaddleOCR Service Dockerfile
# 从 Python 基础镜像构建,避免 CPU 指令集兼容性问题
FROM python:3.10-slim
WORKDIR /app
# 安装系统依赖(使用新的包名适配 Debian Trixie
RUN apt-get update && apt-get install -y \
libgomp1 \
libglib2.0-0 \
libsm6 \
libxext6 \
libxrender-dev \
libgl1 \
git \
wget \
&& rm -rf /var/lib/apt/lists/*
# 复制 requirements
COPY requirements.txt .
# 安装 Python 依赖
# 使用 pip 安装的 PaddlePaddle 会自动适配 CPU 指令集
RUN pip install --no-cache-dir paddlepaddle==2.6.0 \
&& pip install --no-cache-dir -r requirements.txt
# 克隆 PaddleOCR 仓库
RUN git clone https://github.com/PaddlePaddle/PaddleOCR.git /PaddleOCR
# 设置环境
ENV PYTHONPATH=/PaddleOCR:$PYTHONPATH
ENV HOME=/root
# 复制 API 服务代码
COPY paddleocr_api.py /app/paddleocr_api.py
# 暴露端口
EXPOSE 8866
# 启动 API 服务
CMD ["python", "/app/paddleocr_api.py"]

166
paddleocr/paddleocr_api.py Normal file
View File

@@ -0,0 +1,166 @@
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
PaddleOCR HTTP API Service
基于 PaddlePaddle 官方镜像的 OCR HTTP 服务
"""
from flask import Flask, request, jsonify
from paddleocr import PaddleOCR
import base64
import io
from PIL import Image
import logging
# 配置日志
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# 初始化 PaddleOCR
ocr = PaddleOCR(
use_angle_cls=True,
lang='ch',
use_gpu=False,
show_log=False
)
app = Flask(__name__)
@app.route('/', methods=['GET'])
def index():
"""健康检查"""
return jsonify({
"message": "PaddleOCR Server is running!",
"version": "2.7.0",
"endpoints": {
"/": "GET - 健康检查",
"/ocr/scan": "POST - OCR 识别"
}
})
@app.route('/ocr/scan', methods=['POST'])
def ocr_scan():
"""OCR 识别接口"""
try:
# 获取请求数据
data = request.get_json()
if not data or 'image' not in data:
return jsonify({
"success": False,
"error": "Missing image data"
}), 400
# 解码图片
image_data = data['image']
if isinstance(image_data, str):
# Base64 编码
if image_data.startswith('data:image'):
image_data = image_data.split(',')[1]
image_bytes = base64.b64decode(image_data)
else:
return jsonify({
"success": False,
"error": "Invalid image format"
}), 400
# 转换为 PIL Image
image = Image.open(io.BytesIO(image_bytes))
# 执行 OCR
result = ocr.ocr(image, cls=True)
# 解析结果
if result and result[0]:
texts = []
for line in result[0]:
box = line[0]
text_info = line[1]
texts.append({
"text": text_info[0],
"confidence": float(text_info[1]),
"box": box
})
all_text = "\n".join([t["text"] for t in texts])
return jsonify({
"success": True,
"data": {
"texts": texts,
"fullText": all_text
}
})
else:
return jsonify({
"success": False,
"error": "No text detected"
}), 200
except Exception as e:
logger.error(f"OCR Error: {str(e)}")
return jsonify({
"success": False,
"error": str(e)
}), 500
@app.route('/ocr/text', methods=['POST'])
def ocr_text():
"""简化的 OCR 接口,只返回文本"""
try:
data = request.get_json()
if not data or 'image' not in data:
return jsonify({
"success": False,
"error": "Missing image data"
}), 400
# 解码图片
image_data = data['image']
if isinstance(image_data, str):
if image_data.startswith('data:image'):
image_data = image_data.split(',')[1]
image_bytes = base64.b64decode(image_data)
else:
return jsonify({
"success": False,
"error": "Invalid image format"
}), 400
image = Image.open(io.BytesIO(image_bytes))
# 执行 OCR
result = ocr.ocr(image, cls=True)
# 提取文本
if result and result[0]:
texts = [line[1][0] for line in result[0]]
all_text = "\n".join(texts)
return jsonify({
"success": True,
"data": {
"text": all_text,
"lines": texts
}
})
else:
return jsonify({
"success": True,
"data": {
"text": "",
"lines": []
}
})
except Exception as e:
logger.error(f"OCR Error: {str(e)}")
return jsonify({
"success": False,
"error": str(e)
}), 500
if __name__ == '__main__':
logger.info("Starting PaddleOCR API server on port 8866...")
app.run(host='0.0.0.0', port=8866, debug=False)

View File

@@ -0,0 +1,5 @@
paddleocr==2.7.0
protobuf>=3.20.2
flask==2.3.0
pillow==10.0.0
numpy<2.0.0

BIN
test_ocr.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 230 KiB

2
当前问题 Normal file
View File

@@ -0,0 +1,2 @@
识别后无法转为待办
paddleOCR和ripadOCR