110 lines
2.2 KiB
Markdown
110 lines
2.2 KiB
Markdown
|
|
# CutThenThink OCR Plugin
|
|||
|
|
|
|||
|
|
本地 OCR 插件,基于 Tesseract OCR 引擎实现。
|
|||
|
|
|
|||
|
|
## 功能特性
|
|||
|
|
|
|||
|
|
- 支持多语言识别(中文、英文、日文、韩文等)
|
|||
|
|
- 返回文本内容和边界框信息
|
|||
|
|
- 提供置信度评分
|
|||
|
|
- 跨平台支持(Windows、macOS、Linux)
|
|||
|
|
|
|||
|
|
## 系统依赖
|
|||
|
|
|
|||
|
|
### Linux (Ubuntu/Debian)
|
|||
|
|
```bash
|
|||
|
|
sudo apt-get install tesseract-ocr
|
|||
|
|
sudo apt-get install tesseract-ocr-chi-sim # 简体中文
|
|||
|
|
sudo apt-get install tesseract-ocr-chi-tra # 繁体中文
|
|||
|
|
sudo apt-get install libtesseract-dev
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### macOS
|
|||
|
|
```bash
|
|||
|
|
brew install tesseract
|
|||
|
|
brew install tesseract-lang # 包含中文语言包
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### Windows
|
|||
|
|
1. 下载安装 Tesseract from [UB Mannheim](https://github.com/UB-Mannheim/tesseract/wiki)
|
|||
|
|
2. 将 Tesseract 安装目录添加到 PATH 环境变量
|
|||
|
|
3. 安装中文语言包
|
|||
|
|
|
|||
|
|
## 编译
|
|||
|
|
|
|||
|
|
```bash
|
|||
|
|
# 下载依赖
|
|||
|
|
go mod download
|
|||
|
|
|
|||
|
|
# 编译当前平台
|
|||
|
|
go build -o ocr-plugin main.go
|
|||
|
|
|
|||
|
|
# 交叉编译
|
|||
|
|
GOOS=windows GOARCH=amd64 go build -o ocr-plugin.exe main.go
|
|||
|
|
GOOS=darwin GOARCH=amd64 go build -o ocr-plugin-mac main.go
|
|||
|
|
GOOS=linux GOARCH=amd64 go build -o ocr-plugin-linux main.go
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 使用
|
|||
|
|
|
|||
|
|
### 查看版本
|
|||
|
|
```bash
|
|||
|
|
./ocr-plugin version
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 识别文本
|
|||
|
|
```bash
|
|||
|
|
./ocr-plugin recognize -image /path/to/image.png -lang eng+chi_sim
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 参数说明
|
|||
|
|
- `-image`: 图片文件路径(必需)
|
|||
|
|
- `-lang`: OCR 语言(默认: eng+chi_sim)
|
|||
|
|
|
|||
|
|
#### 支持的语言代码
|
|||
|
|
- `eng` - English
|
|||
|
|
- `chi_sim` - 简体中文
|
|||
|
|
- `chi_tra` - 繁体中文
|
|||
|
|
- `jpn` - Japanese
|
|||
|
|
- `kor` - Korean
|
|||
|
|
|
|||
|
|
## 输出格式
|
|||
|
|
|
|||
|
|
### 成功响应
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": true,
|
|||
|
|
"engine": "tesseract",
|
|||
|
|
"language": "eng+chi_sim",
|
|||
|
|
"blocks": [
|
|||
|
|
{
|
|||
|
|
"text": "识别的文本",
|
|||
|
|
"confidence": 95.5,
|
|||
|
|
"bbox_x": 100,
|
|||
|
|
"bbox_y": 200,
|
|||
|
|
"bbox_width": 150,
|
|||
|
|
"bbox_height": 30,
|
|||
|
|
"block_type": "text"
|
|||
|
|
}
|
|||
|
|
]
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 错误响应
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"success": false,
|
|||
|
|
"error": "错误信息"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 性能优化
|
|||
|
|
|
|||
|
|
1. **图片预处理**: 在识别前对图片进行降噪、二值化处理可提高准确率
|
|||
|
|
2. **语言选择**: 只加载需要的语言包可以提高速度
|
|||
|
|
3. **图片尺寸**: 过大的图片会降低识别速度,建议缩放到合理尺寸
|
|||
|
|
|
|||
|
|
## 许可证
|
|||
|
|
|
|||
|
|
MIT License
|