feat: 任务进度实时展示、接口测试、暗色主题重构及多项 bug 修复

后端
- 新增 app/task_progress.py 线程安全进度注册表
- 任务改为后台线程异步执行(_run_task_background),手动触发立即返回 task_key
- 6 个任务函数(summarizer/tagger/scorer/deduplicator/brief/taxonomy)循环内上报进度
- scheduler 定时任务同步上报进度(trigger=scheduled)
- 新增 GET /api/tasks/progress 与 POST /api/tasks/progress/reset 接口
- 新增 POST /api/test-connection 接口连通性测试(独立短超时客户端)
- 修复 ai_client/rss_client 配置在 import 时固化的 bug(改为 property 运行时读取 settings),
  导致实际任务用 .env 假 key 调 LLM 401
- 修复 ai_client 对 reasoning 模型(MiniMax-M3 等)输出 <think> 块的 JSON 解析失败
- 修复 taxonomy bootstrap:LLM 超时(改用 300s 专用 client)、MiniMax 输出审查
  (精简样本仅标题 + 约束生成中性类目名)、失败误报 success(改抛异常如实标记)
- 修复 models.py 双外键关系映射启动崩溃(显式 foreign_keys)
- 修复 main.py SPA 路由 404、ArticleOut.published_at 序列化 500
- 移除 lifespan 同步 bootstrap 阻塞启动,改由 scheduler 后台异步执行

前端
- Deep Ink 高对比度暗色主题重构,修复 Element Plus 暗色模式对比度问题
- Tasks 页面任务进度实时展示(进度条/阶段/计数/状态/触发来源)+ 1.5s 轮询
- 接口测试面板(rssKeeper / LLM 连通性 + 延迟)
- 修复 nextJobs jobId 映射 bug

部署与文档
- Dockerfile 优化(BuildKit 缓存挂载、预编译 wheel、去 gcc、阿里云镜像源)
- 新增 API.md 接口文档

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
congsh
2026-06-14 15:14:40 +08:00
parent bae47a2411
commit 778ccefb22
24 changed files with 1853 additions and 312 deletions
+7
View File
@@ -51,3 +51,10 @@ data/
# 系统文件 # 系统文件
.DS_Store .DS_Store
# 截图与 Playwright 调试产物
*.png
.playwright-mcp/
# Claude 计划文件
.claude/
+434
View File
@@ -0,0 +1,434 @@
# dataClean 接口文档
> 服务地址:`http://<host>:7331`
> 所有 `/api/*` 接口(除 `/health`)在配置了 `API_TOKEN` 时需要在请求头携带 `Authorization: Bearer <token>`。
## 目录
- [鉴权](#鉴权)
- [健康检查](#健康检查)
- [文章接口](#文章接口)
- [简报接口](#简报接口)
- [分类体系接口](#分类体系接口)
- [任务接口](#任务接口)
- [任务进度](#任务进度)
- [接口连通性测试](#接口连通性测试)
- [配置管理接口](#配置管理接口)
- [仪表盘统计](#仪表盘统计)
- [错误码](#错误码)
---
## 鉴权
若服务端配置了 `API_TOKEN`,除 `/health` 外所有接口需要在请求头携带:
```
Authorization: Bearer <API_TOKEN>
```
未携带或 token 无效时返回 `401` / `403`。未配置 `API_TOKEN` 时不启用鉴权(仅建议内网使用)。
---
## 健康检查
### `GET /health`
服务存活探针,无需鉴权。
**响应**
```json
{ "status": "ok", "service": "dataClean" }
```
---
## 文章接口
### `GET /api/articles`
分页查询加工后的文章,支持按日期、分类、标签过滤。
**查询参数**
| 参数 | 类型 | 必填 | 默认 | 说明 |
|------|------|------|------|------|
| `date` | string | 否 | - | 日期 `YYYY-MM-DD`,按 `fetched_at` 过滤当天 |
| `category` | string | 否 | - | 精确分类名 |
| `tag` | string | 否 | - | 标签名(JSON 数组精确匹配) |
| `representative_only` | bool | 否 | `false` | 仅返回重复组代表文章 |
| `limit` | int | 否 | `50` | 1200 |
| `offset` | int | 否 | `0` | 分页偏移 |
**响应** `ArticleListOut`
```json
{
"total": 200,
"items": [
{
"id": 1,
"rk_article_id": 28124,
"title": "文章标题",
"link": "https://...",
"feed_title": "来源",
"category": "科技",
"tags": ["AI", "芯片"],
"heat_score": 45.2,
"importance_score": 60.0,
"duplication_score": 25.0,
"composite_score": 52.56,
"ai_summary": "AI 生成的摘要",
"is_representative": false,
"published_at": "2026-06-13T13:48:42"
}
]
}
```
### `GET /api/articles/{article_id}`
获取单篇文章详情。
**路径参数** `article_id`int
**响应** `ArticleOut`(同上 items 元素)
**错误** `404` 文章不存在
---
## 简报接口
### `GET /api/briefs`
列出每日简报(按日期倒序)。
**查询参数** `limit`int,默认 30,范围 1100
**响应** `List[BriefOut]`
```json
[
{
"id": 1,
"brief_date": "2026-06-13",
"total_articles": 200,
"unique_articles": 150,
"by_category": { "科技": [{...}], "财经": [{...}] },
"markdown_path": "/app/data/briefs/2026-06-13/daily-brief.md"
}
]
```
### `GET /api/briefs/{date}`
获取指定日期简报。
**路径参数** `date`string`YYYY-MM-DD`
**响应** `BriefOut` **错误** `404` 简报不存在
### `POST /api/briefs/{date}/regenerate`
强制重新生成指定日期简报。同步执行(需持任务锁)。
**响应**
```json
{ "message": "简报已重新生成", "data": { ... } }
```
**错误** `409` 已有任务执行中
---
## 分类体系接口
### `GET /api/taxonomy`
列出分类/标签/打分规则。
**查询参数** `kind`string,可选,过滤类型:`category` / `tag` / `heat_rule` / `importance_rule` / `duplication_rule`
**响应** `List[TaxonomyOut]`
```json
[
{
"id": 1,
"name": "科技",
"kind": "category",
"description": "人工智能、芯片、互联网等",
"keywords": ["AI", "芯片", "大模型"],
"weight": 1.0,
"created_by_ai": true
}
]
```
### `POST /api/taxonomy/bootstrap`
初始化或强制重建分类体系(后台异步执行)。
**查询参数** `force`bool,默认 `false``true` 时清空后重建。
**响应**(立即返回)
```json
{ "message": "taxonomy 初始化已开始", "task_key": "bootstrap_taxonomy" }
```
**错误** `409` 已有任务执行中。可通过 [`GET /api/tasks/progress`](#任务进度) 查看 `bootstrap_taxonomy` 进度。
---
## 任务接口
所有任务接口均为**后台异步执行**:提交后立即返回 `task_key`,任务在线程池执行,通过[进度接口](#任务进度)轮询。
任务全局互斥(共享 `_task_lock`):同一时刻仅一个任务运行。
### `POST /api/tasks/summarize`
拉取 rssKeeper 最近 24 小时文章,为无摘要/短摘要文章生成 AI 摘要。
**响应**
```json
{ "message": "摘要任务已开始", "task_key": "summarize" }
```
**错误** `409` 已有任务执行中
### `POST /api/tasks/tag-score-dedup`
对当天文章执行:分类打标 → 去重 → 打分(三阶段,进度合并显示)。
**响应**
```json
{ "message": "分类/去重/打分任务已开始", "task_key": "tag_score_dedup" }
```
### `POST /api/tasks/brief`
生成当天每日简报(force 重新生成)。
**响应**
```json
{ "message": "简报生成任务已开始", "task_key": "generate_daily_brief" }
```
---
## 任务进度
### `GET /api/tasks/progress`
返回所有任务的实时进度快照(前端每 ~1.5 秒轮询)。
**响应**
```json
{
"summarize": {
"status": "running",
"stage": "生成摘要",
"current": 75,
"total": 200,
"message": null,
"started_at": "2026-06-13T14:30:00+00:00",
"updated_at": "2026-06-13T14:32:15+00:00",
"finished_at": null,
"trigger": "manual"
},
"tag_score_dedup": { "status": "idle", "stage": "", "current": 0, "total": 0, "message": null, "started_at": null, "updated_at": null, "finished_at": null, "trigger": null },
"generate_daily_brief": { "..." : "同上结构" },
"bootstrap_taxonomy": { "..." : "同上结构" }
}
```
**字段说明**
| 字段 | 说明 |
|------|------|
| `status` | `idle` / `running` / `success` / `error` |
| `stage` | 当前阶段文案(如「生成摘要」「LLM 生成分类体系」) |
| `current` / `total` | 进度计数,`total=0` 时为阶段型任务(用 indeterminate 进度条) |
| `message` | 附加信息或错误详情(`status=error` 时为错误信息) |
| `started_at` / `finished_at` | ISO 8601 时间戳 |
| `trigger` | `manual`(手动触发)/ `scheduled`(定时触发) |
### `POST /api/tasks/progress/reset`
重置指定任务的进度为 idle(清除终态显示)。
**查询参数** `task_key`string,必填
**响应** `{ "message": "已重置" }`
---
## 接口连通性测试
### `POST /api/test-connection`
测试 rssKeeper 与 LLM API 连通性,返回状态与延迟。每个测试使用独立短超时客户端(10 秒、0 重试)。
**响应** `ConnectionTestResponse`
```json
{
"rss_keeper": {
"name": "rssKeeper",
"status": "ok",
"latency_ms": 26.0,
"error": null
},
"llm": {
"name": "LLM",
"status": "ok",
"latency_ms": 6871.4,
"error": null
}
}
```
`status``error``error` 字段含失败原因,`latency_ms``null`
---
## 配置管理接口
> 配置修改保存到数据库,部分配置(调度间隔等)需重启服务生效。
### `GET /api/settings`
列出所有可编辑配置。敏感项(`OPENAI_API_KEY` / `API_TOKEN`)返回脱敏值。
**响应** `List[SettingOut]`
```json
[
{
"key": "OPENAI_API_KEY",
"value": "sk-c...2R_8",
"description": "LLM API Key",
"is_sensitive": true,
"is_masked": true,
"updated_at": "2026-06-13T..."
}
]
```
### `PUT /api/settings/{key}`
更新单个配置项。
**请求体**
```json
{ "value": "新值" }
```
**响应** `{ "message": "配置已保存,重启服务后生效" }` **错误** `400` 无效配置项
### `PUT /api/settings`
批量更新配置。
**请求体**
```json
{ "settings": { "OPENAI_MODEL": "gpt-4o-mini", "OPENAI_TIMEOUT": "60" } }
```
**响应** `{ "message": "配置已保存,重启服务后生效" }` **错误** `400` 列出无效配置项
### `POST /api/settings/reset`
将所有配置重置为环境变量默认值。
**响应** `{ "message": "配置已重置为环境变量默认值,重启服务后生效" }`
### 可编辑配置清单
| key | 说明 | 敏感 |
|-----|------|------|
| `RSSKEEPER_BASE_URL` | rssKeeper 服务地址 | 否 |
| `OPENAI_API_KEY` | LLM API Key | 是 |
| `OPENAI_BASE_URL` | LLM API 基础地址 | 否 |
| `OPENAI_MODEL` | LLM 模型名 | 否 |
| `OPENAI_TIMEOUT` | LLM 调用超时(秒) | 否 |
| `OPENAI_MAX_RETRIES` | LLM 最大重试次数 | 否 |
| `SUMMARIZE_INTERVAL_MINUTES` | 摘要任务间隔(分钟) | 否 |
| `TAG_SCORE_INTERVAL_MINUTES` | 分类/打分/去重任务间隔(分钟) | 否 |
| `DAILY_BRIEF_HOUR` | 每日简报生成小时 | 否 |
| `DAILY_BRIEF_MINUTE` | 每日简报生成分钟 | 否 |
| `TITLE_SIMILARITY_THRESHOLD` | 标题相似度阈值 | 否 |
| `CONTENT_SIMILARITY_THRESHOLD` | 内容相似度阈值 | 否 |
| `MAX_AI_SUMMARY_LENGTH` | AI 摘要最大长度 | 否 |
| `MIN_ORIGINAL_SUMMARY_LENGTH` | 原始摘要最小长度 | 否 |
| `BRIEF_TOP_N_PER_CATEGORY` | 简报每分类显示文章数 | 否 |
| `LOG_LEVEL` | 日志级别 | 否 |
| `API_TOKEN` | API 鉴权 Token(空则不启用) | 是 |
| `CORS_ALLOWED_ORIGINS` | CORS 允许来源(逗号分隔) | 否 |
---
## 仪表盘统计
### `GET /api/stats`
返回仪表盘统计与下次定时任务时间。
**响应** `StatsOut`
```json
{
"total_articles": 200,
"today_articles": 50,
"ai_summarized": 180,
"categories": 12,
"tags": 43,
"duplicate_groups": 5,
"briefs": 1,
"next_jobs": {
"fetch_and_summarize": "2026-06-13T17:39:13+08:00",
"tag_score_deduplicate": "2026-06-14T16:39:13+08:00",
"generate_daily_brief": "2026-06-14T08:00:00+08:00"
}
}
```
---
## 错误码
| 状态码 | 含义 | 触发场景 |
|--------|------|----------|
| `200` | 成功 | 正常请求 |
| `400` | 参数错误 | 无效配置项 / 请求体格式错误 |
| `401` | 未认证 | 未携带 Authorization 头(启用了鉴权时) |
| `403` | 鉴权失败 | token 无效 |
| `404` | 资源不存在 | 文章/简报不存在 |
| `409` | 冲突 | 已有任务正在执行(任务全局互斥) |
| `422` | 校验失败 | 响应模型序列化失败等 |
| `500` | 服务器错误 | 内部异常 |
## 定时任务
服务启动后由 APScheduler 自动注册(时区 `Asia/Shanghai`):
| Job ID | 触发方式 | 默认 |
|--------|----------|------|
| `bootstrap_taxonomy` | 启动时一次(DateTrigger | taxonomy 为空时生成 |
| `fetch_and_summarize` | 间隔 | 每 60 分钟 |
| `tag_score_deduplicate` | 间隔 | 每 1440 分钟(24 小时) |
| `generate_daily_brief` | Cron | 每日 08:00 |
间隔参数可通过[配置接口](#配置管理接口)修改,修改后需重启服务生效。
+18 -9
View File
@@ -5,25 +5,34 @@ ARG NPM_REGISTRY=https://registry.npmmirror.com
WORKDIR /app/frontend WORKDIR /app/frontend
COPY frontend/package*.json ./ COPY frontend/package*.json ./
RUN npm install --registry=${NPM_REGISTRY} RUN --mount=type=cache,target=/root/.npm \
npm install --registry=${NPM_REGISTRY}
COPY frontend/ . COPY frontend/ .
RUN npm run build RUN npm run build
# Stage 2: Python 后端 # Stage 2: Python 后端
FROM python:3.12-slim FROM python:3.12-slim
ARG PIP_INDEX=https://pypi.tuna.tsinghua.edu.cn/simple ARG PIP_INDEX=https://mirrors.aliyun.com/pypi/simple/
WORKDIR /app WORKDIR /app
# 安装构建依赖(部分 Python 包可能需要),并创建非 root 用户 # 先只 COPY requirements.txt,利用 Docker 层缓存——只要依赖不变就命中缓存
RUN apt-get update && apt-get install -y --no-install-recommends \
gcc \
&& rm -rf /var/lib/apt/lists/* \
&& useradd --create-home --uid 1000 app
COPY requirements.txt . COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt -i ${PIP_INDEX}
# 用 --only-binary=:all: 强制只下载预编译 wheel,避免编译 scikit-learn
# 若平台无 wheel 会报错,但 x86_64 上 scikit-learn/numpy/scipy 都有
RUN --mount=type=cache,target=/root/.cache/pip \
pip install --no-cache-dir -r requirements.txt \
-i ${PIP_INDEX} \
--trusted-host mirrors.aliyun.com \
--only-binary=:all: \
|| pip install --no-cache-dir -r requirements.txt \
-i ${PIP_INDEX} \
--trusted-host mirrors.aliyun.com
# 创建非 root 用户(不需要 gcc 了,去掉 apt-get 节省 ~40s
RUN useradd --create-home --uid 1000 app
COPY . . COPY . .
COPY --from=frontend-builder /app/frontend/dist ./static COPY --from=frontend-builder /app/frontend/dist ./static
+79 -16
View File
@@ -1,6 +1,7 @@
"""LLM API 客户端,兼容 OpenAI API 格式""" """LLM API 客户端,兼容 OpenAI API 格式"""
import json import json
import logging import logging
import re
from typing import Optional from typing import Optional
from openai import OpenAI, APIError from openai import OpenAI, APIError
@@ -9,9 +10,57 @@ from config import settings
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
# 匹配 reasoning 模型(MiniMax-M3 / DeepSeek-R1 / GLM-Z1 等)的 <think>...</think> 推理块
_THINK_RE = re.compile(r"<think>.*?</think>", re.DOTALL)
def _parse_llm_json(content: str) -> dict:
"""从 LLM 输出中提取 JSON。
兼容 reasoning 模型在 json_object 模式下仍输出 <think>...</think>
推理块、以及 JSON 前后有多余文本的情况。
"""
if not content or not content.strip():
raise ValueError("LLM 返回空内容,无法解析 JSON")
text = content.strip()
# 1) 去掉闭合的 <think>...</think> 块
text = _THINK_RE.sub("", text).strip()
# 2) 处理只有 <think> 开头但未闭合(content 被截断)的情况
if text.startswith("<think>"):
text = text.split("</think>", 1)[-1].strip()
# 3) 尝试直接解析
try:
return json.loads(text)
except json.JSONDecodeError:
pass
# 4) 提取首个 { 到最后 } 之间的子串
start = text.find("{")
end = text.rfind("}")
if start != -1 and end > start:
try:
return json.loads(text[start : end + 1])
except json.JSONDecodeError:
pass
# 5) 兜底:尝试数组
start = text.find("[")
end = text.rfind("]")
if start != -1 and end > start:
return json.loads(text[start : end + 1])
logger.error("无法从 LLM 输出提取 JSON: %s", content[:500])
raise ValueError("LLM 输出无法解析为 JSON")
class AIClient: class AIClient:
"""封装 LLM 调用,支持重试和 JSON 输出""" """封装 LLM 调用,支持重试和 JSON 输出
配置以 property 形式运行时从 settings 读取,避免模块 import 时
固化旧值(settings 在 FastAPI lifespan 启动后才会被数据库配置覆盖)。
"""
def __init__( def __init__(
self, self,
@@ -21,24 +70,42 @@ class AIClient:
timeout: Optional[int] = None, timeout: Optional[int] = None,
max_retries: Optional[int] = None, max_retries: Optional[int] = None,
): ):
self.api_key = api_key or settings.OPENAI_API_KEY # 仅保存显式传入的覆盖值;为 None 时运行时回退到 settings
self.base_url = base_url or settings.OPENAI_BASE_URL self._api_key = api_key
self.model = model or settings.OPENAI_MODEL self._base_url = base_url
self.timeout = timeout or settings.OPENAI_TIMEOUT self._model = model
self.max_retries = max_retries or settings.OPENAI_MAX_RETRIES self._timeout = timeout
self._max_retries = max_retries
self._client: Optional[OpenAI] = None @property
def api_key(self) -> str:
return self._api_key or settings.OPENAI_API_KEY
@property
def base_url(self) -> str:
return self._base_url or settings.OPENAI_BASE_URL
@property
def model(self) -> str:
return self._model or settings.OPENAI_MODEL
@property
def timeout(self) -> int:
return self._timeout or settings.OPENAI_TIMEOUT
@property
def max_retries(self) -> int:
return self._max_retries or settings.OPENAI_MAX_RETRIES
@property @property
def client(self) -> OpenAI: def client(self) -> OpenAI:
if self._client is None: # 每次按最新配置创建,确保用到启动后覆盖的真实配置
self._client = OpenAI( return OpenAI(
api_key=self.api_key, api_key=self.api_key,
base_url=self.base_url, base_url=self.base_url,
timeout=self.timeout, timeout=self.timeout,
max_retries=self.max_retries, max_retries=self.max_retries,
) )
return self._client
def chat_completion( def chat_completion(
self, self,
@@ -75,18 +142,14 @@ class AIClient:
user_prompt: str, user_prompt: str,
temperature: float = 0.3, temperature: float = 0.3,
) -> dict: ) -> dict:
"""调用 LLM 并解析返回的 JSON""" """调用 LLM 并解析返回的 JSON(兼容 reasoning 模型的 <think> 块)"""
content = self.chat_completion( content = self.chat_completion(
system_prompt=system_prompt, system_prompt=system_prompt,
user_prompt=user_prompt, user_prompt=user_prompt,
temperature=temperature, temperature=temperature,
json_mode=True, json_mode=True,
) )
try: return _parse_llm_json(content)
return json.loads(content)
except json.JSONDecodeError as exc:
logger.error("LLM 返回不是合法 JSON: %s - content=%s", exc, content[:500])
raise
ai_client = AIClient() ai_client = AIClient()
+7
View File
@@ -9,6 +9,7 @@ from sqlalchemy.orm import Session
from config import settings from config import settings
from models import EnrichedArticle, DailyBrief from models import EnrichedArticle, DailyBrief
from app.task_progress import update_progress
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -76,6 +77,7 @@ def generate_daily_brief(db: Session, date_str: str = None, force: bool = False)
existing = db.query(DailyBrief).filter(DailyBrief.brief_date == date_str).first() existing = db.query(DailyBrief).filter(DailyBrief.brief_date == date_str).first()
if existing and not force: if existing and not force:
logger.info("日期 %s 简报已存在,跳过生成", date_str) logger.info("日期 %s 简报已存在,跳过生成", date_str)
update_progress("generate_daily_brief", status="running", stage="简报已存在", current=0, total=0, message="简报已存在,跳过生成")
return { return {
"date": date_str, "date": date_str,
"total_articles": existing.total_articles, "total_articles": existing.total_articles,
@@ -86,6 +88,8 @@ def generate_daily_brief(db: Session, date_str: str = None, force: bool = False)
day_start = datetime.strptime(date_str, "%Y-%m-%d") day_start = datetime.strptime(date_str, "%Y-%m-%d")
day_end = day_start + timedelta(days=1) day_end = day_start + timedelta(days=1)
update_progress("generate_daily_brief", status="running", stage="加载文章", current=0, total=0)
# 取当天去重后的代表文章 # 取当天去重后的代表文章
query = ( query = (
db.query(EnrichedArticle) db.query(EnrichedArticle)
@@ -106,6 +110,7 @@ def generate_daily_brief(db: Session, date_str: str = None, force: bool = False)
) )
# 按分类分组并排序 # 按分类分组并排序
update_progress("generate_daily_brief", status="running", stage="按分类整理", current=0, total=0)
by_category: Dict[str, List[Dict[str, Any]]] = {} by_category: Dict[str, List[Dict[str, Any]]] = {}
for art in representative_articles: for art in representative_articles:
cat = art.category or "未分类" cat = art.category or "未分类"
@@ -127,6 +132,7 @@ def generate_daily_brief(db: Session, date_str: str = None, force: bool = False)
} }
# 生成 Markdown 文件 # 生成 Markdown 文件
update_progress("generate_daily_brief", status="running", stage="生成 Markdown", current=0, total=0)
output_dir = settings.brief_output_dir_path / date_str output_dir = settings.brief_output_dir_path / date_str
output_dir.mkdir(parents=True, exist_ok=True) output_dir.mkdir(parents=True, exist_ok=True)
markdown_path = output_dir / "daily-brief.md" markdown_path = output_dir / "daily-brief.md"
@@ -134,6 +140,7 @@ def generate_daily_brief(db: Session, date_str: str = None, force: bool = False)
markdown_path.write_text(markdown_content, encoding="utf-8") markdown_path.write_text(markdown_content, encoding="utf-8")
# 更新文章 brief_date # 更新文章 brief_date
update_progress("generate_daily_brief", status="running", stage="保存简报", current=0, total=0)
for art in representative_articles: for art in representative_articles:
art.brief_date = date_str art.brief_date = date_str
+7 -1
View File
@@ -12,6 +12,7 @@ import numpy as np
from config import settings from config import settings
from models import EnrichedArticle, DuplicateGroup from models import EnrichedArticle, DuplicateGroup
from app.task_progress import update_progress, report_loop_progress
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -172,8 +173,11 @@ def deduplicate_articles(
if not articles: if not articles:
logger.info("日期 %s 无文章可去重", date_str) logger.info("日期 %s 无文章可去重", date_str)
update_progress("tag_score_dedup", status="running", stage="去重", current=0, total=0, message="无文章可去重")
return {"total": 0, "duplicate_groups": 0, "representatives": 0} return {"total": 0, "duplicate_groups": 0, "representatives": 0}
update_progress("tag_score_dedup", status="running", stage="计算相似度并去重", current=0, total=0)
# 先 URL 去重:相同 link 只保留一篇 # 先 URL 去重:相同 link 只保留一篇
unique_articles: List[EnrichedArticle] = [] unique_articles: List[EnrichedArticle] = []
seen_links: set = set() seen_links: set = set()
@@ -194,8 +198,9 @@ def deduplicate_articles(
) )
stats = {"total": len(articles), "duplicate_groups": len(clusters), "representatives": 0} stats = {"total": len(articles), "duplicate_groups": len(clusters), "representatives": 0}
update_progress("tag_score_dedup", status="running", stage="写入重复组", current=0, total=len(clusters))
for cluster in clusters: for ci, cluster in enumerate(clusters):
representative = _pick_representative(unique_articles, cluster) representative = _pick_representative(unique_articles, cluster)
member_ids = [unique_articles[i].id for i in cluster] member_ids = [unique_articles[i].id for i in cluster]
@@ -214,6 +219,7 @@ def deduplicate_articles(
art.is_representative = (art.id == representative.id) art.is_representative = (art.id == representative.id)
stats["representatives"] += 1 stats["representatives"] += 1
report_loop_progress("tag_score_dedup", ci + 1, len(clusters), "写入重复组")
db.commit() db.commit()
logger.info( logger.info(
+16 -4
View File
@@ -11,11 +11,23 @@ logger = logging.getLogger(__name__)
class RSSKeeperClient: class RSSKeeperClient:
"""rssKeeper 外部 API 客户端""" """rssKeeper 外部 API 客户端
def __init__(self, base_url: Optional[str] = None, timeout: int = 30): 配置以 property 形式运行时从 settings 读取,避免模块 import 时
self.base_url = (base_url or settings.RSSKEEPER_BASE_URL).rstrip("/") 固化旧值(settings 在 FastAPI lifespan 启动后才会被数据库配置覆盖)。
self.timeout = timeout """
def __init__(self, base_url: Optional[str] = None, timeout: Optional[int] = None):
self._base_url = base_url
self._timeout = timeout
@property
def base_url(self) -> str:
return (self._base_url or settings.RSSKEEPER_BASE_URL).rstrip("/")
@property
def timeout(self) -> int:
return self._timeout if self._timeout is not None else 30
def _get(self, path: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]: def _get(self, path: str, params: Optional[Dict[str, Any]] = None) -> Dict[str, Any]:
url = f"{self.base_url}{path}" url = f"{self.base_url}{path}"
+3
View File
@@ -8,6 +8,7 @@ from sqlalchemy.orm import Session
from config import settings from config import settings
from models import EnrichedArticle, Taxonomy from models import EnrichedArticle, Taxonomy
from app.task_progress import update_progress, report_loop_progress
from app.tagger import _count_matches, _normalize from app.tagger import _count_matches, _normalize
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -119,6 +120,7 @@ def score_articles(
query = query.filter(EnrichedArticle.id.in_(article_ids)) query = query.filter(EnrichedArticle.id.in_(article_ids))
articles = query.all() articles = query.all()
update_progress("tag_score_dedup", status="running", stage="计算分数", current=0, total=len(articles))
count = 0 count = 0
for article in articles: for article in articles:
article.heat_score = compute_heat_score(article, heat_rules) article.heat_score = compute_heat_score(article, heat_rules)
@@ -141,6 +143,7 @@ def score_articles(
count += 1 count += 1
if count % 50 == 0: if count % 50 == 0:
db.commit() db.commit()
report_loop_progress("tag_score_dedup", count, len(articles), "计算分数")
db.commit() db.commit()
logger.info("打分完成: %d 篇文章", count) logger.info("打分完成: %d 篇文章", count)
+6 -1
View File
@@ -7,6 +7,7 @@ from sqlalchemy.orm import Session
from app.ai_client import ai_client from app.ai_client import ai_client
from app.rss_client import rss_client from app.rss_client import rss_client
from app.task_progress import update_progress, report_loop_progress
from config import settings from config import settings
from models import EnrichedArticle from models import EnrichedArticle
@@ -109,11 +110,13 @@ def fetch_and_summarize(db: Session, hours: int = 24, limit: int = 200) -> Dict[
articles = rss_client.fetch_recent(hours=hours, limit=limit) articles = rss_client.fetch_recent(hours=hours, limit=limit)
if not articles: if not articles:
logger.info("未拉取到新文章") logger.info("未拉取到新文章")
update_progress("summarize", status="running", stage="无新文章", current=0, total=0, message="未拉取到新文章")
return {"fetched": 0, "created": 0, "summarized": 0} return {"fetched": 0, "created": 0, "summarized": 0}
stats = {"fetched": len(articles), "created": 0, "summarized": 0} stats = {"fetched": len(articles), "created": 0, "summarized": 0}
update_progress("summarize", status="running", stage="拉取文章并生成摘要", current=0, total=len(articles))
for raw in articles: for i, raw in enumerate(articles):
data = _article_from_rss(raw) data = _article_from_rss(raw)
article = db.query(EnrichedArticle).filter( article = db.query(EnrichedArticle).filter(
EnrichedArticle.rk_article_id == data["rk_article_id"] EnrichedArticle.rk_article_id == data["rk_article_id"]
@@ -146,6 +149,8 @@ def fetch_and_summarize(db: Session, hours: int = 24, limit: int = 200) -> Dict[
if stats["summarized"] % 10 == 0: if stats["summarized"] % 10 == 0:
db.commit() db.commit()
report_loop_progress("summarize", i + 1, len(articles), "生成摘要")
db.commit() db.commit()
logger.info( logger.info(
"摘要任务完成: fetched=%d, created=%d, summarized=%d", "摘要任务完成: fetched=%d, created=%d, summarized=%d",
+3
View File
@@ -5,6 +5,7 @@ from typing import List, Dict, Any, Tuple
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from app.task_progress import update_progress, report_loop_progress
from models import EnrichedArticle, Taxonomy from models import EnrichedArticle, Taxonomy
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -103,6 +104,7 @@ def tag_articles(db: Session, article_ids: List[int] = None) -> int:
) )
articles = query.all() articles = query.all()
update_progress("tag_score_dedup", status="running", stage="分类打标", current=0, total=len(articles))
count = 0 count = 0
for article in articles: for article in articles:
article.category = classify_article(article, categories) article.category = classify_article(article, categories)
@@ -110,6 +112,7 @@ def tag_articles(db: Session, article_ids: List[int] = None) -> int:
count += 1 count += 1
if count % 50 == 0: if count % 50 == 0:
db.commit() db.commit()
report_loop_progress("tag_score_dedup", count, len(articles), "分类打标")
db.commit() db.commit()
logger.info("分类/打标签完成: %d 篇文章", count) logger.info("分类/打标签完成: %d 篇文章", count)
+117
View File
@@ -0,0 +1,117 @@
"""任务进度注册表(进程内内存,线程安全)。
供手动任务、定时任务在执行过程中上报进度,前端通过
GET /api/tasks/progress 轮询读取展示。
单 workeruvicorn --workers 1)前提下,所有请求/任务线程共享同一份内存。
"""
import copy
import threading
from datetime import datetime, timezone
from typing import Optional
# 4 个稳定任务 key
TASK_KEYS = ("summarize", "tag_score_dedup", "generate_daily_brief", "bootstrap_taxonomy")
_progress: dict = {}
_lock = threading.Lock()
def _now_iso() -> str:
return datetime.now(timezone.utc).isoformat()
def _init() -> None:
"""初始化所有任务 key 为 idle"""
for key in TASK_KEYS:
_progress[key] = {
"status": "idle",
"stage": "",
"current": 0,
"total": 0,
"message": None,
"started_at": None,
"updated_at": None,
"finished_at": None,
"trigger": None,
}
_init()
def update_progress(
task_key: str,
*,
status: Optional[str] = None,
stage: Optional[str] = None,
current: Optional[int] = None,
total: Optional[int] = None,
message: Optional[str] = None,
trigger: Optional[str] = None,
) -> None:
"""合并非 None 字段并盖时间戳"""
with _lock:
entry = _progress.get(task_key)
if entry is None:
entry = {
"status": "idle", "stage": "", "current": 0, "total": 0,
"message": None, "started_at": None, "updated_at": None,
"finished_at": None, "trigger": None,
}
_progress[task_key] = entry
now = _now_iso()
if status == "running" and entry.get("started_at") is None:
entry["started_at"] = now
if status in ("success", "error"):
entry["finished_at"] = now
# 若重新进入 running,重置终态时间戳
if status == "running":
entry["finished_at"] = None
if status is not None:
entry["status"] = status
if stage is not None:
entry["stage"] = stage
if current is not None:
entry["current"] = current
if total is not None:
entry["total"] = total
if message is not None:
entry["message"] = message
if trigger is not None:
entry["trigger"] = trigger
entry["updated_at"] = now
def report_loop_progress(
task_key: str,
index: int,
total: int,
stage: str,
message: Optional[str] = None,
every: int = 5,
) -> None:
"""紧凑循环进度上报:每 `every` 次或最后一次(index==total)才上报,减少加锁"""
if index % every == 0 or index >= total:
update_progress(task_key, status="running", stage=stage, current=index, total=total, message=message)
def get_progress(task_key: Optional[str] = None) -> dict:
"""返回深拷贝(单个或全部),防止序列化期间被并发修改"""
with _lock:
if task_key is not None:
return copy.deepcopy(_progress.get(task_key))
return copy.deepcopy(_progress)
def reset_progress(task_key: str) -> None:
"""重置单个任务为 idle(前端清除终态显示用)"""
with _lock:
if task_key in _progress:
_progress[task_key] = {
"status": "idle", "stage": "", "current": 0, "total": 0,
"message": None, "started_at": None, "updated_at": None,
"finished_at": None, "trigger": None,
}
+17 -14
View File
@@ -5,8 +5,9 @@ from typing import List, Dict, Any
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from app.ai_client import ai_client from app.ai_client import AIClient
from app.rss_client import rss_client from app.rss_client import rss_client
from app.task_progress import update_progress
from models import Taxonomy from models import Taxonomy
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -40,19 +41,19 @@ TAXONOMY_SYSTEM_PROMPT = """你是一位专业的信息分类与内容分析专
3. heat_rules 和 importance_rules 各 10-20 条,weight 范围 0.5-2.0。 3. heat_rules 和 importance_rules 各 10-20 条,weight 范围 0.5-2.0。
4. 所有 keywords 用中文或中英双语,便于后续关键词匹配。 4. 所有 keywords 用中文或中英双语,便于后续关键词匹配。
5. 不要输出任何解释文字,只输出 JSON。 5. 不要输出任何解释文字,只输出 JSON。
6. **分类与标签名称必须使用中性的主题领域词**(如科技、财经、文化、体育、生活、健康、设计、商业等),
禁止使用具体事件、人名、地名、国家名、机构名或任何政治/军事/冲突相关的敏感词作为名称或关键词,
以保证内容中立、避免触发内容审查。
""" """
def _build_sample_prompt(articles: List[Dict[str, Any]]) -> str: def _build_sample_prompt(articles: List[Dict[str, Any]]) -> str:
lines = [f"共有 {len(articles)} 篇文章样本:"] # 只用标题和来源,不带正文摘要——降低输入中的敏感内容,避免触发内容审查
for idx, art in enumerate(articles[:50], 1): lines = [f"共有 {len(articles)} 篇文章样本(仅展示标题用于归纳主题):"]
for idx, art in enumerate(articles[:40], 1):
title = art.get("title", "") title = art.get("title", "")
summary = art.get("summary", "") or art.get("content", "")[:300]
feed = art.get("feed_title", "") feed = art.get("feed_title", "")
cat = art.get("category", "") lines.append(f"[{idx}] {title} (来源:{feed}")
lines.append(f"\n[{idx}] 标题:{title}")
lines.append(f" 来源:{feed} | 源分类:{cat}")
lines.append(f" 摘要:{summary[:400]}")
return "\n".join(lines) return "\n".join(lines)
@@ -72,22 +73,24 @@ def bootstrap_taxonomy(db: Session, force: bool = False) -> bool:
logger.info("强制重新初始化 taxonomy") logger.info("强制重新初始化 taxonomy")
logger.info("开始从 rssKeeper 拉取样本文章并生成分类体系...") logger.info("开始从 rssKeeper 拉取样本文章并生成分类体系...")
update_progress("bootstrap_taxonomy", status="running", stage="拉取样本文章", current=0, total=0)
articles = rss_client.fetch_recent(hours=24 * 7, limit=200) articles = rss_client.fetch_recent(hours=24 * 7, limit=200)
if not articles: if not articles:
logger.warning("未获取到样本文章,无法生成分类体系") logger.warning("未获取到样本文章,无法生成分类体系")
return False raise RuntimeError("未获取到样本文章,无法生成分类体系")
user_prompt = _build_sample_prompt(articles) user_prompt = _build_sample_prompt(articles)
try: update_progress("bootstrap_taxonomy", status="running", stage="LLM 生成分类体系", current=0, total=0, message="正在调用 LLM 生成分类规则,可能需要 2-4 分钟")
result = ai_client.chat_completion_json( # bootstrap 是一次性大任务(生成 categories+tags+rules),MiniMax-M3 reasoning 模式较慢,
# 用专用大 timeout client(默认 60s 不够),失败抛异常由调用方捕获并如实标记进度
bootstrap_ai = AIClient(timeout=300, max_retries=2)
result = bootstrap_ai.chat_completion_json(
system_prompt=TAXONOMY_SYSTEM_PROMPT, system_prompt=TAXONOMY_SYSTEM_PROMPT,
user_prompt=user_prompt, user_prompt=user_prompt,
temperature=0.5, temperature=0.5,
) )
except Exception as exc:
logger.error("生成分类体系失败: %s", exc)
return False
update_progress("bootstrap_taxonomy", status="running", stage="保存规则", current=0, total=0)
_save_taxonomy(db, result) _save_taxonomy(db, result)
logger.info("taxonomy 初始化完成,共写入 %d 条规则", db.query(Taxonomy).count()) logger.info("taxonomy 初始化完成,共写入 %d 条规则", db.query(Taxonomy).count())
return True return True
+4 -1
View File
@@ -1,10 +1,13 @@
<!DOCTYPE html> <!DOCTYPE html>
<html lang="zh-CN"> <html lang="zh-CN" class="dark">
<head> <head>
<meta charset="UTF-8" /> <meta charset="UTF-8" />
<link rel="icon" type="image/svg+xml" href="/vite.svg" /> <link rel="icon" type="image/svg+xml" href="/vite.svg" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>dataClean - RSS 数据清洗</title> <title>dataClean - RSS 数据清洗</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Plus+Jakarta+Sans:ital,wght@0,400;0,500;0,600;0,700;0,800&display=swap" rel="stylesheet">
</head> </head>
<body> <body>
<div id="app"></div> <div id="app"></div>
+36 -18
View File
@@ -1,16 +1,18 @@
<template> <template>
<el-container class="layout-container"> <el-container class="layout-container">
<el-aside width="220px"> <el-aside width="230px">
<div class="logo"> <div class="logo">
<el-icon size="28"><DataLine /></el-icon> <div class="logo-icon">
<span>dataClean</span> <el-icon size="22"><DataLine /></el-icon>
</div>
<span class="logo-text">dataClean</span>
</div> </div>
<el-menu <el-menu
:default-active="$route.path" :default-active="$route.path"
router router
background-color="transparent" background-color="transparent"
text-color="#a0a0a0" text-color="#9ba4b8"
active-text-color="#409eff" active-text-color="#2dd4bf"
> >
<el-menu-item index="/dashboard"> <el-menu-item index="/dashboard">
<el-icon><Odometer /></el-icon> <el-icon><Odometer /></el-icon>
@@ -40,17 +42,17 @@
</el-aside> </el-aside>
<el-container> <el-container>
<el-header class="top-header" height="60px"> <el-header class="top-header" height="56px">
<div class="header-right"> <div class="header-right">
<el-input <el-input
v-model="apiTokenInput" v-model="apiTokenInput"
placeholder="API Token(未设置可留空)" placeholder="API Token(未设置可留空)"
size="small" size="default"
show-password show-password
style="width: 260px;" style="width: 280px;"
@keyup.enter="saveToken" @keyup.enter="saveToken"
/> />
<el-button size="small" type="primary" @click="saveToken"> <el-button size="default" type="primary" @click="saveToken">
{{ hasToken ? '更新 Token' : '设置 Token' }} {{ hasToken ? '更新 Token' : '设置 Token' }}
</el-button> </el-button>
</div> </div>
@@ -89,15 +91,30 @@ const saveToken = () => {
} }
.logo { .logo {
height: 60px; height: 56px;
display: flex; display: flex;
align-items: center; align-items: center;
justify-content: center; justify-content: center;
gap: 10px; gap: 10px;
font-size: 20px;
font-weight: 600;
color: #409eff;
border-bottom: 1px solid var(--dc-border); border-bottom: 1px solid var(--dc-border);
padding: 0 16px;
}
.logo-icon {
width: 36px;
height: 36px;
border-radius: 10px;
background: var(--dc-primary-dim);
display: flex;
align-items: center;
justify-content: center;
color: var(--dc-primary);
}
.logo-text {
font-size: 18px;
font-weight: 700;
color: var(--dc-text);
letter-spacing: -0.3px;
font-family: var(--dc-font);
} }
.top-header { .top-header {
@@ -105,20 +122,21 @@ const saveToken = () => {
align-items: center; align-items: center;
justify-content: flex-end; justify-content: flex-end;
border-bottom: 1px solid var(--dc-border); border-bottom: 1px solid var(--dc-border);
background-color: var(--dc-card-bg); padding: 0 24px;
} }
.header-right { .header-right {
display: flex; display: flex;
align-items: center; align-items: center;
gap: 10px; gap: 12px;
} }
.el-menu-item { .el-menu-item {
height: 50px; height: 46px;
line-height: 50px; line-height: 46px;
margin: 2px 8px;
border-radius: 8px;
} }
.el-menu-item .el-icon { .el-menu-item .el-icon {
margin-right: 8px; margin-right: 8px;
} }
+5
View File
@@ -65,10 +65,15 @@ export const datacleanApi = {
summarize: () => api.post('/tasks/summarize'), summarize: () => api.post('/tasks/summarize'),
tagScoreDedup: () => api.post('/tasks/tag-score-dedup'), tagScoreDedup: () => api.post('/tasks/tag-score-dedup'),
generateBrief: () => api.post('/tasks/brief'), generateBrief: () => api.post('/tasks/brief'),
getTaskProgress: () => api.get('/tasks/progress'),
resetTaskProgress: (taskKey) => api.post('/tasks/progress/reset', null, { params: { task_key: taskKey } }),
// 配置 // 配置
getSettings: () => api.get('/settings'), getSettings: () => api.get('/settings'),
updateSetting: (key, value) => api.put(`/settings/${key}`, { value }), updateSetting: (key, value) => api.put(`/settings/${key}`, { value }),
updateSettingsBatch: (settings) => api.put('/settings', { settings }), updateSettingsBatch: (settings) => api.put('/settings', { settings }),
resetSettings: () => api.post('/settings/reset'), resetSettings: () => api.post('/settings/reset'),
// 连通性测试
testConnection: () => api.post('/test-connection'),
} }
+394 -68
View File
@@ -1,100 +1,159 @@
/* ============================================
dataClean — Deep Ink Theme
高对比度深色主题,阅读优先
============================================ */
:root { :root {
--dc-bg: #0f0f23; /* —— 背景层次 —— */
--dc-card-bg: #1a1a2e; --dc-bg: #0b0d12;
--dc-border: #2d2d44; --dc-surface: #12151e;
--dc-text: #e0e0e0; --dc-surface-raised: #191d2a;
--dc-text-secondary: #a0a0a0; --dc-surface-hover: #1f2435;
--dc-primary: #409eff; --dc-overlay: #252b3d;
--dc-success: #67c23a;
--dc-warning: #e6a23c; /* —— 边框 —— */
--dc-danger: #f56c6c; --dc-border: #272e3f;
--dc-border-light: #333c52;
/* —— 文字(严格高对比度) —— */
--dc-text: #e8ecf4;
--dc-text-secondary: #9ba4b8;
--dc-text-muted: #5f6a80;
/* —— 主色调:Teal Mint —— */
--dc-primary: #2dd4bf;
--dc-primary-hover: #5eead4;
--dc-primary-dim: rgba(45, 212, 191, 0.12);
--dc-primary-bg: rgba(45, 212, 191, 0.08);
/* —— 辅助强调 —— */
--dc-accent-warm: #f59e42;
--dc-accent-rose: #fb7185;
/* —— 语义色 —— */
--dc-success: #4ade80;
--dc-warning: #facc15;
--dc-danger: #f87171;
--dc-info: #60a5fa;
/* —— 阴影 —— */
--dc-shadow: 0 1px 4px rgba(0,0,0,0.35);
--dc-shadow-lg: 0 8px 24px rgba(0,0,0,0.4);
/* —— 圆角 —— */
--dc-radius: 10px;
--dc-radius-sm: 6px;
/* —— 字体 —— */
--dc-font: 'Plus Jakarta Sans', -apple-system, BlinkMacSystemFont, 'PingFang SC', 'Noto Sans SC', 'Microsoft YaHei', sans-serif;
} }
/* ===== 全局重置 ===== */
* { * {
margin: 0; margin: 0;
padding: 0; padding: 0;
box-sizing: border-box; box-sizing: border-box;
} }
html.dark {
color-scheme: dark;
}
body { body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif; font-family: var(--dc-font);
background-color: var(--dc-bg); background-color: var(--dc-bg);
color: var(--dc-text); color: var(--dc-text);
}
.page-title {
font-size: 24px;
font-weight: 600;
margin-bottom: 20px;
color: var(--dc-text);
}
.stat-card {
background: var(--dc-card-bg);
border: 1px solid var(--dc-border);
border-radius: 8px;
padding: 20px;
transition: transform 0.2s;
}
.stat-card:hover {
transform: translateY(-2px);
}
.stat-value {
font-size: 28px;
font-weight: 700;
color: var(--dc-primary);
}
.stat-label {
font-size: 14px; font-size: 14px;
color: var(--dc-text-secondary); line-height: 1.6;
margin-top: 8px; -webkit-font-smoothing: antialiased;
-moz-osx-font-smoothing: grayscale;
} }
a {
color: var(--dc-primary);
text-decoration: none;
transition: color 0.2s;
}
a:hover {
color: var(--dc-primary-hover);
}
/* ===== 页面标题 ===== */
.page-title {
font-size: 22px;
font-weight: 700;
margin-bottom: 24px;
color: var(--dc-text);
letter-spacing: -0.3px;
}
/* ===== 统计卡片 ===== */
.stat-card {
background: var(--dc-surface);
border: 1px solid var(--dc-border);
border-radius: var(--dc-radius);
padding: 20px 22px;
transition: border-color 0.2s, box-shadow 0.2s;
}
.stat-card:hover {
border-color: var(--dc-border-light);
box-shadow: var(--dc-shadow);
}
.stat-value {
font-size: 30px;
font-weight: 800;
color: var(--dc-primary);
letter-spacing: -0.5px;
font-family: var(--dc-font);
}
.stat-label {
font-size: 13px;
color: var(--dc-text-secondary);
margin-top: 6px;
font-weight: 500;
}
/* ===== 通用卡片(替代 dark-card ===== */
.dark-card { .dark-card {
background: var(--dc-card-bg) !important; background: var(--dc-surface) !important;
border: 1px solid var(--dc-border) !important; border: 1px solid var(--dc-border) !important;
border-radius: var(--dc-radius) !important;
color: var(--dc-text) !important; color: var(--dc-text) !important;
} }
.dark-card .el-card__header { .dark-card .el-card__header {
border-bottom: 1px solid var(--dc-border) !important; border-bottom: 1px solid var(--dc-border) !important;
color: var(--dc-text) !important; color: var(--dc-text) !important;
font-weight: 600;
} }
/* ===== 柱状图 ===== */
.daily-bar-wrap { .daily-bar-wrap {
display: flex; display: flex;
align-items: flex-end; align-items: flex-end;
gap: 8px; gap: 8px;
height: 120px; height: 140px;
padding: 10px 0; padding: 10px 0;
} }
.daily-bar { .daily-bar {
flex: 1; flex: 1;
background: linear-gradient(to top, var(--dc-primary), #66b1ff); background: linear-gradient(to top, var(--dc-primary), #5eead4);
border-radius: 4px 4px 0 0; border-radius: 4px 4px 0 0;
min-width: 20px; min-width: 20px;
position: relative; position: relative;
transition: opacity 0.2s; transition: opacity 0.2s;
} }
.daily-bar:hover { .daily-bar:hover {
opacity: 0.8; opacity: 0.8;
} }
.daily-bar-label { .daily-bar-label {
position: absolute; position: absolute;
bottom: -20px; bottom: -22px;
left: 50%; left: 50%;
transform: translateX(-50%); transform: translateX(-50%);
font-size: 12px; font-size: 11px;
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
white-space: nowrap; white-space: nowrap;
} }
.daily-bar-value { .daily-bar-value {
position: absolute; position: absolute;
top: -20px; top: -20px;
@@ -102,63 +161,330 @@ body {
transform: translateX(-50%); transform: translateX(-50%);
font-size: 12px; font-size: 12px;
color: var(--dc-text); color: var(--dc-text);
font-weight: 600;
} }
/* ===== 评分进度条 ===== */
.score-progress { .score-progress {
margin-top: 8px; margin-top: 8px;
} }
.score-progress .el-progress-bar__outer { .score-progress .el-progress-bar__outer {
background-color: rgba(255, 255, 255, 0.1) !important; background-color: rgba(255, 255, 255, 0.06) !important;
} }
/* ===== 链接 & 标签 ===== */
.article-link { .article-link {
color: var(--dc-primary); color: var(--dc-primary);
text-decoration: none; text-decoration: none;
transition: color 0.15s;
} }
.article-link:hover { .article-link:hover {
color: var(--dc-primary-hover);
text-decoration: underline; text-decoration: underline;
} }
.tag-item { .tag-item {
margin-right: 6px; margin-right: 6px;
margin-bottom: 4px; margin-bottom: 4px;
} }
/* Element Plus 暗色覆盖 */ /* ============================================
Element Plus 暗色深度覆盖
确保 EP 组件在深色背景上完全可读
============================================ */
/* —— 侧栏 —— */
.el-aside {
background-color: var(--dc-surface) !important;
border-right: 1px solid var(--dc-border) !important;
}
.el-menu { .el-menu {
border-right: none !important; border-right: none !important;
background-color: transparent !important; background-color: transparent !important;
} }
.el-menu-item {
.el-aside { color: var(--dc-text-secondary) !important;
background-color: var(--dc-card-bg) !important; border-radius: var(--dc-radius-sm);
border-right: 1px solid var(--dc-border) !important; margin: 2px 8px;
transition: all 0.2s;
}
.el-menu-item:hover {
background-color: var(--dc-surface-hover) !important;
color: var(--dc-text) !important;
}
.el-menu-item.is-active {
background-color: var(--dc-primary-dim) !important;
color: var(--dc-primary) !important;
} }
/* —— 主容器 —— */
.el-container { .el-container {
background-color: var(--dc-bg) !important; background-color: var(--dc-bg) !important;
} }
.el-main { .el-main {
background-color: var(--dc-bg) !important; background-color: var(--dc-bg) !important;
} }
.el-header {
.el-table { background-color: var(--dc-surface) !important;
background-color: transparent !important;
} }
/* —— 卡片 —— */
.el-card {
background-color: var(--dc-surface) !important;
border-color: var(--dc-border) !important;
border-radius: var(--dc-radius) !important;
color: var(--dc-text) !important;
}
.el-card__header {
border-bottom-color: var(--dc-border) !important;
color: var(--dc-text) !important;
}
/* —— 表格 —— */
.el-table {
--el-table-bg-color: transparent;
--el-table-tr-bg-color: transparent;
--el-table-header-bg-color: var(--dc-surface-raised);
--el-table-row-hover-bg-color: var(--dc-primary-dim);
--el-table-border-color: var(--dc-border);
--el-table-text-color: var(--dc-text);
--el-table-header-text-color: var(--dc-text-secondary);
background-color: transparent !important;
color: var(--dc-text) !important;
}
.el-table th, .el-table th,
.el-table tr { .el-table tr {
background-color: transparent !important; background-color: transparent !important;
color: var(--dc-text) !important;
}
.el-table th.el-table__cell {
background-color: var(--dc-surface-raised) !important;
color: var(--dc-text-secondary) !important;
font-weight: 600;
font-size: 13px;
} }
.el-table--enable-row-hover .el-table__body tr:hover > td { .el-table--enable-row-hover .el-table__body tr:hover > td {
background-color: rgba(64, 158, 255, 0.1) !important; background-color: var(--dc-primary-dim) !important;
}
.el-table td.el-table__cell,
.el-table th.el-table__cell {
border-bottom-color: var(--dc-border) !important;
}
.el-table__empty-text {
color: var(--dc-text-muted) !important;
} }
.el-input__wrapper, /* —— 输入框 —— */
.el-textarea__inner { .el-input__wrapper {
background-color: rgba(255, 255, 255, 0.05) !important; background-color: var(--dc-surface-raised) !important;
box-shadow: 0 0 0 1px var(--dc-border) inset !important;
color: var(--dc-text) !important;
}
.el-input__wrapper:hover {
box-shadow: 0 0 0 1px var(--dc-border-light) inset !important;
}
.el-input__wrapper.is-focus {
box-shadow: 0 0 0 1px var(--dc-primary) inset !important;
}
.el-input__inner {
color: var(--dc-text) !important;
}
.el-input__inner::placeholder {
color: var(--dc-text-muted) !important;
}
.el-textarea__inner {
background-color: var(--dc-surface-raised) !important;
border-color: var(--dc-border) !important;
color: var(--dc-text) !important;
}
.el-textarea__inner::placeholder {
color: var(--dc-text-muted) !important;
}
/* —— 按钮 —— */
.el-button--primary {
--el-button-bg-color: var(--dc-primary);
--el-button-border-color: var(--dc-primary);
--el-button-hover-bg-color: var(--dc-primary-hover);
--el-button-hover-border-color: var(--dc-primary-hover);
--el-button-text-color: #0b0d12;
--el-button-hover-text-color: #0b0d12;
font-weight: 600;
}
.el-button--danger {
--el-button-bg-color: var(--dc-danger);
--el-button-border-color: var(--dc-danger);
--el-button-text-color: #0b0d12;
--el-button-hover-text-color: #0b0d12;
}
/* —— 标签 —— */
.el-tag {
border-radius: 6px;
font-weight: 500;
}
.el-tag--info {
--el-tag-bg-color: rgba(96, 165, 250, 0.12);
--el-tag-border-color: rgba(96, 165, 250, 0.2);
--el-tag-text-color: #93bbfd;
}
.el-tag--success {
--el-tag-bg-color: rgba(74, 222, 128, 0.12);
--el-tag-border-color: rgba(74, 222, 128, 0.2);
--el-tag-text-color: #86efac;
}
.el-tag--warning {
--el-tag-bg-color: rgba(250, 204, 21, 0.12);
--el-tag-border-color: rgba(250, 204, 21, 0.2);
--el-tag-text-color: #fde047;
}
.el-tag--danger {
--el-tag-bg-color: rgba(248, 113, 113, 0.12);
--el-tag-border-color: rgba(248, 113, 113, 0.2);
--el-tag-text-color: #fca5a5;
}
.el-tag--primary {
--el-tag-bg-color: var(--dc-primary-dim);
--el-tag-border-color: rgba(45, 212, 191, 0.25);
--el-tag-text-color: var(--dc-primary);
}
/* —— 时间线 —— */
.el-timeline-item__tail {
border-left-color: var(--dc-border-light) !important;
}
.el-timeline-item__node {
background-color: var(--dc-primary) !important;
}
.el-timeline-item__timestamp {
color: var(--dc-text-muted) !important;
}
.el-timeline-item__content {
color: var(--dc-text) !important;
}
/* —— 折叠面板 —— */
.el-collapse {
--el-collapse-header-bg-color: transparent;
--el-collapse-content-bg-color: transparent;
--el-collapse-border-color: var(--dc-border);
border-color: var(--dc-border) !important;
}
.el-collapse-item__header {
color: var(--dc-text) !important;
background-color: transparent !important;
border-bottom-color: var(--dc-border) !important;
font-weight: 600;
}
.el-collapse-item__wrap {
background-color: transparent !important;
border-bottom-color: var(--dc-border) !important;
}
.el-collapse-item__content {
color: var(--dc-text-secondary) !important;
}
/* —— Tabs —— */
.el-tabs__item {
color: var(--dc-text-secondary) !important;
font-weight: 500;
}
.el-tabs__item.is-active {
color: var(--dc-primary) !important;
}
.el-tabs__item:hover {
color: var(--dc-primary-hover) !important;
}
.el-tabs__active-bar {
background-color: var(--dc-primary) !important;
}
.el-tabs__nav-wrap::after {
background-color: var(--dc-border) !important;
}
/* —— 分页 —— */
.el-pagination {
--el-pagination-bg-color: transparent;
--el-pagination-text-color: var(--dc-text-secondary);
--el-pagination-button-bg-color: var(--dc-surface-raised);
--el-pagination-hover-color: var(--dc-primary);
}
.el-pager li {
background: var(--dc-surface-raised) !important;
color: var(--dc-text-secondary) !important;
}
.el-pager li.is-active {
background: var(--dc-primary) !important;
color: #0b0d12 !important;
font-weight: 700;
}
/* —— Alert —— */
.el-alert--info {
--el-alert-bg-color: rgba(96, 165, 250, 0.08);
border-color: rgba(96, 165, 250, 0.2);
}
.el-alert--warning {
--el-alert-bg-color: rgba(250, 204, 21, 0.08);
border-color: rgba(250, 204, 21, 0.2);
}
.el-alert__title {
color: var(--dc-text) !important;
font-weight: 500;
}
/* —— Empty —— */
.el-empty__description p {
color: var(--dc-text-muted) !important;
}
/* —— Page Header —— */
.el-page-header__left {
color: var(--dc-text-secondary) !important;
}
.el-page-header__content {
color: var(--dc-text) !important;
}
/* —— 日期选择器弹出层 —— */
.el-date-editor {
--el-fill-color-blank: var(--dc-surface-raised);
}
/* —— Checkbox —— */
.el-checkbox__label {
color: var(--dc-text-secondary) !important;
}
/* —— 进度条文本 —— */
.el-progress__text {
color: var(--dc-text) !important;
}
/* —— Loading —— */
.el-loading-mask {
background-color: rgba(11, 13, 18, 0.7) !important;
}
/* —— Message Box —— */
.el-message-box {
--el-messagebox-title-color: var(--dc-text);
--el-messagebox-content-color: var(--dc-text-secondary);
background-color: var(--dc-surface) !important;
border-color: var(--dc-border) !important;
}
/* —— 滚动条美化 —— */
::-webkit-scrollbar {
width: 6px;
height: 6px;
}
::-webkit-scrollbar-track {
background: transparent;
}
::-webkit-scrollbar-thumb {
background: var(--dc-border-light);
border-radius: 3px;
}
::-webkit-scrollbar-thumb:hover {
background: var(--dc-text-muted);
} }
+50 -35
View File
@@ -1,6 +1,10 @@
<template> <template>
<div v-loading="loading"> <div v-loading="loading">
<el-page-header @back="$router.push('/articles')" title="文章详情" /> <el-page-header @back="$router.push('/articles')" title="返回列表">
<template #content>
<span style="color: var(--dc-text); font-weight: 600;">文章详情</span>
</template>
</el-page-header>
<el-card v-if="article" class="dark-card" style="margin-top: 20px;"> <el-card v-if="article" class="dark-card" style="margin-top: 20px;">
<template #header> <template #header>
@@ -10,7 +14,7 @@
<span><el-icon><OfficeBuilding /></el-icon> {{ article.feed_title }}</span> <span><el-icon><OfficeBuilding /></el-icon> {{ article.feed_title }}</span>
<span v-if="article.author"><el-icon><User /></el-icon> {{ article.author }}</span> <span v-if="article.author"><el-icon><User /></el-icon> {{ article.author }}</span>
<span><el-icon><Timer /></el-icon> {{ article.published_at }}</span> <span><el-icon><Timer /></el-icon> {{ article.published_at }}</span>
<el-tag v-if="article.is_representative" type="success">重复组代表</el-tag> <el-tag v-if="article.is_representative" type="success" size="small">重复组代表</el-tag>
</div> </div>
</div> </div>
</template> </template>
@@ -35,15 +39,13 @@
<div class="article-section"> <div class="article-section">
<h3>评分</h3> <h3>评分</h3>
<el-row :gutter="20"> <div class="scores-grid">
<el-col :span="6" v-for="score in scoreList" :key="score.label"> <div class="score-item" v-for="score in scoreList" :key="score.label">
<div class="score-item">
<div class="score-label">{{ score.label }}</div> <div class="score-label">{{ score.label }}</div>
<div class="score-value">{{ score.value.toFixed(1) }}</div> <div class="score-value" :style="{ color: score.color }">{{ score.value.toFixed(1) }}</div>
<el-progress :percentage="Math.round(score.value)" :color="score.color" class="score-progress" /> <el-progress :percentage="Math.round(score.value)" :color="score.color" class="score-progress" />
</div> </div>
</el-col> </div>
</el-row>
</div> </div>
<div class="article-section" v-if="article.link"> <div class="article-section" v-if="article.link">
@@ -73,10 +75,10 @@ const article = ref(null)
const scoreList = computed(() => { const scoreList = computed(() => {
if (!article.value) return [] if (!article.value) return []
return [ return [
{ label: '热度', value: article.value.heat_score, color: '#f56c6c' }, { label: '热度', value: article.value.heat_score, color: '#f87171' },
{ label: '重要性', value: article.value.importance_score, color: '#e6a23c' }, { label: '重要性', value: article.value.importance_score, color: '#facc15' },
{ label: '重复度', value: article.value.duplication_score, color: '#67c23a' }, { label: '重复度', value: article.value.duplication_score, color: '#4ade80' },
{ label: '综合分', value: article.value.composite_score, color: '#409eff' }, { label: '综合分', value: article.value.composite_score, color: '#2dd4bf' },
] ]
}) })
@@ -98,66 +100,79 @@ onMounted(loadArticle)
.article-header h2 { .article-header h2 {
margin-bottom: 12px; margin-bottom: 12px;
color: var(--dc-text); color: var(--dc-text);
font-size: 20px;
font-weight: 700;
line-height: 1.4;
} }
.article-meta { .article-meta {
display: flex; display: flex;
gap: 20px; gap: 20px;
align-items: center; align-items: center;
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
font-size: 14px; font-size: 13px;
flex-wrap: wrap;
} }
.article-meta .el-icon { .article-meta .el-icon {
margin-right: 4px; margin-right: 4px;
vertical-align: middle; vertical-align: middle;
} }
.article-section { .article-section {
margin-bottom: 24px; margin-bottom: 28px;
} }
.article-section h3 { .article-section h3 {
font-size: 16px; font-size: 15px;
margin-bottom: 12px; margin-bottom: 14px;
color: var(--dc-text); color: var(--dc-text);
border-left: 4px solid var(--dc-primary); border-left: 3px solid var(--dc-primary);
padding-left: 10px; padding-left: 12px;
font-weight: 600;
} }
.ai-summary { .ai-summary {
line-height: 1.8; line-height: 1.8;
color: var(--dc-text); color: var(--dc-text);
background: rgba(64, 158, 255, 0.1); background: var(--dc-primary-bg);
padding: 16px; padding: 18px;
border-radius: 8px; border-radius: var(--dc-radius);
border: 1px solid rgba(45, 212, 191, 0.15);
font-size: 15px;
} }
.no-data { .no-data {
color: var(--dc-text-secondary); color: var(--dc-text-muted);
font-style: italic;
} }
.section-label { .section-label {
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
margin-right: 8px; margin-right: 8px;
font-weight: 500;
} }
.scores-grid {
display: grid;
grid-template-columns: repeat(4, 1fr);
gap: 16px;
}
.score-item { .score-item {
background: rgba(255, 255, 255, 0.03); background: var(--dc-surface-raised);
padding: 16px; padding: 18px;
border-radius: 8px; border-radius: var(--dc-radius);
text-align: center; text-align: center;
border: 1px solid var(--dc-border);
} }
.score-label { .score-label {
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
font-size: 14px; font-size: 13px;
font-weight: 500;
margin-bottom: 4px;
} }
.score-value { .score-value {
font-size: 24px; font-size: 28px;
font-weight: 700; font-weight: 800;
margin: 8px 0; margin: 6px 0 10px;
color: var(--dc-text); letter-spacing: -0.5px;
font-family: var(--dc-font);
} }
</style> </style>
+27 -19
View File
@@ -1,14 +1,17 @@
<template> <template>
<div v-loading="loading"> <div v-loading="loading">
<el-page-header @back="$router.push('/briefs')" title="简报详情" /> <el-page-header @back="$router.push('/briefs')" title="返回列表">
<template #content>
<span style="color: var(--dc-text); font-weight: 600;">{{ brief?.brief_date }} 每日简报</span>
</template>
</el-page-header>
<el-card v-if="brief" class="dark-card" style="margin-top: 20px;"> <el-card v-if="brief" class="dark-card" style="margin-top: 20px;">
<template #header> <template #header>
<div class="brief-header"> <div class="brief-header">
<h2>{{ brief.brief_date }} 每日简报</h2>
<div class="brief-meta"> <div class="brief-meta">
<el-tag type="info">原始文章{{ brief.total_articles }}</el-tag> <el-tag type="info" effect="plain">原始文章{{ brief.total_articles }}</el-tag>
<el-tag type="success">去重后{{ brief.unique_articles }}</el-tag> <el-tag type="success" effect="plain">去重后{{ brief.unique_articles }}</el-tag>
</div> </div>
</div> </div>
</template> </template>
@@ -17,9 +20,11 @@
<el-collapse-item <el-collapse-item
v-for="(articles, category) in brief.by_category" v-for="(articles, category) in brief.by_category"
:key="category" :key="category"
:title="`${category} (${articles.length})`"
:name="category" :name="category"
> >
<template #title>
<span class="collapse-title">{{ category }}{{ articles.length }} </span>
</template>
<div <div
v-for="article in articles" v-for="article in articles"
:key="article.id" :key="article.id"
@@ -30,7 +35,7 @@
<span class="brief-article-feed">{{ article.feed_title }}</span> <span class="brief-article-feed">{{ article.feed_title }}</span>
</div> </div>
<div class="brief-article-tags"> <div class="brief-article-tags">
<el-tag v-for="tag in article.tags" :key="tag" size="small" class="tag-item">{{ tag }}</el-tag> <el-tag v-for="tag in article.tags" :key="tag" size="small" class="tag-item" type="info">{{ tag }}</el-tag>
<el-tag size="small" type="warning">综合 {{ article.composite_score.toFixed(1) }}</el-tag> <el-tag size="small" type="warning">综合 {{ article.composite_score.toFixed(1) }}</el-tag>
</div> </div>
<p v-if="article.summary" class="brief-article-summary">{{ article.summary }}</p> <p v-if="article.summary" class="brief-article-summary">{{ article.summary }}</p>
@@ -73,49 +78,52 @@ onMounted(loadBrief)
</script> </script>
<style scoped> <style scoped>
.brief-header h2 { .brief-header {
margin-bottom: 10px; display: flex;
color: var(--dc-text); align-items: center;
} }
.brief-meta { .brief-meta {
display: flex; display: flex;
gap: 10px; gap: 10px;
} }
.collapse-title {
font-weight: 600;
font-size: 15px;
color: var(--dc-text);
}
.brief-article { .brief-article {
padding: 16px 0; padding: 16px 0;
border-bottom: 1px solid var(--dc-border); border-bottom: 1px solid var(--dc-border);
} }
.brief-article:last-child { .brief-article:last-child {
border-bottom: none; border-bottom: none;
} }
.brief-article-title { .brief-article-title {
display: flex; display: flex;
justify-content: space-between; justify-content: space-between;
align-items: center; align-items: center;
margin-bottom: 8px; margin-bottom: 8px;
gap: 16px;
} }
.brief-article-title a { .brief-article-title a {
font-size: 16px; font-size: 15px;
font-weight: 500; font-weight: 500;
color: var(--dc-primary);
} }
.brief-article-feed { .brief-article-feed {
color: var(--dc-text-secondary); color: var(--dc-text-muted);
font-size: 13px; font-size: 13px;
white-space: nowrap;
} }
.brief-article-tags { .brief-article-tags {
margin-bottom: 8px; margin-bottom: 8px;
} }
.brief-article-summary { .brief-article-summary {
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
font-size: 14px; font-size: 14px;
line-height: 1.6; line-height: 1.7;
margin-top: 8px;
} }
</style> </style>
+38 -14
View File
@@ -3,21 +3,19 @@
<h1 class="page-title">仪表盘</h1> <h1 class="page-title">仪表盘</h1>
<!-- 统计卡片 --> <!-- 统计卡片 -->
<el-row :gutter="20"> <div class="stats-grid">
<el-col :span="6" v-for="stat in stats" :key="stat.label"> <div class="stat-card" v-for="stat in stats" :key="stat.label">
<div class="stat-card">
<div class="stat-value">{{ stat.value }}</div> <div class="stat-value">{{ stat.value }}</div>
<div class="stat-label">{{ stat.label }}</div> <div class="stat-label">{{ stat.label }}</div>
</div> </div>
</el-col> </div>
</el-row>
<!-- 分类分布 + 最近简报 --> <!-- 分类分布 + 最近简报 -->
<el-row :gutter="20" style="margin-top: 20px;"> <el-row :gutter="20" style="margin-top: 20px;">
<el-col :span="16"> <el-col :span="16">
<el-card class="dark-card"> <el-card class="dark-card">
<template #header> <template #header>
<span>分类分布</span> <span class="card-header-title">分类分布</span>
</template> </template>
<div v-if="categoryDistribution.length" class="daily-bar-wrap"> <div v-if="categoryDistribution.length" class="daily-bar-wrap">
<div <div
@@ -38,7 +36,7 @@
<el-col :span="8"> <el-col :span="8">
<el-card class="dark-card"> <el-card class="dark-card">
<template #header> <template #header>
<span>最近简报</span> <span class="card-header-title">最近简报</span>
</template> </template>
<el-timeline v-if="recentBriefs.length"> <el-timeline v-if="recentBriefs.length">
<el-timeline-item <el-timeline-item
@@ -46,8 +44,8 @@
:key="brief.brief_date" :key="brief.brief_date"
:timestamp="brief.brief_date" :timestamp="brief.brief_date"
> >
<el-link @click="$router.push(`/briefs/${brief.brief_date}`)"> <el-link @click="$router.push(`/briefs/${brief.brief_date}`)" style="color: var(--dc-primary);">
{{ brief.unique_articles }} 篇去重后文章 / {{ brief.total_articles }} 篇原始文章 {{ brief.unique_articles }} 篇去重后 / {{ brief.total_articles }} 篇原始
</el-link> </el-link>
</el-timeline-item> </el-timeline-item>
</el-timeline> </el-timeline>
@@ -61,11 +59,19 @@
<el-col :span="24"> <el-col :span="24">
<el-card class="dark-card"> <el-card class="dark-card">
<template #header> <template #header>
<span>定时任务状态</span> <span class="card-header-title">定时任务状态</span>
</template> </template>
<el-table :data="jobList" style="width: 100%"> <el-table :data="jobList" style="width: 100%">
<el-table-column prop="id" label="任务" /> <el-table-column prop="id" label="任务">
<el-table-column prop="next_run" label="下次执行时间" /> <template #default="{ row }">
<span style="font-weight: 500; color: var(--dc-text);">{{ jobNameMap[row.id] || row.id }}</span>
</template>
</el-table-column>
<el-table-column prop="next_run" label="下次执行时间">
<template #default="{ row }">
<span style="color: var(--dc-text-secondary);">{{ row.next_run }}</span>
</template>
</el-table-column>
</el-table> </el-table>
</el-card> </el-card>
</el-col> </el-col>
@@ -91,6 +97,13 @@ const statsData = ref({
const recentBriefs = ref([]) const recentBriefs = ref([])
const categoryDistribution = ref([]) const categoryDistribution = ref([])
const jobNameMap = {
fetch_and_summarize: 'AI 摘要生成',
tag_score_deduplicate: '分类 / 打分 / 去重',
generate_daily_brief: '每日简报生成',
bootstrap_taxonomy: '分类体系初始化',
}
const stats = computed(() => [ const stats = computed(() => [
{ label: '总加工文章', value: statsData.value.total_articles }, { label: '总加工文章', value: statsData.value.total_articles },
{ label: '今日文章', value: statsData.value.today_articles }, { label: '今日文章', value: statsData.value.today_articles },
@@ -119,14 +132,12 @@ const loadData = async () => {
statsData.value = statsRes statsData.value = statsRes
recentBriefs.value = briefsRes recentBriefs.value = briefsRes
// 计算分类分布
const categories = taxonomyRes.filter((t) => t.kind === 'category') const categories = taxonomyRes.filter((t) => t.kind === 'category')
const catMap = {} const catMap = {}
categories.forEach((c) => { categories.forEach((c) => {
catMap[c.name] = 0 catMap[c.name] = 0
}) })
// 从简报中聚合各分类文章数(取最近一份简报)
if (briefsRes.length > 0) { if (briefsRes.length > 0) {
const latestBrief = await datacleanApi.getBrief(briefsRes[0].brief_date) const latestBrief = await datacleanApi.getBrief(briefsRes[0].brief_date)
const byCategory = latestBrief.by_category || {} const byCategory = latestBrief.by_category || {}
@@ -150,3 +161,16 @@ const loadData = async () => {
onMounted(loadData) onMounted(loadData)
</script> </script>
<style scoped>
.stats-grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(150px, 1fr));
gap: 16px;
}
.card-header-title {
font-weight: 600;
font-size: 15px;
color: var(--dc-text);
}
</style>
+359 -30
View File
@@ -3,86 +3,233 @@
<h1 class="page-title">任务管理</h1> <h1 class="page-title">任务管理</h1>
<el-row :gutter="20"> <el-row :gutter="20">
<el-col :span="8" v-for="task in tasks" :key="task.id"> <el-col :xs="24" :sm="12" :md="8" :lg="6" v-for="task in tasks" :key="task.id">
<el-card class="dark-card" style="margin-bottom: 20px;"> <el-card class="dark-card task-card" style="margin-bottom: 20px;">
<template #header> <template #header>
<div class="task-header"> <div class="task-header">
<el-icon size="24"><component :is="task.icon" /></el-icon> <div class="task-icon-wrap">
<span>{{ task.title }}</span> <el-icon size="20"><component :is="task.icon" /></el-icon>
</div>
<span class="task-title">{{ task.title }}</span>
</div> </div>
</template> </template>
<p class="task-desc">{{ task.description }}</p> <p class="task-desc">{{ task.description }}</p>
<div v-if="task.nextRun" class="task-next-run"> <div v-if="task.nextRun" class="task-next-run">
下次执行{{ task.nextRun }} <el-icon size="14"><Timer /></el-icon>
<span>下次执行{{ task.nextRun }}</span>
</div> </div>
<!-- 进度块 -->
<div v-if="task.progress && task.progress.status !== 'idle'" class="task-progress">
<div class="progress-top">
<el-tag :type="statusTagType(task.progress.status)" size="small" effect="dark">
{{ statusLabel(task.progress.status) }}
</el-tag>
<span v-if="task.progress.trigger === 'scheduled'" class="trigger-tag">定时</span>
<span v-else-if="task.progress.trigger === 'manual'" class="trigger-tag">手动</span>
</div>
<div class="task-stage">
{{ task.progress.stage || '处理中' }}
<span v-if="task.progress.total > 0" class="task-counts">
{{ task.progress.current }}/{{ task.progress.total }}
</span>
</div>
<el-progress
v-if="task.progress.total > 0"
:percentage="progressPercent(task.progress)"
:status="progressStatus(task.progress.status)"
:stroke-width="8"
/>
<el-progress
v-else
:indeterminate="task.progress.status === 'running'"
:show-text="false"
:stroke-width="8"
:percentage="task.progress.status === 'running' ? 100 : 0"
:status="progressStatus(task.progress.status)"
/>
<div v-if="task.progress.status === 'error' && task.progress.message" class="task-error-msg">
{{ task.progress.message }}
</div>
<div v-else-if="task.progress.message" class="task-message">
{{ task.progress.message }}
</div>
</div>
<el-button <el-button
type="primary" type="primary"
style="margin-top: 16px;" style="margin-top: 16px; width: 100%;"
:loading="task.loading" :loading="task.loading"
:disabled="anyRunning"
@click="runTask(task)" @click="runTask(task)"
> >
立即执行 {{ task.progress && task.progress.status === 'running' ? '执行中…' : '立即执行' }}
</el-button> </el-button>
</el-card> </el-card>
</el-col> </el-col>
</el-row> </el-row>
<!-- 接口连通测试 -->
<h1 class="page-title" style="margin-top: 32px;">接口测试</h1>
<el-card class="dark-card">
<template #header>
<div style="display: flex; justify-content: space-between; align-items: center;">
<span style="font-weight: 600;">外部服务连通性</span>
<el-button type="primary" :loading="testing" @click="runConnectionTest">
开始测试
</el-button>
</div>
</template>
<el-row :gutter="20">
<el-col :span="12">
<div class="test-result-card" :class="resultClass('rss_keeper')">
<div class="test-result-header">
<div class="task-icon-wrap">
<el-icon size="20"><Link /></el-icon>
</div>
<span class="test-result-title">rssKeeper API</span>
</div>
<div v-if="!connResults.rss_keeper" class="test-result-pending">
点击开始测试检测连通性
</div>
<div v-else-if="connResults.rss_keeper.status === 'ok'" class="test-result-success">
<el-icon color="var(--dc-success)" size="18"><CircleCheckFilled /></el-icon>
<span>连通正常</span>
<span class="test-latency">{{ connResults.rss_keeper.latency_ms }} ms</span>
</div>
<div v-else class="test-result-fail">
<div style="display: flex; align-items: center; gap: 6px;">
<el-icon color="var(--dc-danger)" size="18"><CircleCloseFilled /></el-icon>
<span>连接失败</span>
</div>
<div class="test-error-msg">{{ connResults.rss_keeper.error }}</div>
</div>
</div>
</el-col>
<el-col :span="12">
<div class="test-result-card" :class="resultClass('llm')">
<div class="test-result-header">
<div class="task-icon-wrap">
<el-icon size="20"><MagicStick /></el-icon>
</div>
<span class="test-result-title">LLM API</span>
</div>
<div v-if="!connResults.llm" class="test-result-pending">
点击开始测试检测连通性
</div>
<div v-else-if="connResults.llm.status === 'ok'" class="test-result-success">
<el-icon color="var(--dc-success)" size="18"><CircleCheckFilled /></el-icon>
<span>连通正常</span>
<span class="test-latency">{{ connResults.llm.latency_ms }} ms</span>
</div>
<div v-else class="test-result-fail">
<div style="display: flex; align-items: center; gap: 6px;">
<el-icon color="var(--dc-danger)" size="18"><CircleCloseFilled /></el-icon>
<span>连接失败</span>
</div>
<div class="test-error-msg">{{ connResults.llm.error }}</div>
</div>
</div>
</el-col>
</el-row>
</el-card>
</div> </div>
</template> </template>
<script setup> <script setup>
import { ref, onMounted } from 'vue' import { ref, computed, onMounted, onUnmounted } from 'vue'
import { ElMessage } from 'element-plus' import { ElMessage } from 'element-plus'
import { Document, CollectionTag, Collection } from '@element-plus/icons-vue' import { Document, CollectionTag, Collection, Timer, Link, MagicStick, CircleCheckFilled, CircleCloseFilled } from '@element-plus/icons-vue'
import { datacleanApi } from '@/api' import { datacleanApi } from '@/api'
const tasks = ref([ const tasks = ref([
{ {
id: 'summarize', id: 'summarize',
jobId: 'fetch_and_summarize',
title: '生成 AI 摘要', title: '生成 AI 摘要',
description: '拉取 rssKeeper 最近文章,为无摘要或短摘要文章生成 AI 摘要。', description: '拉取 rssKeeper 最近文章,为无摘要或短摘要文章生成 AI 摘要。',
icon: 'Document', icon: 'Document',
nextRun: '', nextRun: '',
loading: false, loading: false,
progress: null,
action: datacleanApi.summarize, action: datacleanApi.summarize,
}, },
{ {
id: 'tag_score_deduplicate', id: 'tag_score_dedup',
jobId: 'tag_score_deduplicate',
title: '分类 / 打分 / 去重', title: '分类 / 打分 / 去重',
description: '对当天文章进行分类、打标签、计算分数并生成重复组。', description: '对当天文章进行分类、打标签、计算分数并生成重复组。',
icon: 'CollectionTag', icon: 'CollectionTag',
nextRun: '', nextRun: '',
loading: false, loading: false,
progress: null,
action: datacleanApi.tagScoreDedup, action: datacleanApi.tagScoreDedup,
}, },
{ {
id: 'generate_daily_brief', id: 'generate_daily_brief',
jobId: 'generate_daily_brief',
title: '生成每日简报', title: '生成每日简报',
description: '基于当天去重后的代表文章生成每日简报。', description: '基于当天去重后的代表文章生成每日简报。',
icon: 'Collection', icon: 'Collection',
nextRun: '', nextRun: '',
loading: false, loading: false,
progress: null,
action: datacleanApi.generateBrief, action: datacleanApi.generateBrief,
}, },
{
id: 'bootstrap_taxonomy',
jobId: 'bootstrap_taxonomy',
title: '初始化分类体系',
description: 'AI 根据样本文章生成分类、标签和打分规则(服务首次启动时自动执行)。',
icon: 'CollectionTag',
nextRun: '',
loading: false,
progress: null,
action: datacleanApi.bootstrapTaxonomy,
},
]) ])
const loadStats = async () => { const anyRunning = computed(() =>
tasks.value.some((t) => t.progress?.status === 'running')
)
// 接口测试状态
const testing = ref(false)
const connResults = ref({ rss_keeper: null, llm: null })
// 进度轮询
let progressTimer = null
const fetchProgress = async () => {
try { try {
const stats = await datacleanApi.getStats() const data = await datacleanApi.getTaskProgress()
const nextJobs = stats.next_jobs || {} tasks.value.forEach((t) => {
tasks.value.forEach((task) => { t.progress = data[t.id] || null
task.nextRun = nextJobs[task.id] || '未调度'
}) })
} catch (err) { // 无任何任务运行时停止轮询
ElMessage.error(err.message) if (!anyRunning.value && progressTimer) {
clearInterval(progressTimer)
progressTimer = null
}
} catch (e) {
// 静默失败,下一轮重试
}
}
const startPolling = () => {
if (!progressTimer) {
fetchProgress()
progressTimer = setInterval(fetchProgress, 1500)
} }
} }
const runTask = async (task) => { const runTask = async (task) => {
task.loading = true task.loading = true
try { try {
const res = await task.action() const res = await task.action() // 后台执行,立即返回
ElMessage.success(res.message) ElMessage.success(res.message)
loadStats() startPolling()
} catch (err) { } catch (err) {
ElMessage.error(err.message) ElMessage.error(err.message)
} finally { } finally {
@@ -90,27 +237,209 @@ const runTask = async (task) => {
} }
} }
onMounted(loadStats) const statusTagType = (status) => ({
running: 'warning', success: 'success', error: 'danger', idle: 'info',
}[status] || 'info')
const statusLabel = (status) => ({
running: '执行中', success: '已完成', error: '失败', idle: '空闲',
}[status] || status)
const progressPercent = (p) => {
if (!p || !p.total) return 0
return Math.min(100, Math.round((p.current / p.total) * 100))
}
const progressStatus = (status) => {
if (status === 'success') return 'success'
if (status === 'error') return 'exception'
return null // running/idle 默认
}
const resultClass = (key) => {
if (!connResults.value[key]) return ''
return connResults.value[key].status === 'ok' ? 'test-ok' : 'test-err'
}
const runConnectionTest = async () => {
testing.value = true
connResults.value = { rss_keeper: null, llm: null }
try {
connResults.value = await datacleanApi.testConnection()
} catch (err) {
ElMessage.error(err.message)
} finally {
testing.value = false
}
}
const loadStats = async () => {
try {
const stats = await datacleanApi.getStats()
const nextJobs = stats.next_jobs || {}
tasks.value.forEach((task) => {
task.nextRun = nextJobs[task.jobId] || nextJobs[task.id] || '未调度'
})
} catch (err) {
ElMessage.error(err.message)
}
}
onMounted(async () => {
await loadStats()
await fetchProgress()
if (anyRunning.value) startPolling()
})
onUnmounted(() => {
if (progressTimer) {
clearInterval(progressTimer)
progressTimer = null
}
})
</script> </script>
<style scoped> <style scoped>
.task-header { .task-header {
display: flex; display: flex;
align-items: center; align-items: center;
gap: 10px; gap: 12px;
font-size: 16px; }
font-weight: 600; .task-icon-wrap {
width: 36px;
height: 36px;
border-radius: 8px;
background: var(--dc-primary-dim);
display: flex;
align-items: center;
justify-content: center;
color: var(--dc-primary);
}
.task-title {
font-size: 15px;
font-weight: 600;
color: var(--dc-text);
} }
.task-desc { .task-desc {
color: var(--dc-text-secondary); color: var(--dc-text-secondary);
line-height: 1.6; line-height: 1.6;
min-height: 60px; min-height: 48px;
}
.task-next-run {
margin-top: 12px;
color: var(--dc-text-secondary);
font-size: 13px; font-size: 13px;
} }
.task-next-run {
margin-top: 12px;
color: var(--dc-text-muted);
font-size: 12px;
display: flex;
align-items: center;
gap: 4px;
}
/* 进度块 */
.task-progress {
margin-top: 14px;
padding: 12px;
background: var(--dc-surface-raised);
border-radius: var(--dc-radius-sm);
border: 1px solid var(--dc-border);
}
.progress-top {
display: flex;
align-items: center;
gap: 8px;
margin-bottom: 8px;
}
.trigger-tag {
font-size: 11px;
color: var(--dc-text-muted);
padding: 1px 6px;
border: 1px solid var(--dc-border-light);
border-radius: 4px;
}
.task-stage {
font-size: 13px;
color: var(--dc-text);
font-weight: 500;
margin-bottom: 8px;
}
.task-counts {
color: var(--dc-text-secondary);
font-weight: 400;
}
.task-message {
margin-top: 8px;
color: var(--dc-text-secondary);
font-size: 12px;
line-height: 1.5;
}
.task-error-msg {
margin-top: 8px;
color: var(--dc-text-secondary);
font-size: 12px;
word-break: break-all;
line-height: 1.5;
padding: 8px;
background: rgba(248, 113, 113, 0.06);
border-radius: var(--dc-radius-sm);
}
/* 接口测试卡片 */
.test-result-card {
background: var(--dc-surface-raised);
border: 1px solid var(--dc-border);
border-radius: var(--dc-radius);
padding: 20px;
min-height: 130px;
transition: border-color 0.3s;
}
.test-result-card.test-ok {
border-color: var(--dc-success);
}
.test-result-card.test-err {
border-color: var(--dc-danger);
}
.test-result-header {
display: flex;
align-items: center;
gap: 12px;
margin-bottom: 16px;
}
.test-result-title {
font-size: 15px;
font-weight: 600;
color: var(--dc-text);
}
.test-result-pending {
color: var(--dc-text-muted);
font-size: 13px;
}
.test-result-success {
display: flex;
align-items: center;
gap: 8px;
color: var(--dc-success);
font-size: 14px;
font-weight: 500;
}
.test-result-fail {
color: var(--dc-danger);
font-size: 14px;
font-weight: 500;
}
.test-error-msg {
margin-top: 10px;
color: var(--dc-text-secondary);
font-size: 12px;
word-break: break-all;
line-height: 1.5;
padding: 10px;
background: rgba(248, 113, 113, 0.06);
border-radius: var(--dc-radius-sm);
}
.test-latency {
margin-left: auto;
color: var(--dc-text-secondary);
font-size: 13px;
font-weight: 400;
}
</style> </style>
+8 -17
View File
@@ -7,8 +7,9 @@
title="分类体系在首次启动时由 AI 根据样本文章生成,后续可通过编辑数据库调整。" title="分类体系在首次启动时由 AI 根据样本文章生成,后续可通过编辑数据库调整。"
type="info" type="info"
:closable="false" :closable="false"
show-icon
/> />
<div style="margin-top: 16px;"> <div style="margin-top: 16px; display: flex; gap: 10px;">
<el-button type="primary" @click="bootstrap(false)" :loading="bootstrapping"> <el-button type="primary" @click="bootstrap(false)" :loading="bootstrapping">
检查/初始化分类体系 检查/初始化分类体系
</el-button> </el-button>
@@ -18,7 +19,8 @@
</div> </div>
</el-card> </el-card>
<el-tabs v-model="activeTab" class="dark-tabs"> <div class="taxonomy-tabs-wrap">
<el-tabs v-model="activeTab">
<el-tab-pane label="分类" name="category"> <el-tab-pane label="分类" name="category">
<TaxonomyTable :data="taxonomyByKind.category" /> <TaxonomyTable :data="taxonomyByKind.category" />
</el-tab-pane> </el-tab-pane>
@@ -36,6 +38,7 @@
</el-tab-pane> </el-tab-pane>
</el-tabs> </el-tabs>
</div> </div>
</div>
</template> </template>
<script setup> <script setup>
@@ -89,22 +92,10 @@ onMounted(loadTaxonomy)
</script> </script>
<style scoped> <style scoped>
.dark-tabs { .taxonomy-tabs-wrap {
background: var(--dc-card-bg); background: var(--dc-surface);
border: 1px solid var(--dc-border); border: 1px solid var(--dc-border);
border-radius: 8px; border-radius: var(--dc-radius);
padding: 20px; padding: 20px;
} }
.dark-tabs :deep(.el-tabs__item) {
color: var(--dc-text-secondary);
}
.dark-tabs :deep(.el-tabs__item.is-active) {
color: var(--dc-primary);
}
.dark-tabs :deep(.el-tabs__active-bar) {
background-color: var(--dc-primary);
}
</style> </style>
+157 -27
View File
@@ -1,6 +1,7 @@
"""dataClean FastAPI 入口""" """dataClean FastAPI 入口"""
import logging import logging
import os import os
import threading
from contextlib import asynccontextmanager from contextlib import asynccontextmanager
from datetime import datetime, timedelta, timezone from datetime import datetime, timedelta, timezone
from typing import Optional, List from typing import Optional, List
@@ -8,6 +9,7 @@ from typing import Optional, List
from fastapi import FastAPI, Depends, HTTPException, Query, Body, Security, status from fastapi import FastAPI, Depends, HTTPException, Query, Body, Security, status
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse, Response
from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials from fastapi.security import HTTPBearer, HTTPAuthorizationCredentials
from pydantic import BaseModel, ConfigDict from pydantic import BaseModel, ConfigDict
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
@@ -15,7 +17,10 @@ from sqlalchemy.orm import Session
from config import settings from config import settings
from database import init_db, get_db, SessionLocal from database import init_db, get_db, SessionLocal
from scheduler import init_scheduler, stop_scheduler, get_scheduler, get_task_lock from scheduler import init_scheduler, stop_scheduler, get_scheduler, get_task_lock
from app.taxonomy import bootstrap_taxonomy, list_taxonomy, ensure_taxonomy from app.taxonomy import bootstrap_taxonomy, list_taxonomy
from app.rss_client import rss_client, RSSKeeperClient
from app.ai_client import ai_client, AIClient
from app import task_progress
from app.summarizer import fetch_and_summarize from app.summarizer import fetch_and_summarize
from app.tagger import tag_articles from app.tagger import tag_articles
from app.deduplicator import deduplicate_articles from app.deduplicator import deduplicate_articles
@@ -69,16 +74,39 @@ def verify_token(credentials: Optional[HTTPAuthorizationCredentials] = Security(
return credentials.credentials return credentials.credentials
def _run_task_locked(func, db: Session): def _run_task_background(task_key: str, trigger: str, fn) -> bool:
"""带互斥锁执行任务""" """
acquired = get_task_lock().acquire(blocking=False) 将任务提交到后台线程执行,立即返回。
if not acquired: 请求线程非阻塞获取 _task_lock(失败返回 False → 调用方抛 409),
raise HTTPException(status_code=409, detail="已有任务正在执行,请稍后再试") 并把锁所有权交给后台 worker。worker 内创建独立 SessionLocal
上报进度,执行 fn(db),最终释放锁。无 TOCTOU 窗口。
"""
if not get_task_lock().acquire(blocking=False):
return False # 锁被占用,调用方抛 409
def _worker():
db = SessionLocal()
task_progress.update_progress(
task_key, status="running", trigger=trigger,
stage="初始化", current=0, total=0, message=None,
)
try: try:
return func(db) fn(db)
task_progress.update_progress(
task_key, status="success", stage="完成", message="任务执行成功"
)
except Exception as exc:
logger.error("后台任务 %s 失败: %s", task_key, exc, exc_info=True)
task_progress.update_progress(
task_key, status="error", stage="失败", message=str(exc)[:500]
)
finally: finally:
db.close()
get_task_lock().release() get_task_lock().release()
threading.Thread(target=_worker, name=f"task-{task_key}", daemon=True).start()
return True
@asynccontextmanager @asynccontextmanager
async def lifespan(app: FastAPI): async def lifespan(app: FastAPI):
@@ -92,8 +120,8 @@ async def lifespan(app: FastAPI):
init_default_settings(db) init_default_settings(db)
# 用数据库配置覆盖全局 settings # 用数据库配置覆盖全局 settings
apply_db_settings_to_config(db) apply_db_settings_to_config(db)
# 首次启动时确保 taxonomy 表存在 # 注意:taxonomy 初始化交由 scheduler 的 bootstrap job 后台异步执行,
ensure_taxonomy(db) # 避免在启动时同步调用 LLM 阻塞服务就绪(进度可在前端实时查看)。
except Exception as exc: except Exception as exc:
logger.error("启动初始化失败: %s", exc) logger.error("启动初始化失败: %s", exc)
finally: finally:
@@ -138,7 +166,7 @@ class ArticleOut(BaseModel):
composite_score: float composite_score: float
ai_summary: str ai_summary: str
is_representative: bool is_representative: bool
published_at: Optional[str] published_at: Optional[datetime]
model_config = ConfigDict(from_attributes=True) model_config = ConfigDict(from_attributes=True)
@@ -199,6 +227,18 @@ class StatsOut(BaseModel):
next_jobs: dict next_jobs: dict
class ConnectionTestResult(BaseModel):
name: str
status: str
latency_ms: Optional[float] = None
error: Optional[str] = None
class ConnectionTestResponse(BaseModel):
rss_keeper: ConnectionTestResult
llm: ConnectionTestResult
# ---------- 健康检查 ---------- # ---------- 健康检查 ----------
@app.get("/health") @app.get("/health")
@@ -292,42 +332,108 @@ def get_taxonomy(kind: Optional[str] = Query(None), db: Session = Depends(get_db
@app.post("/api/taxonomy/bootstrap") @app.post("/api/taxonomy/bootstrap")
def trigger_taxonomy_bootstrap( def trigger_taxonomy_bootstrap(
force: bool = False, force: bool = False,
db: Session = Depends(get_db),
_=Depends(verify_token), _=Depends(verify_token),
): ):
ok = bootstrap_taxonomy(db, force=force) def _run(session):
ok = bootstrap_taxonomy(session, force=force)
if not ok: if not ok:
return {"message": "taxonomy 已存在或初始化失败,请检查日志"} raise RuntimeError("taxonomy 已存在或初始化失败,请检查日志")
return {"message": "taxonomy 初始化成功"} if not _run_task_background("bootstrap_taxonomy", "manual", _run):
raise HTTPException(status_code=409, detail="已有任务正在执行,请稍后再试")
return {"message": "taxonomy 初始化已开始", "task_key": "bootstrap_taxonomy"}
# ---------- 手动触发任务接口 ---------- # ---------- 手动触发任务接口 ----------
# ---------- 手动触发任务接口(后台执行,立即返回,前端轮询进度) ----------
@app.post("/api/tasks/summarize") @app.post("/api/tasks/summarize")
def task_summarize(db: Session = Depends(get_db), _=Depends(verify_token)): def task_summarize(_=Depends(verify_token)):
stats = _run_task_locked(lambda session: fetch_and_summarize(session, hours=24, limit=200), db) def _run(session):
return {"message": "摘要任务完成", "stats": stats} fetch_and_summarize(session, hours=24, limit=200)
if not _run_task_background("summarize", "manual", _run):
raise HTTPException(status_code=409, detail="已有任务正在执行,请稍后再试")
return {"message": "摘要任务已开始", "task_key": "summarize"}
@app.post("/api/tasks/tag-score-dedup") @app.post("/api/tasks/tag-score-dedup")
def task_tag_score_dedup(db: Session = Depends(get_db), _=Depends(verify_token)): def task_tag_score_dedup(_=Depends(verify_token)):
def _run(session): def _run(session):
tag_articles(session) tag_articles(session)
today = datetime.now(timezone.utc).strftime("%Y-%m-%d") today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
deduplicate_articles(session, date_str=today) deduplicate_articles(session, date_str=today)
score_articles(session, update_duplication=True) score_articles(session, update_duplication=True)
return None if not _run_task_background("tag_score_dedup", "manual", _run):
_run_task_locked(_run, db) raise HTTPException(status_code=409, detail="已有任务正在执行,请稍后再试")
return {"message": "分类/去重/打分任务完成"} return {"message": "分类/去重/打分任务已开始", "task_key": "tag_score_dedup"}
@app.post("/api/tasks/brief") @app.post("/api/tasks/brief")
def task_brief(db: Session = Depends(get_db), _=Depends(verify_token)): def task_brief(_=Depends(verify_token)):
def _run(session): def _run(session):
today = datetime.now(timezone.utc).strftime("%Y-%m-%d") today = datetime.now(timezone.utc).strftime("%Y-%m-%d")
return generate_daily_brief(session, date_str=today, force=True) generate_daily_brief(session, date_str=today, force=True)
data = _run_task_locked(_run, db) if not _run_task_background("generate_daily_brief", "manual", _run):
return {"message": "简报生成任务完成", "data": data} raise HTTPException(status_code=409, detail="已有任务正在执行,请稍后再试")
return {"message": "简报生成任务已开始", "task_key": "generate_daily_brief"}
@app.get("/api/tasks/progress")
def get_task_progress(_=Depends(verify_token)):
"""返回所有任务的实时进度(前端轮询)"""
return task_progress.get_progress()
@app.post("/api/tasks/progress/reset")
def reset_task_progress(task_key: str = Query(...), _=Depends(verify_token)):
"""重置指定任务的进度显示为 idle"""
task_progress.reset_progress(task_key)
return {"message": "已重置"}
# ---------- 接口连通性测试 ----------
@app.post("/api/test-connection", response_model=ConnectionTestResponse)
def test_connection(_=Depends(verify_token)):
"""测试 rssKeeper 和 LLM API 连通性,返回状态和延迟"""
import time
# rssKeeper 连通测试(使用短超时,避免长时间等待)
rss_result = {"name": "rssKeeper", "status": "error", "latency_ms": None, "error": None}
try:
t0 = time.monotonic()
# 临时用短超时的 client 测试
test_client = RSSKeeperClient(base_url=settings.RSSKEEPER_BASE_URL, timeout=10)
test_client._get("/api/v1/external/feeds", params={"limit": 1})
rss_result = {
"name": "rssKeeper",
"status": "ok",
"latency_ms": round((time.monotonic() - t0) * 1000, 1),
"error": None,
}
except Exception as exc:
rss_result["error"] = str(exc)[:200]
# LLM 连通测试(使用短超时 + 无重试)
llm_result = {"name": "LLM", "status": "error", "latency_ms": None, "error": None}
try:
t0 = time.monotonic()
test_ai = AIClient(timeout=10, max_retries=0)
test_ai.chat_completion(
system_prompt="You are a connectivity test.",
user_prompt="Reply with exactly: ok",
temperature=0.0,
)
llm_result = {
"name": "LLM",
"status": "ok",
"latency_ms": round((time.monotonic() - t0) * 1000, 1),
"error": None,
}
except Exception as exc:
llm_result["error"] = str(exc)[:200]
return {"rss_keeper": rss_result, "llm": llm_result}
# ---------- 配置管理接口 ---------- # ---------- 配置管理接口 ----------
@@ -408,7 +514,7 @@ def get_stats(db: Session = Depends(get_db)):
} }
# ---------- 静态文件托管(生产环境) ---------- # ---------- 静态文件托管(生产环境 SPA ----------
static_dir = os.path.join(os.path.dirname(__file__), "static") static_dir = os.path.join(os.path.dirname(__file__), "static")
if not os.path.isdir(static_dir): if not os.path.isdir(static_dir):
@@ -418,7 +524,31 @@ if not os.path.isdir(static_dir):
static_dir = frontend_dist static_dir = frontend_dist
if os.path.isdir(static_dir): if os.path.isdir(static_dir):
app.mount("/", StaticFiles(directory=static_dir, html=True), name="static") # 静态资源(JS/CSS/图片等)走 /assets 子路径挂载
assets_dir = os.path.join(static_dir, "assets")
if os.path.isdir(assets_dir):
app.mount("/assets", StaticFiles(directory=assets_dir), name="assets")
# SPA favicon、vite.svg 等根级静态文件
@app.get("/favicon.ico")
@app.get("/vite.svg")
async def serve_static_root(request):
from starlette.requests import Request
filename = os.path.basename(str(request.url.path))
file_path = os.path.join(static_dir, filename)
if os.path.isfile(file_path):
return FileResponse(file_path)
return Response(status_code=404)
# 所有未匹配的路由 → 返回 index.html(SPA 客户端路由)
@app.get("/{full_path:path}")
async def serve_spa(full_path: str):
# 先尝试匹配静态文件
file_path = os.path.join(static_dir, full_path)
if full_path and os.path.isfile(file_path):
return FileResponse(file_path)
# 否则返回 index.html 让 Vue Router 处理
return FileResponse(os.path.join(static_dir, "index.html"))
if __name__ == "__main__": if __name__ == "__main__":
+2 -2
View File
@@ -48,7 +48,7 @@ class EnrichedArticle(Base):
created_at = Column(DateTime, default=_utc_now) created_at = Column(DateTime, default=_utc_now)
updated_at = Column(DateTime, default=_utc_now, onupdate=_utc_now) updated_at = Column(DateTime, default=_utc_now, onupdate=_utc_now)
duplicate_group = relationship("DuplicateGroup", back_populates="articles") duplicate_group = relationship("DuplicateGroup", back_populates="articles", foreign_keys=[duplicate_group_id])
class Taxonomy(Base): class Taxonomy(Base):
@@ -78,7 +78,7 @@ class DuplicateGroup(Base):
brief_date = Column(String(10), default="", index=True) brief_date = Column(String(10), default="", index=True)
created_at = Column(DateTime, default=_utc_now) created_at = Column(DateTime, default=_utc_now)
articles = relationship("EnrichedArticle", back_populates="duplicate_group") articles = relationship("EnrichedArticle", back_populates="duplicate_group", foreign_keys="EnrichedArticle.duplicate_group_id")
class DailyBrief(Base): class DailyBrief(Base):
+24 -1
View File
@@ -19,6 +19,7 @@ from app.deduplicator import deduplicate_articles
from app.scorer import score_articles from app.scorer import score_articles
from app.brief import generate_daily_brief from app.brief import generate_daily_brief
from app.settings_manager import get_setting_value from app.settings_manager import get_setting_value
from app import task_progress
logger = logging.getLogger(__name__) logger = logging.getLogger(__name__)
@@ -27,6 +28,14 @@ _scheduler: BackgroundScheduler | None = None
# 任务互斥锁:防止手动任务与定时任务并发执行 # 任务互斥锁:防止手动任务与定时任务并发执行
_task_lock = threading.Lock() _task_lock = threading.Lock()
# 定时任务函数名 → 进度 key 映射
_JOB_TASK_KEYS = {
"job_fetch_and_summarize": "summarize",
"job_tag_score_deduplicate": "tag_score_dedup",
"job_generate_daily_brief": "generate_daily_brief",
"job_bootstrap_taxonomy": "bootstrap_taxonomy",
}
def get_scheduler() -> BackgroundScheduler: def get_scheduler() -> BackgroundScheduler:
global _scheduler global _scheduler
@@ -48,18 +57,32 @@ def get_task_lock():
def _with_db(func): def _with_db(func):
"""装饰器:为任务函数提供数据库会话,并记录运行日志""" """装饰器:为任务函数提供数据库会话,并记录运行日志,同时上报进度"""
@functools.wraps(func) @functools.wraps(func)
def wrapper(): def wrapper():
acquired = _task_lock.acquire(blocking=False) acquired = _task_lock.acquire(blocking=False)
if not acquired: if not acquired:
logger.warning("定时任务 %s 跳过:已有其他任务正在执行", func.__name__) logger.warning("定时任务 %s 跳过:已有其他任务正在执行", func.__name__)
return return
task_key = _JOB_TASK_KEYS.get(func.__name__)
db = SessionLocal() db = SessionLocal()
if task_key:
task_progress.update_progress(
task_key, status="running", trigger="scheduled",
stage="初始化", current=0, total=0, message=None,
)
try: try:
func(db) func(db)
if task_key:
task_progress.update_progress(
task_key, status="success", stage="完成", message="定时任务执行成功"
)
except Exception as exc: except Exception as exc:
logger.error("定时任务 %s 执行失败: %s", func.__name__, exc, exc_info=True) logger.error("定时任务 %s 执行失败: %s", func.__name__, exc, exc_info=True)
if task_key:
task_progress.update_progress(
task_key, status="error", stage="失败", message=str(exc)[:500]
)
finally: finally:
db.close() db.close()
_task_lock.release() _task_lock.release()