feat: 更新项目进度,完成 Phase 4 和 Phase 5,添加监控与健康检查功能
This commit is contained in:
68
.dockerignore
Normal file
68
.dockerignore
Normal file
@@ -0,0 +1,68 @@
|
||||
# Git
|
||||
.git
|
||||
.gitignore
|
||||
|
||||
# Python
|
||||
__pycache__
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
.venv
|
||||
venv
|
||||
env
|
||||
.eggs
|
||||
*.egg-info
|
||||
.installed.cfg
|
||||
*.egg
|
||||
|
||||
# IDE
|
||||
.vscode
|
||||
.idea
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# Testing
|
||||
.pytest_cache
|
||||
.coverage
|
||||
htmlcov
|
||||
.tox
|
||||
.nox
|
||||
|
||||
# Mypy
|
||||
.mypy_cache
|
||||
|
||||
# Documentation
|
||||
docs/_build
|
||||
*.md
|
||||
!README.md
|
||||
|
||||
# Local config
|
||||
.env
|
||||
.env.local
|
||||
config.local.json5
|
||||
secrets/
|
||||
|
||||
# Logs and data
|
||||
*.log
|
||||
logs/
|
||||
*.db
|
||||
*.sqlite
|
||||
*.sqlite3
|
||||
data/
|
||||
|
||||
# PoC files
|
||||
poc/
|
||||
PoC*.md
|
||||
|
||||
# Development files
|
||||
*.tmp
|
||||
tmp/
|
||||
temp/
|
||||
|
||||
# Tests (optional - include if you want tests in image)
|
||||
# tests/
|
||||
|
||||
# Claude config
|
||||
.claude/
|
||||
73
Dockerfile
Normal file
73
Dockerfile
Normal file
@@ -0,0 +1,73 @@
|
||||
# MineNASAI Dockerfile
|
||||
# 多阶段构建,优化镜像大小
|
||||
|
||||
# ==================== 构建阶段 ====================
|
||||
FROM python:3.13-slim AS builder
|
||||
|
||||
# 设置环境变量
|
||||
ENV PYTHONDONTWRITEBYTECODE=1 \
|
||||
PYTHONUNBUFFERED=1 \
|
||||
PIP_NO_CACHE_DIR=1 \
|
||||
PIP_DISABLE_PIP_VERSION_CHECK=1
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# 安装构建依赖
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
build-essential \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# 复制项目文件
|
||||
COPY pyproject.toml ./
|
||||
COPY src/ ./src/
|
||||
|
||||
# 安装依赖到虚拟环境
|
||||
RUN python -m venv /opt/venv
|
||||
ENV PATH="/opt/venv/bin:$PATH"
|
||||
RUN pip install --upgrade pip && \
|
||||
pip install .
|
||||
|
||||
# ==================== 运行阶段 ====================
|
||||
FROM python:3.13-slim AS runtime
|
||||
|
||||
# 安全设置 - 创建非 root 用户
|
||||
RUN groupadd --gid 1000 minenasai && \
|
||||
useradd --uid 1000 --gid minenasai --shell /bin/bash --create-home minenasai
|
||||
|
||||
# 设置环境变量
|
||||
ENV PYTHONDONTWRITEBYTECODE=1 \
|
||||
PYTHONUNBUFFERED=1 \
|
||||
PATH="/opt/venv/bin:$PATH" \
|
||||
MINENASAI_ENV=production
|
||||
|
||||
WORKDIR /app
|
||||
|
||||
# 安装运行时依赖
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
openssh-client \
|
||||
curl \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# 从构建阶段复制虚拟环境
|
||||
COPY --from=builder /opt/venv /opt/venv
|
||||
|
||||
# 复制应用代码
|
||||
COPY --chown=minenasai:minenasai src/ ./src/
|
||||
COPY --chown=minenasai:minenasai config/ ./config/
|
||||
|
||||
# 创建数据目录
|
||||
RUN mkdir -p /app/data /app/logs && \
|
||||
chown -R minenasai:minenasai /app/data /app/logs
|
||||
|
||||
# 切换到非 root 用户
|
||||
USER minenasai
|
||||
|
||||
# 健康检查
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
|
||||
CMD curl -f http://localhost:8000/health/live || exit 1
|
||||
|
||||
# 暴露端口
|
||||
EXPOSE 8000 8080
|
||||
|
||||
# 启动命令
|
||||
CMD ["python", "-m", "uvicorn", "minenasai.gateway.server:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
157
README.md
157
README.md
@@ -1,16 +1,22 @@
|
||||
# MineNASAI
|
||||
|
||||
基于 NAS 的智能个人 AI 助理,支持企业微信/飞书通讯,集成 Claude 编程能力。
|
||||
基于 NAS 的智能个人 AI 助理,支持企业微信/飞书通讯,集成多 LLM 编程能力。
|
||||
|
||||
[](https://python.org)
|
||||
[](tests/)
|
||||
[](LICENSE)
|
||||
|
||||
## 特性
|
||||
|
||||
- **多 LLM 支持**: Anthropic Claude、OpenAI、DeepSeek、智谱、MiniMax、Moonshot、Gemini
|
||||
- **多渠道通讯**: 企业微信、飞书接入
|
||||
- **智能路由**: 自动识别任务复杂度,选择最优处理方式
|
||||
- **双界面模式**:
|
||||
- 通讯工具:日常交互、简单任务
|
||||
- Web TUI:深度编程、复杂项目
|
||||
- **安全隔离**: Python 沙箱执行、权限分级
|
||||
- **可扩展**: 支持 MCP Server 插件
|
||||
- **安全隔离**: Python 沙箱执行、权限分级、确认机制
|
||||
- **生产就绪**: 健康检查、监控指标、Docker 部署
|
||||
- **可扩展**: 工具注册中心、Cron 定时任务
|
||||
|
||||
## 快速开始
|
||||
|
||||
@@ -24,13 +30,13 @@
|
||||
|
||||
```bash
|
||||
# 克隆项目
|
||||
git clone https://github.com/minenasai/minenasai.git
|
||||
cd minenasai
|
||||
git clone http://jiulu-gameplay.com.cn:13001/congsh/MineNasAI.git
|
||||
cd MineNasAI
|
||||
|
||||
# 创建虚拟环境
|
||||
python -m venv .venv
|
||||
source .venv/bin/activate # Linux/macOS
|
||||
# .venv\Scripts\activate # Windows
|
||||
.venv\Scripts\activate # Windows
|
||||
# source .venv/bin/activate # Linux/macOS
|
||||
|
||||
# 安装依赖
|
||||
pip install -e ".[dev]"
|
||||
@@ -45,39 +51,70 @@ pre-commit install
|
||||
# 复制环境变量模板
|
||||
cp .env.example .env
|
||||
|
||||
# 编辑 .env 文件,填入 API Key 等配置
|
||||
# ANTHROPIC_API_KEY=sk-ant-xxxxx
|
||||
|
||||
# 初始化配置文件
|
||||
minenasai config --init
|
||||
# 编辑 .env 文件,填入 API Key
|
||||
# MINENASAI_ANTHROPIC_API_KEY=sk-ant-xxxxx
|
||||
# MINENASAI_DEEPSEEK_API_KEY=sk-xxxxx
|
||||
```
|
||||
|
||||
### 运行
|
||||
|
||||
```bash
|
||||
# 启动 Gateway 服务
|
||||
minenasai server --port 8000
|
||||
python -m uvicorn minenasai.gateway.server:app --port 8000
|
||||
|
||||
# 启动 Web TUI 服务(另一个终端)
|
||||
minenasai webtui --port 8080
|
||||
python -m uvicorn minenasai.webtui.server:app --port 8080
|
||||
```
|
||||
|
||||
### Docker 部署
|
||||
|
||||
```bash
|
||||
# 构建并启动
|
||||
docker-compose up -d
|
||||
|
||||
# 查看日志
|
||||
docker-compose logs -f gateway
|
||||
|
||||
# 停止服务
|
||||
docker-compose down
|
||||
```
|
||||
|
||||
## 项目结构
|
||||
|
||||
```
|
||||
minenasai/
|
||||
MineNasAI/
|
||||
├── src/minenasai/
|
||||
│ ├── core/ # 核心模块(配置、日志)
|
||||
│ ├── core/ # 核心模块
|
||||
│ │ ├── config.py # 配置管理
|
||||
│ │ ├── logging.py # 日志系统
|
||||
│ │ ├── database.py # 数据库
|
||||
│ │ ├── monitoring.py # 监控与健康检查
|
||||
│ │ └── cache.py # 缓存与限流
|
||||
│ ├── gateway/ # Gateway 服务
|
||||
│ │ ├── protocol/ # 消息协议
|
||||
│ │ └── channels/ # 通讯渠道(企业微信、飞书)
|
||||
│ │ ├── router.py # 智能路由
|
||||
│ │ ├── server.py # FastAPI 服务
|
||||
│ │ └── channels/ # 通讯渠道
|
||||
│ ├── llm/ # LLM 集成
|
||||
│ │ ├── base.py # 基础接口
|
||||
│ │ ├── manager.py # LLM 管理器
|
||||
│ │ └── clients/ # 各提供商客户端
|
||||
│ ├── agent/ # Agent 运行时
|
||||
│ │ ├── runtime.py # Agent 执行
|
||||
│ │ ├── permissions.py # 权限管理
|
||||
│ │ ├── tool_registry.py # 工具注册
|
||||
│ │ └── tools/ # 内置工具
|
||||
│ └── webtui/ # Web TUI 界面
|
||||
│ └── static/ # 前端静态文件
|
||||
├── tests/ # 测试用例
|
||||
├── config/ # 配置文件模板
|
||||
└── docs/ # 文档
|
||||
│ ├── scheduler/ # 定时任务
|
||||
│ │ └── cron.py # Cron 调度器
|
||||
│ └── webtui/ # Web TUI
|
||||
│ ├── server.py # TUI 服务器
|
||||
│ ├── auth.py # 认证管理
|
||||
│ ├── ssh_manager.py # SSH 管理
|
||||
│ └── static/ # 前端文件
|
||||
├── tests/ # 测试用例 (131 tests)
|
||||
├── config/ # 配置模板
|
||||
├── Dockerfile # Docker 构建
|
||||
└── docker-compose.yml # 容器编排
|
||||
```
|
||||
|
||||
## 架构概述
|
||||
@@ -90,7 +127,7 @@ minenasai/
|
||||
│
|
||||
┌────────────────────────────┴────────────────────────────────┐
|
||||
│ Gateway 服务 (FastAPI) │
|
||||
│ WebSocket协议 / 消息队列 / 权限验证 │
|
||||
│ WebSocket协议 / 监控指标 / 健康检查 / CORS │
|
||||
└────────────────────────────┬────────────────────────────────┘
|
||||
│
|
||||
┌────────────────────────────┴────────────────────────────────┐
|
||||
@@ -101,11 +138,40 @@ minenasai/
|
||||
┌───────────────────┼───────────────────┐
|
||||
↓ ↓ ↓
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ 快速执行通道 │ │ Anthropic │ │ Web TUI │
|
||||
│ Python沙箱 │ │ API │ │ SSH+Claude │
|
||||
│ 快速执行通道 │ │ 多 LLM │ │ Web TUI │
|
||||
│ Python沙箱 │ │ API │ │ SSH+Claude │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘
|
||||
│
|
||||
┌───────────────────┼───────────────────┐
|
||||
↓ ↓ ↓
|
||||
Anthropic Claude DeepSeek/智谱 OpenAI/Gemini
|
||||
```
|
||||
|
||||
## 支持的 LLM 提供商
|
||||
|
||||
| 提供商 | 模型示例 | 区域 | 代理 |
|
||||
|--------|----------|------|------|
|
||||
| Anthropic | claude-sonnet-4-20250514 | 境外 | 需要 |
|
||||
| OpenAI | gpt-4o | 境外 | 需要 |
|
||||
| Google | gemini-2.0-flash | 境外 | 需要 |
|
||||
| DeepSeek | deepseek-chat | 国内 | 不需要 |
|
||||
| 智谱 | glm-4-flash | 国内 | 不需要 |
|
||||
| MiniMax | abab6.5s-chat | 国内 | 不需要 |
|
||||
| Moonshot | moonshot-v1-8k | 国内 | 不需要 |
|
||||
|
||||
## API 端点
|
||||
|
||||
| 端点 | 方法 | 说明 |
|
||||
|------|------|------|
|
||||
| `/` | GET | 服务状态 |
|
||||
| `/health` | GET | 完整健康检查 |
|
||||
| `/health/live` | GET | 存活检查 (K8s) |
|
||||
| `/health/ready` | GET | 就绪检查 (K8s) |
|
||||
| `/metrics` | GET | 监控指标 |
|
||||
| `/ws` | WebSocket | 消息通道 |
|
||||
| `/api/agents` | GET | Agent 列表 |
|
||||
| `/api/sessions` | GET | 会话列表 |
|
||||
|
||||
## 开发
|
||||
|
||||
### 代码规范
|
||||
@@ -124,16 +190,31 @@ mypy src
|
||||
### 测试
|
||||
|
||||
```bash
|
||||
# 运行测试
|
||||
# 运行所有测试
|
||||
pytest
|
||||
|
||||
# 带覆盖率
|
||||
pytest --cov=minenasai
|
||||
|
||||
# 详细输出
|
||||
pytest -v --tb=short
|
||||
```
|
||||
|
||||
### 当前测试覆盖
|
||||
|
||||
- **test_core.py**: 配置、日志 (9 tests)
|
||||
- **test_gateway.py**: 协议、路由 (14 tests)
|
||||
- **test_llm.py**: LLM 客户端 (10 tests)
|
||||
- **test_monitoring.py**: 监控、健康检查 (17 tests)
|
||||
- **test_cache.py**: 缓存、限流 (21 tests)
|
||||
- **test_permissions.py**: 权限、工具注册 (17 tests)
|
||||
- **test_scheduler.py**: Cron 调度 (15 tests)
|
||||
- **test_tools.py**: 内置工具 (14 tests)
|
||||
- **test_webtui.py**: Web TUI (14 tests)
|
||||
|
||||
## 配置说明
|
||||
|
||||
配置文件位置: `~/.config/minenasai/config.json5`
|
||||
配置文件: `config/config.json5`
|
||||
|
||||
主要配置项:
|
||||
|
||||
@@ -141,9 +222,29 @@ pytest --cov=minenasai
|
||||
|--------|------|--------|
|
||||
| `gateway.port` | Gateway 端口 | 8000 |
|
||||
| `webtui.port` | Web TUI 端口 | 8080 |
|
||||
| `agents.default_model` | 默认模型 | claude-sonnet-4-20250514 |
|
||||
| `llm.default_provider` | 默认 LLM 提供商 | anthropic |
|
||||
| `llm.default_model` | 默认模型 | claude-sonnet-4-20250514 |
|
||||
| `proxy.enabled` | 是否启用代理 | false |
|
||||
| `router.mode` | 路由模式 | agent |
|
||||
|
||||
## 环境变量
|
||||
|
||||
```bash
|
||||
# LLM API Keys
|
||||
MINENASAI_ANTHROPIC_API_KEY=sk-ant-xxx
|
||||
MINENASAI_OPENAI_API_KEY=sk-xxx
|
||||
MINENASAI_DEEPSEEK_API_KEY=sk-xxx
|
||||
MINENASAI_ZHIPU_API_KEY=xxx
|
||||
MINENASAI_MINIMAX_API_KEY=xxx
|
||||
MINENASAI_MOONSHOT_API_KEY=sk-xxx
|
||||
MINENASAI_GEMINI_API_KEY=xxx
|
||||
|
||||
# 代理设置 (境外 API)
|
||||
MINENASAI_PROXY_ENABLED=true
|
||||
MINENASAI_PROXY_HTTP=http://127.0.0.1:7890
|
||||
MINENASAI_PROXY_HTTPS=http://127.0.0.1:7890
|
||||
```
|
||||
|
||||
## 许可证
|
||||
|
||||
MIT License
|
||||
|
||||
100
docker-compose.yml
Normal file
100
docker-compose.yml
Normal file
@@ -0,0 +1,100 @@
|
||||
# MineNASAI Docker Compose 配置
|
||||
# 用于本地开发和生产部署
|
||||
|
||||
version: "3.9"
|
||||
|
||||
services:
|
||||
# ==================== Gateway 服务 ====================
|
||||
gateway:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
container_name: minenasai-gateway
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "${GATEWAY_PORT:-8000}:8000"
|
||||
environment:
|
||||
- MINENASAI_ENV=production
|
||||
- MINENASAI_GATEWAY_HOST=0.0.0.0
|
||||
- MINENASAI_GATEWAY_PORT=8000
|
||||
# LLM API Keys (从 .env 读取)
|
||||
- MINENASAI_ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY:-}
|
||||
- MINENASAI_OPENAI_API_KEY=${OPENAI_API_KEY:-}
|
||||
- MINENASAI_DEEPSEEK_API_KEY=${DEEPSEEK_API_KEY:-}
|
||||
- MINENASAI_ZHIPU_API_KEY=${ZHIPU_API_KEY:-}
|
||||
- MINENASAI_MINIMAX_API_KEY=${MINIMAX_API_KEY:-}
|
||||
- MINENASAI_MOONSHOT_API_KEY=${MOONSHOT_API_KEY:-}
|
||||
- MINENASAI_GEMINI_API_KEY=${GEMINI_API_KEY:-}
|
||||
# 代理设置
|
||||
- MINENASAI_PROXY_ENABLED=${PROXY_ENABLED:-false}
|
||||
- MINENASAI_PROXY_HTTP=${PROXY_HTTP:-}
|
||||
- MINENASAI_PROXY_HTTPS=${PROXY_HTTPS:-}
|
||||
volumes:
|
||||
- minenasai-data:/app/data
|
||||
- minenasai-logs:/app/logs
|
||||
- ./config:/app/config:ro
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8000/health/live"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 10s
|
||||
networks:
|
||||
- minenasai-network
|
||||
depends_on:
|
||||
redis:
|
||||
condition: service_healthy
|
||||
|
||||
# ==================== Web TUI 服务 ====================
|
||||
webtui:
|
||||
build:
|
||||
context: .
|
||||
dockerfile: Dockerfile
|
||||
container_name: minenasai-webtui
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "${WEBTUI_PORT:-8080}:8080"
|
||||
environment:
|
||||
- MINENASAI_ENV=production
|
||||
- MINENASAI_WEBTUI_HOST=0.0.0.0
|
||||
- MINENASAI_WEBTUI_PORT=8080
|
||||
volumes:
|
||||
- minenasai-data:/app/data
|
||||
- minenasai-logs:/app/logs
|
||||
command: ["python", "-m", "uvicorn", "minenasai.webtui.server:app", "--host", "0.0.0.0", "--port", "8080"]
|
||||
networks:
|
||||
- minenasai-network
|
||||
depends_on:
|
||||
- gateway
|
||||
|
||||
# ==================== Redis (消息队列 & 缓存) ====================
|
||||
redis:
|
||||
image: redis:7-alpine
|
||||
container_name: minenasai-redis
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "${REDIS_PORT:-6379}:6379"
|
||||
volumes:
|
||||
- redis-data:/data
|
||||
command: redis-server --appendonly yes --maxmemory 256mb --maxmemory-policy allkeys-lru
|
||||
healthcheck:
|
||||
test: ["CMD", "redis-cli", "ping"]
|
||||
interval: 10s
|
||||
timeout: 5s
|
||||
retries: 3
|
||||
networks:
|
||||
- minenasai-network
|
||||
|
||||
# ==================== 数据卷 ====================
|
||||
volumes:
|
||||
minenasai-data:
|
||||
driver: local
|
||||
minenasai-logs:
|
||||
driver: local
|
||||
redis-data:
|
||||
driver: local
|
||||
|
||||
# ==================== 网络 ====================
|
||||
networks:
|
||||
minenasai-network:
|
||||
driver: bridge
|
||||
@@ -1,8 +1,15 @@
|
||||
"""核心模块
|
||||
|
||||
提供配置管理、日志系统、数据库等基础功能
|
||||
提供配置管理、日志系统、数据库、监控、缓存等基础功能
|
||||
"""
|
||||
|
||||
from minenasai.core.cache import (
|
||||
MemoryCache,
|
||||
RateLimiter,
|
||||
get_rate_limiter,
|
||||
get_response_cache,
|
||||
make_cache_key,
|
||||
)
|
||||
from minenasai.core.config import Settings, get_settings, load_config, reset_settings
|
||||
from minenasai.core.logging import (
|
||||
AuditLogger,
|
||||
@@ -10,14 +17,39 @@ from minenasai.core.logging import (
|
||||
get_logger,
|
||||
setup_logging,
|
||||
)
|
||||
from minenasai.core.monitoring import (
|
||||
ComponentHealth,
|
||||
HealthChecker,
|
||||
HealthStatus,
|
||||
SystemMetrics,
|
||||
get_health_checker,
|
||||
get_metrics,
|
||||
setup_monitoring,
|
||||
)
|
||||
|
||||
__all__ = [
|
||||
# 配置
|
||||
"Settings",
|
||||
"get_settings",
|
||||
"load_config",
|
||||
"reset_settings",
|
||||
# 日志
|
||||
"setup_logging",
|
||||
"get_logger",
|
||||
"AuditLogger",
|
||||
"get_audit_logger",
|
||||
# 监控
|
||||
"setup_monitoring",
|
||||
"get_metrics",
|
||||
"get_health_checker",
|
||||
"SystemMetrics",
|
||||
"HealthChecker",
|
||||
"HealthStatus",
|
||||
"ComponentHealth",
|
||||
# 缓存
|
||||
"MemoryCache",
|
||||
"RateLimiter",
|
||||
"get_response_cache",
|
||||
"get_rate_limiter",
|
||||
"make_cache_key",
|
||||
]
|
||||
|
||||
266
src/minenasai/core/cache.py
Normal file
266
src/minenasai/core/cache.py
Normal file
@@ -0,0 +1,266 @@
|
||||
"""缓存模块
|
||||
|
||||
提供内存缓存和 TTL 管理
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import hashlib
|
||||
import time
|
||||
from dataclasses import dataclass, field
|
||||
from typing import Any, Generic, TypeVar
|
||||
|
||||
from minenasai.core.logging import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
T = TypeVar("T")
|
||||
|
||||
|
||||
@dataclass
|
||||
class CacheEntry(Generic[T]):
|
||||
"""缓存条目"""
|
||||
|
||||
key: str
|
||||
value: T
|
||||
created_at: float
|
||||
expires_at: float
|
||||
hits: int = 0
|
||||
|
||||
@property
|
||||
def is_expired(self) -> bool:
|
||||
"""是否过期"""
|
||||
return time.time() > self.expires_at
|
||||
|
||||
@property
|
||||
def ttl_remaining(self) -> float:
|
||||
"""剩余 TTL(秒)"""
|
||||
return max(0, self.expires_at - time.time())
|
||||
|
||||
|
||||
class MemoryCache(Generic[T]):
|
||||
"""内存缓存
|
||||
|
||||
支持 TTL、最大容量、LRU 淘汰
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
max_size: int = 1000,
|
||||
default_ttl: float = 300.0, # 5分钟
|
||||
cleanup_interval: float = 60.0, # 1分钟清理一次
|
||||
) -> None:
|
||||
self._cache: dict[str, CacheEntry[T]] = {}
|
||||
self._max_size = max_size
|
||||
self._default_ttl = default_ttl
|
||||
self._cleanup_interval = cleanup_interval
|
||||
self._cleanup_task: asyncio.Task[None] | None = None
|
||||
|
||||
# 统计
|
||||
self._hits = 0
|
||||
self._misses = 0
|
||||
|
||||
async def start(self) -> None:
|
||||
"""启动后台清理任务"""
|
||||
if self._cleanup_task is None:
|
||||
self._cleanup_task = asyncio.create_task(self._cleanup_loop())
|
||||
logger.info("cache_cleanup_started", interval=self._cleanup_interval)
|
||||
|
||||
async def stop(self) -> None:
|
||||
"""停止后台清理任务"""
|
||||
if self._cleanup_task:
|
||||
self._cleanup_task.cancel()
|
||||
try:
|
||||
await self._cleanup_task
|
||||
except asyncio.CancelledError:
|
||||
pass
|
||||
self._cleanup_task = None
|
||||
|
||||
async def _cleanup_loop(self) -> None:
|
||||
"""清理循环"""
|
||||
while True:
|
||||
await asyncio.sleep(self._cleanup_interval)
|
||||
self._cleanup_expired()
|
||||
|
||||
def _cleanup_expired(self) -> int:
|
||||
"""清理过期条目"""
|
||||
expired_keys = [k for k, v in self._cache.items() if v.is_expired]
|
||||
for key in expired_keys:
|
||||
del self._cache[key]
|
||||
|
||||
if expired_keys:
|
||||
logger.debug("cache_cleanup", removed=len(expired_keys))
|
||||
|
||||
return len(expired_keys)
|
||||
|
||||
def _evict_lru(self) -> None:
|
||||
"""LRU 淘汰"""
|
||||
if len(self._cache) < self._max_size:
|
||||
return
|
||||
|
||||
# 按命中次数和创建时间排序,淘汰最少使用的
|
||||
sorted_entries = sorted(
|
||||
self._cache.items(),
|
||||
key=lambda x: (x[1].hits, x[1].created_at),
|
||||
)
|
||||
|
||||
# 淘汰 10% 的条目
|
||||
evict_count = max(1, len(sorted_entries) // 10)
|
||||
for key, _ in sorted_entries[:evict_count]:
|
||||
del self._cache[key]
|
||||
|
||||
logger.debug("cache_eviction", evicted=evict_count)
|
||||
|
||||
def get(self, key: str) -> T | None:
|
||||
"""获取缓存值"""
|
||||
entry = self._cache.get(key)
|
||||
|
||||
if entry is None:
|
||||
self._misses += 1
|
||||
return None
|
||||
|
||||
if entry.is_expired:
|
||||
del self._cache[key]
|
||||
self._misses += 1
|
||||
return None
|
||||
|
||||
entry.hits += 1
|
||||
self._hits += 1
|
||||
return entry.value
|
||||
|
||||
def set(self, key: str, value: T, ttl: float | None = None) -> None:
|
||||
"""设置缓存值"""
|
||||
if ttl is None:
|
||||
ttl = self._default_ttl
|
||||
|
||||
# 检查容量
|
||||
if len(self._cache) >= self._max_size:
|
||||
self._evict_lru()
|
||||
|
||||
now = time.time()
|
||||
self._cache[key] = CacheEntry(
|
||||
key=key,
|
||||
value=value,
|
||||
created_at=now,
|
||||
expires_at=now + ttl,
|
||||
)
|
||||
|
||||
def delete(self, key: str) -> bool:
|
||||
"""删除缓存"""
|
||||
if key in self._cache:
|
||||
del self._cache[key]
|
||||
return True
|
||||
return False
|
||||
|
||||
def clear(self) -> int:
|
||||
"""清空缓存"""
|
||||
count = len(self._cache)
|
||||
self._cache.clear()
|
||||
return count
|
||||
|
||||
def exists(self, key: str) -> bool:
|
||||
"""检查 key 是否存在且未过期"""
|
||||
entry = self._cache.get(key)
|
||||
if entry is None:
|
||||
return False
|
||||
if entry.is_expired:
|
||||
del self._cache[key]
|
||||
return False
|
||||
return True
|
||||
|
||||
def get_stats(self) -> dict[str, Any]:
|
||||
"""获取统计信息"""
|
||||
total = self._hits + self._misses
|
||||
hit_rate = self._hits / total if total > 0 else 0.0
|
||||
|
||||
return {
|
||||
"size": len(self._cache),
|
||||
"max_size": self._max_size,
|
||||
"hits": self._hits,
|
||||
"misses": self._misses,
|
||||
"hit_rate": round(hit_rate * 100, 2),
|
||||
"default_ttl": self._default_ttl,
|
||||
}
|
||||
|
||||
|
||||
def make_cache_key(*args: Any, **kwargs: Any) -> str:
|
||||
"""生成缓存 key"""
|
||||
key_parts = [str(arg) for arg in args]
|
||||
key_parts.extend(f"{k}={v}" for k, v in sorted(kwargs.items()))
|
||||
key_string = ":".join(key_parts)
|
||||
return hashlib.md5(key_string.encode()).hexdigest()
|
||||
|
||||
|
||||
# 全局缓存实例
|
||||
_response_cache: MemoryCache[dict[str, Any]] | None = None
|
||||
|
||||
|
||||
def get_response_cache() -> MemoryCache[dict[str, Any]]:
|
||||
"""获取响应缓存"""
|
||||
global _response_cache
|
||||
if _response_cache is None:
|
||||
_response_cache = MemoryCache(
|
||||
max_size=500,
|
||||
default_ttl=300.0, # 5分钟
|
||||
)
|
||||
return _response_cache
|
||||
|
||||
|
||||
@dataclass
|
||||
class RateLimiter:
|
||||
"""速率限制器
|
||||
|
||||
令牌桶算法实现
|
||||
"""
|
||||
|
||||
rate: float # 每秒允许的请求数
|
||||
burst: int # 突发容量
|
||||
|
||||
_tokens: float = field(init=False)
|
||||
_last_update: float = field(init=False)
|
||||
|
||||
def __post_init__(self) -> None:
|
||||
self._tokens = float(self.burst)
|
||||
self._last_update = time.time()
|
||||
|
||||
def _refill(self) -> None:
|
||||
"""补充令牌"""
|
||||
now = time.time()
|
||||
elapsed = now - self._last_update
|
||||
self._tokens = min(self.burst, self._tokens + elapsed * self.rate)
|
||||
self._last_update = now
|
||||
|
||||
def acquire(self, tokens: int = 1) -> bool:
|
||||
"""尝试获取令牌"""
|
||||
self._refill()
|
||||
|
||||
if self._tokens >= tokens:
|
||||
self._tokens -= tokens
|
||||
return True
|
||||
return False
|
||||
|
||||
async def wait(self, tokens: int = 1) -> None:
|
||||
"""等待获取令牌"""
|
||||
while not self.acquire(tokens):
|
||||
# 计算需要等待的时间
|
||||
needed = tokens - self._tokens
|
||||
wait_time = needed / self.rate
|
||||
await asyncio.sleep(min(wait_time, 0.1))
|
||||
|
||||
@property
|
||||
def available_tokens(self) -> float:
|
||||
"""可用令牌数"""
|
||||
self._refill()
|
||||
return self._tokens
|
||||
|
||||
|
||||
# 全局速率限制器
|
||||
_rate_limiters: dict[str, RateLimiter] = {}
|
||||
|
||||
|
||||
def get_rate_limiter(name: str, rate: float = 10.0, burst: int = 20) -> RateLimiter:
|
||||
"""获取或创建速率限制器"""
|
||||
if name not in _rate_limiters:
|
||||
_rate_limiters[name] = RateLimiter(rate=rate, burst=burst)
|
||||
return _rate_limiters[name]
|
||||
379
src/minenasai/core/monitoring.py
Normal file
379
src/minenasai/core/monitoring.py
Normal file
@@ -0,0 +1,379 @@
|
||||
"""监控与健康检查模块
|
||||
|
||||
提供:
|
||||
- 全局异常处理
|
||||
- 健康检查端点
|
||||
- 监控指标收集
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import time
|
||||
import traceback
|
||||
from dataclasses import dataclass, field
|
||||
from datetime import datetime
|
||||
from enum import Enum
|
||||
from typing import Any, Callable, Coroutine
|
||||
|
||||
from fastapi import FastAPI, Request, Response
|
||||
from fastapi.responses import JSONResponse
|
||||
|
||||
from minenasai.core import get_logger
|
||||
|
||||
logger = get_logger(__name__)
|
||||
|
||||
|
||||
class HealthStatus(Enum):
|
||||
"""健康状态"""
|
||||
|
||||
HEALTHY = "healthy"
|
||||
DEGRADED = "degraded"
|
||||
UNHEALTHY = "unhealthy"
|
||||
|
||||
|
||||
@dataclass
|
||||
class ComponentHealth:
|
||||
"""组件健康状态"""
|
||||
|
||||
name: str
|
||||
status: HealthStatus
|
||||
message: str = ""
|
||||
latency_ms: float = 0.0
|
||||
last_check: datetime = field(default_factory=datetime.now)
|
||||
|
||||
|
||||
@dataclass
|
||||
class SystemMetrics:
|
||||
"""系统指标"""
|
||||
|
||||
# 请求计数
|
||||
total_requests: int = 0
|
||||
successful_requests: int = 0
|
||||
failed_requests: int = 0
|
||||
|
||||
# 响应时间
|
||||
total_response_time_ms: float = 0.0
|
||||
min_response_time_ms: float = float("inf")
|
||||
max_response_time_ms: float = 0.0
|
||||
|
||||
# 错误统计
|
||||
errors_by_type: dict[str, int] = field(default_factory=dict)
|
||||
|
||||
# 活跃连接
|
||||
active_connections: int = 0
|
||||
peak_connections: int = 0
|
||||
|
||||
# 启动时间
|
||||
start_time: datetime = field(default_factory=datetime.now)
|
||||
|
||||
@property
|
||||
def avg_response_time_ms(self) -> float:
|
||||
"""平均响应时间"""
|
||||
if self.total_requests == 0:
|
||||
return 0.0
|
||||
return self.total_response_time_ms / self.total_requests
|
||||
|
||||
@property
|
||||
def uptime_seconds(self) -> float:
|
||||
"""运行时间(秒)"""
|
||||
return (datetime.now() - self.start_time).total_seconds()
|
||||
|
||||
@property
|
||||
def success_rate(self) -> float:
|
||||
"""成功率"""
|
||||
if self.total_requests == 0:
|
||||
return 1.0
|
||||
return self.successful_requests / self.total_requests
|
||||
|
||||
def record_request(self, response_time_ms: float, success: bool) -> None:
|
||||
"""记录请求"""
|
||||
self.total_requests += 1
|
||||
self.total_response_time_ms += response_time_ms
|
||||
|
||||
if response_time_ms < self.min_response_time_ms:
|
||||
self.min_response_time_ms = response_time_ms
|
||||
if response_time_ms > self.max_response_time_ms:
|
||||
self.max_response_time_ms = response_time_ms
|
||||
|
||||
if success:
|
||||
self.successful_requests += 1
|
||||
else:
|
||||
self.failed_requests += 1
|
||||
|
||||
def record_error(self, error_type: str) -> None:
|
||||
"""记录错误"""
|
||||
self.errors_by_type[error_type] = self.errors_by_type.get(error_type, 0) + 1
|
||||
|
||||
def to_dict(self) -> dict[str, Any]:
|
||||
"""转换为字典"""
|
||||
return {
|
||||
"requests": {
|
||||
"total": self.total_requests,
|
||||
"successful": self.successful_requests,
|
||||
"failed": self.failed_requests,
|
||||
"success_rate": round(self.success_rate * 100, 2),
|
||||
},
|
||||
"response_time_ms": {
|
||||
"avg": round(self.avg_response_time_ms, 2),
|
||||
"min": round(self.min_response_time_ms, 2)
|
||||
if self.min_response_time_ms != float("inf")
|
||||
else 0,
|
||||
"max": round(self.max_response_time_ms, 2),
|
||||
},
|
||||
"connections": {
|
||||
"active": self.active_connections,
|
||||
"peak": self.peak_connections,
|
||||
},
|
||||
"errors": self.errors_by_type,
|
||||
"uptime_seconds": round(self.uptime_seconds, 2),
|
||||
"start_time": self.start_time.isoformat(),
|
||||
}
|
||||
|
||||
|
||||
class HealthChecker:
|
||||
"""健康检查器"""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self._checks: dict[str, Callable[[], Coroutine[Any, Any, ComponentHealth]]] = {}
|
||||
self._results: dict[str, ComponentHealth] = {}
|
||||
|
||||
def register(
|
||||
self,
|
||||
name: str,
|
||||
check_func: Callable[[], Coroutine[Any, Any, ComponentHealth]],
|
||||
) -> None:
|
||||
"""注册健康检查"""
|
||||
self._checks[name] = check_func
|
||||
|
||||
async def check_component(self, name: str) -> ComponentHealth:
|
||||
"""检查单个组件"""
|
||||
if name not in self._checks:
|
||||
return ComponentHealth(
|
||||
name=name,
|
||||
status=HealthStatus.UNHEALTHY,
|
||||
message=f"未知组件: {name}",
|
||||
)
|
||||
|
||||
start_time = time.time()
|
||||
try:
|
||||
result = await asyncio.wait_for(self._checks[name](), timeout=5.0)
|
||||
result.latency_ms = (time.time() - start_time) * 1000
|
||||
self._results[name] = result
|
||||
return result
|
||||
except asyncio.TimeoutError:
|
||||
result = ComponentHealth(
|
||||
name=name,
|
||||
status=HealthStatus.UNHEALTHY,
|
||||
message="检查超时",
|
||||
latency_ms=(time.time() - start_time) * 1000,
|
||||
)
|
||||
self._results[name] = result
|
||||
return result
|
||||
except Exception as e:
|
||||
result = ComponentHealth(
|
||||
name=name,
|
||||
status=HealthStatus.UNHEALTHY,
|
||||
message=str(e),
|
||||
latency_ms=(time.time() - start_time) * 1000,
|
||||
)
|
||||
self._results[name] = result
|
||||
return result
|
||||
|
||||
async def check_all(self) -> dict[str, ComponentHealth]:
|
||||
"""检查所有组件"""
|
||||
tasks = [self.check_component(name) for name in self._checks]
|
||||
results = await asyncio.gather(*tasks)
|
||||
return {r.name: r for r in results}
|
||||
|
||||
def get_overall_status(self) -> HealthStatus:
|
||||
"""获取总体状态"""
|
||||
if not self._results:
|
||||
return HealthStatus.HEALTHY
|
||||
|
||||
statuses = [r.status for r in self._results.values()]
|
||||
|
||||
if all(s == HealthStatus.HEALTHY for s in statuses):
|
||||
return HealthStatus.HEALTHY
|
||||
elif any(s == HealthStatus.UNHEALTHY for s in statuses):
|
||||
return HealthStatus.UNHEALTHY
|
||||
else:
|
||||
return HealthStatus.DEGRADED
|
||||
|
||||
|
||||
# 全局实例
|
||||
_metrics = SystemMetrics()
|
||||
_health_checker = HealthChecker()
|
||||
|
||||
|
||||
def get_metrics() -> SystemMetrics:
|
||||
"""获取全局指标"""
|
||||
return _metrics
|
||||
|
||||
|
||||
def get_health_checker() -> HealthChecker:
|
||||
"""获取健康检查器"""
|
||||
return _health_checker
|
||||
|
||||
|
||||
def setup_monitoring(app: FastAPI) -> None:
|
||||
"""设置监控中间件和端点"""
|
||||
|
||||
@app.middleware("http")
|
||||
async def monitoring_middleware(request: Request, call_next: Any) -> Response:
|
||||
"""监控中间件 - 记录请求指标"""
|
||||
start_time = time.time()
|
||||
success = True
|
||||
|
||||
try:
|
||||
response = await call_next(request)
|
||||
if response.status_code >= 400:
|
||||
success = False
|
||||
_metrics.record_error(f"HTTP_{response.status_code}")
|
||||
return response
|
||||
except Exception as e:
|
||||
success = False
|
||||
error_type = type(e).__name__
|
||||
_metrics.record_error(error_type)
|
||||
logger.error(
|
||||
"request_error",
|
||||
path=request.url.path,
|
||||
error_type=error_type,
|
||||
error=str(e),
|
||||
)
|
||||
raise
|
||||
finally:
|
||||
response_time_ms = (time.time() - start_time) * 1000
|
||||
_metrics.record_request(response_time_ms, success)
|
||||
|
||||
@app.exception_handler(Exception)
|
||||
async def global_exception_handler(request: Request, exc: Exception) -> JSONResponse:
|
||||
"""全局异常处理器"""
|
||||
error_id = f"ERR-{int(time.time() * 1000)}"
|
||||
error_type = type(exc).__name__
|
||||
|
||||
# 记录详细错误
|
||||
logger.error(
|
||||
"unhandled_exception",
|
||||
error_id=error_id,
|
||||
error_type=error_type,
|
||||
error=str(exc),
|
||||
path=request.url.path,
|
||||
method=request.method,
|
||||
traceback=traceback.format_exc(),
|
||||
)
|
||||
|
||||
# 返回友好错误响应
|
||||
return JSONResponse(
|
||||
status_code=500,
|
||||
content={
|
||||
"error": "内部服务器错误",
|
||||
"error_id": error_id,
|
||||
"message": "请联系管理员并提供错误ID",
|
||||
},
|
||||
)
|
||||
|
||||
@app.get("/health")
|
||||
async def health_check() -> dict[str, Any]:
|
||||
"""健康检查端点"""
|
||||
results = await _health_checker.check_all()
|
||||
overall_status = _health_checker.get_overall_status()
|
||||
|
||||
return {
|
||||
"status": overall_status.value,
|
||||
"timestamp": datetime.now().isoformat(),
|
||||
"components": {
|
||||
name: {
|
||||
"status": result.status.value,
|
||||
"message": result.message,
|
||||
"latency_ms": round(result.latency_ms, 2),
|
||||
}
|
||||
for name, result in results.items()
|
||||
},
|
||||
}
|
||||
|
||||
@app.get("/health/live")
|
||||
async def liveness_check() -> dict[str, str]:
|
||||
"""存活检查(Kubernetes liveness probe)"""
|
||||
return {"status": "alive"}
|
||||
|
||||
@app.get("/health/ready")
|
||||
async def readiness_check() -> dict[str, Any]:
|
||||
"""就绪检查(Kubernetes readiness probe)"""
|
||||
results = await _health_checker.check_all()
|
||||
overall_status = _health_checker.get_overall_status()
|
||||
|
||||
if overall_status == HealthStatus.UNHEALTHY:
|
||||
return JSONResponse(
|
||||
status_code=503,
|
||||
content={"status": "not_ready", "reason": "依赖服务不可用"},
|
||||
)
|
||||
|
||||
return {"status": "ready"}
|
||||
|
||||
@app.get("/metrics")
|
||||
async def get_metrics_endpoint() -> dict[str, Any]:
|
||||
"""监控指标端点"""
|
||||
return _metrics.to_dict()
|
||||
|
||||
# 注册默认健康检查
|
||||
async def check_self() -> ComponentHealth:
|
||||
"""自身健康检查"""
|
||||
return ComponentHealth(
|
||||
name="self",
|
||||
status=HealthStatus.HEALTHY,
|
||||
message="服务运行正常",
|
||||
)
|
||||
|
||||
_health_checker.register("self", check_self)
|
||||
|
||||
logger.info("monitoring_setup_complete", endpoints=["/health", "/metrics"])
|
||||
|
||||
|
||||
async def check_database_health() -> ComponentHealth:
|
||||
"""数据库健康检查"""
|
||||
try:
|
||||
from minenasai.core.database import get_database
|
||||
|
||||
db = await get_database()
|
||||
# 简单查询测试
|
||||
await db.execute("SELECT 1")
|
||||
return ComponentHealth(
|
||||
name="database",
|
||||
status=HealthStatus.HEALTHY,
|
||||
message="数据库连接正常",
|
||||
)
|
||||
except Exception as e:
|
||||
return ComponentHealth(
|
||||
name="database",
|
||||
status=HealthStatus.UNHEALTHY,
|
||||
message=str(e),
|
||||
)
|
||||
|
||||
|
||||
async def check_llm_health() -> ComponentHealth:
|
||||
"""LLM 服务健康检查"""
|
||||
try:
|
||||
from minenasai.llm import LLMManager
|
||||
|
||||
manager = LLMManager()
|
||||
clients = manager.get_available_providers()
|
||||
|
||||
if not clients:
|
||||
return ComponentHealth(
|
||||
name="llm",
|
||||
status=HealthStatus.DEGRADED,
|
||||
message="无可用的 LLM 提供商",
|
||||
)
|
||||
|
||||
return ComponentHealth(
|
||||
name="llm",
|
||||
status=HealthStatus.HEALTHY,
|
||||
message=f"可用提供商: {', '.join(clients)}",
|
||||
)
|
||||
except Exception as e:
|
||||
return ComponentHealth(
|
||||
name="llm",
|
||||
status=HealthStatus.UNHEALTHY,
|
||||
message=str(e),
|
||||
)
|
||||
@@ -12,8 +12,16 @@ from typing import Any, AsyncGenerator
|
||||
from fastapi import FastAPI, WebSocket, WebSocketDisconnect
|
||||
from fastapi.middleware.cors import CORSMiddleware
|
||||
|
||||
from minenasai.core import get_logger, get_settings, setup_logging
|
||||
from minenasai.core import (
|
||||
get_health_checker,
|
||||
get_logger,
|
||||
get_metrics,
|
||||
get_settings,
|
||||
setup_logging,
|
||||
setup_monitoring,
|
||||
)
|
||||
from minenasai.core.database import close_database, get_database
|
||||
from minenasai.core.monitoring import check_database_health, check_llm_health
|
||||
from minenasai.gateway.protocol import (
|
||||
ChatMessage,
|
||||
ErrorMessage,
|
||||
@@ -38,12 +46,24 @@ class ConnectionManager:
|
||||
"""接受新连接"""
|
||||
await websocket.accept()
|
||||
self.active_connections[client_id] = websocket
|
||||
|
||||
# 更新监控指标
|
||||
metrics = get_metrics()
|
||||
metrics.active_connections = len(self.active_connections)
|
||||
if metrics.active_connections > metrics.peak_connections:
|
||||
metrics.peak_connections = metrics.active_connections
|
||||
|
||||
logger.info("WebSocket 连接建立", client_id=client_id)
|
||||
|
||||
def disconnect(self, client_id: str) -> None:
|
||||
"""断开连接"""
|
||||
if client_id in self.active_connections:
|
||||
del self.active_connections[client_id]
|
||||
|
||||
# 更新监控指标
|
||||
metrics = get_metrics()
|
||||
metrics.active_connections = len(self.active_connections)
|
||||
|
||||
logger.info("WebSocket 连接断开", client_id=client_id)
|
||||
|
||||
async def send_message(self, client_id: str, message: dict[str, Any]) -> None:
|
||||
@@ -66,6 +86,14 @@ async def lifespan(app: FastAPI) -> AsyncGenerator[None, None]:
|
||||
settings = get_settings()
|
||||
setup_logging(settings.logging)
|
||||
|
||||
# 设置监控
|
||||
setup_monitoring(app)
|
||||
|
||||
# 注册健康检查
|
||||
health_checker = get_health_checker()
|
||||
health_checker.register("database", check_database_health)
|
||||
health_checker.register("llm", check_llm_health)
|
||||
|
||||
# 初始化数据库
|
||||
db = await get_database()
|
||||
logger.info("数据库初始化完成")
|
||||
@@ -102,10 +130,8 @@ async def root() -> dict[str, str]:
|
||||
return {"service": "MineNASAI Gateway", "status": "running"}
|
||||
|
||||
|
||||
@app.get("/health")
|
||||
async def health() -> dict[str, str]:
|
||||
"""健康检查"""
|
||||
return {"status": "healthy"}
|
||||
# 注意: /health, /health/live, /health/ready, /metrics 端点
|
||||
# 由 setup_monitoring() 在 lifespan 中自动添加
|
||||
|
||||
|
||||
@app.get("/api/agents")
|
||||
|
||||
220
tests/test_cache.py
Normal file
220
tests/test_cache.py
Normal file
@@ -0,0 +1,220 @@
|
||||
"""缓存模块测试"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import time
|
||||
|
||||
import pytest
|
||||
|
||||
from minenasai.core.cache import (
|
||||
MemoryCache,
|
||||
RateLimiter,
|
||||
get_rate_limiter,
|
||||
get_response_cache,
|
||||
make_cache_key,
|
||||
)
|
||||
|
||||
|
||||
class TestMemoryCache:
|
||||
"""MemoryCache 测试"""
|
||||
|
||||
def test_set_and_get(self):
|
||||
"""测试设置和获取"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
cache.set("key1", "value1")
|
||||
assert cache.get("key1") == "value1"
|
||||
|
||||
def test_get_nonexistent(self):
|
||||
"""测试获取不存在的 key"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
assert cache.get("nonexistent") is None
|
||||
|
||||
def test_ttl_expiration(self):
|
||||
"""测试 TTL 过期"""
|
||||
cache: MemoryCache[str] = MemoryCache(default_ttl=0.1)
|
||||
|
||||
cache.set("key1", "value1")
|
||||
assert cache.get("key1") == "value1"
|
||||
|
||||
time.sleep(0.15)
|
||||
assert cache.get("key1") is None
|
||||
|
||||
def test_custom_ttl(self):
|
||||
"""测试自定义 TTL"""
|
||||
cache: MemoryCache[str] = MemoryCache(default_ttl=10.0)
|
||||
|
||||
cache.set("key1", "value1", ttl=0.1)
|
||||
|
||||
time.sleep(0.15)
|
||||
assert cache.get("key1") is None
|
||||
|
||||
def test_delete(self):
|
||||
"""测试删除"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
cache.set("key1", "value1")
|
||||
assert cache.delete("key1") is True
|
||||
assert cache.get("key1") is None
|
||||
assert cache.delete("key1") is False
|
||||
|
||||
def test_clear(self):
|
||||
"""测试清空"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
cache.set("key1", "value1")
|
||||
cache.set("key2", "value2")
|
||||
|
||||
count = cache.clear()
|
||||
assert count == 2
|
||||
assert cache.get("key1") is None
|
||||
assert cache.get("key2") is None
|
||||
|
||||
def test_exists(self):
|
||||
"""测试存在检查"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
cache.set("key1", "value1")
|
||||
assert cache.exists("key1") is True
|
||||
assert cache.exists("key2") is False
|
||||
|
||||
def test_max_size_eviction(self):
|
||||
"""测试最大容量淘汰"""
|
||||
cache: MemoryCache[int] = MemoryCache(max_size=5)
|
||||
|
||||
for i in range(10):
|
||||
cache.set(f"key{i}", i)
|
||||
|
||||
# 应该只保留部分
|
||||
assert len(cache._cache) <= 5
|
||||
|
||||
def test_hit_tracking(self):
|
||||
"""测试命中跟踪"""
|
||||
cache: MemoryCache[str] = MemoryCache()
|
||||
|
||||
cache.set("key1", "value1")
|
||||
|
||||
cache.get("key1")
|
||||
cache.get("key1")
|
||||
cache.get("nonexistent")
|
||||
|
||||
stats = cache.get_stats()
|
||||
assert stats["hits"] == 2
|
||||
assert stats["misses"] == 1
|
||||
|
||||
def test_get_stats(self):
|
||||
"""测试获取统计"""
|
||||
cache: MemoryCache[str] = MemoryCache(max_size=100, default_ttl=60.0)
|
||||
|
||||
cache.set("key1", "value1")
|
||||
cache.get("key1")
|
||||
|
||||
stats = cache.get_stats()
|
||||
|
||||
assert stats["size"] == 1
|
||||
assert stats["max_size"] == 100
|
||||
assert stats["default_ttl"] == 60.0
|
||||
assert "hit_rate" in stats
|
||||
|
||||
|
||||
class TestRateLimiter:
|
||||
"""RateLimiter 测试"""
|
||||
|
||||
def test_acquire_within_limit(self):
|
||||
"""测试在限制内获取"""
|
||||
limiter = RateLimiter(rate=10.0, burst=5)
|
||||
|
||||
# 可以获取 burst 数量的令牌
|
||||
for _ in range(5):
|
||||
assert limiter.acquire() is True
|
||||
|
||||
def test_acquire_exceeds_limit(self):
|
||||
"""测试超出限制"""
|
||||
limiter = RateLimiter(rate=10.0, burst=2)
|
||||
|
||||
assert limiter.acquire() is True
|
||||
assert limiter.acquire() is True
|
||||
assert limiter.acquire() is False
|
||||
|
||||
def test_token_refill(self):
|
||||
"""测试令牌补充"""
|
||||
limiter = RateLimiter(rate=100.0, burst=2)
|
||||
|
||||
# 消耗所有令牌
|
||||
limiter.acquire()
|
||||
limiter.acquire()
|
||||
assert limiter.acquire() is False
|
||||
|
||||
# 等待补充
|
||||
time.sleep(0.05)
|
||||
assert limiter.acquire() is True
|
||||
|
||||
def test_available_tokens(self):
|
||||
"""测试可用令牌数"""
|
||||
limiter = RateLimiter(rate=10.0, burst=5)
|
||||
|
||||
assert limiter.available_tokens == pytest.approx(5.0, abs=0.1)
|
||||
|
||||
limiter.acquire(2)
|
||||
assert limiter.available_tokens == pytest.approx(3.0, abs=0.1)
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_wait(self):
|
||||
"""测试等待获取"""
|
||||
limiter = RateLimiter(rate=100.0, burst=1)
|
||||
|
||||
limiter.acquire()
|
||||
|
||||
start = time.time()
|
||||
await limiter.wait()
|
||||
elapsed = time.time() - start
|
||||
|
||||
# 应该等待了一小段时间
|
||||
assert elapsed > 0
|
||||
|
||||
|
||||
class TestCacheKey:
|
||||
"""make_cache_key 测试"""
|
||||
|
||||
def test_same_args_same_key(self):
|
||||
"""测试相同参数生成相同 key"""
|
||||
key1 = make_cache_key("a", "b", c=1)
|
||||
key2 = make_cache_key("a", "b", c=1)
|
||||
|
||||
assert key1 == key2
|
||||
|
||||
def test_different_args_different_key(self):
|
||||
"""测试不同参数生成不同 key"""
|
||||
key1 = make_cache_key("a", "b")
|
||||
key2 = make_cache_key("a", "c")
|
||||
|
||||
assert key1 != key2
|
||||
|
||||
def test_kwargs_order_independent(self):
|
||||
"""测试 kwargs 顺序无关"""
|
||||
key1 = make_cache_key(a=1, b=2)
|
||||
key2 = make_cache_key(b=2, a=1)
|
||||
|
||||
assert key1 == key2
|
||||
|
||||
|
||||
class TestGlobalInstances:
|
||||
"""全局实例测试"""
|
||||
|
||||
def test_get_response_cache(self):
|
||||
"""测试获取响应缓存"""
|
||||
cache = get_response_cache()
|
||||
assert isinstance(cache, MemoryCache)
|
||||
|
||||
def test_get_rate_limiter(self):
|
||||
"""测试获取速率限制器"""
|
||||
limiter = get_rate_limiter("test", rate=10.0, burst=20)
|
||||
assert isinstance(limiter, RateLimiter)
|
||||
|
||||
def test_get_rate_limiter_reuse(self):
|
||||
"""测试速率限制器复用"""
|
||||
limiter1 = get_rate_limiter("shared")
|
||||
limiter2 = get_rate_limiter("shared")
|
||||
|
||||
assert limiter1 is limiter2
|
||||
228
tests/test_monitoring.py
Normal file
228
tests/test_monitoring.py
Normal file
@@ -0,0 +1,228 @@
|
||||
"""监控模块测试"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import pytest
|
||||
|
||||
from minenasai.core.monitoring import (
|
||||
ComponentHealth,
|
||||
HealthChecker,
|
||||
HealthStatus,
|
||||
SystemMetrics,
|
||||
get_health_checker,
|
||||
get_metrics,
|
||||
)
|
||||
|
||||
|
||||
class TestHealthStatus:
|
||||
"""HealthStatus 测试"""
|
||||
|
||||
def test_health_status_values(self):
|
||||
"""测试健康状态值"""
|
||||
assert HealthStatus.HEALTHY.value == "healthy"
|
||||
assert HealthStatus.DEGRADED.value == "degraded"
|
||||
assert HealthStatus.UNHEALTHY.value == "unhealthy"
|
||||
|
||||
|
||||
class TestSystemMetrics:
|
||||
"""SystemMetrics 测试"""
|
||||
|
||||
def test_initial_metrics(self):
|
||||
"""测试初始指标"""
|
||||
metrics = SystemMetrics()
|
||||
|
||||
assert metrics.total_requests == 0
|
||||
assert metrics.successful_requests == 0
|
||||
assert metrics.failed_requests == 0
|
||||
assert metrics.active_connections == 0
|
||||
|
||||
def test_record_request(self):
|
||||
"""测试记录请求"""
|
||||
metrics = SystemMetrics()
|
||||
|
||||
metrics.record_request(100.0, success=True)
|
||||
metrics.record_request(200.0, success=True)
|
||||
metrics.record_request(50.0, success=False)
|
||||
|
||||
assert metrics.total_requests == 3
|
||||
assert metrics.successful_requests == 2
|
||||
assert metrics.failed_requests == 1
|
||||
assert metrics.min_response_time_ms == 50.0
|
||||
assert metrics.max_response_time_ms == 200.0
|
||||
|
||||
def test_avg_response_time(self):
|
||||
"""测试平均响应时间"""
|
||||
metrics = SystemMetrics()
|
||||
|
||||
metrics.record_request(100.0, success=True)
|
||||
metrics.record_request(200.0, success=True)
|
||||
|
||||
assert metrics.avg_response_time_ms == 150.0
|
||||
|
||||
def test_success_rate(self):
|
||||
"""测试成功率"""
|
||||
metrics = SystemMetrics()
|
||||
|
||||
metrics.record_request(100.0, success=True)
|
||||
metrics.record_request(100.0, success=True)
|
||||
metrics.record_request(100.0, success=False)
|
||||
|
||||
assert metrics.success_rate == pytest.approx(2 / 3)
|
||||
|
||||
def test_record_error(self):
|
||||
"""测试记录错误"""
|
||||
metrics = SystemMetrics()
|
||||
|
||||
metrics.record_error("ValueError")
|
||||
metrics.record_error("ValueError")
|
||||
metrics.record_error("TypeError")
|
||||
|
||||
assert metrics.errors_by_type["ValueError"] == 2
|
||||
assert metrics.errors_by_type["TypeError"] == 1
|
||||
|
||||
def test_to_dict(self):
|
||||
"""测试转换为字典"""
|
||||
metrics = SystemMetrics()
|
||||
metrics.record_request(100.0, success=True)
|
||||
|
||||
result = metrics.to_dict()
|
||||
|
||||
assert "requests" in result
|
||||
assert "response_time_ms" in result
|
||||
assert "connections" in result
|
||||
assert "uptime_seconds" in result
|
||||
|
||||
|
||||
class TestComponentHealth:
|
||||
"""ComponentHealth 测试"""
|
||||
|
||||
def test_component_health(self):
|
||||
"""测试组件健康状态"""
|
||||
health = ComponentHealth(
|
||||
name="test",
|
||||
status=HealthStatus.HEALTHY,
|
||||
message="OK",
|
||||
latency_ms=10.0,
|
||||
)
|
||||
|
||||
assert health.name == "test"
|
||||
assert health.status == HealthStatus.HEALTHY
|
||||
assert health.message == "OK"
|
||||
assert health.latency_ms == 10.0
|
||||
|
||||
|
||||
class TestHealthChecker:
|
||||
"""HealthChecker 测试"""
|
||||
|
||||
def setup_method(self):
|
||||
"""初始化"""
|
||||
self.checker = HealthChecker()
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_register_and_check(self):
|
||||
"""测试注册和检查"""
|
||||
|
||||
async def check_ok() -> ComponentHealth:
|
||||
return ComponentHealth(
|
||||
name="test",
|
||||
status=HealthStatus.HEALTHY,
|
||||
message="OK",
|
||||
)
|
||||
|
||||
self.checker.register("test", check_ok)
|
||||
result = await self.checker.check_component("test")
|
||||
|
||||
assert result.status == HealthStatus.HEALTHY
|
||||
assert result.latency_ms > 0
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_unknown_component(self):
|
||||
"""测试检查未知组件"""
|
||||
result = await self.checker.check_component("unknown")
|
||||
|
||||
assert result.status == HealthStatus.UNHEALTHY
|
||||
assert "未知组件" in result.message
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_all(self):
|
||||
"""测试检查所有组件"""
|
||||
|
||||
async def check_a() -> ComponentHealth:
|
||||
return ComponentHealth(name="a", status=HealthStatus.HEALTHY)
|
||||
|
||||
async def check_b() -> ComponentHealth:
|
||||
return ComponentHealth(name="b", status=HealthStatus.HEALTHY)
|
||||
|
||||
self.checker.register("a", check_a)
|
||||
self.checker.register("b", check_b)
|
||||
|
||||
results = await self.checker.check_all()
|
||||
|
||||
assert len(results) == 2
|
||||
assert "a" in results
|
||||
assert "b" in results
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_overall_status_healthy(self):
|
||||
"""测试总体状态 - 健康"""
|
||||
|
||||
async def check_ok() -> ComponentHealth:
|
||||
return ComponentHealth(name="test", status=HealthStatus.HEALTHY)
|
||||
|
||||
self.checker.register("test", check_ok)
|
||||
await self.checker.check_all()
|
||||
|
||||
assert self.checker.get_overall_status() == HealthStatus.HEALTHY
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_overall_status_degraded(self):
|
||||
"""测试总体状态 - 降级"""
|
||||
|
||||
async def check_degraded() -> ComponentHealth:
|
||||
return ComponentHealth(name="test", status=HealthStatus.DEGRADED)
|
||||
|
||||
self.checker.register("test", check_degraded)
|
||||
await self.checker.check_all()
|
||||
|
||||
assert self.checker.get_overall_status() == HealthStatus.DEGRADED
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_overall_status_unhealthy(self):
|
||||
"""测试总体状态 - 不健康"""
|
||||
|
||||
async def check_unhealthy() -> ComponentHealth:
|
||||
return ComponentHealth(name="test", status=HealthStatus.UNHEALTHY)
|
||||
|
||||
self.checker.register("test", check_unhealthy)
|
||||
await self.checker.check_all()
|
||||
|
||||
assert self.checker.get_overall_status() == HealthStatus.UNHEALTHY
|
||||
|
||||
@pytest.mark.asyncio
|
||||
async def test_check_timeout(self):
|
||||
"""测试检查超时"""
|
||||
import asyncio
|
||||
|
||||
async def slow_check() -> ComponentHealth:
|
||||
await asyncio.sleep(10) # 超过5秒超时
|
||||
return ComponentHealth(name="slow", status=HealthStatus.HEALTHY)
|
||||
|
||||
self.checker.register("slow", slow_check)
|
||||
result = await self.checker.check_component("slow")
|
||||
|
||||
assert result.status == HealthStatus.UNHEALTHY
|
||||
assert "超时" in result.message
|
||||
|
||||
|
||||
class TestGlobalInstances:
|
||||
"""全局实例测试"""
|
||||
|
||||
def test_get_metrics(self):
|
||||
"""测试获取全局指标"""
|
||||
metrics = get_metrics()
|
||||
assert isinstance(metrics, SystemMetrics)
|
||||
|
||||
def test_get_health_checker(self):
|
||||
"""测试获取健康检查器"""
|
||||
checker = get_health_checker()
|
||||
assert isinstance(checker, HealthChecker)
|
||||
68
进度.md
68
进度.md
@@ -1,8 +1,8 @@
|
||||
# MineNASAI 项目进度跟踪
|
||||
|
||||
**更新日期**: 2026-02-04
|
||||
**当前阶段**: Phase 3 完成,准备 Phase 4 开发
|
||||
**整体进度**: 80% (4/5 Phase 完成)
|
||||
**更新日期**: 2026-02-05
|
||||
**当前阶段**: Phase 5 完成,项目基本完成
|
||||
**整体进度**: 100% (5/5 Phase 完成)
|
||||
**技术方案**: Web TUI集成(方案C)+ 多 LLM 支持
|
||||
|
||||
---
|
||||
@@ -15,8 +15,8 @@
|
||||
| Phase 1 | 核心框架(MVP) | ✅ 已完成 | 7/7 | 14-21天 | 1天 |
|
||||
| Phase 2 | Agent与工具系统 | ✅ 已完成 | 6/6 | 14-21天 | 1天 |
|
||||
| Phase 3 | Web TUI与Claude集成 | ✅ 已完成 | 5/5 | 7-10天 | 1天 |
|
||||
| Phase 4 | 高级特性 | ⏸️ 未开始 | 0/2 | 10-14天 | - |
|
||||
| Phase 5 | 生产就绪 | ⏸️ 未开始 | 0/3 | 7-10天 | - |
|
||||
| Phase 4 | 高级特性 | ✅ 已完成 | 2/2 | 10-14天 | 1天 |
|
||||
| Phase 5 | 生产就绪 | ✅ 已完成 | 4/4 | 7-10天 | 1天 |
|
||||
|
||||
**图例**: ⏸️ 未开始 | 🔄 进行中 | ✅ 已完成 | ⚠️ 阻塞
|
||||
|
||||
@@ -226,9 +226,63 @@
|
||||
|
||||
---
|
||||
|
||||
## Phase 4-5 (详细任务待展开)
|
||||
## Phase 4: 高级特性 (100%) ✅
|
||||
|
||||
Phase 4-5 的详细任务清单参见 [开发步骤.md](./开发步骤.md)
|
||||
### 4.1 定时任务调度 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 功能:
|
||||
- [x] Cron 表达式解析 (CronParser)
|
||||
- [x] 预定义表达式支持 (@daily, @hourly 等)
|
||||
- [x] 任务调度器 (CronScheduler)
|
||||
- [x] 异步任务执行
|
||||
|
||||
### 4.2 权限控制与工具管理 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 功能:
|
||||
- [x] 危险级别定义 (DangerLevel)
|
||||
- [x] 权限管理器 (PermissionManager)
|
||||
- [x] 工具注册中心 (ToolRegistry)
|
||||
- [x] @tool 装饰器
|
||||
- [x] 确认机制 (ConfirmationRequest)
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: 生产就绪 (100%) ✅
|
||||
|
||||
### 5.1 错误处理与监控 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 功能:
|
||||
- [x] 全局异常处理中间件
|
||||
- [x] 健康检查端点 (/health, /health/live, /health/ready)
|
||||
- [x] 监控指标收集 (/metrics)
|
||||
- [x] 组件健康检查 (数据库、LLM)
|
||||
|
||||
### 5.2 性能优化 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 功能:
|
||||
- [x] 内存缓存 (MemoryCache + TTL + LRU)
|
||||
- [x] 速率限制器 (RateLimiter - 令牌桶算法)
|
||||
- [x] 缓存 Key 生成器
|
||||
|
||||
### 5.3 Docker 部署 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 文件:
|
||||
- [x] Dockerfile (多阶段构建)
|
||||
- [x] docker-compose.yml (Gateway + WebTUI + Redis)
|
||||
- [x] .dockerignore
|
||||
|
||||
### 5.4 测试与文档 ✅
|
||||
- 状态: ✅ 已完成
|
||||
- 实际完成: 2026-02-05
|
||||
- 成果:
|
||||
- [x] 131 个测试全部通过
|
||||
- [x] README.md 完善
|
||||
- [x] API 端点文档
|
||||
|
||||
---
|
||||
|
||||
|
||||
Reference in New Issue
Block a user