Bladeren bron

feat: 实现 NER 实体识别服务并移除 Docker 相关配置

新增功能:
- Python FastAPI NER 服务(规则模式,支持扩展 spaCy/Transformers/API)
- Java NER 客户端 (PythonNerClient) 与服务接口
- NER DTO 类 (NerRequest, NerResponse, EntityInfo, RelationInfo 等)
- NER API 接口 (/api/ner/extract, /api/ner/document/{id})
- 关系抽取服务 (基于规则的位置邻近性、语义模式匹配)
- GraphNodeService (节点/关系 CRUD、批量操作)
- 文档解析完成事件 (DocumentParsedEvent) 与 NER 自动触发

移除:
- 删除所有 Docker 相关文件 (Dockerfile, docker-compose.yml, deploy.sh)
- 删除 README_DEPLOY.md Docker 部署指南
- 更新文档移除 Docker 相关内容
何文松 1 maand geleden
bovenliggende
commit
926e4a13ee
48 gewijzigde bestanden met toevoegingen van 3577 en 848 verwijderingen
  1. 229 0
      NER_README.md
  2. 0 441
      README_DEPLOY.md
  3. 4 6
      backend/CONFIG_GUIDE.md
  4. 0 24
      backend/Dockerfile
  5. 217 0
      backend/ai-service/src/main/java/com/lingyue/ai/client/PythonNerClient.java
  6. 42 0
      backend/ai-service/src/main/java/com/lingyue/ai/config/NerClientConfig.java
  7. 136 0
      backend/ai-service/src/main/java/com/lingyue/ai/controller/NerController.java
  8. 42 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/EntityInfo.java
  9. 40 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/NerRequest.java
  10. 75 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/NerResponse.java
  11. 36 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/PositionInfo.java
  12. 39 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationInfo.java
  13. 32 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationRequest.java
  14. 66 0
      backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationResponse.java
  15. 62 0
      backend/ai-service/src/main/java/com/lingyue/ai/enums/EntityType.java
  16. 67 0
      backend/ai-service/src/main/java/com/lingyue/ai/listener/DocumentParsedEventListener.java
  17. 65 0
      backend/ai-service/src/main/java/com/lingyue/ai/service/NerService.java
  18. 342 0
      backend/ai-service/src/main/java/com/lingyue/ai/service/impl/NerServiceImpl.java
  19. 44 0
      backend/common/src/main/java/com/lingyue/common/event/DocumentParsedEvent.java
  20. 230 9
      backend/graph-service/src/main/java/com/lingyue/graph/controller/GraphController.java
  21. 36 0
      backend/graph-service/src/main/java/com/lingyue/graph/dto/BatchCreateNodesRequest.java
  22. 30 0
      backend/graph-service/src/main/java/com/lingyue/graph/dto/BatchCreateRelationsRequest.java
  23. 50 0
      backend/graph-service/src/main/java/com/lingyue/graph/dto/CreateNodeRequest.java
  24. 47 0
      backend/graph-service/src/main/java/com/lingyue/graph/dto/CreateRelationRequest.java
  25. 365 0
      backend/graph-service/src/main/java/com/lingyue/graph/service/GraphNodeService.java
  26. 30 0
      backend/graph-service/src/main/java/com/lingyue/graph/service/TextStorageService.java
  27. 17 0
      backend/lingyue-starter/src/main/resources/application.properties
  28. 0 12
      backend/sql/README_SUPPLEMENT.md
  29. 0 184
      deploy.sh
  30. 0 139
      docker-compose.yml
  31. 103 0
      python-services/ner-service/README.md
  32. 3 0
      python-services/ner-service/app/__init__.py
  33. 43 0
      python-services/ner-service/app/config.py
  34. 85 0
      python-services/ner-service/app/main.py
  35. 16 0
      python-services/ner-service/app/models/__init__.py
  36. 53 0
      python-services/ner-service/app/models/request.py
  37. 93 0
      python-services/ner-service/app/models/response.py
  38. 6 0
      python-services/ner-service/app/routers/__init__.py
  39. 63 0
      python-services/ner-service/app/routers/ner.py
  40. 54 0
      python-services/ner-service/app/routers/relation.py
  41. 7 0
      python-services/ner-service/app/services/__init__.py
  42. 197 0
      python-services/ner-service/app/services/ner_service.py
  43. 228 0
      python-services/ner-service/app/services/relation_service.py
  44. 28 0
      python-services/ner-service/requirements.txt
  45. 3 0
      python-services/ner-service/tests/__init__.py
  46. 104 0
      python-services/ner-service/tests/test_ner.py
  47. 113 0
      test/test_ner_e2e.sh
  48. 35 33
      进度报告.md

+ 229 - 0
NER_README.md

@@ -0,0 +1,229 @@
+# NER 服务使用指南
+
+本文档介绍 NER(命名实体识别)服务的使用方法。
+
+## 概述
+
+NER 服务提供以下功能:
+
+1. **实体提取**:从文本中识别人名、机构、地点、日期、数值、设备等实体
+2. **关系抽取**:从实体间抽取语义关系
+3. **自动集成**:文档解析完成后自动触发 NER,将结果保存到图数据库
+
+## 架构
+
+```
+文件上传 → 文档解析 → 文本存储 → RAG 索引
+                              ↓
+                         发布事件
+                              ↓
+                         NER 监听器
+                              ↓
+                    调用 Python NER 服务
+                              ↓
+                    保存到图数据库 (graph_nodes, graph_relations)
+```
+
+## 快速开始
+
+### 1. 启动 Python NER 服务
+
+```bash
+cd python-services/ner-service
+
+# 安装依赖
+pip install -r requirements.txt
+
+# 启动服务
+uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
+```
+
+### 2. 验证服务
+
+```bash
+# 健康检查
+curl http://localhost:8001/health
+
+# 运行端到端测试
+./test/test_ner_e2e.sh
+```
+
+## API 接口
+
+### Python NER 服务
+
+| 方法 | 路径 | 说明 |
+|------|------|------|
+| GET | `/health` | 健康检查 |
+| POST | `/ner/extract` | 提取实体和关系 |
+| POST | `/ner/relations` | 从实体中抽取关系 |
+
+### Java NER 接口
+
+| 方法 | 路径 | 说明 |
+|------|------|------|
+| POST | `/api/ner/extract` | 提取实体(同步) |
+| POST | `/api/ner/extract/async` | 提取实体(异步) |
+| POST | `/api/ner/relations` | 提取关系 |
+| POST | `/api/ner/document/{documentId}` | 对文档执行 NER 并保存 |
+| POST | `/api/ner/document/{documentId}/async` | 异步对文档执行 NER |
+
+### 图数据库接口
+
+| 方法 | 路径 | 说明 |
+|------|------|------|
+| GET | `/api/graph/documents/{documentId}/nodes` | 获取文档节点 |
+| GET | `/api/graph/documents/{documentId}/stats` | 获取图统计 |
+| POST | `/api/graph/nodes` | 创建节点 |
+| POST | `/api/graph/relations` | 创建关系 |
+
+## 使用示例
+
+### 实体提取
+
+```bash
+curl -X POST http://localhost:8001/ner/extract \
+  -H "Content-Type: application/json" \
+  -d '{
+    "documentId": "doc-001",
+    "text": "2024年5月15日,成都检测公司完成了环境监测项目。",
+    "extractRelations": true
+  }'
+```
+
+响应:
+```json
+{
+  "documentId": "doc-001",
+  "entities": [
+    {
+      "name": "2024年5月15日",
+      "type": "DATE",
+      "value": "2024年5月15日",
+      "position": {"charStart": 0, "charEnd": 11, "line": 1},
+      "confidence": 0.8
+    },
+    {
+      "name": "成都检测公司",
+      "type": "ORG",
+      "value": "成都检测公司",
+      "position": {"charStart": 12, "charEnd": 18, "line": 1},
+      "confidence": 0.8
+    }
+  ],
+  "relations": [...],
+  "entityCount": 2,
+  "relationCount": 1,
+  "processingTime": 150,
+  "success": true
+}
+```
+
+### 对文档执行 NER
+
+```bash
+curl -X POST "http://localhost:5232/api/ner/document/doc-001?userId=user-001"
+```
+
+## 实体类型
+
+| 类型 | 说明 | 示例 |
+|------|------|------|
+| PERSON | 人名 | 张经理、李总 |
+| ORG | 机构 | 成都检测公司 |
+| LOC | 地点 | 成都市高新区 |
+| DATE | 日期 | 2024年5月15日 |
+| NUMBER | 数值 | 50分贝、100万元 |
+| DEVICE | 设备 | 噪音检测设备 |
+| PROJECT | 项目 | 环境监测项目 |
+| TERM | 专业术语 | - |
+
+## 关系类型
+
+| 类型 | 说明 |
+|------|------|
+| 负责 | 主体负责某事 |
+| 属于 | 隶属关系 |
+| 位于 | 位置关系 |
+| 包含 | 包含关系 |
+| 使用 | 使用关系 |
+| 检测 | 检测关系 |
+
+## 配置说明
+
+### application.properties
+
+```properties
+# NER 自动提取配置
+ner.auto-extract.enabled=true
+
+# Python NER 服务配置
+ner.python-service.url=http://localhost:8001
+ner.python-service.timeout=60000
+ner.python-service.max-retries=3
+```
+
+### 环境变量
+
+| 变量名 | 说明 | 默认值 |
+|--------|------|--------|
+| NER_SERVICE_URL | NER 服务地址 | http://localhost:8001 |
+| NER_MODEL | NER 模型类型 | rule |
+| USE_GPU | 是否使用 GPU | false |
+
+## NER 模型模式
+
+### 1. rule 模式(默认)
+
+基于正则表达式规则的简单 NER,适合开发测试:
+
+- 优点:无需额外依赖,启动快速
+- 缺点:准确率有限,仅支持常见实体类型
+
+### 2. spacy 模式
+
+使用 spaCy 中文模型:
+
+```bash
+pip install spacy
+python -m spacy download zh_core_web_sm
+```
+
+### 3. api 模式
+
+调用外部 API(如 DeepSeek/Qwen)进行 NER:
+
+```properties
+NER_MODEL=api
+API_BASE_URL=https://dashscope.aliyuncs.com/compatible-mode/v1
+API_KEY=your-api-key
+API_MODEL=qwen-plus
+```
+
+## 故障排除
+
+### NER 服务连接失败
+
+1. 确认 NER 服务已启动:`curl http://localhost:8001/health`
+2. 检查防火墙设置
+3. 检查配置文件中的服务地址
+
+### 实体提取结果为空
+
+1. 确认文本内容不为空
+2. 检查文本是否包含可识别的实体
+3. 查看 NER 服务日志
+
+### 关系抽取失败
+
+1. 确认至少有 2 个实体
+2. 检查实体位置信息是否正确
+
+## 开发计划
+
+- [ ] 集成 spaCy 中文模型
+- [ ] 集成 Transformers NER 模型(如 Qwen-NER)
+- [ ] 实现 API 模式(DeepSeek/Qwen)
+- [ ] 支持自定义实体类型
+- [ ] 优化关系抽取准确率
+- [ ] 添加实体去重和合并逻辑

+ 0 - 441
README_DEPLOY.md

@@ -1,441 +0,0 @@
-# 灵越智报 2.0 部署指南
-
-## 目录
-
-- [环境要求](#环境要求)
-- [快速开始](#快速开始)
-- [Docker 部署(推荐)](#docker-部署推荐)
-- [传统部署](#传统部署)
-- [配置说明](#配置说明)
-- [常见问题](#常见问题)
-
-## 环境要求
-
-### 基础环境
-- **操作系统**: Linux / macOS / Windows
-- **Java**: JDK 17 或更高版本
-- **Maven**: 3.8.0 或更高版本
-
-### 运行环境(如果使用 Docker 则自动提供)
-- **PostgreSQL**: 16.0 或更高版本
-- **Redis**: 7.0 或更高版本
-- **RabbitMQ**: 3.13 或更高版本(可选)
-
-### Docker 环境(推荐)
-- **Docker**: 20.10 或更高版本
-- **Docker Compose**: 2.0 或更高版本
-
-## 快速开始
-
-### 方式一: 使用部署脚本(最简单)
-
-```bash
-# 1. 进入项目根目录
-cd lingyue-zhibao
-
-# 2. 赋予执行权限
-chmod +x deploy.sh
-
-# 3. 启动服务(不包含 OCR)
-./deploy.sh start
-
-# 或者启动完整服务(包含 PaddleOCR)
-./deploy.sh start-with-ocr
-```
-
-### 方式二: 手动 Docker 部署
-
-```bash
-# 1. 编译项目
-cd backend
-mvn clean package -DskipTests
-cd ..
-
-# 2. 复制环境变量配置
-cp .env.example .env
-
-# 3. 修改 .env 文件中的配置(特别是 JWT_SECRET 和 API 密钥)
-vim .env
-
-# 4. 启动服务
-docker-compose up -d
-
-# 5. 查看日志
-docker-compose logs -f lingyue-app
-```
-
-## Docker 部署(推荐)
-
-### 1. 准备工作
-
-```bash
-# 克隆或进入项目目录
-cd lingyue-zhibao
-
-# 复制环境配置文件
-cp .env.example .env
-```
-
-### 2. 修改配置
-
-编辑 `.env` 文件,修改以下重要配置:
-
-```env
-# 数据库密码(生产环境务必修改!)
-DB_PASSWORD=your-strong-password
-
-# JWT 密钥(生产环境务必修改!)
-JWT_SECRET=your-very-long-random-secret-key
-
-# DeepSeek API Key(如需 AI 功能)
-DEEPSEEK_API_KEY=your-deepseek-api-key
-```
-
-### 3. 启动服务
-
-```bash
-# 完整启动(包含所有服务)
-docker-compose --profile with-ocr up -d
-
-# 或者不启动 OCR 服务
-docker-compose up -d
-```
-
-### 4. 验证部署
-
-```bash
-# 查看服务状态
-docker-compose ps
-
-# 查看应用日志
-docker-compose logs -f lingyue-app
-
-# 健康检查
-curl http://localhost:8000/actuator/health
-```
-
-### 5. 访问应用
-
-- **应用主页**: http://localhost:8000
-- **API 文档**: http://localhost:8000/swagger-ui.html
-- **Druid 监控**: http://localhost:8000/druid/ (admin/admin123)
-- **RabbitMQ 管理**: http://localhost:15672 (admin/admin123)
-
-## 传统部署
-
-### 1. 环境准备
-
-#### 安装 PostgreSQL
-
-```bash
-# Ubuntu/Debian
-sudo apt update
-sudo apt install postgresql-16
-
-# 创建数据库
-sudo -u postgres psql
-CREATE DATABASE lingyue_zhibao;
-CREATE USER lingyue WITH PASSWORD '123123';
-GRANT ALL PRIVILEGES ON DATABASE lingyue_zhibao TO lingyue;
-```
-
-#### 安装 Redis
-
-```bash
-# Ubuntu/Debian
-sudo apt install redis-server
-
-# 启动 Redis
-sudo systemctl start redis-server
-sudo systemctl enable redis-server
-```
-
-#### 安装 RabbitMQ(可选)
-
-```bash
-# Ubuntu/Debian
-sudo apt install rabbitmq-server
-
-# 启动 RabbitMQ
-sudo systemctl start rabbitmq-server
-sudo systemctl enable rabbitmq-server
-
-# 启用管理插件
-sudo rabbitmq-plugins enable rabbitmq_management
-
-# 创建用户
-sudo rabbitmqctl add_user admin admin123
-sudo rabbitmqctl set_user_tags admin administrator
-sudo rabbitmqctl set_permissions -p / admin ".*" ".*" ".*"
-```
-
-### 2. 编译项目
-
-```bash
-cd backend
-mvn clean package -DskipTests
-```
-
-### 3. 配置应用
-
-编辑 `backend/lingyue-starter/src/main/resources/application.properties`:
-
-```properties
-# 数据库配置
-spring.datasource.druid.url=jdbc:postgresql://localhost:5432/lingyue_zhibao
-spring.datasource.druid.username=lingyue
-spring.datasource.druid.password=123123
-
-# Redis配置
-spring.data.redis.host=localhost
-spring.data.redis.port=6379
-
-# RabbitMQ配置
-spring.rabbitmq.host=localhost
-spring.rabbitmq.port=5672
-spring.rabbitmq.username=admin
-spring.rabbitmq.password=admin123
-
-# JWT配置
-jwt.secret=your-jwt-secret-key
-```
-
-或者通过环境变量配置(推荐用于生产环境):
-
-```bash
-export DB_USERNAME=lingyue
-export DB_PASSWORD=123123
-export JWT_SECRET=your-very-long-random-secret-key
-```
-
-### 4. 运行应用
-
-```bash
-# 方式一: 使用 java -jar
-cd backend/lingyue-starter/target
-java -jar lingyue-starter.jar
-
-# 方式二: 使用 Maven
-cd backend
-mvn spring-boot:run -pl lingyue-starter
-
-# 方式三: 后台运行
-nohup java -jar lingyue-starter.jar > app.log 2>&1 &
-```
-
-### 5. 使用 systemd 管理服务
-
-创建服务文件 `/etc/systemd/system/lingyue-zhibao.service`:
-
-```ini
-[Unit]
-Description=Lingyue Zhibao Application
-After=network.target postgresql.service redis.service
-
-[Service]
-Type=simple
-User=lingyue
-WorkingDirectory=/opt/lingyue-zhibao
-ExecStart=/usr/bin/java -Xms512m -Xmx1024m -jar /opt/lingyue-zhibao/lingyue-starter.jar
-Restart=on-failure
-RestartSec=10
-
-[Install]
-WantedBy=multi-user.target
-```
-
-启动服务:
-
-```bash
-sudo systemctl daemon-reload
-sudo systemctl start lingyue-zhibao
-sudo systemctl enable lingyue-zhibao
-sudo systemctl status lingyue-zhibao
-```
-
-## 配置说明
-
-### 环境变量
-
-| 变量名 | 说明 | 默认值 |
-|--------|------|--------|
-| DB_USERNAME | 数据库用户名 | postgres |
-| DB_PASSWORD | 数据库密码 | postgres |
-| REDIS_HOST | Redis 主机 | localhost |
-| REDIS_PORT | Redis 端口 | 6379 |
-| REDIS_PASSWORD | Redis 密码 | (空) |
-| RABBITMQ_HOST | RabbitMQ 主机 | localhost |
-| RABBITMQ_PORT | RabbitMQ 端口 | 5672 |
-| RABBITMQ_USERNAME | RabbitMQ 用户名 | guest |
-| RABBITMQ_PASSWORD | RabbitMQ 密码 | guest |
-| JWT_SECRET | JWT 密钥 | (需修改) |
-| PADDLEOCR_SERVER_URL | PaddleOCR 服务地址 | http://localhost:8866 |
-| DEEPSEEK_API_KEY | DeepSeek API 密钥 | (需配置) |
-
-### 端口说明
-
-| 服务 | 端口 | 说明 |
-|------|------|------|
-| 应用主服务 | 8000 | 主应用端口 |
-| PostgreSQL | 5432 | 数据库端口 |
-| Redis | 6379 | 缓存服务端口 |
-| RabbitMQ | 5672 | 消息队列端口 |
-| RabbitMQ 管理 | 15672 | RabbitMQ 管理界面 |
-| PaddleOCR | 8866 | OCR 服务端口 |
-
-## 常见问题
-
-### 1. 数据库连接失败
-
-**问题**: 应用启动时提示数据库连接失败
-
-**解决方案**:
-```bash
-# 检查 PostgreSQL 是否运行
-docker-compose ps postgres
-# 或
-sudo systemctl status postgresql
-
-# 检查数据库配置
-docker-compose exec postgres psql -U postgres -c "SELECT 1"
-
-# 查看详细日志
-docker-compose logs postgres
-```
-
-### 2. Redis 连接失败
-
-**问题**: Redis 连接超时或拒绝连接
-
-**解决方案**:
-```bash
-# 检查 Redis 是否运行
-docker-compose ps redis
-# 或
-sudo systemctl status redis
-
-# 测试 Redis 连接
-redis-cli ping
-```
-
-### 3. 端口被占用
-
-**问题**: 服务启动失败,提示端口已被占用
-
-**解决方案**:
-```bash
-# 查看占用端口的进程
-sudo lsof -i :8000
-sudo netstat -tulpn | grep 8000
-
-# 修改 docker-compose.yml 中的端口映射
-# 例如将 8000:8000 改为 8080:8000
-```
-
-### 4. 内存不足
-
-**问题**: 应用运行缓慢或频繁重启
-
-**解决方案**:
-
-修改 `Dockerfile` 中的 JVM 参数:
-```dockerfile
-ENV JAVA_OPTS="-Xms1024m -Xmx2048m -XX:+UseG1GC"
-```
-
-或在启动时指定:
-```bash
-docker-compose up -d --build
-```
-
-### 5. 查看应用日志
-
-```bash
-# Docker 部署
-docker-compose logs -f lingyue-app
-
-# 传统部署
-tail -f /opt/lingyue-zhibao/logs/lingyue.log
-
-# systemd 服务
-sudo journalctl -u lingyue-zhibao -f
-```
-
-### 6. 重置数据库
-
-```bash
-# Docker 环境
-docker-compose down -v  # 删除所有卷
-docker-compose up -d
-
-# 传统部署
-sudo -u postgres psql
-DROP DATABASE lingyue_zhibao;
-CREATE DATABASE lingyue_zhibao;
-```
-
-## 性能优化建议
-
-### 1. JVM 调优
-
-```bash
-# 生产环境建议
-JAVA_OPTS="-Xms2g -Xmx4g \
-  -XX:+UseG1GC \
-  -XX:MaxGCPauseMillis=200 \
-  -XX:+HeapDumpOnOutOfMemoryError \
-  -XX:HeapDumpPath=/app/logs"
-```
-
-### 2. 数据库优化
-
-```sql
--- 增加连接池大小
-ALTER SYSTEM SET max_connections = 200;
-
--- 优化缓存
-ALTER SYSTEM SET shared_buffers = '256MB';
-ALTER SYSTEM SET effective_cache_size = '1GB';
-```
-
-### 3. Redis 优化
-
-```bash
-# 增加最大内存
-redis-cli CONFIG SET maxmemory 2gb
-redis-cli CONFIG SET maxmemory-policy allkeys-lru
-```
-
-## 安全建议
-
-1. **修改默认密码**: 生产环境务必修改所有默认密码
-2. **配置防火墙**: 只开放必要的端口
-3. **启用 HTTPS**: 使用 Nginx 反向代理配置 SSL
-4. **定期备份**: 配置数据库自动备份
-5. **监控告警**: 配置应用和基础设施监控
-
-## 备份与恢复
-
-### 数据库备份
-
-```bash
-# 备份
-docker-compose exec postgres pg_dump -U postgres lingyue_zhibao > backup.sql
-
-# 恢复
-docker-compose exec -T postgres psql -U postgres lingyue_zhibao < backup.sql
-```
-
-### 文件备份
-
-```bash
-# 备份上传文件
-docker run --rm -v lingyue_app_data:/data -v $(pwd):/backup ubuntu tar czf /backup/app_data_backup.tar.gz /data
-```
-
-## 技术支持
-
-如有问题,请联系:
-- 邮箱: support@lingyue.com
-- 文档: https://docs.lingyue.com
-- Issues: https://github.com/lingyue/lingyue-zhibao/issues

+ 4 - 6
backend/CONFIG_GUIDE.md

@@ -185,9 +185,8 @@ export DEEPSEEK_API_KEY=your-deepseek-api-key
 
 java -jar lingyue-starter.jar
 
-# 方式二: 使用 .env 文件配合 Docker
-# 见 .env.example 文件
-docker-compose up -d
+# 方式二: 后台运行
+nohup java -jar lingyue-starter.jar > app.log 2>&1 &
 ```
 
 ### 常用环境变量列表
@@ -315,9 +314,8 @@ jwt.secret=your-generated-secret-key
 
 1. **开发环境**: 使用 `application-dev.properties`,配置写在文件中
 2. **生产环境**: 使用 `application-prod.properties`,敏感配置用环境变量
-3. **Docker 部署**: 使用 `.env` 文件管理环境变量
-4. **敏感信息**: 永远不要把密码、密钥提交到 Git 仓库
-5. **配置分离**: 公共配置放在 `application-common.properties`,特定配置放在各服务配置文件
+3. **敏感信息**: 永远不要把密码、密钥提交到 Git 仓库
+4. **配置分离**: 公共配置放在 `application-common.properties`,特定配置放在各服务配置文件
 
 ## 配置文件变更历史
 

+ 0 - 24
backend/Dockerfile

@@ -1,24 +0,0 @@
-# 使用 OpenJDK 17 作为基础镜像
-FROM openjdk:17-jdk-slim
-
-# 设置工作目录
-WORKDIR /app
-
-# 设置时区为中国上海
-ENV TZ=Asia/Shanghai
-RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
-
-# 复制编译好的 jar 包
-COPY lingyue-starter/target/lingyue-starter.jar /app/app.jar
-
-# 创建数据目录
-RUN mkdir -p /tmp/lingyue-zhibao
-
-# 暴露端口
-EXPOSE 8000
-
-# JVM 参数配置
-ENV JAVA_OPTS="-Xms512m -Xmx1024m -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/app/logs"
-
-# 启动应用
-ENTRYPOINT ["sh", "-c", "java $JAVA_OPTS -jar /app/app.jar"]

+ 217 - 0
backend/ai-service/src/main/java/com/lingyue/ai/client/PythonNerClient.java

@@ -0,0 +1,217 @@
+package com.lingyue.ai.client;
+
+import com.lingyue.ai.config.NerClientConfig;
+import com.lingyue.ai.dto.ner.*;
+import com.lingyue.common.exception.ServiceException;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.http.HttpHeaders;
+import org.springframework.http.MediaType;
+import org.springframework.stereotype.Component;
+import org.springframework.web.reactive.function.client.WebClient;
+import org.springframework.web.reactive.function.client.WebClientResponseException;
+import reactor.util.retry.Retry;
+
+import java.time.Duration;
+import java.util.List;
+
+/**
+ * Python NER 服务 HTTP 客户端
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Slf4j
+@Component
+public class PythonNerClient {
+
+    private final WebClient webClient;
+    private final NerClientConfig config;
+
+    public PythonNerClient(NerClientConfig config) {
+        this.config = config;
+        this.webClient = WebClient.builder()
+                .baseUrl(config.getUrl())
+                .defaultHeader(HttpHeaders.CONTENT_TYPE, MediaType.APPLICATION_JSON_VALUE)
+                .defaultHeader(HttpHeaders.ACCEPT, MediaType.APPLICATION_JSON_VALUE)
+                .build();
+        log.info("PythonNerClient 初始化完成: url={}, timeout={}ms", config.getUrl(), config.getTimeout());
+    }
+
+    /**
+     * 提取实体
+     *
+     * @param request NER 请求
+     * @return NER 响应
+     */
+    public NerResponse extractEntities(NerRequest request) {
+        log.debug("调用 Python NER 服务提取实体: documentId={}", request.getDocumentId());
+        
+        try {
+            NerResponse response = webClient
+                    .post()
+                    .uri("/ner/extract")
+                    .bodyValue(request)
+                    .retrieve()
+                    .bodyToMono(NerResponse.class)
+                    .timeout(Duration.ofMillis(config.getTimeout()))
+                    .retryWhen(Retry.backoff(config.getMaxRetries(), Duration.ofMillis(config.getRetryInterval()))
+                            .filter(this::isRetryable)
+                            .doBeforeRetry(signal -> log.warn("NER 服务调用失败,正在重试: attempt={}", 
+                                    signal.totalRetries() + 1)))
+                    .block();
+
+            if (response == null) {
+                throw new ServiceException("Python NER 服务返回空响应");
+            }
+
+            if (!response.getSuccess()) {
+                throw new ServiceException("NER 提取失败: " + response.getErrorMessage());
+            }
+
+            log.info("NER 提取成功: documentId={}, entityCount={}, relationCount={}", 
+                    request.getDocumentId(), response.getEntityCount(), response.getRelationCount());
+
+            return response;
+
+        } catch (WebClientResponseException e) {
+            log.error("Python NER 服务调用失败: status={}, body={}", e.getStatusCode(), e.getResponseBodyAsString());
+            throw new ServiceException("NER 服务调用失败: " + e.getMessage(), e);
+        } catch (Exception e) {
+            if (e instanceof ServiceException) {
+                throw e;
+            }
+            log.error("Python NER 服务调用异常: {}", e.getMessage(), e);
+            throw new ServiceException("NER 服务调用异常: " + e.getMessage(), e);
+        }
+    }
+
+    /**
+     * 提取关系
+     *
+     * @param request 关系抽取请求
+     * @return 关系抽取响应
+     */
+    public RelationResponse extractRelations(RelationRequest request) {
+        log.debug("调用 Python NER 服务提取关系: documentId={}, entityCount={}", 
+                request.getDocumentId(), 
+                request.getEntities() != null ? request.getEntities().size() : 0);
+        
+        try {
+            RelationResponse response = webClient
+                    .post()
+                    .uri("/ner/relations")
+                    .bodyValue(request)
+                    .retrieve()
+                    .bodyToMono(RelationResponse.class)
+                    .timeout(Duration.ofMillis(config.getTimeout()))
+                    .retryWhen(Retry.backoff(config.getMaxRetries(), Duration.ofMillis(config.getRetryInterval()))
+                            .filter(this::isRetryable)
+                            .doBeforeRetry(signal -> log.warn("关系抽取服务调用失败,正在重试: attempt={}", 
+                                    signal.totalRetries() + 1)))
+                    .block();
+
+            if (response == null) {
+                throw new ServiceException("Python NER 服务返回空响应");
+            }
+
+            if (!response.getSuccess()) {
+                throw new ServiceException("关系抽取失败: " + response.getErrorMessage());
+            }
+
+            log.info("关系抽取成功: documentId={}, relationCount={}", 
+                    request.getDocumentId(), response.getRelationCount());
+
+            return response;
+
+        } catch (WebClientResponseException e) {
+            log.error("Python NER 关系抽取服务调用失败: status={}, body={}", 
+                    e.getStatusCode(), e.getResponseBodyAsString());
+            throw new ServiceException("关系抽取服务调用失败: " + e.getMessage(), e);
+        } catch (Exception e) {
+            if (e instanceof ServiceException) {
+                throw e;
+            }
+            log.error("Python NER 关系抽取服务调用异常: {}", e.getMessage(), e);
+            throw new ServiceException("关系抽取服务调用异常: " + e.getMessage(), e);
+        }
+    }
+
+    /**
+     * 健康检查
+     *
+     * @return 是否健康
+     */
+    public boolean healthCheck() {
+        try {
+            String response = webClient
+                    .get()
+                    .uri("/health")
+                    .retrieve()
+                    .bodyToMono(String.class)
+                    .timeout(Duration.ofMillis(config.getConnectTimeout()))
+                    .block();
+            
+            return response != null && response.contains("ok");
+        } catch (Exception e) {
+            log.warn("Python NER 服务健康检查失败: {}", e.getMessage());
+            return false;
+        }
+    }
+
+    /**
+     * 判断异常是否可重试
+     */
+    private boolean isRetryable(Throwable throwable) {
+        if (throwable instanceof WebClientResponseException) {
+            WebClientResponseException e = (WebClientResponseException) throwable;
+            // 5xx 错误可重试
+            return e.getStatusCode().is5xxServerError();
+        }
+        // 连接超时、读取超时可重试
+        return throwable instanceof java.net.ConnectException ||
+               throwable instanceof java.net.SocketTimeoutException ||
+               throwable instanceof java.util.concurrent.TimeoutException;
+    }
+
+    /**
+     * 快捷方法:仅提取实体(不提取关系)
+     */
+    public List<EntityInfo> extractEntitiesOnly(String documentId, String text) {
+        NerRequest request = NerRequest.builder()
+                .documentId(documentId)
+                .text(text)
+                .extractRelations(false)
+                .build();
+        
+        NerResponse response = extractEntities(request);
+        return response.getEntities();
+    }
+
+    /**
+     * 快捷方法:提取实体和关系
+     */
+    public NerResponse extractEntitiesAndRelations(String documentId, String text) {
+        NerRequest request = NerRequest.builder()
+                .documentId(documentId)
+                .text(text)
+                .extractRelations(true)
+                .build();
+        
+        return extractEntities(request);
+    }
+
+    /**
+     * 快捷方法:提取指定类型的实体
+     */
+    public List<EntityInfo> extractEntitiesByTypes(String documentId, String text, List<String> entityTypes) {
+        NerRequest request = NerRequest.builder()
+                .documentId(documentId)
+                .text(text)
+                .entityTypes(entityTypes)
+                .extractRelations(false)
+                .build();
+        
+        NerResponse response = extractEntities(request);
+        return response.getEntities();
+    }
+}

+ 42 - 0
backend/ai-service/src/main/java/com/lingyue/ai/config/NerClientConfig.java

@@ -0,0 +1,42 @@
+package com.lingyue.ai.config;
+
+import lombok.Data;
+import org.springframework.boot.context.properties.ConfigurationProperties;
+import org.springframework.context.annotation.Configuration;
+
+/**
+ * NER 客户端配置
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Configuration
+@ConfigurationProperties(prefix = "ner.python-service")
+public class NerClientConfig {
+
+    /**
+     * Python NER 服务地址
+     */
+    private String url = "http://localhost:8001";
+
+    /**
+     * 请求超时时间(毫秒)
+     */
+    private int timeout = 60000;
+
+    /**
+     * 连接超时时间(毫秒)
+     */
+    private int connectTimeout = 5000;
+
+    /**
+     * 最大重试次数
+     */
+    private int maxRetries = 3;
+
+    /**
+     * 重试间隔(毫秒)
+     */
+    private int retryInterval = 1000;
+}

+ 136 - 0
backend/ai-service/src/main/java/com/lingyue/ai/controller/NerController.java

@@ -0,0 +1,136 @@
+package com.lingyue.ai.controller;
+
+import com.lingyue.ai.dto.ner.*;
+import com.lingyue.ai.service.NerService;
+import com.lingyue.common.domain.AjaxResult;
+import io.swagger.v3.oas.annotations.Operation;
+import io.swagger.v3.oas.annotations.Parameter;
+import io.swagger.v3.oas.annotations.tags.Tag;
+import lombok.RequiredArgsConstructor;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.web.bind.annotation.*;
+
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * NER(命名实体识别)控制器
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Slf4j
+@RestController
+@RequestMapping("/api/ner")
+@RequiredArgsConstructor
+@Tag(name = "NER 服务", description = "命名实体识别相关接口")
+public class NerController {
+
+    private final NerService nerService;
+
+    /**
+     * 从文本中提取实体(同步)
+     */
+    @PostMapping("/extract")
+    @Operation(summary = "提取实体", description = "从给定文本中提取命名实体")
+    public AjaxResult extractEntities(@RequestBody NerRequest request) {
+        log.info("开始提取实体: documentId={}, textLength={}", 
+                request.getDocumentId(), 
+                request.getText() != null ? request.getText().length() : 0);
+        
+        try {
+            NerResponse response = nerService.extractEntities(request);
+            log.info("实体提取完成: documentId={}, entityCount={}, relationCount={}", 
+                    request.getDocumentId(), 
+                    response.getEntityCount(), 
+                    response.getRelationCount());
+            return AjaxResult.success(response);
+        } catch (Exception e) {
+            log.error("实体提取失败: documentId={}", request.getDocumentId(), e);
+            return AjaxResult.error("实体提取失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 从文本中提取实体(异步)
+     */
+    @PostMapping("/extract/async")
+    @Operation(summary = "异步提取实体", description = "异步方式从给定文本中提取命名实体")
+    public AjaxResult extractEntitiesAsync(@RequestBody NerRequest request) {
+        log.info("开始异步提取实体: documentId={}", request.getDocumentId());
+        
+        try {
+            CompletableFuture<NerResponse> future = nerService.extractEntitiesAsync(request);
+            // 返回任务已提交的确认
+            return AjaxResult.success("NER 任务已提交", "documentId: " + request.getDocumentId());
+        } catch (Exception e) {
+            log.error("异步实体提取提交失败: documentId={}", request.getDocumentId(), e);
+            return AjaxResult.error("异步实体提取提交失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 从已提取的实体中抽取关系
+     */
+    @PostMapping("/relations")
+    @Operation(summary = "提取关系", description = "从已提取的实体中抽取实体间的关系")
+    public AjaxResult extractRelations(@RequestBody RelationRequest request) {
+        log.info("开始提取关系: documentId={}, entityCount={}", 
+                request.getDocumentId(), 
+                request.getEntities() != null ? request.getEntities().size() : 0);
+        
+        try {
+            RelationResponse response = nerService.extractRelations(request);
+            log.info("关系提取完成: documentId={}, relationCount={}", 
+                    request.getDocumentId(), 
+                    response.getRelationCount());
+            return AjaxResult.success(response);
+        } catch (Exception e) {
+            log.error("关系提取失败: documentId={}", request.getDocumentId(), e);
+            return AjaxResult.error("关系提取失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 对已解析的文档执行 NER 并保存结果
+     */
+    @PostMapping("/document/{documentId}")
+    @Operation(summary = "文档 NER", description = "对已解析的文档执行 NER 并保存结果到图数据库")
+    public AjaxResult extractForDocument(
+            @Parameter(description = "文档ID") @PathVariable String documentId,
+            @Parameter(description = "用户ID") @RequestParam(required = false) String userId) {
+        log.info("开始对文档执行 NER: documentId={}, userId={}", documentId, userId);
+        
+        try {
+            NerResponse response = nerService.extractAndSaveForDocument(documentId, userId);
+            if (response.getSuccess()) {
+                log.info("文档 NER 完成: documentId={}, entityCount={}, relationCount={}", 
+                        documentId, response.getEntityCount(), response.getRelationCount());
+                return AjaxResult.success(response);
+            } else {
+                return AjaxResult.error(response.getErrorMessage());
+            }
+        } catch (Exception e) {
+            log.error("文档 NER 失败: documentId={}", documentId, e);
+            return AjaxResult.error("文档 NER 失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 对已解析的文档执行 NER 并保存结果(异步)
+     */
+    @PostMapping("/document/{documentId}/async")
+    @Operation(summary = "异步文档 NER", description = "异步对已解析的文档执行 NER 并保存结果")
+    public AjaxResult extractForDocumentAsync(
+            @Parameter(description = "文档ID") @PathVariable String documentId,
+            @Parameter(description = "用户ID") @RequestParam(required = false) String userId) {
+        log.info("开始异步对文档执行 NER: documentId={}, userId={}", documentId, userId);
+        
+        try {
+            CompletableFuture<NerResponse> future = nerService.extractAndSaveForDocumentAsync(documentId, userId);
+            return AjaxResult.success("文档 NER 任务已提交", "documentId: " + documentId);
+        } catch (Exception e) {
+            log.error("异步文档 NER 提交失败: documentId={}", documentId, e);
+            return AjaxResult.error("异步文档 NER 提交失败: " + e.getMessage());
+        }
+    }
+}

+ 42 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/EntityInfo.java

@@ -0,0 +1,42 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+/**
+ * 实体信息
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "实体信息")
+public class EntityInfo {
+
+    @Schema(description = "实体名称")
+    private String name;
+
+    @Schema(description = "实体类型", example = "PERSON/ORG/LOC/DATE/NUMBER/DEVICE/TERM")
+    private String type;
+
+    @Schema(description = "实体值")
+    private String value;
+
+    @Schema(description = "位置信息")
+    private PositionInfo position;
+
+    @Schema(description = "上下文片段")
+    private String context;
+
+    @Schema(description = "置信度", example = "0.95")
+    private Double confidence;
+
+    @Schema(description = "临时ID,用于关系抽取时引用")
+    private String tempId;
+}

+ 40 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/NerRequest.java

@@ -0,0 +1,40 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * NER 请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "NER 请求")
+public class NerRequest {
+
+    @Schema(description = "文档ID")
+    private String documentId;
+
+    @Schema(description = "待提取的文本内容", required = true)
+    private String text;
+
+    @Schema(description = "指定要提取的实体类型(可选),为空则提取所有类型", 
+            example = "[\"PERSON\", \"ORG\", \"LOC\", \"DATE\", \"NUMBER\", \"DEVICE\"]")
+    private List<String> entityTypes;
+
+    @Schema(description = "是否提取关系", example = "true")
+    @Builder.Default
+    private Boolean extractRelations = true;
+
+    @Schema(description = "用户ID")
+    private String userId;
+}

+ 75 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/NerResponse.java

@@ -0,0 +1,75 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * NER 响应 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "NER 响应")
+public class NerResponse {
+
+    @Schema(description = "文档ID")
+    private String documentId;
+
+    @Schema(description = "提取的实体列表")
+    private List<EntityInfo> entities;
+
+    @Schema(description = "提取的关系列表")
+    private List<RelationInfo> relations;
+
+    @Schema(description = "处理耗时(毫秒)")
+    private Long processingTime;
+
+    @Schema(description = "实体数量")
+    private Integer entityCount;
+
+    @Schema(description = "关系数量")
+    private Integer relationCount;
+
+    @Schema(description = "是否成功")
+    @Builder.Default
+    private Boolean success = true;
+
+    @Schema(description = "错误信息")
+    private String errorMessage;
+
+    /**
+     * 创建成功响应
+     */
+    public static NerResponse success(String documentId, List<EntityInfo> entities, 
+                                       List<RelationInfo> relations, long processingTime) {
+        return NerResponse.builder()
+                .documentId(documentId)
+                .entities(entities)
+                .relations(relations)
+                .entityCount(entities != null ? entities.size() : 0)
+                .relationCount(relations != null ? relations.size() : 0)
+                .processingTime(processingTime)
+                .success(true)
+                .build();
+    }
+
+    /**
+     * 创建失败响应
+     */
+    public static NerResponse error(String documentId, String errorMessage) {
+        return NerResponse.builder()
+                .documentId(documentId)
+                .success(false)
+                .errorMessage(errorMessage)
+                .build();
+    }
+}

+ 36 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/PositionInfo.java

@@ -0,0 +1,36 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+/**
+ * 实体位置信息
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "实体位置信息")
+public class PositionInfo {
+
+    @Schema(description = "字符起始位置")
+    private Integer charStart;
+
+    @Schema(description = "字符结束位置")
+    private Integer charEnd;
+
+    @Schema(description = "所在行号")
+    private Integer line;
+
+    @Schema(description = "所在页码(如果有)")
+    private Integer page;
+
+    @Schema(description = "文件ID(如果有)")
+    private String fileId;
+}

+ 39 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationInfo.java

@@ -0,0 +1,39 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+/**
+ * 关系信息
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "关系信息")
+public class RelationInfo {
+
+    @Schema(description = "源实体名称")
+    private String fromEntity;
+
+    @Schema(description = "源实体临时ID")
+    private String fromEntityId;
+
+    @Schema(description = "目标实体名称")
+    private String toEntity;
+
+    @Schema(description = "目标实体临时ID")
+    private String toEntityId;
+
+    @Schema(description = "关系类型", example = "负责/属于/包含/位于/关联")
+    private String relationType;
+
+    @Schema(description = "置信度", example = "0.92")
+    private Double confidence;
+}

+ 32 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationRequest.java

@@ -0,0 +1,32 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * 关系抽取请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "关系抽取请求")
+public class RelationRequest {
+
+    @Schema(description = "文档ID")
+    private String documentId;
+
+    @Schema(description = "原始文本内容", required = true)
+    private String text;
+
+    @Schema(description = "已提取的实体列表", required = true)
+    private List<EntityInfo> entities;
+}

+ 66 - 0
backend/ai-service/src/main/java/com/lingyue/ai/dto/ner/RelationResponse.java

@@ -0,0 +1,66 @@
+package com.lingyue.ai.dto.ner;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * 关系抽取响应 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "关系抽取响应")
+public class RelationResponse {
+
+    @Schema(description = "文档ID")
+    private String documentId;
+
+    @Schema(description = "提取的关系列表")
+    private List<RelationInfo> relations;
+
+    @Schema(description = "处理耗时(毫秒)")
+    private Long processingTime;
+
+    @Schema(description = "关系数量")
+    private Integer relationCount;
+
+    @Schema(description = "是否成功")
+    @Builder.Default
+    private Boolean success = true;
+
+    @Schema(description = "错误信息")
+    private String errorMessage;
+
+    /**
+     * 创建成功响应
+     */
+    public static RelationResponse success(String documentId, List<RelationInfo> relations, long processingTime) {
+        return RelationResponse.builder()
+                .documentId(documentId)
+                .relations(relations)
+                .relationCount(relations != null ? relations.size() : 0)
+                .processingTime(processingTime)
+                .success(true)
+                .build();
+    }
+
+    /**
+     * 创建失败响应
+     */
+    public static RelationResponse error(String documentId, String errorMessage) {
+        return RelationResponse.builder()
+                .documentId(documentId)
+                .success(false)
+                .errorMessage(errorMessage)
+                .build();
+    }
+}

+ 62 - 0
backend/ai-service/src/main/java/com/lingyue/ai/enums/EntityType.java

@@ -0,0 +1,62 @@
+package com.lingyue.ai.enums;
+
+import lombok.Getter;
+
+/**
+ * 实体类型枚举
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Getter
+public enum EntityType {
+
+    PERSON("person", "人名"),
+    ORG("org", "机构"),
+    LOC("loc", "地点"),
+    DATE("date", "日期"),
+    NUMBER("number", "数值"),
+    DEVICE("device", "设备"),
+    TERM("term", "专业术语"),
+    PROJECT("project", "项目"),
+    COMPANY("company", "公司"),
+    OTHER("other", "其他");
+
+    private final String code;
+    private final String description;
+
+    EntityType(String code, String description) {
+        this.code = code;
+        this.description = description;
+    }
+
+    /**
+     * 根据 code 获取枚举
+     */
+    public static EntityType fromCode(String code) {
+        if (code == null) {
+            return OTHER;
+        }
+        for (EntityType type : values()) {
+            if (type.code.equalsIgnoreCase(code) || type.name().equalsIgnoreCase(code)) {
+                return type;
+            }
+        }
+        return OTHER;
+    }
+
+    /**
+     * 检查是否是有效的实体类型
+     */
+    public static boolean isValid(String code) {
+        if (code == null) {
+            return false;
+        }
+        for (EntityType type : values()) {
+            if (type.code.equalsIgnoreCase(code) || type.name().equalsIgnoreCase(code)) {
+                return true;
+            }
+        }
+        return false;
+    }
+}

+ 67 - 0
backend/ai-service/src/main/java/com/lingyue/ai/listener/DocumentParsedEventListener.java

@@ -0,0 +1,67 @@
+package com.lingyue.ai.listener;
+
+import com.lingyue.ai.dto.ner.NerResponse;
+import com.lingyue.ai.service.NerService;
+import com.lingyue.common.event.DocumentParsedEvent;
+import lombok.RequiredArgsConstructor;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.beans.factory.annotation.Value;
+import org.springframework.context.event.EventListener;
+import org.springframework.scheduling.annotation.Async;
+import org.springframework.stereotype.Component;
+
+/**
+ * 文档解析完成事件监听器
+ * 监听文档解析完成事件,自动触发 NER 提取
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Slf4j
+@Component
+@RequiredArgsConstructor
+public class DocumentParsedEventListener {
+
+    private final NerService nerService;
+
+    @Value("${ner.auto-extract.enabled:true}")
+    private boolean nerAutoExtractEnabled;
+
+    /**
+     * 处理文档解析完成事件
+     * 异步执行 NER 提取,不阻塞主流程
+     */
+    @Async
+    @EventListener
+    public void handleDocumentParsedEvent(DocumentParsedEvent event) {
+        if (!nerAutoExtractEnabled) {
+            log.debug("NER 自动提取已禁用,跳过: documentId={}", event.getDocumentId());
+            return;
+        }
+
+        String documentId = event.getDocumentId();
+        String userId = event.getUserId();
+
+        log.info("收到文档解析完成事件,开始 NER 提取: documentId={}, userId={}", documentId, userId);
+
+        try {
+            // 调用 NER 服务提取实体和关系并保存
+            NerResponse response = nerService.extractAndSaveForDocument(documentId, userId);
+
+            if (response.getSuccess()) {
+                log.info("NER 自动提取完成: documentId={}, entityCount={}, relationCount={}, time={}ms",
+                        documentId,
+                        response.getEntityCount(),
+                        response.getRelationCount(),
+                        response.getProcessingTime());
+            } else {
+                log.warn("NER 自动提取失败: documentId={}, error={}",
+                        documentId, response.getErrorMessage());
+            }
+
+        } catch (Exception e) {
+            log.error("NER 自动提取异常: documentId={}", documentId, e);
+            // 异常不向上抛出,不影响其他处理
+        }
+    }
+}

+ 65 - 0
backend/ai-service/src/main/java/com/lingyue/ai/service/NerService.java

@@ -0,0 +1,65 @@
+package com.lingyue.ai.service;
+
+import com.lingyue.ai.dto.ner.*;
+
+import java.util.List;
+import java.util.concurrent.CompletableFuture;
+
+/**
+ * NER(命名实体识别)服务接口
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+public interface NerService {
+
+    /**
+     * 从文本中提取实体(同步)
+     *
+     * @param request NER 请求
+     * @return NER 响应(包含实体和关系)
+     */
+    NerResponse extractEntities(NerRequest request);
+
+    /**
+     * 从文本中提取实体(异步)
+     *
+     * @param request NER 请求
+     * @return 异步结果
+     */
+    CompletableFuture<NerResponse> extractEntitiesAsync(NerRequest request);
+
+    /**
+     * 从已提取的实体中抽取关系
+     *
+     * @param request 关系抽取请求
+     * @return 关系抽取响应
+     */
+    RelationResponse extractRelations(RelationRequest request);
+
+    /**
+     * 对已解析的文档执行 NER 并保存结果到图数据库
+     *
+     * @param documentId 文档ID
+     * @param userId     用户ID
+     * @return NER 响应
+     */
+    NerResponse extractAndSaveForDocument(String documentId, String userId);
+
+    /**
+     * 对已解析的文档执行 NER 并保存结果到图数据库(异步)
+     *
+     * @param documentId 文档ID
+     * @param userId     用户ID
+     * @return 异步结果
+     */
+    CompletableFuture<NerResponse> extractAndSaveForDocumentAsync(String documentId, String userId);
+
+    /**
+     * 批量提取实体
+     *
+     * @param texts 文本列表
+     * @return 每个文本的实体列表
+     */
+    List<List<EntityInfo>> batchExtractEntities(List<String> texts);
+}

+ 342 - 0
backend/ai-service/src/main/java/com/lingyue/ai/service/impl/NerServiceImpl.java

@@ -0,0 +1,342 @@
+package com.lingyue.ai.service.impl;
+
+import com.lingyue.ai.client.PythonNerClient;
+import com.lingyue.ai.dto.ner.*;
+import com.lingyue.ai.service.NerService;
+import com.lingyue.common.exception.ServiceException;
+import com.lingyue.graph.entity.GraphNode;
+import com.lingyue.graph.entity.GraphRelation;
+import com.lingyue.graph.entity.TextStorage;
+import com.lingyue.graph.repository.GraphNodeRepository;
+import com.lingyue.graph.repository.GraphRelationRepository;
+import com.lingyue.graph.repository.TextStorageRepository;
+import lombok.RequiredArgsConstructor;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.scheduling.annotation.Async;
+import org.springframework.stereotype.Service;
+import org.springframework.transaction.annotation.Transactional;
+
+import java.nio.charset.StandardCharsets;
+import java.nio.file.Files;
+import java.nio.file.Path;
+import java.util.*;
+import java.util.concurrent.CompletableFuture;
+import java.util.stream.Collectors;
+
+/**
+ * NER 服务实现
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Slf4j
+@Service
+@RequiredArgsConstructor
+public class NerServiceImpl implements NerService {
+
+    private final PythonNerClient pythonNerClient;
+    private final TextStorageRepository textStorageRepository;
+    private final GraphNodeRepository graphNodeRepository;
+    private final GraphRelationRepository graphRelationRepository;
+
+    @Override
+    public NerResponse extractEntities(NerRequest request) {
+        log.info("开始提取实体: documentId={}, textLength={}", 
+                request.getDocumentId(), 
+                request.getText() != null ? request.getText().length() : 0);
+        
+        long startTime = System.currentTimeMillis();
+        
+        try {
+            // 验证请求
+            if (request.getText() == null || request.getText().isEmpty()) {
+                return NerResponse.error(request.getDocumentId(), "文本内容不能为空");
+            }
+            
+            // 调用 Python NER 服务
+            NerResponse response = pythonNerClient.extractEntities(request);
+            
+            long processingTime = System.currentTimeMillis() - startTime;
+            log.info("实体提取完成: documentId={}, entityCount={}, relationCount={}, time={}ms",
+                    request.getDocumentId(), 
+                    response.getEntityCount(), 
+                    response.getRelationCount(),
+                    processingTime);
+            
+            return response;
+            
+        } catch (Exception e) {
+            log.error("实体提取失败: documentId={}", request.getDocumentId(), e);
+            return NerResponse.error(request.getDocumentId(), e.getMessage());
+        }
+    }
+
+    @Override
+    @Async
+    public CompletableFuture<NerResponse> extractEntitiesAsync(NerRequest request) {
+        return CompletableFuture.supplyAsync(() -> extractEntities(request));
+    }
+
+    @Override
+    public RelationResponse extractRelations(RelationRequest request) {
+        log.info("开始提取关系: documentId={}, entityCount={}", 
+                request.getDocumentId(), 
+                request.getEntities() != null ? request.getEntities().size() : 0);
+        
+        try {
+            // 验证请求
+            if (request.getEntities() == null || request.getEntities().size() < 2) {
+                return RelationResponse.success(request.getDocumentId(), Collections.emptyList(), 0);
+            }
+            
+            // 调用 Python NER 服务
+            RelationResponse response = pythonNerClient.extractRelations(request);
+            
+            log.info("关系提取完成: documentId={}, relationCount={}", 
+                    request.getDocumentId(), response.getRelationCount());
+            
+            return response;
+            
+        } catch (Exception e) {
+            log.error("关系提取失败: documentId={}", request.getDocumentId(), e);
+            return RelationResponse.error(request.getDocumentId(), e.getMessage());
+        }
+    }
+
+    @Override
+    @Transactional
+    public NerResponse extractAndSaveForDocument(String documentId, String userId) {
+        log.info("开始对文档执行 NER 并保存: documentId={}, userId={}", documentId, userId);
+        
+        long startTime = System.currentTimeMillis();
+        
+        try {
+            // 1. 获取文档的文本内容
+            String text = getDocumentText(documentId);
+            if (text == null || text.isEmpty()) {
+                return NerResponse.error(documentId, "文档文本内容为空");
+            }
+            
+            // 2. 调用 NER 提取
+            NerRequest request = NerRequest.builder()
+                    .documentId(documentId)
+                    .text(text)
+                    .userId(userId)
+                    .extractRelations(true)
+                    .build();
+            
+            NerResponse nerResponse = pythonNerClient.extractEntities(request);
+            
+            if (!nerResponse.getSuccess()) {
+                return nerResponse;
+            }
+            
+            // 3. 保存实体到图数据库
+            Map<String, String> tempIdToNodeId = saveEntitiesToGraph(
+                    documentId, userId, nerResponse.getEntities());
+            
+            // 4. 保存关系到图数据库
+            if (nerResponse.getRelations() != null && !nerResponse.getRelations().isEmpty()) {
+                saveRelationsToGraph(nerResponse.getRelations(), tempIdToNodeId);
+            }
+            
+            long processingTime = System.currentTimeMillis() - startTime;
+            
+            log.info("文档 NER 完成并保存: documentId={}, entityCount={}, relationCount={}, time={}ms",
+                    documentId, nerResponse.getEntityCount(), nerResponse.getRelationCount(), processingTime);
+            
+            // 更新响应的处理时间
+            nerResponse.setProcessingTime(processingTime);
+            return nerResponse;
+            
+        } catch (Exception e) {
+            log.error("文档 NER 失败: documentId={}", documentId, e);
+            return NerResponse.error(documentId, e.getMessage());
+        }
+    }
+
+    @Override
+    @Async
+    public CompletableFuture<NerResponse> extractAndSaveForDocumentAsync(String documentId, String userId) {
+        return CompletableFuture.supplyAsync(() -> extractAndSaveForDocument(documentId, userId));
+    }
+
+    @Override
+    public List<List<EntityInfo>> batchExtractEntities(List<String> texts) {
+        log.info("批量提取实体: count={}", texts.size());
+        
+        return texts.stream()
+                .map(text -> {
+                    try {
+                        NerRequest request = NerRequest.builder()
+                                .text(text)
+                                .extractRelations(false)
+                                .build();
+                        NerResponse response = extractEntities(request);
+                        return response.getEntities() != null ? response.getEntities() : Collections.<EntityInfo>emptyList();
+                    } catch (Exception e) {
+                        log.warn("批量提取实体失败: {}", e.getMessage());
+                        return Collections.<EntityInfo>emptyList();
+                    }
+                })
+                .collect(Collectors.toList());
+    }
+
+    /**
+     * 获取文档的文本内容
+     */
+    private String getDocumentText(String documentId) {
+        TextStorage textStorage = textStorageRepository.findByDocumentId(documentId);
+        if (textStorage == null) {
+            throw new ServiceException("文档文本存储记录不存在: documentId=" + documentId);
+        }
+        
+        String filePath = textStorage.getFilePath();
+        try {
+            return Files.readString(Path.of(filePath), StandardCharsets.UTF_8);
+        } catch (Exception e) {
+            throw new ServiceException("读取文档文本失败: " + e.getMessage(), e);
+        }
+    }
+
+    /**
+     * 保存实体到图数据库
+     *
+     * @return tempId 到实际 nodeId 的映射
+     */
+    private Map<String, String> saveEntitiesToGraph(String documentId, String userId, 
+                                                     List<EntityInfo> entities) {
+        Map<String, String> tempIdToNodeId = new HashMap<>();
+        
+        if (entities == null || entities.isEmpty()) {
+            return tempIdToNodeId;
+        }
+        
+        for (EntityInfo entity : entities) {
+            GraphNode node = new GraphNode();
+            node.setId(UUID.randomUUID().toString().replace("-", ""));
+            node.setDocumentId(documentId);
+            node.setUserId(userId != null ? userId : "system");
+            node.setName(entity.getName());
+            node.setType(entity.getType() != null ? entity.getType().toLowerCase() : "other");
+            node.setValue(entity.getValue());
+            node.setLevel(0);
+            node.setCreateTime(new Date());
+            node.setUpdateTime(new Date());
+            
+            // 转换位置信息
+            if (entity.getPosition() != null) {
+                Map<String, Object> position = new HashMap<>();
+                position.put("charStart", entity.getPosition().getCharStart());
+                position.put("charEnd", entity.getPosition().getCharEnd());
+                if (entity.getPosition().getLine() != null) {
+                    position.put("line", entity.getPosition().getLine());
+                }
+                if (entity.getPosition().getPage() != null) {
+                    position.put("page", entity.getPosition().getPage());
+                }
+                node.setPosition(position);
+            }
+            
+            // 保存元数据
+            Map<String, Object> metadata = new HashMap<>();
+            if (entity.getContext() != null) {
+                metadata.put("context", entity.getContext());
+            }
+            if (entity.getConfidence() != null) {
+                metadata.put("confidence", entity.getConfidence());
+            }
+            metadata.put("source", "ner");
+            node.setMetadata(metadata);
+            
+            graphNodeRepository.insert(node);
+            
+            // 记录 tempId 到 nodeId 的映射
+            if (entity.getTempId() != null) {
+                tempIdToNodeId.put(entity.getTempId(), node.getId());
+            }
+        }
+        
+        log.debug("保存实体到图数据库完成: count={}", entities.size());
+        return tempIdToNodeId;
+    }
+
+    /**
+     * 保存关系到图数据库
+     */
+    private void saveRelationsToGraph(List<RelationInfo> relations, Map<String, String> tempIdToNodeId) {
+        if (relations == null || relations.isEmpty()) {
+            return;
+        }
+        
+        int savedCount = 0;
+        for (RelationInfo relation : relations) {
+            // 通过 tempId 获取实际的 nodeId
+            String fromNodeId = relation.getFromEntityId() != null ? 
+                    tempIdToNodeId.get(relation.getFromEntityId()) : null;
+            String toNodeId = relation.getToEntityId() != null ? 
+                    tempIdToNodeId.get(relation.getToEntityId()) : null;
+            
+            // 如果无法找到对应的节点,跳过
+            if (fromNodeId == null || toNodeId == null) {
+                log.debug("跳过关系保存(节点不存在): from={}, to={}", 
+                        relation.getFromEntity(), relation.getToEntity());
+                continue;
+            }
+            
+            GraphRelation graphRelation = new GraphRelation();
+            graphRelation.setId(UUID.randomUUID().toString().replace("-", ""));
+            graphRelation.setFromNodeId(fromNodeId);
+            graphRelation.setToNodeId(toNodeId);
+            graphRelation.setRelationType(mapRelationType(relation.getRelationType()));
+            graphRelation.setOrderIndex(0);
+            graphRelation.setCreateTime(new Date());
+            graphRelation.setUpdateTime(new Date());
+            
+            // 保存元数据
+            Map<String, Object> metadata = new HashMap<>();
+            if (relation.getConfidence() != null) {
+                metadata.put("confidence", relation.getConfidence());
+            }
+            metadata.put("originalType", relation.getRelationType());
+            metadata.put("source", "ner");
+            graphRelation.setMetadata(metadata);
+            
+            graphRelationRepository.insert(graphRelation);
+            savedCount++;
+        }
+        
+        log.debug("保存关系到图数据库完成: count={}", savedCount);
+    }
+
+    /**
+     * 映射关系类型到系统定义的类型
+     */
+    private String mapRelationType(String originalType) {
+        if (originalType == null) {
+            return "DEP";
+        }
+        
+        // 映射常见关系类型
+        switch (originalType) {
+            case "负责":
+            case "管理":
+            case "承担":
+                return "DEP";  // 依赖关系
+            case "属于":
+            case "隶属":
+                return "DEP";
+            case "包含":
+            case "包括":
+                return "DEP";
+            case "位于":
+            case "在":
+                return "DEP";
+            case "使用":
+            case "采用":
+                return "DEP";
+            default:
+                return "DEP";  // 默认为依赖关系
+        }
+    }
+}

+ 44 - 0
backend/common/src/main/java/com/lingyue/common/event/DocumentParsedEvent.java

@@ -0,0 +1,44 @@
+package com.lingyue.common.event;
+
+import lombok.Getter;
+import org.springframework.context.ApplicationEvent;
+
+/**
+ * 文档解析完成事件
+ * 用于通知其他模块文档已解析完成,可以进行后续处理(如 NER、向量化等)
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Getter
+public class DocumentParsedEvent extends ApplicationEvent {
+
+    /**
+     * 文档ID
+     */
+    private final String documentId;
+
+    /**
+     * 用户ID
+     */
+    private final String userId;
+
+    /**
+     * 文本文件路径
+     */
+    private final String textFilePath;
+
+    /**
+     * 文本存储ID
+     */
+    private final String textStorageId;
+
+    public DocumentParsedEvent(Object source, String documentId, String userId, 
+                                String textFilePath, String textStorageId) {
+        super(source);
+        this.documentId = documentId;
+        this.userId = userId;
+        this.textFilePath = textFilePath;
+        this.textStorageId = textStorageId;
+    }
+}

+ 230 - 9
backend/graph-service/src/main/java/com/lingyue/graph/controller/GraphController.java

@@ -1,44 +1,265 @@
 package com.lingyue.graph.controller;
 
+import com.lingyue.graph.dto.*;
+import com.lingyue.graph.entity.GraphNode;
+import com.lingyue.graph.entity.GraphRelation;
+import com.lingyue.graph.service.GraphNodeService;
 import com.lingyue.graph.service.GraphService;
 import com.lingyue.common.domain.AjaxResult;
+import io.swagger.v3.oas.annotations.Operation;
+import io.swagger.v3.oas.annotations.Parameter;
+import io.swagger.v3.oas.annotations.tags.Tag;
 import lombok.RequiredArgsConstructor;
+import lombok.extern.slf4j.Slf4j;
 import org.springframework.web.bind.annotation.*;
 
+import java.util.List;
+import java.util.Map;
+
 /**
- * 关系网络控制器(基础框架)
+ * 图数据库控制器
+ * 提供图节点和关系的 CRUD 操作
+ *
+ * @author lingyue
+ * @since 2026-01-19
  */
+@Slf4j
 @RestController
-@RequestMapping("/graphs")
+@RequestMapping("/api/graph")
 @RequiredArgsConstructor
+@Tag(name = "图数据库", description = "图节点和关系管理接口")
 public class GraphController {
     
     private final GraphService graphService;
-    
+    private final GraphNodeService graphNodeService;
+
+    // ==================== 节点操作 ====================
+
+    /**
+     * 创建节点
+     */
+    @PostMapping("/nodes")
+    @Operation(summary = "创建节点", description = "创建单个图节点")
+    public AjaxResult<GraphNode> createNode(@RequestBody CreateNodeRequest request) {
+        try {
+            GraphNode node = graphNodeService.createNode(request);
+            return AjaxResult.success(node);
+        } catch (Exception e) {
+            log.error("创建节点失败: {}", e.getMessage(), e);
+            return AjaxResult.error("创建节点失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 批量创建节点
+     */
+    @PostMapping("/nodes/batch")
+    @Operation(summary = "批量创建节点", description = "批量创建图节点")
+    public AjaxResult<List<GraphNode>> batchCreateNodes(@RequestBody BatchCreateNodesRequest request) {
+        try {
+            List<GraphNode> nodes = graphNodeService.batchCreateNodes(request);
+            return AjaxResult.success(nodes);
+        } catch (Exception e) {
+            log.error("批量创建节点失败: {}", e.getMessage(), e);
+            return AjaxResult.error("批量创建节点失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 获取节点
+     */
+    @GetMapping("/nodes/{nodeId}")
+    @Operation(summary = "获取节点", description = "根据ID获取节点详情")
+    public AjaxResult<GraphNode> getNode(
+            @Parameter(description = "节点ID") @PathVariable String nodeId) {
+        GraphNode node = graphNodeService.getNodeById(nodeId);
+        if (node == null) {
+            return AjaxResult.error("节点不存在: " + nodeId);
+        }
+        return AjaxResult.success(node);
+    }
+
+    /**
+     * 获取文档的所有节点
+     */
+    @GetMapping("/documents/{documentId}/nodes")
+    @Operation(summary = "获取文档节点", description = "获取指定文档的所有节点")
+    public AjaxResult<List<GraphNode>> getNodesByDocument(
+            @Parameter(description = "文档ID") @PathVariable String documentId,
+            @Parameter(description = "节点类型(可选)") @RequestParam(required = false) String type) {
+        List<GraphNode> nodes;
+        if (type != null && !type.isEmpty()) {
+            nodes = graphNodeService.getNodesByDocumentIdAndType(documentId, type);
+        } else {
+            nodes = graphNodeService.getNodesByDocumentId(documentId);
+        }
+        return AjaxResult.success(nodes);
+    }
+
+    /**
+     * 更新节点
+     */
+    @PutMapping("/nodes/{nodeId}")
+    @Operation(summary = "更新节点", description = "更新节点信息")
+    public AjaxResult<GraphNode> updateNode(
+            @Parameter(description = "节点ID") @PathVariable String nodeId,
+            @RequestBody CreateNodeRequest request) {
+        try {
+            GraphNode node = graphNodeService.updateNode(nodeId, request);
+            return AjaxResult.success(node);
+        } catch (Exception e) {
+            log.error("更新节点失败: nodeId={}, error={}", nodeId, e.getMessage(), e);
+            return AjaxResult.error("更新节点失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 删除节点
+     */
+    @DeleteMapping("/nodes/{nodeId}")
+    @Operation(summary = "删除节点", description = "删除节点及其相关关系")
+    public AjaxResult<Void> deleteNode(
+            @Parameter(description = "节点ID") @PathVariable String nodeId) {
+        try {
+            graphNodeService.deleteNode(nodeId);
+            return AjaxResult.success();
+        } catch (Exception e) {
+            log.error("删除节点失败: nodeId={}, error={}", nodeId, e.getMessage(), e);
+            return AjaxResult.error("删除节点失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 删除文档的所有节点
+     */
+    @DeleteMapping("/documents/{documentId}/nodes")
+    @Operation(summary = "删除文档节点", description = "删除指定文档的所有节点及关系")
+    public AjaxResult<Integer> deleteNodesByDocument(
+            @Parameter(description = "文档ID") @PathVariable String documentId) {
+        try {
+            int count = graphNodeService.deleteNodesByDocumentId(documentId);
+            return AjaxResult.success("删除成功", count);
+        } catch (Exception e) {
+            log.error("删除文档节点失败: documentId={}, error={}", documentId, e.getMessage(), e);
+            return AjaxResult.error("删除文档节点失败: " + e.getMessage());
+        }
+    }
+
+    // ==================== 关系操作 ====================
+
+    /**
+     * 创建关系
+     */
+    @PostMapping("/relations")
+    @Operation(summary = "创建关系", description = "创建节点间的关系")
+    public AjaxResult<GraphRelation> createRelation(@RequestBody CreateRelationRequest request) {
+        try {
+            GraphRelation relation = graphNodeService.createRelation(request);
+            return AjaxResult.success(relation);
+        } catch (Exception e) {
+            log.error("创建关系失败: {}", e.getMessage(), e);
+            return AjaxResult.error("创建关系失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 批量创建关系
+     */
+    @PostMapping("/relations/batch")
+    @Operation(summary = "批量创建关系", description = "批量创建节点间的关系")
+    public AjaxResult<List<GraphRelation>> batchCreateRelations(@RequestBody BatchCreateRelationsRequest request) {
+        try {
+            List<GraphRelation> relations = graphNodeService.batchCreateRelations(request);
+            return AjaxResult.success(relations);
+        } catch (Exception e) {
+            log.error("批量创建关系失败: {}", e.getMessage(), e);
+            return AjaxResult.error("批量创建关系失败: " + e.getMessage());
+        }
+    }
+
+    /**
+     * 获取关系
+     */
+    @GetMapping("/relations/{relationId}")
+    @Operation(summary = "获取关系", description = "根据ID获取关系详情")
+    public AjaxResult<GraphRelation> getRelation(
+            @Parameter(description = "关系ID") @PathVariable String relationId) {
+        GraphRelation relation = graphNodeService.getRelationById(relationId);
+        if (relation == null) {
+            return AjaxResult.error("关系不存在: " + relationId);
+        }
+        return AjaxResult.success(relation);
+    }
+
+    /**
+     * 获取节点的所有关系
+     */
+    @GetMapping("/nodes/{nodeId}/relations")
+    @Operation(summary = "获取节点关系", description = "获取指定节点的所有关系")
+    public AjaxResult<List<GraphRelation>> getRelationsByNode(
+            @Parameter(description = "节点ID") @PathVariable String nodeId) {
+        List<GraphRelation> relations = graphNodeService.getRelationsByNodeId(nodeId);
+        return AjaxResult.success(relations);
+    }
+
+    /**
+     * 删除关系
+     */
+    @DeleteMapping("/relations/{relationId}")
+    @Operation(summary = "删除关系", description = "删除指定关系")
+    public AjaxResult<Void> deleteRelation(
+            @Parameter(description = "关系ID") @PathVariable String relationId) {
+        try {
+            graphNodeService.deleteRelation(relationId);
+            return AjaxResult.success();
+        } catch (Exception e) {
+            log.error("删除关系失败: relationId={}, error={}", relationId, e.getMessage(), e);
+            return AjaxResult.error("删除关系失败: " + e.getMessage());
+        }
+    }
+
+    // ==================== 统计操作 ====================
+
+    /**
+     * 获取文档的图统计信息
+     */
+    @GetMapping("/documents/{documentId}/stats")
+    @Operation(summary = "获取图统计", description = "获取指定文档的节点和关系统计信息")
+    public AjaxResult<Map<String, Object>> getGraphStats(
+            @Parameter(description = "文档ID") @PathVariable String documentId) {
+        Map<String, Object> nodeStats = graphNodeService.getNodeStatsByDocumentId(documentId);
+        Map<String, Object> relationStats = graphNodeService.getRelationStatsByDocumentId(documentId);
+        
+        nodeStats.put("relations", relationStats);
+        return AjaxResult.success(nodeStats);
+    }
+
+    // ==================== 旧接口兼容 ====================
+
     /**
      * 创建关系网络(待实现)
      */
-    @PostMapping
+    @PostMapping("/legacy")
+    @Operation(summary = "创建关系网络(旧接口)", description = "待实现")
     public AjaxResult<?> createGraph() {
-        // TODO: 实现关系网络创建
         return AjaxResult.success("关系网络创建接口待实现");
     }
     
     /**
      * 获取关系网络(待实现)
      */
-    @GetMapping("/{graphId}")
+    @GetMapping("/legacy/{graphId}")
+    @Operation(summary = "获取关系网络(旧接口)", description = "待实现")
     public AjaxResult<?> getGraph(@PathVariable String graphId) {
-        // TODO: 实现关系网络查询
         return AjaxResult.success("关系网络查询接口待实现");
     }
     
     /**
      * 计算关系网络(待实现)
      */
-    @PostMapping("/{graphId}/calculate")
+    @PostMapping("/legacy/{graphId}/calculate")
+    @Operation(summary = "计算关系网络(旧接口)", description = "待实现")
     public AjaxResult<?> calculateGraph(@PathVariable String graphId) {
-        // TODO: 实现关系网络计算
         return AjaxResult.success("关系网络计算接口待实现");
     }
 }

+ 36 - 0
backend/graph-service/src/main/java/com/lingyue/graph/dto/BatchCreateNodesRequest.java

@@ -0,0 +1,36 @@
+package com.lingyue.graph.dto;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * 批量创建图节点请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "批量创建图节点请求")
+public class BatchCreateNodesRequest {
+
+    @Schema(description = "文档ID", required = true)
+    private String documentId;
+
+    @Schema(description = "用户ID")
+    private String userId;
+
+    @Schema(description = "节点列表", required = true)
+    private List<CreateNodeRequest> nodes;
+
+    @Schema(description = "是否删除已有节点后再创建", example = "false")
+    @Builder.Default
+    private Boolean replaceExisting = false;
+}

+ 30 - 0
backend/graph-service/src/main/java/com/lingyue/graph/dto/BatchCreateRelationsRequest.java

@@ -0,0 +1,30 @@
+package com.lingyue.graph.dto;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.List;
+
+/**
+ * 批量创建图关系请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "批量创建图关系请求")
+public class BatchCreateRelationsRequest {
+
+    @Schema(description = "关系列表", required = true)
+    private List<CreateRelationRequest> relations;
+
+    @Schema(description = "是否跳过不存在的节点", example = "true")
+    @Builder.Default
+    private Boolean skipInvalidNodes = true;
+}

+ 50 - 0
backend/graph-service/src/main/java/com/lingyue/graph/dto/CreateNodeRequest.java

@@ -0,0 +1,50 @@
+package com.lingyue.graph.dto;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.Map;
+
+/**
+ * 创建图节点请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "创建图节点请求")
+public class CreateNodeRequest {
+
+    @Schema(description = "文档ID", required = true)
+    private String documentId;
+
+    @Schema(description = "用户ID")
+    private String userId;
+
+    @Schema(description = "节点名称", required = true)
+    private String name;
+
+    @Schema(description = "节点类型", example = "person/org/loc/date/number/device", required = true)
+    private String type;
+
+    @Schema(description = "节点值")
+    private String value;
+
+    @Schema(description = "位置信息")
+    private Map<String, Object> position;
+
+    @Schema(description = "父节点ID")
+    private String parentId;
+
+    @Schema(description = "层级")
+    private Integer level;
+
+    @Schema(description = "元数据")
+    private Map<String, Object> metadata;
+}

+ 47 - 0
backend/graph-service/src/main/java/com/lingyue/graph/dto/CreateRelationRequest.java

@@ -0,0 +1,47 @@
+package com.lingyue.graph.dto;
+
+import io.swagger.v3.oas.annotations.media.Schema;
+import lombok.AllArgsConstructor;
+import lombok.Builder;
+import lombok.Data;
+import lombok.NoArgsConstructor;
+
+import java.util.Map;
+
+/**
+ * 创建图关系请求 DTO
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Data
+@Builder
+@NoArgsConstructor
+@AllArgsConstructor
+@Schema(description = "创建图关系请求")
+public class CreateRelationRequest {
+
+    @Schema(description = "源节点ID", required = true)
+    private String fromNodeId;
+
+    @Schema(description = "目标节点ID", required = true)
+    private String toNodeId;
+
+    @Schema(description = "关系类型", example = "DEP/ADD/SUB/MUL/DIV/UNION/INTERSECT/AI", required = true)
+    private String relationType;
+
+    @Schema(description = "动作类型")
+    private String actionType;
+
+    @Schema(description = "动作配置")
+    private Map<String, Object> actionConfig;
+
+    @Schema(description = "顺序索引")
+    private Integer orderIndex;
+
+    @Schema(description = "条件表达式")
+    private String conditionExpr;
+
+    @Schema(description = "元数据")
+    private Map<String, Object> metadata;
+}

+ 365 - 0
backend/graph-service/src/main/java/com/lingyue/graph/service/GraphNodeService.java

@@ -0,0 +1,365 @@
+package com.lingyue.graph.service;
+
+import com.baomidou.mybatisplus.core.conditions.query.LambdaQueryWrapper;
+import com.lingyue.common.exception.ServiceException;
+import com.lingyue.graph.dto.BatchCreateNodesRequest;
+import com.lingyue.graph.dto.BatchCreateRelationsRequest;
+import com.lingyue.graph.dto.CreateNodeRequest;
+import com.lingyue.graph.dto.CreateRelationRequest;
+import com.lingyue.graph.entity.GraphNode;
+import com.lingyue.graph.entity.GraphRelation;
+import com.lingyue.graph.repository.GraphNodeRepository;
+import com.lingyue.graph.repository.GraphRelationRepository;
+import lombok.RequiredArgsConstructor;
+import lombok.extern.slf4j.Slf4j;
+import org.springframework.stereotype.Service;
+import org.springframework.transaction.annotation.Transactional;
+
+import java.util.*;
+
+/**
+ * 图节点服务
+ * 提供图节点和关系的 CRUD 及批量操作
+ *
+ * @author lingyue
+ * @since 2026-01-19
+ */
+@Slf4j
+@Service
+@RequiredArgsConstructor
+public class GraphNodeService {
+
+    private final GraphNodeRepository graphNodeRepository;
+    private final GraphRelationRepository graphRelationRepository;
+
+    // ==================== 节点操作 ====================
+
+    /**
+     * 创建单个节点
+     */
+    public GraphNode createNode(CreateNodeRequest request) {
+        GraphNode node = new GraphNode();
+        node.setId(UUID.randomUUID().toString().replace("-", ""));
+        node.setDocumentId(request.getDocumentId());
+        node.setUserId(request.getUserId() != null ? request.getUserId() : "system");
+        node.setName(request.getName());
+        node.setType(request.getType());
+        node.setValue(request.getValue());
+        node.setPosition(request.getPosition());
+        node.setParentId(request.getParentId());
+        node.setLevel(request.getLevel() != null ? request.getLevel() : 0);
+        node.setMetadata(request.getMetadata());
+        node.setCreateTime(new Date());
+        node.setUpdateTime(new Date());
+
+        graphNodeRepository.insert(node);
+        log.debug("创建节点: id={}, name={}, type={}", node.getId(), node.getName(), node.getType());
+
+        return node;
+    }
+
+    /**
+     * 批量创建节点
+     *
+     * @return 创建的节点列表
+     */
+    @Transactional
+    public List<GraphNode> batchCreateNodes(BatchCreateNodesRequest request) {
+        if (request.getNodes() == null || request.getNodes().isEmpty()) {
+            return Collections.emptyList();
+        }
+
+        String documentId = request.getDocumentId();
+        String userId = request.getUserId();
+
+        // 如果需要替换已有节点,先删除
+        if (Boolean.TRUE.equals(request.getReplaceExisting()) && documentId != null) {
+            deleteNodesByDocumentId(documentId);
+        }
+
+        List<GraphNode> createdNodes = new ArrayList<>();
+        for (CreateNodeRequest nodeRequest : request.getNodes()) {
+            // 设置文档ID和用户ID
+            if (nodeRequest.getDocumentId() == null) {
+                nodeRequest.setDocumentId(documentId);
+            }
+            if (nodeRequest.getUserId() == null) {
+                nodeRequest.setUserId(userId);
+            }
+
+            GraphNode node = createNode(nodeRequest);
+            createdNodes.add(node);
+        }
+
+        log.info("批量创建节点完成: documentId={}, count={}", documentId, createdNodes.size());
+        return createdNodes;
+    }
+
+    /**
+     * 根据ID获取节点
+     */
+    public GraphNode getNodeById(String nodeId) {
+        return graphNodeRepository.selectById(nodeId);
+    }
+
+    /**
+     * 根据文档ID获取所有节点
+     */
+    public List<GraphNode> getNodesByDocumentId(String documentId) {
+        return graphNodeRepository.findByDocumentId(documentId);
+    }
+
+    /**
+     * 根据文档ID和类型获取节点
+     */
+    public List<GraphNode> getNodesByDocumentIdAndType(String documentId, String type) {
+        return graphNodeRepository.findByDocumentIdAndType(documentId, type);
+    }
+
+    /**
+     * 根据用户ID获取节点
+     */
+    public List<GraphNode> getNodesByUserId(String userId) {
+        return graphNodeRepository.findByUserId(userId);
+    }
+
+    /**
+     * 更新节点
+     */
+    public GraphNode updateNode(String nodeId, CreateNodeRequest request) {
+        GraphNode node = graphNodeRepository.selectById(nodeId);
+        if (node == null) {
+            throw new ServiceException("节点不存在: " + nodeId);
+        }
+
+        if (request.getName() != null) {
+            node.setName(request.getName());
+        }
+        if (request.getType() != null) {
+            node.setType(request.getType());
+        }
+        if (request.getValue() != null) {
+            node.setValue(request.getValue());
+        }
+        if (request.getPosition() != null) {
+            node.setPosition(request.getPosition());
+        }
+        if (request.getParentId() != null) {
+            node.setParentId(request.getParentId());
+        }
+        if (request.getLevel() != null) {
+            node.setLevel(request.getLevel());
+        }
+        if (request.getMetadata() != null) {
+            node.setMetadata(request.getMetadata());
+        }
+        node.setUpdateTime(new Date());
+
+        graphNodeRepository.updateById(node);
+        log.debug("更新节点: id={}", nodeId);
+
+        return node;
+    }
+
+    /**
+     * 删除节点(同时删除相关关系)
+     */
+    @Transactional
+    public void deleteNode(String nodeId) {
+        // 先删除相关关系
+        deleteRelationsByNodeId(nodeId);
+        // 再删除节点
+        graphNodeRepository.deleteById(nodeId);
+        log.debug("删除节点: id={}", nodeId);
+    }
+
+    /**
+     * 根据文档ID删除所有节点(同时删除相关关系)
+     */
+    @Transactional
+    public int deleteNodesByDocumentId(String documentId) {
+        // 获取所有节点ID
+        List<GraphNode> nodes = graphNodeRepository.findByDocumentId(documentId);
+        if (nodes.isEmpty()) {
+            return 0;
+        }
+
+        // 删除所有相关关系
+        for (GraphNode node : nodes) {
+            deleteRelationsByNodeId(node.getId());
+        }
+
+        // 删除节点
+        LambdaQueryWrapper<GraphNode> wrapper = new LambdaQueryWrapper<>();
+        wrapper.eq(GraphNode::getDocumentId, documentId);
+        int count = graphNodeRepository.delete(wrapper);
+
+        log.info("删除文档节点: documentId={}, count={}", documentId, count);
+        return count;
+    }
+
+    // ==================== 关系操作 ====================
+
+    /**
+     * 创建单个关系
+     */
+    public GraphRelation createRelation(CreateRelationRequest request) {
+        // 验证节点存在
+        if (graphNodeRepository.selectById(request.getFromNodeId()) == null) {
+            throw new ServiceException("源节点不存在: " + request.getFromNodeId());
+        }
+        if (graphNodeRepository.selectById(request.getToNodeId()) == null) {
+            throw new ServiceException("目标节点不存在: " + request.getToNodeId());
+        }
+
+        GraphRelation relation = new GraphRelation();
+        relation.setId(UUID.randomUUID().toString().replace("-", ""));
+        relation.setFromNodeId(request.getFromNodeId());
+        relation.setToNodeId(request.getToNodeId());
+        relation.setRelationType(request.getRelationType());
+        relation.setActionType(request.getActionType());
+        relation.setActionConfig(request.getActionConfig());
+        relation.setOrderIndex(request.getOrderIndex() != null ? request.getOrderIndex() : 0);
+        relation.setConditionExpr(request.getConditionExpr());
+        relation.setMetadata(request.getMetadata());
+        relation.setCreateTime(new Date());
+        relation.setUpdateTime(new Date());
+
+        graphRelationRepository.insert(relation);
+        log.debug("创建关系: id={}, from={}, to={}, type={}",
+                relation.getId(), relation.getFromNodeId(), relation.getToNodeId(), relation.getRelationType());
+
+        return relation;
+    }
+
+    /**
+     * 批量创建关系
+     */
+    @Transactional
+    public List<GraphRelation> batchCreateRelations(BatchCreateRelationsRequest request) {
+        if (request.getRelations() == null || request.getRelations().isEmpty()) {
+            return Collections.emptyList();
+        }
+
+        List<GraphRelation> createdRelations = new ArrayList<>();
+        boolean skipInvalid = Boolean.TRUE.equals(request.getSkipInvalidNodes());
+
+        for (CreateRelationRequest relationRequest : request.getRelations()) {
+            try {
+                GraphRelation relation = createRelation(relationRequest);
+                createdRelations.add(relation);
+            } catch (ServiceException e) {
+                if (skipInvalid) {
+                    log.warn("跳过无效关系: from={}, to={}, error={}",
+                            relationRequest.getFromNodeId(), relationRequest.getToNodeId(), e.getMessage());
+                } else {
+                    throw e;
+                }
+            }
+        }
+
+        log.info("批量创建关系完成: count={}", createdRelations.size());
+        return createdRelations;
+    }
+
+    /**
+     * 根据ID获取关系
+     */
+    public GraphRelation getRelationById(String relationId) {
+        return graphRelationRepository.selectById(relationId);
+    }
+
+    /**
+     * 根据源节点ID获取关系
+     */
+    public List<GraphRelation> getRelationsByFromNodeId(String fromNodeId) {
+        return graphRelationRepository.findByFromNodeId(fromNodeId);
+    }
+
+    /**
+     * 根据目标节点ID获取关系
+     */
+    public List<GraphRelation> getRelationsByToNodeId(String toNodeId) {
+        return graphRelationRepository.findByToNodeId(toNodeId);
+    }
+
+    /**
+     * 根据节点ID获取所有相关关系
+     */
+    public List<GraphRelation> getRelationsByNodeId(String nodeId) {
+        return graphRelationRepository.findByNodeId(nodeId);
+    }
+
+    /**
+     * 删除关系
+     */
+    public void deleteRelation(String relationId) {
+        graphRelationRepository.deleteById(relationId);
+        log.debug("删除关系: id={}", relationId);
+    }
+
+    /**
+     * 删除节点相关的所有关系
+     */
+    public int deleteRelationsByNodeId(String nodeId) {
+        LambdaQueryWrapper<GraphRelation> wrapper = new LambdaQueryWrapper<>();
+        wrapper.eq(GraphRelation::getFromNodeId, nodeId)
+                .or()
+                .eq(GraphRelation::getToNodeId, nodeId);
+        int count = graphRelationRepository.delete(wrapper);
+        log.debug("删除节点相关关系: nodeId={}, count={}", nodeId, count);
+        return count;
+    }
+
+    // ==================== 统计操作 ====================
+
+    /**
+     * 获取文档的节点统计
+     */
+    public Map<String, Object> getNodeStatsByDocumentId(String documentId) {
+        List<GraphNode> nodes = graphNodeRepository.findByDocumentId(documentId);
+
+        Map<String, Long> typeCount = new HashMap<>();
+        for (GraphNode node : nodes) {
+            String type = node.getType() != null ? node.getType() : "other";
+            typeCount.put(type, typeCount.getOrDefault(type, 0L) + 1);
+        }
+
+        Map<String, Object> stats = new HashMap<>();
+        stats.put("totalNodes", nodes.size());
+        stats.put("typeDistribution", typeCount);
+
+        return stats;
+    }
+
+    /**
+     * 获取文档的关系统计
+     */
+    public Map<String, Object> getRelationStatsByDocumentId(String documentId) {
+        List<GraphNode> nodes = graphNodeRepository.findByDocumentId(documentId);
+        Set<String> nodeIds = new HashSet<>();
+        for (GraphNode node : nodes) {
+            nodeIds.add(node.getId());
+        }
+
+        int relationCount = 0;
+        Map<String, Long> typeCount = new HashMap<>();
+
+        for (String nodeId : nodeIds) {
+            List<GraphRelation> relations = graphRelationRepository.findByFromNodeId(nodeId);
+            for (GraphRelation relation : relations) {
+                if (nodeIds.contains(relation.getToNodeId())) {
+                    relationCount++;
+                    String type = relation.getRelationType() != null ? relation.getRelationType() : "OTHER";
+                    typeCount.put(type, typeCount.getOrDefault(type, 0L) + 1);
+                }
+            }
+        }
+
+        Map<String, Object> stats = new HashMap<>();
+        stats.put("totalRelations", relationCount);
+        stats.put("typeDistribution", typeCount);
+
+        return stats;
+    }
+}

+ 30 - 0
backend/graph-service/src/main/java/com/lingyue/graph/service/TextStorageService.java

@@ -1,11 +1,13 @@
 package com.lingyue.graph.service;
 
+import com.lingyue.common.event.DocumentParsedEvent;
 import com.lingyue.common.exception.ServiceException;
 import com.lingyue.graph.entity.TextStorage;
 import com.lingyue.graph.repository.TextStorageRepository;
 import lombok.RequiredArgsConstructor;
 import lombok.extern.slf4j.Slf4j;
 import org.springframework.beans.factory.annotation.Value;
+import org.springframework.context.ApplicationEventPublisher;
 import org.springframework.stereotype.Service;
 
 import java.io.File;
@@ -28,10 +30,14 @@ public class TextStorageService {
     
     private final TextStorageRepository textStorageRepository;
     private final RAGService ragService;
+    private final ApplicationEventPublisher eventPublisher;
     
     @Value("${rag.auto-index.enabled:true}")
     private boolean autoIndexEnabled;
     
+    @Value("${ner.auto-extract.enabled:true}")
+    private boolean nerAutoExtractEnabled;
+    
     /**
      * 保存文本存储记录
      * 
@@ -93,6 +99,18 @@ public class TextStorageService {
      * @return 文本存储记录
      */
     public TextStorage saveAndIndex(String documentId, String filePath) {
+        return saveAndIndex(documentId, filePath, null);
+    }
+    
+    /**
+     * 保存文本并自动建立 RAG 索引(带用户ID)
+     * 
+     * @param documentId 文档ID
+     * @param filePath TXT文件路径
+     * @param userId 用户ID
+     * @return 文本存储记录
+     */
+    public TextStorage saveAndIndex(String documentId, String filePath, String userId) {
         // 1. 保存文本存储记录
         TextStorage textStorage = saveTextStorage(documentId, filePath);
         
@@ -108,6 +126,18 @@ public class TextStorageService {
             }
         }
         
+        // 3. 发布文档解析完成事件,触发 NER 等后续处理
+        if (nerAutoExtractEnabled) {
+            try {
+                DocumentParsedEvent event = new DocumentParsedEvent(
+                        this, documentId, userId, filePath, textStorage.getId());
+                eventPublisher.publishEvent(event);
+                log.info("发布文档解析完成事件: documentId={}", documentId);
+            } catch (Exception e) {
+                log.warn("发布文档解析完成事件失败,不影响主流程: documentId={}", documentId, e);
+            }
+        }
+        
         return textStorage;
     }
     

+ 17 - 0
backend/lingyue-starter/src/main/resources/application.properties

@@ -115,6 +115,23 @@ rag.search.top-k=3
 # RAG 自动索引配置
 rag.auto-index.enabled=true
 
+# ============================================
+# NER 服务配置
+# ============================================
+
+# NER 自动提取配置
+ner.auto-extract.enabled=true
+
+# Python NER 服务配置
+ner.python-service.url=${NER_SERVICE_URL:http://localhost:8001}
+ner.python-service.timeout=60000
+ner.python-service.connect-timeout=5000
+ner.python-service.max-retries=3
+ner.python-service.retry-interval=1000
+
+# NER 实体类型配置
+ner.entity-types=PERSON,ORG,LOC,DATE,NUMBER,DEVICE,PROJECT,TERM
+
 # XSS防护配置
 xss.enabled=true
 xss.excludes=/auth/register,/auth/login

+ 0 - 12
backend/sql/README_SUPPLEMENT.md

@@ -19,18 +19,6 @@ psql -U <username> -d lingyue_zhibao -f backend/sql/supplement_tables.sql
 2. 连接到 `lingyue_zhibao` 数据库
 3. 打开并执行 `backend/sql/supplement_tables.sql` 文件
 
-### 方法 3:使用 Docker
-
-如果您使用 Docker Compose:
-
-```bash
-# 进入 PostgreSQL 容器
-docker exec -it <postgres_container_name> bash
-
-# 在容器内执行
-psql -U postgres -d lingyue_zhibao -f /path/to/supplement_tables.sql
-```
-
 ## 补充表说明
 
 `supplement_tables.sql` 包含以下表:

+ 0 - 184
deploy.sh

@@ -1,184 +0,0 @@
-#!/bin/bash
-
-# 灵越智报部署脚本
-# 作者: lingyue
-# 版本: 2.0.0
-
-set -e
-
-# 颜色定义
-RED='\033[0;31m'
-GREEN='\033[0;32m'
-YELLOW='\033[1;33m'
-NC='\033[0m' # No Color
-
-# 日志函数
-log_info() {
-    echo -e "${GREEN}[INFO]${NC} $1"
-}
-
-log_warn() {
-    echo -e "${YELLOW}[WARN]${NC} $1"
-}
-
-log_error() {
-    echo -e "${RED}[ERROR]${NC} $1"
-}
-
-# 检查依赖
-check_dependencies() {
-    log_info "检查依赖..."
-    
-    if ! command -v docker &> /dev/null; then
-        log_error "Docker 未安装,请先安装 Docker"
-        exit 1
-    fi
-    
-    if ! command -v docker-compose &> /dev/null && ! docker compose version &> /dev/null; then
-        log_error "Docker Compose 未安装,请先安装 Docker Compose"
-        exit 1
-    fi
-    
-    log_info "依赖检查通过"
-}
-
-# 编译项目
-build_project() {
-    log_info "开始编译项目..."
-    cd backend
-    mvn clean package -DskipTests
-    if [ $? -ne 0 ]; then
-        log_error "项目编译失败"
-        exit 1
-    fi
-    cd ..
-    log_info "项目编译成功"
-}
-
-# 停止现有服务
-stop_services() {
-    log_info "停止现有服务..."
-    docker-compose down || docker compose down
-    log_info "服务已停止"
-}
-
-# 启动服务
-start_services() {
-    local with_ocr=$1
-    
-    log_info "启动服务..."
-    
-    if [ "$with_ocr" = "true" ]; then
-        log_info "启动包含 PaddleOCR 服务..."
-        docker-compose --profile with-ocr up -d || docker compose --profile with-ocr up -d
-    else
-        log_info "启动基础服务(不包含 OCR)..."
-        docker-compose up -d || docker compose up -d
-    fi
-    
-    if [ $? -ne 0 ]; then
-        log_error "服务启动失败"
-        exit 1
-    fi
-    
-    log_info "服务启动成功"
-}
-
-# 查看日志
-show_logs() {
-    log_info "查看应用日志..."
-    docker-compose logs -f lingyue-app || docker compose logs -f lingyue-app
-}
-
-# 检查服务状态
-check_status() {
-    log_info "检查服务状态..."
-    docker-compose ps || docker compose ps
-}
-
-# 显示帮助信息
-show_help() {
-    cat << EOF
-灵越智报部署脚本
-
-使用方法:
-    ./deploy.sh [选项]
-
-选项:
-    start           编译并启动所有服务(不包含 OCR)
-    start-with-ocr  编译并启动所有服务(包含 PaddleOCR)
-    stop            停止所有服务
-    restart         重启所有服务
-    logs            查看应用日志
-    status          查看服务状态
-    build           仅编译项目
-    help            显示此帮助信息
-
-示例:
-    ./deploy.sh start           # 启动基础服务
-    ./deploy.sh start-with-ocr  # 启动包含 OCR 的完整服务
-    ./deploy.sh logs            # 查看日志
-
-EOF
-}
-
-# 主函数
-main() {
-    local action=${1:-help}
-    
-    case $action in
-        start)
-            check_dependencies
-            build_project
-            stop_services
-            start_services false
-            log_info "部署完成!"
-            log_info "访问地址: http://localhost:8000"
-            log_info "Swagger文档: http://localhost:8000/swagger-ui.html"
-            log_info "Druid监控: http://localhost:8000/druid/ (用户名: admin, 密码: admin123)"
-            log_info "RabbitMQ管理界面: http://localhost:15672 (用户名: admin, 密码: admin123)"
-            log_info ""
-            log_warn "提示: 使用 './deploy.sh logs' 查看日志"
-            ;;
-        start-with-ocr)
-            check_dependencies
-            build_project
-            stop_services
-            start_services true
-            log_info "部署完成(包含 OCR 服务)!"
-            log_info "访问地址: http://localhost:8000"
-            log_info "PaddleOCR服务: http://localhost:8866"
-            ;;
-        stop)
-            stop_services
-            log_info "服务已全部停止"
-            ;;
-        restart)
-            check_dependencies
-            build_project
-            stop_services
-            start_services false
-            log_info "服务重启完成"
-            ;;
-        logs)
-            show_logs
-            ;;
-        status)
-            check_status
-            ;;
-        build)
-            build_project
-            ;;
-        help)
-            show_help
-            ;;
-        *)
-            log_error "未知选项: $action"
-            show_help
-            exit 1
-            ;;
-    esac
-}
-
-# 执行主函数
-main "$@"

+ 0 - 139
docker-compose.yml

@@ -1,139 +0,0 @@
-version: '3.8'
-
-services:
-  # PostgreSQL 数据库
-  postgres:
-    image: postgres:16-alpine
-    container_name: lingyue-postgres
-    restart: unless-stopped
-    environment:
-      POSTGRES_DB: lingyue_zhibao
-      POSTGRES_USER: postgres
-      POSTGRES_PASSWORD: postgres123
-      POSTGRES_INITDB_ARGS: "--encoding=UTF-8"
-    ports:
-      - "5432:5432"
-    volumes:
-      - postgres_data:/var/lib/postgresql/data
-      - ./backend/sql:/docker-entrypoint-initdb.d
-    networks:
-      - lingyue-network
-    healthcheck:
-      test: ["CMD-SHELL", "pg_isready -U postgres"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-
-  # Redis 缓存
-  redis:
-    image: redis:7-alpine
-    container_name: lingyue-redis
-    restart: unless-stopped
-    command: redis-server --appendonly yes
-    ports:
-      - "6379:6379"
-    volumes:
-      - redis_data:/data
-    networks:
-      - lingyue-network
-    healthcheck:
-      test: ["CMD", "redis-cli", "ping"]
-      interval: 10s
-      timeout: 3s
-      retries: 5
-
-  # RabbitMQ 消息队列
-  rabbitmq:
-    image: rabbitmq:3.13-management-alpine
-    container_name: lingyue-rabbitmq
-    restart: unless-stopped
-    environment:
-      RABBITMQ_DEFAULT_USER: admin
-      RABBITMQ_DEFAULT_PASS: admin123
-    ports:
-      - "5672:5672"   # AMQP 端口
-      - "15672:15672" # 管理界面端口
-    volumes:
-      - rabbitmq_data:/var/lib/rabbitmq
-    networks:
-      - lingyue-network
-    healthcheck:
-      test: ["CMD", "rabbitmq-diagnostics", "ping"]
-      interval: 10s
-      timeout: 5s
-      retries: 5
-
-  # 灵越智报应用
-  lingyue-app:
-    build:
-      context: ./backend
-      dockerfile: Dockerfile
-    container_name: lingyue-app
-    restart: unless-stopped
-    ports:
-      - "8000:8000"
-    environment:
-      # 数据库配置
-      DB_USERNAME: postgres
-      DB_PASSWORD: postgres123
-      # Redis 配置
-      REDIS_HOST: redis
-      REDIS_PORT: 6379
-      REDIS_PASSWORD: ""
-      # RabbitMQ 配置
-      RABBITMQ_HOST: rabbitmq
-      RABBITMQ_PORT: 5672
-      RABBITMQ_USERNAME: admin
-      RABBITMQ_PASSWORD: admin123
-      # JWT 配置
-      JWT_SECRET: lingyue-zhibao-production-secret-key-please-change-this-in-production-environment
-      # PaddleOCR 配置
-      PADDLEOCR_SERVER_URL: http://paddleocr:8866
-      # DeepSeek API 配置 (需要自行配置)
-      DEEPSEEK_API_KEY: ${DEEPSEEK_API_KEY:-}
-    volumes:
-      - app_data:/tmp/lingyue-zhibao
-      - app_logs:/app/logs
-    networks:
-      - lingyue-network
-    depends_on:
-      postgres:
-        condition: service_healthy
-      redis:
-        condition: service_healthy
-      rabbitmq:
-        condition: service_healthy
-    healthcheck:
-      test: ["CMD", "curl", "-f", "http://localhost:8000/actuator/health"]
-      interval: 30s
-      timeout: 10s
-      retries: 3
-      start_period: 60s
-
-  # PaddleOCR 服务(可选)
-  paddleocr:
-    image: paddlepaddle/paddleocr:latest-cpu
-    container_name: lingyue-paddleocr
-    restart: unless-stopped
-    ports:
-      - "8866:8866"
-    networks:
-      - lingyue-network
-    profiles:
-      - with-ocr
-
-volumes:
-  postgres_data:
-    driver: local
-  redis_data:
-    driver: local
-  rabbitmq_data:
-    driver: local
-  app_data:
-    driver: local
-  app_logs:
-    driver: local
-
-networks:
-  lingyue-network:
-    driver: bridge

+ 103 - 0
python-services/ner-service/README.md

@@ -0,0 +1,103 @@
+# NER Service
+
+命名实体识别(NER)服务,提供实体提取和关系抽取功能。
+
+## 功能特性
+
+- 实体提取:从文本中识别人名、机构、地点、日期、数值、设备等实体
+- 关系抽取:从实体间抽取语义关系
+- 多模式支持:规则模式(开发)、spaCy、Transformers、API
+
+## 快速开始
+
+### 本地运行
+
+```bash
+# 安装依赖
+pip install -r requirements.txt
+
+# 启动服务
+uvicorn app.main:app --host 0.0.0.0 --port 8001 --reload
+```
+
+## API 接口
+
+### 健康检查
+
+```
+GET /health
+```
+
+### 实体提取
+
+```
+POST /ner/extract
+Content-Type: application/json
+
+{
+  "documentId": "doc-001",
+  "text": "待提取的文本内容...",
+  "entityTypes": ["PERSON", "ORG", "LOC"],  // 可选
+  "extractRelations": true
+}
+```
+
+### 关系抽取
+
+```
+POST /ner/relations
+Content-Type: application/json
+
+{
+  "documentId": "doc-001",
+  "text": "原始文本",
+  "entities": [...]  // 已提取的实体列表
+}
+```
+
+## 实体类型
+
+| 类型 | 说明 | 示例 |
+|------|------|------|
+| PERSON | 人名 | 张三、李经理 |
+| ORG | 机构 | 成都检测公司 |
+| LOC | 地点 | 成都市高新区 |
+| DATE | 日期 | 2024年5月15日 |
+| NUMBER | 数值 | 100万元、50分贝 |
+| DEVICE | 设备 | 噪音检测设备 |
+| PROJECT | 项目 | 环境监测项目 |
+| TERM | 专业术语 | - |
+
+## 关系类型
+
+| 类型 | 说明 |
+|------|------|
+| 负责 | 主体负责某事 |
+| 属于 | 隶属关系 |
+| 位于 | 位置关系 |
+| 包含 | 包含关系 |
+| 使用 | 使用关系 |
+| 检测 | 检测关系 |
+| 合作 | 合作关系 |
+
+## 配置说明
+
+| 配置项 | 说明 | 默认值 |
+|--------|------|--------|
+| NER_MODEL | NER 模型类型 | rule |
+| USE_GPU | 是否使用 GPU | false |
+| MAX_TEXT_LENGTH | 最大文本长度 | 50000 |
+| LOG_LEVEL | 日志级别 | INFO |
+
+## 测试
+
+```bash
+pytest tests/ -v
+```
+
+## 开发计划
+
+- [ ] 集成 spaCy 中文模型
+- [ ] 集成 Transformers NER 模型
+- [ ] 实现 API 模式(DeepSeek/Qwen)
+- [ ] 优化关系抽取准确率

+ 3 - 0
python-services/ner-service/app/__init__.py

@@ -0,0 +1,3 @@
+"""
+NER Service Application
+"""

+ 43 - 0
python-services/ner-service/app/config.py

@@ -0,0 +1,43 @@
+"""
+NER 服务配置
+"""
+import os
+from pydantic_settings import BaseSettings
+from typing import Optional, List
+
+
+class Settings(BaseSettings):
+    """应用配置"""
+    
+    # 服务配置
+    app_name: str = "NER Service"
+    app_version: str = "1.0.0"
+    debug: bool = False
+    host: str = "0.0.0.0"
+    port: int = 8001
+    
+    # NER 模型配置
+    ner_model: str = "rule"  # rule / spacy / transformers / api
+    ner_model_name: Optional[str] = None  # 具体模型名称
+    use_gpu: bool = False
+    max_text_length: int = 50000
+    
+    # API 配置(用于 API 模式的后备方案)
+    api_base_url: Optional[str] = None
+    api_key: Optional[str] = None
+    api_model: str = "qwen-plus"
+    
+    # 实体类型配置
+    entity_types: List[str] = [
+        "PERSON", "ORG", "LOC", "DATE", "NUMBER", "DEVICE", "TERM", "PROJECT", "COMPANY"
+    ]
+    
+    # 日志配置
+    log_level: str = "INFO"
+    
+    class Config:
+        env_file = ".env"
+        env_file_encoding = "utf-8"
+
+
+settings = Settings()

+ 85 - 0
python-services/ner-service/app/main.py

@@ -0,0 +1,85 @@
+"""
+NER 服务主入口
+"""
+from fastapi import FastAPI
+from fastapi.middleware.cors import CORSMiddleware
+from loguru import logger
+import sys
+
+from .config import settings
+from .routers import ner, relation
+from .models import HealthResponse
+
+# 配置日志
+logger.remove()
+logger.add(
+    sys.stdout,
+    format="<green>{time:YYYY-MM-DD HH:mm:ss}</green> | <level>{level: <8}</level> | <cyan>{name}</cyan>:<cyan>{function}</cyan>:<cyan>{line}</cyan> - <level>{message}</level>",
+    level=settings.log_level
+)
+
+# 创建 FastAPI 应用
+app = FastAPI(
+    title=settings.app_name,
+    version=settings.app_version,
+    description="NER(命名实体识别)服务,提供实体提取和关系抽取功能",
+    docs_url="/docs",
+    redoc_url="/redoc"
+)
+
+# CORS 中间件
+app.add_middleware(
+    CORSMiddleware,
+    allow_origins=["*"],
+    allow_credentials=True,
+    allow_methods=["*"],
+    allow_headers=["*"],
+)
+
+# 注册路由
+app.include_router(ner.router, prefix="/ner", tags=["NER"])
+app.include_router(relation.router, prefix="/ner", tags=["Relation"])
+
+
+@app.get("/health", response_model=HealthResponse, tags=["Health"])
+async def health_check():
+    """健康检查接口"""
+    return HealthResponse(
+        status="ok",
+        version=settings.app_version,
+        ner_model=settings.ner_model
+    )
+
+
+@app.get("/", tags=["Root"])
+async def root():
+    """根路径"""
+    return {
+        "service": settings.app_name,
+        "version": settings.app_version,
+        "status": "running"
+    }
+
+
+@app.on_event("startup")
+async def startup_event():
+    """启动事件"""
+    logger.info(f"Starting {settings.app_name} v{settings.app_version}")
+    logger.info(f"NER Model: {settings.ner_model}")
+    logger.info(f"GPU Enabled: {settings.use_gpu}")
+
+
+@app.on_event("shutdown")
+async def shutdown_event():
+    """关闭事件"""
+    logger.info(f"Shutting down {settings.app_name}")
+
+
+if __name__ == "__main__":
+    import uvicorn
+    uvicorn.run(
+        "app.main:app",
+        host=settings.host,
+        port=settings.port,
+        reload=settings.debug
+    )

+ 16 - 0
python-services/ner-service/app/models/__init__.py

@@ -0,0 +1,16 @@
+"""
+模型模块
+"""
+from .request import NerRequest, RelationRequest, EntityInfo, PositionInfo
+from .response import NerResponse, RelationResponse, RelationInfo, HealthResponse
+
+__all__ = [
+    "NerRequest",
+    "RelationRequest",
+    "EntityInfo",
+    "PositionInfo",
+    "NerResponse",
+    "RelationResponse",
+    "RelationInfo",
+    "HealthResponse",
+]

+ 53 - 0
python-services/ner-service/app/models/request.py

@@ -0,0 +1,53 @@
+"""
+请求模型定义
+"""
+from pydantic import BaseModel, Field
+from typing import Optional, List
+
+
+class PositionInfo(BaseModel):
+    """位置信息"""
+    char_start: int = Field(..., alias="charStart", description="字符起始位置")
+    char_end: int = Field(..., alias="charEnd", description="字符结束位置")
+    line: Optional[int] = Field(None, description="所在行号")
+    page: Optional[int] = Field(None, description="所在页码")
+    file_id: Optional[str] = Field(None, alias="fileId", description="文件ID")
+    
+    class Config:
+        populate_by_name = True
+
+
+class EntityInfo(BaseModel):
+    """实体信息"""
+    name: str = Field(..., description="实体名称")
+    type: str = Field(..., description="实体类型")
+    value: Optional[str] = Field(None, description="实体值")
+    position: Optional[PositionInfo] = Field(None, description="位置信息")
+    context: Optional[str] = Field(None, description="上下文片段")
+    confidence: Optional[float] = Field(None, description="置信度")
+    temp_id: Optional[str] = Field(None, alias="tempId", description="临时ID")
+    
+    class Config:
+        populate_by_name = True
+
+
+class NerRequest(BaseModel):
+    """NER 请求"""
+    document_id: Optional[str] = Field(None, alias="documentId", description="文档ID")
+    text: str = Field(..., description="待提取的文本内容")
+    entity_types: Optional[List[str]] = Field(None, alias="entityTypes", description="指定提取的实体类型")
+    extract_relations: bool = Field(True, alias="extractRelations", description="是否提取关系")
+    user_id: Optional[str] = Field(None, alias="userId", description="用户ID")
+    
+    class Config:
+        populate_by_name = True
+
+
+class RelationRequest(BaseModel):
+    """关系抽取请求"""
+    document_id: Optional[str] = Field(None, alias="documentId", description="文档ID")
+    text: str = Field(..., description="原始文本内容")
+    entities: List[EntityInfo] = Field(..., description="已提取的实体列表")
+    
+    class Config:
+        populate_by_name = True

+ 93 - 0
python-services/ner-service/app/models/response.py

@@ -0,0 +1,93 @@
+"""
+响应模型定义
+"""
+from pydantic import BaseModel, Field
+from typing import Optional, List
+from .request import EntityInfo, PositionInfo
+
+
+class RelationInfo(BaseModel):
+    """关系信息"""
+    from_entity: str = Field(..., alias="fromEntity", description="源实体名称")
+    from_entity_id: Optional[str] = Field(None, alias="fromEntityId", description="源实体临时ID")
+    to_entity: str = Field(..., alias="toEntity", description="目标实体名称")
+    to_entity_id: Optional[str] = Field(None, alias="toEntityId", description="目标实体临时ID")
+    relation_type: str = Field(..., alias="relationType", description="关系类型")
+    confidence: Optional[float] = Field(None, description="置信度")
+    
+    class Config:
+        populate_by_name = True
+
+
+class NerResponse(BaseModel):
+    """NER 响应"""
+    document_id: Optional[str] = Field(None, alias="documentId", description="文档ID")
+    entities: List[EntityInfo] = Field(default_factory=list, description="提取的实体列表")
+    relations: List[RelationInfo] = Field(default_factory=list, description="提取的关系列表")
+    processing_time: Optional[int] = Field(None, alias="processingTime", description="处理耗时(毫秒)")
+    entity_count: int = Field(0, alias="entityCount", description="实体数量")
+    relation_count: int = Field(0, alias="relationCount", description="关系数量")
+    success: bool = Field(True, description="是否成功")
+    error_message: Optional[str] = Field(None, alias="errorMessage", description="错误信息")
+    
+    class Config:
+        populate_by_name = True
+
+    @classmethod
+    def success_response(cls, document_id: str, entities: List[EntityInfo], 
+                         relations: List[RelationInfo], processing_time: int):
+        return cls(
+            document_id=document_id,
+            entities=entities,
+            relations=relations,
+            processing_time=processing_time,
+            entity_count=len(entities),
+            relation_count=len(relations),
+            success=True
+        )
+    
+    @classmethod
+    def error_response(cls, document_id: str, error_message: str):
+        return cls(
+            document_id=document_id,
+            success=False,
+            error_message=error_message
+        )
+
+
+class RelationResponse(BaseModel):
+    """关系抽取响应"""
+    document_id: Optional[str] = Field(None, alias="documentId", description="文档ID")
+    relations: List[RelationInfo] = Field(default_factory=list, description="提取的关系列表")
+    processing_time: Optional[int] = Field(None, alias="processingTime", description="处理耗时(毫秒)")
+    relation_count: int = Field(0, alias="relationCount", description="关系数量")
+    success: bool = Field(True, description="是否成功")
+    error_message: Optional[str] = Field(None, alias="errorMessage", description="错误信息")
+    
+    class Config:
+        populate_by_name = True
+
+    @classmethod
+    def success_response(cls, document_id: str, relations: List[RelationInfo], processing_time: int):
+        return cls(
+            document_id=document_id,
+            relations=relations,
+            processing_time=processing_time,
+            relation_count=len(relations),
+            success=True
+        )
+    
+    @classmethod
+    def error_response(cls, document_id: str, error_message: str):
+        return cls(
+            document_id=document_id,
+            success=False,
+            error_message=error_message
+        )
+
+
+class HealthResponse(BaseModel):
+    """健康检查响应"""
+    status: str = "ok"
+    version: str
+    ner_model: str

+ 6 - 0
python-services/ner-service/app/routers/__init__.py

@@ -0,0 +1,6 @@
+"""
+路由模块
+"""
+from . import ner, relation
+
+__all__ = ["ner", "relation"]

+ 63 - 0
python-services/ner-service/app/routers/ner.py

@@ -0,0 +1,63 @@
+"""
+NER 路由
+"""
+from fastapi import APIRouter, HTTPException
+from loguru import logger
+import time
+
+from ..models import NerRequest, NerResponse, EntityInfo
+from ..services.ner_service import ner_service
+
+router = APIRouter()
+
+
+@router.post("/extract", response_model=NerResponse)
+async def extract_entities(request: NerRequest):
+    """
+    从文本中提取命名实体
+    """
+    start_time = time.time()
+    
+    try:
+        logger.info(f"开始提取实体: document_id={request.document_id}, text_length={len(request.text)}")
+        
+        # 验证文本长度
+        if len(request.text) > 50000:
+            raise HTTPException(status_code=400, detail="文本长度超过限制(最大50000字符)")
+        
+        # 调用 NER 服务
+        entities = await ner_service.extract_entities(
+            text=request.text,
+            entity_types=request.entity_types
+        )
+        
+        # 如果需要提取关系
+        relations = []
+        if request.extract_relations and len(entities) > 1:
+            from ..services.relation_service import relation_service
+            relations = await relation_service.extract_relations(
+                text=request.text,
+                entities=entities
+            )
+        
+        processing_time = int((time.time() - start_time) * 1000)
+        
+        logger.info(f"实体提取完成: document_id={request.document_id}, "
+                   f"entity_count={len(entities)}, relation_count={len(relations)}, "
+                   f"processing_time={processing_time}ms")
+        
+        return NerResponse.success_response(
+            document_id=request.document_id,
+            entities=entities,
+            relations=relations,
+            processing_time=processing_time
+        )
+        
+    except HTTPException:
+        raise
+    except Exception as e:
+        logger.error(f"实体提取失败: document_id={request.document_id}, error={str(e)}")
+        return NerResponse.error_response(
+            document_id=request.document_id,
+            error_message=str(e)
+        )

+ 54 - 0
python-services/ner-service/app/routers/relation.py

@@ -0,0 +1,54 @@
+"""
+关系抽取路由
+"""
+from fastapi import APIRouter, HTTPException
+from loguru import logger
+import time
+
+from ..models import RelationRequest, RelationResponse
+from ..services.relation_service import relation_service
+
+router = APIRouter()
+
+
+@router.post("/relations", response_model=RelationResponse)
+async def extract_relations(request: RelationRequest):
+    """
+    从已提取的实体中抽取关系
+    """
+    start_time = time.time()
+    
+    try:
+        logger.info(f"开始提取关系: document_id={request.document_id}, "
+                   f"entity_count={len(request.entities)}")
+        
+        if not request.entities or len(request.entities) < 2:
+            return RelationResponse.success_response(
+                document_id=request.document_id,
+                relations=[],
+                processing_time=0
+            )
+        
+        # 调用关系抽取服务
+        relations = await relation_service.extract_relations(
+            text=request.text,
+            entities=request.entities
+        )
+        
+        processing_time = int((time.time() - start_time) * 1000)
+        
+        logger.info(f"关系提取完成: document_id={request.document_id}, "
+                   f"relation_count={len(relations)}, processing_time={processing_time}ms")
+        
+        return RelationResponse.success_response(
+            document_id=request.document_id,
+            relations=relations,
+            processing_time=processing_time
+        )
+        
+    except Exception as e:
+        logger.error(f"关系提取失败: document_id={request.document_id}, error={str(e)}")
+        return RelationResponse.error_response(
+            document_id=request.document_id,
+            error_message=str(e)
+        )

+ 7 - 0
python-services/ner-service/app/services/__init__.py

@@ -0,0 +1,7 @@
+"""
+服务模块
+"""
+from .ner_service import ner_service
+from .relation_service import relation_service
+
+__all__ = ["ner_service", "relation_service"]

+ 197 - 0
python-services/ner-service/app/services/ner_service.py

@@ -0,0 +1,197 @@
+"""
+NER 服务实现
+
+支持多种模式:
+1. rule - 基于规则的简单 NER(默认,用于开发测试)
+2. spacy - 使用 spaCy 模型
+3. transformers - 使用 Transformers 模型
+4. api - 调用外部 API(如 DeepSeek/Qwen)
+"""
+import re
+import uuid
+from typing import List, Optional
+from loguru import logger
+
+from ..config import settings
+from ..models import EntityInfo, PositionInfo
+
+
+class NerService:
+    """NER 服务"""
+    
+    def __init__(self):
+        self.model_type = settings.ner_model
+        logger.info(f"初始化 NER 服务: model_type={self.model_type}")
+    
+    async def extract_entities(
+        self, 
+        text: str, 
+        entity_types: Optional[List[str]] = None
+    ) -> List[EntityInfo]:
+        """
+        从文本中提取实体
+        
+        Args:
+            text: 待提取的文本
+            entity_types: 指定要提取的实体类型,为空则提取所有类型
+            
+        Returns:
+            实体列表
+        """
+        if not text or not text.strip():
+            return []
+        
+        if self.model_type == "rule":
+            return await self._extract_by_rules(text, entity_types)
+        elif self.model_type == "spacy":
+            return await self._extract_by_spacy(text, entity_types)
+        elif self.model_type == "transformers":
+            return await self._extract_by_transformers(text, entity_types)
+        elif self.model_type == "api":
+            return await self._extract_by_api(text, entity_types)
+        else:
+            logger.warning(f"未知的模型类型: {self.model_type},使用规则模式")
+            return await self._extract_by_rules(text, entity_types)
+    
+    async def _extract_by_rules(
+        self, 
+        text: str, 
+        entity_types: Optional[List[str]] = None
+    ) -> List[EntityInfo]:
+        """
+        基于规则的 NER 提取
+        用于开发测试阶段,后续可替换为更高级的模型
+        """
+        entities = []
+        
+        # 规则定义
+        rules = {
+            "DATE": [
+                # 中文日期格式
+                r'(\d{4}年\d{1,2}月\d{1,2}日)',
+                r'(\d{4}年\d{1,2}月)',
+                r'(\d{4}-\d{1,2}-\d{1,2})',
+                r'(\d{4}/\d{1,2}/\d{1,2})',
+            ],
+            "NUMBER": [
+                # 带单位的数值
+                r'(\d+\.?\d*\s*(?:万元|元|米|公里|千米|平方米|㎡|吨|kg|g|个|台|套|件|次|人|天|小时|分钟|秒|%|百分比))',
+                # 百分比
+                r'(\d+\.?\d*%)',
+                # 纯数值(较大的数)
+                r'(?<![a-zA-Z])(\d{4,}(?:\.\d+)?)(?![a-zA-Z])',
+            ],
+            "ORG": [
+                # 机构/公司名称
+                r'([\u4e00-\u9fa5]{2,10}(?:公司|集团|院|所|局|部|厅|委|会|中心|协会|学会|银行|医院|学校|大学|学院))',
+                # xx省/市/县/区
+                r'([\u4e00-\u9fa5]{2,6}(?:省|市|县|区|镇|乡|村)(?:人民)?(?:政府|委员会)?)',
+            ],
+            "LOC": [
+                # 地点
+                r'([\u4e00-\u9fa5]{2,6}(?:省|市|县|区|镇|乡|村|路|街|巷|号|楼|栋|单元|室))',
+                # 常见地名后缀
+                r'([\u4e00-\u9fa5]{2,8}(?:工业园|开发区|高新区|科技园|产业园))',
+            ],
+            "PERSON": [
+                # 人名(简单规则:姓+名)
+                r'(?:(?:张|王|李|赵|刘|陈|杨|黄|周|吴|徐|孙|马|朱|胡|郭|何|林|罗|高|郑|梁|谢|唐|许|邓|冯|韩|曹|曾|彭|萧|蔡|潘|田|董|袁|于|余|叶|蒋|杜|苏|魏|程|吕|丁|沈|任|姚|卢|傅|钟|姜|崔|谭|廖|范|汪|陆|金|石|戴|贾|韦|夏|邱|方|侯|邹|熊|孟|秦|白|江|阎|薛|尹|段|雷|黎|史|龙|陶|贺|顾|毛|郝|龚|邵|万|钱|严|赖|覃|洪|武|莫|孔)[\u4e00-\u9fa5]{1,2})(?:总|经理|主任|工程师|教授|博士|先生|女士|同志)?',
+            ],
+            "DEVICE": [
+                # 设备名称
+                r'([\u4e00-\u9fa5]{2,10}(?:设备|仪器|仪表|机器|装置|系统|探测器|传感器|检测仪|分析仪|监测仪))',
+            ],
+            "PROJECT": [
+                # 项目名称
+                r'([\u4e00-\u9fa5]{2,20}(?:项目|工程|计划|方案|课题))',
+            ],
+        }
+        
+        # 过滤实体类型
+        if entity_types:
+            rules = {k: v for k, v in rules.items() if k in entity_types}
+        
+        # 执行规则匹配
+        seen_entities = set()  # 用于去重
+        
+        for entity_type, patterns in rules.items():
+            for pattern in patterns:
+                for match in re.finditer(pattern, text):
+                    entity_text = match.group(1) if match.groups() else match.group(0)
+                    entity_text = entity_text.strip()
+                    
+                    # 去重
+                    entity_key = f"{entity_type}:{entity_text}"
+                    if entity_key in seen_entities:
+                        continue
+                    seen_entities.add(entity_key)
+                    
+                    # 计算行号
+                    line_num = text[:match.start()].count('\n') + 1
+                    
+                    # 获取上下文
+                    context_start = max(0, match.start() - 20)
+                    context_end = min(len(text), match.end() + 20)
+                    context = text[context_start:context_end]
+                    if context_start > 0:
+                        context = "..." + context
+                    if context_end < len(text):
+                        context = context + "..."
+                    
+                    entity = EntityInfo(
+                        name=entity_text,
+                        type=entity_type,
+                        value=entity_text,
+                        position=PositionInfo(
+                            char_start=match.start(),
+                            char_end=match.end(),
+                            line=line_num
+                        ),
+                        context=context,
+                        confidence=0.8,  # 规则匹配默认置信度
+                        temp_id=str(uuid.uuid4())[:8]
+                    )
+                    entities.append(entity)
+        
+        logger.info(f"规则 NER 提取完成: entity_count={len(entities)}")
+        return entities
+    
+    async def _extract_by_spacy(
+        self, 
+        text: str, 
+        entity_types: Optional[List[str]] = None
+    ) -> List[EntityInfo]:
+        """
+        使用 spaCy 进行 NER 提取
+        """
+        # TODO: 实现 spaCy NER
+        logger.warning("spaCy NER 尚未实现,回退到规则模式")
+        return await self._extract_by_rules(text, entity_types)
+    
+    async def _extract_by_transformers(
+        self, 
+        text: str, 
+        entity_types: Optional[List[str]] = None
+    ) -> List[EntityInfo]:
+        """
+        使用 Transformers 模型进行 NER 提取
+        """
+        # TODO: 实现 Transformers NER
+        logger.warning("Transformers NER 尚未实现,回退到规则模式")
+        return await self._extract_by_rules(text, entity_types)
+    
+    async def _extract_by_api(
+        self, 
+        text: str, 
+        entity_types: Optional[List[str]] = None
+    ) -> List[EntityInfo]:
+        """
+        调用外部 API 进行 NER 提取
+        """
+        # TODO: 实现 API NER(调用 DeepSeek/Qwen)
+        logger.warning("API NER 尚未实现,回退到规则模式")
+        return await self._extract_by_rules(text, entity_types)
+
+
+# 创建单例
+ner_service = NerService()

+ 228 - 0
python-services/ner-service/app/services/relation_service.py

@@ -0,0 +1,228 @@
+"""
+关系抽取服务实现
+
+支持多种模式:
+1. rule - 基于规则的简单关系抽取(默认)
+2. api - 调用外部 API 进行关系抽取
+"""
+import re
+from typing import List
+from loguru import logger
+
+from ..config import settings
+from ..models import EntityInfo, RelationInfo
+
+
+class RelationService:
+    """关系抽取服务"""
+    
+    def __init__(self):
+        self.model_type = settings.ner_model
+        logger.info(f"初始化关系抽取服务: model_type={self.model_type}")
+    
+    async def extract_relations(
+        self, 
+        text: str, 
+        entities: List[EntityInfo]
+    ) -> List[RelationInfo]:
+        """
+        从文本和实体中抽取关系
+        
+        Args:
+            text: 原始文本
+            entities: 已提取的实体列表
+            
+        Returns:
+            关系列表
+        """
+        if not text or not entities or len(entities) < 2:
+            return []
+        
+        if self.model_type == "api":
+            return await self._extract_by_api(text, entities)
+        else:
+            return await self._extract_by_rules(text, entities)
+    
+    async def _extract_by_rules(
+        self, 
+        text: str, 
+        entities: List[EntityInfo]
+    ) -> List[RelationInfo]:
+        """
+        基于规则的关系抽取
+        
+        规则策略:
+        1. 基于位置邻近性
+        2. 基于语义模式匹配
+        """
+        relations = []
+        
+        # 关系模式定义
+        relation_patterns = {
+            "负责": [r"负责", r"承担", r"主管", r"管理"],
+            "属于": [r"属于", r"隶属", r"归属"],
+            "位于": [r"位于", r"在", r"坐落于", r"地处"],
+            "包含": [r"包含", r"包括", r"涵盖", r"含有"],
+            "关联": [r"关联", r"相关", r"涉及", r"关于"],
+            "生产": [r"生产", r"制造", r"制作"],
+            "使用": [r"使用", r"采用", r"利用", r"应用"],
+            "检测": [r"检测", r"检验", r"测试", r"测量"],
+            "所有": [r"所有", r"拥有", r"持有"],
+            "合作": [r"合作", r"协作", r"联合"],
+        }
+        
+        # 按位置排序实体
+        sorted_entities = sorted(
+            [e for e in entities if e.position], 
+            key=lambda e: e.position.char_start if e.position else 0
+        )
+        
+        seen_relations = set()
+        
+        # 检查相邻实体间的关系
+        for i in range(len(sorted_entities) - 1):
+            entity1 = sorted_entities[i]
+            entity2 = sorted_entities[i + 1]
+            
+            if not entity1.position or not entity2.position:
+                continue
+            
+            # 获取两个实体之间的文本
+            start = entity1.position.char_end
+            end = entity2.position.char_start
+            
+            if end <= start or end - start > 100:  # 距离太远则跳过
+                continue
+            
+            between_text = text[start:end]
+            
+            # 匹配关系模式
+            for relation_type, patterns in relation_patterns.items():
+                for pattern in patterns:
+                    if re.search(pattern, between_text):
+                        relation_key = f"{entity1.name}-{relation_type}-{entity2.name}"
+                        if relation_key not in seen_relations:
+                            seen_relations.add(relation_key)
+                            relations.append(RelationInfo(
+                                from_entity=entity1.name,
+                                from_entity_id=entity1.temp_id,
+                                to_entity=entity2.name,
+                                to_entity_id=entity2.temp_id,
+                                relation_type=relation_type,
+                                confidence=0.75
+                            ))
+                        break
+        
+        # 基于实体类型的隐含关系
+        org_entities = [e for e in entities if e.type == "ORG"]
+        person_entities = [e for e in entities if e.type == "PERSON"]
+        loc_entities = [e for e in entities if e.type == "LOC"]
+        project_entities = [e for e in entities if e.type == "PROJECT"]
+        device_entities = [e for e in entities if e.type == "DEVICE"]
+        
+        # 人员-机构 关系
+        for person in person_entities:
+            for org in org_entities:
+                relation_key = f"{person.name}-属于-{org.name}"
+                if relation_key not in seen_relations:
+                    # 检查是否在同一句中
+                    if self._in_same_sentence(text, person, org):
+                        seen_relations.add(relation_key)
+                        relations.append(RelationInfo(
+                            from_entity=person.name,
+                            from_entity_id=person.temp_id,
+                            to_entity=org.name,
+                            to_entity_id=org.temp_id,
+                            relation_type="属于",
+                            confidence=0.6
+                        ))
+        
+        # 机构-地点 关系
+        for org in org_entities:
+            for loc in loc_entities:
+                relation_key = f"{org.name}-位于-{loc.name}"
+                if relation_key not in seen_relations:
+                    if self._in_same_sentence(text, org, loc):
+                        seen_relations.add(relation_key)
+                        relations.append(RelationInfo(
+                            from_entity=org.name,
+                            from_entity_id=org.temp_id,
+                            to_entity=loc.name,
+                            to_entity_id=loc.temp_id,
+                            relation_type="位于",
+                            confidence=0.6
+                        ))
+        
+        # 机构-项目 关系
+        for org in org_entities:
+            for project in project_entities:
+                relation_key = f"{org.name}-负责-{project.name}"
+                if relation_key not in seen_relations:
+                    if self._in_same_sentence(text, org, project):
+                        seen_relations.add(relation_key)
+                        relations.append(RelationInfo(
+                            from_entity=org.name,
+                            from_entity_id=org.temp_id,
+                            to_entity=project.name,
+                            to_entity_id=project.temp_id,
+                            relation_type="负责",
+                            confidence=0.6
+                        ))
+        
+        # 项目-设备 关系
+        for project in project_entities:
+            for device in device_entities:
+                relation_key = f"{project.name}-使用-{device.name}"
+                if relation_key not in seen_relations:
+                    if self._in_same_sentence(text, project, device):
+                        seen_relations.add(relation_key)
+                        relations.append(RelationInfo(
+                            from_entity=project.name,
+                            from_entity_id=project.temp_id,
+                            to_entity=device.name,
+                            to_entity_id=device.temp_id,
+                            relation_type="使用",
+                            confidence=0.6
+                        ))
+        
+        logger.info(f"规则关系抽取完成: relation_count={len(relations)}")
+        return relations
+    
+    def _in_same_sentence(self, text: str, entity1: EntityInfo, entity2: EntityInfo) -> bool:
+        """判断两个实体是否在同一句中"""
+        if not entity1.position or not entity2.position:
+            return False
+        
+        # 获取两个实体的位置范围
+        start = min(entity1.position.char_start, entity2.position.char_start)
+        end = max(entity1.position.char_end, entity2.position.char_end)
+        
+        # 检查范围内是否有句号等标点
+        between_text = text[start:end]
+        sentence_enders = ["。", "!", "?", ".", "!", "?", "\n\n"]
+        
+        for ender in sentence_enders:
+            if ender in between_text:
+                return False
+        
+        # 距离限制
+        if end - start > 200:
+            return False
+        
+        return True
+    
+    async def _extract_by_api(
+        self, 
+        text: str, 
+        entities: List[EntityInfo]
+    ) -> List[RelationInfo]:
+        """
+        调用外部 API 进行关系抽取
+        """
+        # TODO: 实现 API 关系抽取
+        logger.warning("API 关系抽取尚未实现,回退到规则模式")
+        return await self._extract_by_rules(text, entities)
+
+
+# 创建单例
+relation_service = RelationService()

+ 28 - 0
python-services/ner-service/requirements.txt

@@ -0,0 +1,28 @@
+# FastAPI and server
+fastapi==0.109.0
+uvicorn[standard]==0.27.0
+python-multipart==0.0.6
+
+# Pydantic for data validation
+pydantic==2.5.3
+pydantic-settings==2.1.0
+
+# HTTP client
+httpx==0.26.0
+aiohttp==3.9.1
+
+# NER and NLP
+# spacy==3.7.2  # Uncomment if using spaCy
+# transformers==4.36.2  # Uncomment if using Transformers
+# torch==2.1.2  # Uncomment if using PyTorch
+
+# For DeepSeek/Qwen API fallback
+openai==1.6.1
+
+# Utilities
+python-dotenv==1.0.0
+loguru==0.7.2
+
+# Testing
+pytest==7.4.4
+pytest-asyncio==0.23.3

+ 3 - 0
python-services/ner-service/tests/__init__.py

@@ -0,0 +1,3 @@
+"""
+测试模块
+"""

+ 104 - 0
python-services/ner-service/tests/test_ner.py

@@ -0,0 +1,104 @@
+"""
+NER 服务测试
+"""
+import pytest
+from fastapi.testclient import TestClient
+
+from app.main import app
+
+
+client = TestClient(app)
+
+
+def test_health_check():
+    """测试健康检查接口"""
+    response = client.get("/health")
+    assert response.status_code == 200
+    data = response.json()
+    assert data["status"] == "ok"
+    assert "version" in data
+
+
+def test_extract_entities():
+    """测试实体提取接口"""
+    request_data = {
+        "documentId": "test-doc-001",
+        "text": "2024年5月15日,成都检测公司在成都市高新区完成了环境监测项目的检测工作,使用了噪音检测设备。",
+        "extractRelations": True
+    }
+    
+    response = client.post("/ner/extract", json=request_data)
+    assert response.status_code == 200
+    
+    data = response.json()
+    assert data["success"] is True
+    assert data["documentId"] == "test-doc-001"
+    assert "entities" in data
+    assert len(data["entities"]) > 0
+    
+    # 验证提取到的实体类型
+    entity_types = {e["type"] for e in data["entities"]}
+    assert "DATE" in entity_types or "ORG" in entity_types
+
+
+def test_extract_relations():
+    """测试关系抽取接口"""
+    request_data = {
+        "documentId": "test-doc-002",
+        "text": "成都检测公司负责环境监测项目",
+        "entities": [
+            {
+                "name": "成都检测公司",
+                "type": "ORG",
+                "value": "成都检测公司",
+                "position": {"charStart": 0, "charEnd": 6, "line": 1},
+                "tempId": "e1"
+            },
+            {
+                "name": "环境监测项目",
+                "type": "PROJECT",
+                "value": "环境监测项目",
+                "position": {"charStart": 8, "charEnd": 14, "line": 1},
+                "tempId": "e2"
+            }
+        ]
+    }
+    
+    response = client.post("/ner/relations", json=request_data)
+    assert response.status_code == 200
+    
+    data = response.json()
+    assert data["success"] is True
+    assert "relations" in data
+
+
+def test_empty_text():
+    """测试空文本"""
+    request_data = {
+        "documentId": "test-doc-003",
+        "text": "",
+        "extractRelations": False
+    }
+    
+    response = client.post("/ner/extract", json=request_data)
+    assert response.status_code == 200
+    
+    data = response.json()
+    assert data["success"] is True
+    assert len(data["entities"]) == 0
+
+
+def test_text_too_long():
+    """测试文本过长"""
+    request_data = {
+        "documentId": "test-doc-004",
+        "text": "a" * 60000,  # 超过限制
+        "extractRelations": False
+    }
+    
+    response = client.post("/ner/extract", json=request_data)
+    assert response.status_code == 400  # 应该返回错误
+
+
+if __name__ == "__main__":
+    pytest.main([__file__, "-v"])

+ 113 - 0
test/test_ner_e2e.sh

@@ -0,0 +1,113 @@
+#!/bin/bash
+
+# NER 服务端到端测试脚本
+# 测试流程:NER 服务健康检查 -> 实体提取 -> 关系抽取 -> 文档 NER
+
+set -e
+
+# 配置
+NER_SERVICE_URL="${NER_SERVICE_URL:-http://localhost:8001}"
+JAVA_SERVICE_URL="${JAVA_SERVICE_URL:-http://localhost:5232}"
+
+echo "=========================================="
+echo "NER 服务端到端测试"
+echo "=========================================="
+echo "NER 服务地址: $NER_SERVICE_URL"
+echo "Java 服务地址: $JAVA_SERVICE_URL"
+echo ""
+
+# 测试 1: Python NER 服务健康检查
+echo ">> 测试 1: Python NER 服务健康检查"
+response=$(curl -s -X GET "$NER_SERVICE_URL/health")
+echo "响应: $response"
+if echo "$response" | grep -q '"status":"ok"'; then
+    echo "✅ Python NER 服务健康检查通过"
+else
+    echo "❌ Python NER 服务健康检查失败"
+    exit 1
+fi
+echo ""
+
+# 测试 2: 实体提取
+echo ">> 测试 2: 实体提取"
+response=$(curl -s -X POST "$NER_SERVICE_URL/ner/extract" \
+    -H "Content-Type: application/json" \
+    -d '{
+        "documentId": "test-doc-001",
+        "text": "2024年5月15日,成都检测公司在成都市高新区完成了环境监测项目的检测工作,项目负责人张经理使用了噪音检测设备进行测量,测量结果显示噪音值为50分贝。",
+        "extractRelations": true
+    }')
+echo "响应: $response"
+if echo "$response" | grep -q '"success":true'; then
+    entity_count=$(echo "$response" | grep -o '"entityCount":[0-9]*' | grep -o '[0-9]*')
+    relation_count=$(echo "$response" | grep -o '"relationCount":[0-9]*' | grep -o '[0-9]*')
+    echo "✅ 实体提取成功: entityCount=$entity_count, relationCount=$relation_count"
+else
+    echo "❌ 实体提取失败"
+    exit 1
+fi
+echo ""
+
+# 测试 3: 关系抽取
+echo ">> 测试 3: 关系抽取"
+response=$(curl -s -X POST "$NER_SERVICE_URL/ner/relations" \
+    -H "Content-Type: application/json" \
+    -d '{
+        "documentId": "test-doc-002",
+        "text": "成都检测公司负责环境监测项目",
+        "entities": [
+            {
+                "name": "成都检测公司",
+                "type": "ORG",
+                "value": "成都检测公司",
+                "position": {"charStart": 0, "charEnd": 6, "line": 1},
+                "tempId": "e1"
+            },
+            {
+                "name": "环境监测项目",
+                "type": "PROJECT",
+                "value": "环境监测项目",
+                "position": {"charStart": 8, "charEnd": 14, "line": 1},
+                "tempId": "e2"
+            }
+        ]
+    }')
+echo "响应: $response"
+if echo "$response" | grep -q '"success":true'; then
+    echo "✅ 关系抽取成功"
+else
+    echo "❌ 关系抽取失败"
+    exit 1
+fi
+echo ""
+
+# 测试 4: Java NER 服务健康检查(如果 Java 服务已启动)
+echo ">> 测试 4: Java 服务健康检查"
+if curl -s -f "$JAVA_SERVICE_URL/actuator/health" > /dev/null 2>&1; then
+    response=$(curl -s "$JAVA_SERVICE_URL/actuator/health")
+    echo "响应: $response"
+    echo "✅ Java 服务健康检查通过"
+    
+    # 测试 5: 通过 Java 服务调用 NER
+    echo ""
+    echo ">> 测试 5: 通过 Java 服务调用 NER"
+    response=$(curl -s -X POST "$JAVA_SERVICE_URL/api/ner/extract" \
+        -H "Content-Type: application/json" \
+        -d '{
+            "documentId": "test-doc-003",
+            "text": "2024年6月1日,湖北环保局在武汉市武昌区开展了水质检测工作。"
+        }')
+    echo "响应: $response"
+    if echo "$response" | grep -q '"code":200'; then
+        echo "✅ 通过 Java 服务调用 NER 成功"
+    else
+        echo "⚠️  通过 Java 服务调用 NER 可能失败(请检查日志)"
+    fi
+else
+    echo "⚠️  Java 服务未启动,跳过 Java 服务测试"
+fi
+echo ""
+
+echo "=========================================="
+echo "测试完成!"
+echo "=========================================="

+ 35 - 33
进度报告.md

@@ -1,29 +1,45 @@
 # 📊 灵越智报 2.0 - 当前进度总结
 
-**整体进度:50%**  |  **报告日期:2026-01-17**
+**整体进度:55%**  |  **报告日期:2026-01-19**
 
 ---
 
-## ✅ 已完成(50%)
+## ✅ 已完成(55%)
 
 ### 基础设施
 - Spring Boot 3.1.5 单体应用架构(lingyue-starter)
 - 数据库(PostgreSQL + pgvector)、缓存(Redis)、消息队列(RabbitMQ)配置完成
 - 6大服务模块框架搭建完成
-- **Docker 容器化部署配置完成**
 
 ### 核心模块现状
 - 📁 文档管理、解析、认证 → 框架完成
 - 🤖 AI服务、图谱服务 → 数据层完成,RAG 功能已实现
 - 🔍 **RAG 向量化存储** → ✅ 已完成
-- 🐳 **容器化部署** → ✅ 已完成
+- 🏷️ **NER 实体识别服务** → ✅ 已完成
+
+### 新增功能(2026-01-19)
+- ✅ **NER 服务完整实现**
+  - Python FastAPI NER 服务(规则模式,支持扩展 spaCy/Transformers/API)
+  - Java NER 客户端(PythonNerClient)
+  - NER DTO 类(NerRequest, NerResponse, EntityInfo, RelationInfo 等)
+  - NER API 接口(/api/ner/extract, /api/ner/document/{id})
+- ✅ **关系抽取服务**
+  - 基于规则的关系抽取(位置邻近性、语义模式匹配)
+  - 关系抽取 API(/api/ner/relations)
+- ✅ **图数据库服务扩展**
+  - GraphNodeService(节点/关系 CRUD、批量操作)
+  - 图数据库 API(/api/graph/nodes, /api/graph/relations)
+  - 文档节点统计接口
+- ✅ **解析流程集成**
+  - 文档解析完成事件(DocumentParsedEvent)
+  - NER 自动触发监听器
+  - 解析 → RAG → NER → 图数据库完整链路
+  - NER 服务配置项(ner.python-service.url 等)
 
 ### 新增功能(2026-01-17)
 - ✅ **单体应用架构重构** - 统一到 lingyue-starter 模块
 - ✅ **配置文件统一** - 全部使用 .properties 格式,删除 .yml 文件
 - ✅ **MyBatisPlusConfig 统一** - 移除各模块独立配置,集中管理
-- ✅ **Docker 部署配置** - Dockerfile + docker-compose.yml
-- ✅ **部署脚本** - deploy.sh(支持 start/stop/restart/logs)
 - ✅ **配置指南文档** - CONFIG_GUIDE.md(环境变量、配置优先级说明)
 - ✅ **接口测试完成** - 验证核心接口可用性
   - /actuator/health(健康检查)
@@ -46,32 +62,12 @@
 
 ---
 
-## 🐳 部署架构
-
-```
-docker-compose.yml
-├── postgres (PostgreSQL 16 + pgvector)
-├── redis (Redis 7)
-├── rabbitmq (RabbitMQ 3.13 + Management)
-├── lingyue-app (Spring Boot 应用)
-└── paddleocr (可选, --profile with-ocr)
-```
-
-**快速部署命令:**
-```bash
-./deploy.sh start           # 启动基础服务
-./deploy.sh start-with-ocr  # 启动包含 OCR 的完整服务
-./deploy.sh logs            # 查看日志
-```
-
----
-
 ## ⚠️ 关键缺失(对照技术预研表)
 
 | 预研项 | 进度 | 表格要求 | 已完成 ✅ | 待实现 ❌ |
 |--------|------|----------|-----------|-----------|
 | **1️⃣ 规则"智能体"设计** | 35% | 报告生成逻辑规则多样(字符逻辑、语义理解、实体关系多层计算) | Graph Service 架构(9个Repository)<br>规则、模板数据模型<br>数据访问层<br>**RAG 问答服务** | 规则DSL定义与解析<br>规则执行引擎<br>多层计算算法 |
-| **2️⃣ 产品定位与功能逻辑** | 40% | 产品交互界面、智能体集群、规则逻辑校验 | 6大后端服务框架<br>Flutter 项目结构<br>路由、主题、基础组件<br>**Docker 部署配置** | 所有前端页面UI<br>智能体集群架构<br>规则校验功能<br>前后端API对接 |
+| **2️⃣ 产品定位与功能逻辑** | 40% | 产品交互界面、智能体集群、规则逻辑校验 | 6大后端服务框架<br>Flutter 项目结构<br>路由、主题、基础组件 | 所有前端页面UI<br>智能体集群架构<br>规则校验功能<br>前后端API对接 |
 | **3️⃣ 规则智能体模拟** | 40% | 单规则逻辑树构建、规则测试、API记忆化(知识图谱) | TextStorage(文本存储)<br>GraphNode、GraphRelation<br>ParseTask(任务管理)<br>**文本分块、向量存储**<br>**向量相似度检索** | 规则逻辑树算法<br>单规则校验引擎<br>知识图谱构建算法<br>图谱查询与推理 |
 | **4️⃣ AI模态体型/OCR** | 60% | AI模态体型、OCR、文本分析代码,NSDK集成 | PaddleOCR Client 接口<br>PDF/Word/Excel 文本提取<br>AI Service 框架<br>Element、Annotation 实体<br>**DeepSeek API 客户端**<br>**Ollama Embedding 服务** | AI模态体模型接入<br>NSDK集成<br>NLP文本分析算法<br>OCR后处理优化 |
 | **5️⃣ 前端交互设计** | 15% | AI产品"交互应简单"体验、核心交互功能规划 | Flutter 项目结构<br>路由、主题配置<br>基础组件、业务组件<br>7个页面骨架 | 所有7个核心页面UI<br>页面交互逻辑<br>与后端API对接<br>WebSocket实时通信 |
@@ -91,7 +87,6 @@ backend/
 ├── notification-service/   # 通知服务(WebSocket)
 ├── gateway-service/        # 网关服务(JWT过滤器、CORS)
 ├── lingyue-starter/        # 单体应用启动器(统一配置)
-├── Dockerfile              # Docker 镜像构建
 └── sql/                    # 数据库初始化脚本
 
 frontend_flutter/
@@ -113,11 +108,18 @@ frontend_flutter/
 **目标:跑通“解析文本 → 向量化 → NER → 关系构建 → 图数据库”的完整链路**(设计文档 2.2.2 / 2.3 / 10.2)
 
 **每日任务计划:**
-- **01-19(周一)**:NER 服务接口定义与数据结构;Python NER 服务联调准备(输入/输出格式)
-- **01-20(周二)**:NER 接口打通(ai-service 调用 Python);实体列表入库到 `graph_nodes`
-- **01-21(周三)**:关系抽取接口(基于实体);关系入库到 `graph_relations`
-- **01-22(周四)**:图节点/关系查询接口完善;按文档查询与基础校验
-- **01-23(周五)**:端到端验证(上传→解析→NER→关系→图);问题收敛与文档更新
+- **01-19(周一)✅ 已完成**:NER 服务接口定义与数据结构;Python NER 服务联调准备(输入/输出格式)
+  - ✅ NER DTO 类(NerRequest, NerResponse, EntityInfo, RelationInfo 等)
+  - ✅ Python FastAPI NER 服务骨架
+  - ✅ 实体提取和关系抽取实现(规则模式)
+  - ✅ Java PythonNerClient HTTP 客户端
+  - ✅ NerServiceImpl 业务逻辑
+  - ✅ GraphNodeService 扩展
+  - ✅ TextStorageService 集成(自动触发 NER)
+- **01-20(周二)**:联调测试与优化;完善错误处理
+- **01-21(周三)**:关系抽取优化;增加更多实体类型支持
+- **01-22(周四)**:图节点/关系查询接口优化;性能测试
+- **01-23(周五)**:端到端验证;问题收敛与文档更新
 
 1. **NER 实体提取(优先级最高)**
    - 设计要求:两轮处理(实体→关系),离线方案优先(设计文档 2.3.1 / 10.2)