--- name: RAG向量化存储实现 overview: 基于 pgvector + Ollama(bge-large-zh-v1.5) + DeepSeek API 实现 RAG 系统,包含文本分块、向量化、向量存储、相似度检索和问答生成功能。 todos: - id: sql-schema content: 创建 RAG 相关数据表 SQL(text_chunks, vector_embeddings) status: completed - id: entities content: 实现 TextChunk 和 VectorEmbedding 实体类 status: completed - id: repositories content: 实现 Repository 层(含向量检索原生 SQL) status: completed - id: chunk-service content: 实现文本分块服务 TextChunkService status: completed - id: ollama-service content: 实现 Ollama Embedding 服务 OllamaEmbeddingService status: completed - id: vector-search content: 实现向量检索服务 VectorSearchService status: completed - id: deepseek-client content: 实现 DeepSeek API 客户端 status: completed - id: rag-service content: 实现 RAG 核心服务 status: completed - id: rag-controller content: 实现 RAG API Controller status: completed - id: config content: 添加配置项和配置类 status: completed - id: integration content: 集成到现有 TextStorageService 和 ParseService status: completed --- # RAG 向量化存储实现计划 ## 技术架构 ```mermaid flowchart LR subgraph input [文档输入] A[上传文档] --> B[文本提取] end subgraph vectorization [向量化处理] B --> C[文本分块] C --> D[Ollama Embedding] D --> E[pgvector存储] end subgraph query [RAG查询] F[用户问题] --> G[问题向量化] G --> H[向量检索] H --> I[上下文构建] I --> J[DeepSeek生成] J --> K[返回答案] end E --> H ``` ## 一、数据库层 ### 1.1 启用 pgvector 扩展 在 [backend/sql/init.sql](backend/sql/init.sql) 中添加 pgvector 扩展和新表。 **新增 SQL**: ```sql -- 启用向量扩展 CREATE EXTENSION IF NOT EXISTS vector; -- 文本分块表 CREATE TABLE text_chunks ( id VARCHAR(32) PRIMARY KEY, document_id VARCHAR(32) NOT NULL, text_storage_id VARCHAR(32), chunk_index INTEGER NOT NULL, content TEXT NOT NULL, token_count INTEGER, metadata JSONB, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP, updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- 向量嵌入表 CREATE TABLE vector_embeddings ( id VARCHAR(32) PRIMARY KEY, chunk_id VARCHAR(32) NOT NULL REFERENCES text_chunks(id) ON DELETE CASCADE, embedding vector(1024), -- bge-large-zh-v1.5 维度 model_name VARCHAR(100) DEFAULT 'bge-large-zh-v1.5', created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ); -- HNSW 向量索引 CREATE INDEX idx_vector_embeddings_hnsw ON vector_embeddings USING hnsw (embedding vector_cosine_ops); ``` ## 二、实体层 (graph-service) ### 2.1 TextChunk 实体 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/entity/TextChunk.java` ```java @Data @TableName(value = "text_chunks", autoResultMap = true) public class TextChunk extends SimpleModel { private String documentId; private String textStorageId; private Integer chunkIndex; private String content; private Integer tokenCount; private Map metadata; } ``` ### 2.2 VectorEmbedding 实体 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/entity/VectorEmbedding.java` ```java @Data @TableName("vector_embeddings") public class VectorEmbedding extends SimpleModel { private String chunkId; private String embedding; // pgvector 格式字符串 private String modelName; } ``` ## 三、Repository 层 ### 3.1 TextChunkRepository **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/repository/TextChunkRepository.java` ### 3.2 VectorEmbeddingRepository **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/repository/VectorEmbeddingRepository.java` - 需自定义原生 SQL 查询实现向量相似度检索 ## 四、Service 层 ### 4.1 文本分块服务 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/service/TextChunkService.java` 核心功能: - `chunkText(documentId, textStorageId, text)` - 文本分块 - 分块策略: 500字符/块,50字符重叠 - 智能句子边界切分(句号、问号等) - 自动计算 token 数量 ### 4.2 Ollama Embedding 服务 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/service/OllamaEmbeddingService.java` 核心功能: - `embed(text)` - 单文本向量化 - `embedBatch(texts)` - 批量向量化 - 调用 Ollama API: `POST http://localhost:11434/api/embeddings` - 模型: `bge-large-zh-v1.5` 或 `nomic-embed-text` ### 4.3 向量检索服务 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/service/VectorSearchService.java` 核心功能: - `search(query, topK, documentId)` - 向量相似度检索 - 使用 pgvector 的 `<=>` 余弦距离操作符 - 支持按文档 ID 筛选 ### 4.4 RAG 服务 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/service/RAGService.java` 核心功能: - `query(question, documentId, topK)` - RAG 问答 - 上下文构建(拼接相关文本块) - Prompt 模板管理 ## 五、AI 服务层 (ai-service) ### 5.1 DeepSeek 客户端 **新建文件**: `backend/ai-service/src/main/java/com/lingyue/ai/client/DeepSeekClient.java` 核心功能: - `complete(prompt)` - 文本生成 - `chat(messages)` - 对话生成 - API 配置: URL、API Key、模型选择 ### 5.2 DeepSeek DTO **新建文件**: - `backend/ai-service/src/main/java/com/lingyue/ai/dto/ChatRequest.java` - `backend/ai-service/src/main/java/com/lingyue/ai/dto/ChatResponse.java` ## 六、配置层 ### 6.1 配置类 **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/config/VectorConfig.java` 配置 Ollama WebClient Bean。 ### 6.2 配置项 在 [backend/lingyue-starter/src/main/resources/application.properties](backend/lingyue-starter/src/main/resources/application.properties) 中添加: ```properties # Ollama Embedding 配置 ollama.url=http://localhost:11434 ollama.embedding.model=nomic-embed-text # DeepSeek API 配置 deepseek.api.url=https://api.deepseek.com deepseek.api.key=${DEEPSEEK_API_KEY:} deepseek.api.model=deepseek-chat # RAG 配置 rag.chunk.size=500 rag.chunk.overlap=50 rag.search.top-k=3 ``` ## 七、Controller 层 ### 7.1 RAG Controller **新建文件**: `backend/graph-service/src/main/java/com/lingyue/graph/controller/RAGController.java` API 接口: - `POST /api/rag/index` - 索引文档(分块+向量化) - `POST /api/rag/query` - RAG 问答 - `GET /api/rag/chunks/{documentId}` - 获取文档分块 - `DELETE /api/rag/index/{documentId}` - 删除文档索引 ## 八、与现有代码集成 ### 8.1 扩展 TextStorageService 修改 [backend/graph-service/src/main/java/com/lingyue/graph/service/TextStorageService.java](backend/graph-service/src/main/java/com/lingyue/graph/service/TextStorageService.java) - 保存文本后自动触发分块和向量化 - 新增方法: `saveAndIndex(documentId, filePath)` ### 8.2 扩展 ParseService 修改 [backend/parse-service/src/main/java/com/lingyue/parse/service/ParseService.java](backend/parse-service/src/main/java/com/lingyue/parse/service/ParseService.java) - 解析完成后自动调用 RAG 索引 ## 九、Ollama 部署 ### 9.1 安装 Ollama ```bash curl -fsSL https://ollama.com/install.sh | sh ``` ### 9.2 拉取 Embedding 模型 ```bash # 推荐使用 nomic-embed-text(Ollama 原生支持) ollama pull nomic-embed-text # 或使用 bge 系列(需确认 Ollama 支持) ollama pull bge-large ``` ## 十、文件清单 | 类型 | 文件路径 | 说明 | |-----|---------|------| | SQL | `backend/sql/rag_tables.sql` | RAG 相关表 | | Entity | `graph-service/.../entity/TextChunk.java` | 文本分块实体 | | Entity | `graph-service/.../entity/VectorEmbedding.java` | 向量嵌入实体 | | Repository | `graph-service/.../repository/TextChunkRepository.java` | 分块数据访问 | | Repository | `graph-service/.../repository/VectorEmbeddingRepository.java` | 向量数据访问 | | Service | `graph-service/.../service/TextChunkService.java` | 文本分块服务 | | Service | `graph-service/.../service/OllamaEmbeddingService.java` | Ollama 向量化 | | Service | `graph-service/.../service/VectorSearchService.java` | 向量检索 | | Service | `graph-service/.../service/RAGService.java` | RAG 核心服务 | | Config | `graph-service/.../config/VectorConfig.java` | 向量配置 | | Client | `ai-service/.../client/DeepSeekClient.java` | DeepSeek 客户端 | | DTO | `ai-service/.../dto/ChatRequest.java` | 请求 DTO | | DTO | `ai-service/.../dto/ChatResponse.java` | 响应 DTO | | Controller | `graph-service/.../controller/RAGController.java` | RAG API |