name: RAG向量化存储实现 overview: 基于 pgvector + Ollama(bge-large-zh-v1.5) + DeepSeek API 实现 RAG 系统,包含文本分块、向量化、向量存储、相似度检索和问答生成功能。 todos:
flowchart LR
subgraph input [文档输入]
A[上传文档] --> B[文本提取]
end
subgraph vectorization [向量化处理]
B --> C[文本分块]
C --> D[Ollama Embedding]
D --> E[pgvector存储]
end
subgraph query [RAG查询]
F[用户问题] --> G[问题向量化]
G --> H[向量检索]
H --> I[上下文构建]
I --> J[DeepSeek生成]
J --> K[返回答案]
end
E --> H
在 backend/sql/init.sql 中添加 pgvector 扩展和新表。
新增 SQL:
-- 启用向量扩展
CREATE EXTENSION IF NOT EXISTS vector;
-- 文本分块表
CREATE TABLE text_chunks (
id VARCHAR(32) PRIMARY KEY,
document_id VARCHAR(32) NOT NULL,
text_storage_id VARCHAR(32),
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
token_count INTEGER,
metadata JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- 向量嵌入表
CREATE TABLE vector_embeddings (
id VARCHAR(32) PRIMARY KEY,
chunk_id VARCHAR(32) NOT NULL REFERENCES text_chunks(id) ON DELETE CASCADE,
embedding vector(1024), -- bge-large-zh-v1.5 维度
model_name VARCHAR(100) DEFAULT 'bge-large-zh-v1.5',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- HNSW 向量索引
CREATE INDEX idx_vector_embeddings_hnsw ON vector_embeddings
USING hnsw (embedding vector_cosine_ops);
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/entity/TextChunk.java
@Data
@TableName(value = "text_chunks", autoResultMap = true)
public class TextChunk extends SimpleModel {
private String documentId;
private String textStorageId;
private Integer chunkIndex;
private String content;
private Integer tokenCount;
private Map<String, Object> metadata;
}
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/entity/VectorEmbedding.java
@Data
@TableName("vector_embeddings")
public class VectorEmbedding extends SimpleModel {
private String chunkId;
private String embedding; // pgvector 格式字符串
private String modelName;
}
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/repository/TextChunkRepository.java
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/repository/VectorEmbeddingRepository.java
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/service/TextChunkService.java
核心功能:
chunkText(documentId, textStorageId, text) - 文本分块新建文件: backend/graph-service/src/main/java/com/lingyue/graph/service/OllamaEmbeddingService.java
核心功能:
embed(text) - 单文本向量化embedBatch(texts) - 批量向量化POST http://localhost:11434/api/embeddingsbge-large-zh-v1.5 或 nomic-embed-text新建文件: backend/graph-service/src/main/java/com/lingyue/graph/service/VectorSearchService.java
核心功能:
search(query, topK, documentId) - 向量相似度检索<=> 余弦距离操作符新建文件: backend/graph-service/src/main/java/com/lingyue/graph/service/RAGService.java
核心功能:
query(question, documentId, topK) - RAG 问答新建文件: backend/ai-service/src/main/java/com/lingyue/ai/client/DeepSeekClient.java
核心功能:
complete(prompt) - 文本生成chat(messages) - 对话生成新建文件:
backend/ai-service/src/main/java/com/lingyue/ai/dto/ChatRequest.javabackend/ai-service/src/main/java/com/lingyue/ai/dto/ChatResponse.java新建文件: backend/graph-service/src/main/java/com/lingyue/graph/config/VectorConfig.java
配置 Ollama WebClient Bean。
在 backend/lingyue-starter/src/main/resources/application.properties 中添加:
# Ollama Embedding 配置
ollama.url=http://localhost:11434
ollama.embedding.model=nomic-embed-text
# DeepSeek API 配置
deepseek.api.url=https://api.deepseek.com
deepseek.api.key=${DEEPSEEK_API_KEY:}
deepseek.api.model=deepseek-chat
# RAG 配置
rag.chunk.size=500
rag.chunk.overlap=50
rag.search.top-k=3
新建文件: backend/graph-service/src/main/java/com/lingyue/graph/controller/RAGController.java
API 接口:
POST /api/rag/index - 索引文档(分块+向量化)POST /api/rag/query - RAG 问答GET /api/rag/chunks/{documentId} - 获取文档分块DELETE /api/rag/index/{documentId} - 删除文档索引修改 backend/graph-service/src/main/java/com/lingyue/graph/service/TextStorageService.java
saveAndIndex(documentId, filePath)修改 backend/parse-service/src/main/java/com/lingyue/parse/service/ParseService.java
curl -fsSL https://ollama.com/install.sh | sh
# 推荐使用 nomic-embed-text(Ollama 原生支持)
ollama pull nomic-embed-text
# 或使用 bge 系列(需确认 Ollama 支持)
ollama pull bge-large
| 类型 | 文件路径 | 说明 |
|-----|---------|------|
| SQL | backend/sql/rag_tables.sql | RAG 相关表 |
| Entity | graph-service/.../entity/TextChunk.java | 文本分块实体 |
| Entity | graph-service/.../entity/VectorEmbedding.java | 向量嵌入实体 |
| Repository | graph-service/.../repository/TextChunkRepository.java | 分块数据访问 |
| Repository | graph-service/.../repository/VectorEmbeddingRepository.java | 向量数据访问 |
| Service | graph-service/.../service/TextChunkService.java | 文本分块服务 |
| Service | graph-service/.../service/OllamaEmbeddingService.java | Ollama 向量化 |
| Service | graph-service/.../service/VectorSearchService.java | 向量检索 |
| Service | graph-service/.../service/RAGService.java | RAG 核心服务 |
| Config | graph-service/.../config/VectorConfig.java | 向量配置 |
| Client | ai-service/.../client/DeepSeekClient.java | DeepSeek 客户端 |
| DTO | ai-service/.../dto/ChatRequest.java | 请求 DTO |
| DTO | ai-service/.../dto/ChatResponse.java | 响应 DTO |
| Controller | graph-service/.../controller/RAGController.java | RAG API |