- CLIP embedding (9086) + Milvus VDB (8886) + NetworkX graph (9092) - BGE-Reranker (9090) for result reranking - Hybrid retrieval: vector search + graph expansion + RRF fusion - API: /api/ingest, /api/search, /api/pipelines, /api/plugins, /api/status - Two pipelines: kg-rag-standard (full) and kg-rag-lite (vector only) - Tested E2E: ingest + search with rerank_score=0.99
128 lines
3.6 KiB
Markdown
128 lines
3.6 KiB
Markdown
# RAG Pipeline Service
|
||
|
||
可插拔 RAG 编排服务,支持知识图谱增强的多模态检索。
|
||
|
||
## 架构
|
||
|
||
```
|
||
┌─────────────────┐
|
||
│ RAG Pipeline │
|
||
│ (9093 CPU) │
|
||
└────────┬────────┘
|
||
│
|
||
┌──────────────────┼──────────────────┐
|
||
│ │ │
|
||
┌─────▼─────┐ ┌─────▼─────┐ ┌─────▼─────┐
|
||
│ CLIP │ │ VDB │ │ Graph │
|
||
│ 9086 │ │ 8886 │ │ 9092 │
|
||
│ GPU 2 │ │ CPU │ │ CPU │
|
||
└───────────┘ └───────────┘ └───────────┘
|
||
│ │ │
|
||
┌─────▼─────┐ ┌─────▼─────┐
|
||
│BGE Rerank │ │ LLM │
|
||
│ 9090 │ │harnessed │
|
||
│ GPU 2 │ │ │
|
||
└───────────┘ └───────────┘
|
||
```
|
||
|
||
## 插件槽位
|
||
|
||
| 槽位 | 当前实现 | 可替换为 |
|
||
|------|---------|---------|
|
||
| embedding | CLIP-ViT-H/14 (9086) | BGE-M3, OpenAI embedding |
|
||
| vdb | Milvus VDB (8886) | Qdrant, Weaviate |
|
||
| graph | NetworkX (9092) | FalkorDB, Neo4j |
|
||
| reranker | BGE-Reranker (9090) | Cohere, LLM rerank |
|
||
| llm | harnessed_agent | 任意 LLM |
|
||
| chunker | recursive/sentence | 自定义分块策略 |
|
||
| extractor | LLM-structured | spaCy, GraphRAG |
|
||
| retriever | hybrid/vector_only | 自定义检索策略 |
|
||
| face | InsightFace (待部署) | FaceNet, DeepFace |
|
||
|
||
## Pipeline 配置
|
||
|
||
### kg-rag-standard(标准版)
|
||
- CLIP embedding + Milvus + NetworkX 图 + BGE Rerank
|
||
- 向量召回 + 图扩展 + RRF 融合 + 精排
|
||
|
||
### kg-rag-lite(轻量版)
|
||
- CLIP embedding + Milvus + BGE Rerank
|
||
- 纯向量检索,无图谱
|
||
|
||
## API
|
||
|
||
### GET /api/status
|
||
服务状态和可用插件列表。
|
||
|
||
### POST /api/ingest
|
||
入库文档。
|
||
|
||
```json
|
||
{
|
||
"document": "文本内容...",
|
||
"pipeline": "kg-rag-standard",
|
||
"collection": "knowledge",
|
||
"graph_name": "knowledge"
|
||
}
|
||
```
|
||
|
||
流程: 分块 → CLIP embedding → VDB 存储 → 实体抽取 → 图存储
|
||
|
||
### POST /api/search
|
||
混合检索。
|
||
|
||
```json
|
||
{
|
||
"query": "搜索问题",
|
||
"pipeline": "kg-rag-standard",
|
||
"collection": "knowledge",
|
||
"graph_name": "knowledge",
|
||
"top_k": 5
|
||
}
|
||
```
|
||
|
||
流程: CLIP embed → VDB 向量召回 → 图扩展 → RRF 融合 → BGE Rerank
|
||
|
||
### GET/POST /api/pipelines
|
||
管理 pipeline 配置。
|
||
|
||
```json
|
||
POST: 创建自定义 pipeline
|
||
GET: 列出所有 pipeline
|
||
GET ?name=xxx: 获取指定 pipeline
|
||
```
|
||
|
||
### GET /api/plugins
|
||
列出所有可用插件及状态。
|
||
|
||
## 部署
|
||
|
||
```bash
|
||
cd /data/ymq/rag-pipeline
|
||
bash build.sh deploy
|
||
bash build.sh stop
|
||
bash build.sh status
|
||
```
|
||
|
||
## 端到端测试
|
||
|
||
```bash
|
||
# 1. 入库
|
||
curl -X POST http://localhost:9093/api/ingest \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"document":"张三在ABC公司担任技术总监,他和李四是同事关系。","pipeline":"kg-rag-standard"}'
|
||
|
||
# 2. 检索
|
||
curl -X POST http://localhost:9093/api/search \
|
||
-H "Content-Type: application/json" \
|
||
-d '{"query":"张三在哪家公司工作?","pipeline":"kg-rag-standard"}'
|
||
```
|
||
|
||
## 端口
|
||
|
||
9093 (CPU only)
|
||
|
||
## Git
|
||
|
||
git@git.opencomputing.cn:yumoqing/rag-pipeline.git
|