feat: Phase 2.5 gaze_trace and lip_trace Qdrant migration + Charade Q&A test
Phase 2.5.1: gaze_trace_nodes from Qdrant - build_gaze_trace_nodes_from_qdrant() - Read trace_id, frame, bbox from Qdrant payload - Compute gaze stats (yaw, pitch, roll, gaze direction, blink) - No PostgreSQL face_detections dependency Phase 2.5.2: lip_trace_nodes from Qdrant + face.json - build_lip_trace_nodes_from_qdrant() - Match trace_id using Qdrant embeddings + face.json bbox - Compute lip stats (openness, variance, speaking frames) - Fixed face.json bbox structure (x,y,width,height not bbox object) Test results: - 23 gaze_trace nodes from Qdrant - 23 lip_trace nodes from Qdrant + face.json - 51 lip_sync edges created - Charade Q&A: 20 identities, 75 relationship chunks Docs: - TKG_PHASE2_NONFACE_MIGRATION_V1.0.md (migration plan) - 2026-06-21_charade_qa_test.md (Q&A test report)
This commit is contained in:
186
docs_v1.0/DESIGN/TKG_PHASE2_NONFACE_MIGRATION_V1.0.md
Normal file
186
docs_v1.0/DESIGN/TKG_PHASE2_NONFACE_MIGRATION_V1.0.md
Normal file
@@ -0,0 +1,186 @@
|
||||
---
|
||||
title: TKG Phase 2-4 Migration Plan (Non-Face Nodes)
|
||||
version: 1.0
|
||||
date: 2026-06-21
|
||||
author: OpenCode
|
||||
status: Draft
|
||||
---
|
||||
|
||||
## 概览
|
||||
|
||||
Phase 2-3 已完成 face_trace_nodes 的 Qdrant 迁移。其他 node types 需要类似迁移。
|
||||
|
||||
## 当前状态
|
||||
|
||||
| Node Type | 数据源 | PostgreSQL 依赖 | 迁移状态 |
|
||||
|-----------|--------|-----------------|----------|
|
||||
| **face_trace_nodes** | Qdrant embeddings | ❌ 无 | ✅ Phase 2.1 完成 |
|
||||
| **gaze_trace_nodes** | face.json | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **lip_trace_nodes** | face.json + lip.json | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **text_trace_nodes** | chunk table | ✅ chunk.sentence | ⏸️ 保持现状 |
|
||||
| **yolo_object_nodes** | .yolo.json | ❌ 无 | ✅ 无需迁移 |
|
||||
| **speaker_nodes** | .asrx.json | ❌ 无 | ✅ 无需迁移 |
|
||||
| **appearance_trace_nodes** | .appearance.json | ❌ 无 | ✅ 无需迁移 |
|
||||
| **skin_tone_trace_nodes** | .skin.json | ❌ 无 | ✅ 无需迁移 |
|
||||
| **accessory_nodes** | .accessory.json | ❌ 无 | ✅ 无需迁移 |
|
||||
|
||||
## Edge Types 迁移状态
|
||||
|
||||
| Edge Type | 数据源 | PostgreSQL 依赖 | 迁移状态 |
|
||||
|-----------|--------|-----------------|----------|
|
||||
| **co_occurrence_edges** | face_detections | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **face_face_edges** | face_detections | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **speaker_face_edges** | face_detections + speaker | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **mutual_gaze_edges** | gaze.json | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
| **lip_sync_edges** | lip.json | ✅ face_detections.trace_id | 🔄 待迁移 |
|
||||
|
||||
## 迁移计划
|
||||
|
||||
### Phase 2.5: Gaze & Lip Nodes
|
||||
|
||||
**目标**: 使用 Qdrant payload 替代 face_detections 查询
|
||||
|
||||
#### 2.5.1: gaze_trace_nodes
|
||||
|
||||
**当前代码** (`src/core/processor/tkg.rs`):
|
||||
```rust
|
||||
let frame_rows: Vec<(i64, i64, f64, f64, f64, f64)> = sqlx::query_as(
|
||||
"SELECT trace_id, frame_number, x, y, width, height
|
||||
FROM face_detections WHERE file_uuid = $1"
|
||||
)
|
||||
```
|
||||
|
||||
**迁移方案**:
|
||||
```rust
|
||||
// 使用 Qdrant payload (trace_id, frame, bbox_x/y/w/h)
|
||||
let qdrant_embeddings = face_db.get_all_embeddings_for_file(file_uuid).await?;
|
||||
// Group by trace_id → compute gaze
|
||||
```
|
||||
|
||||
#### 2.5.2: lip_trace_nodes
|
||||
|
||||
**当前代码**:
|
||||
```rust
|
||||
// Read lip.json, query face_detections for trace_id
|
||||
let trace_id = sqlx::query_scalar(
|
||||
"SELECT trace_id FROM face_detections
|
||||
WHERE file_uuid = $1 AND frame_number = $2 AND x = $3 ..."
|
||||
)
|
||||
```
|
||||
|
||||
**迁移方案**:
|
||||
```rust
|
||||
// 使用 Qdrant payload 直接关联 trace_id
|
||||
// face.json 已有 trace_id (Python store_traced_faces.py)
|
||||
```
|
||||
|
||||
### Phase 2.6: Edge Types
|
||||
|
||||
#### 2.6.1: co_occurrence_edges
|
||||
|
||||
**当前代码**:
|
||||
```rust
|
||||
"SELECT trace_id FROM face_detections
|
||||
WHERE file_uuid = $1 AND frame_number BETWEEN $2 AND $3"
|
||||
```
|
||||
|
||||
**迁移方案**:
|
||||
```rust
|
||||
// 使用 Qdrant payload.group_by(trace_id)
|
||||
// 预计算 frame ranges
|
||||
```
|
||||
|
||||
#### 2.6.2: face_face_edges
|
||||
|
||||
**当前代码**:
|
||||
```rust
|
||||
"SELECT trace_id, frame_number FROM face_detections
|
||||
WHERE file_uuid = $1 AND trace_id IS NOT NULL"
|
||||
```
|
||||
|
||||
**迁移方案**:
|
||||
```rust
|
||||
// 使用 Qdrant embeddings 的 spatial proximity
|
||||
// 无需 PostgreSQL
|
||||
```
|
||||
|
||||
#### 2.6.3: speaker_face_edges
|
||||
|
||||
**当前代码**:
|
||||
```rust
|
||||
// JOIN face_detections.trace_id + speaker_nodes
|
||||
```
|
||||
|
||||
**迁移方案**:
|
||||
```rust
|
||||
// Qdrant trace_id + speaker_nodes (already from .asrx.json)
|
||||
```
|
||||
|
||||
### Phase 2.7: Identity Resolution for Edges
|
||||
|
||||
**当前代码** (Rule2):
|
||||
```rust
|
||||
// 已完成 Phase 2.3: 查询 tkg_nodes.properties.identity_id
|
||||
```
|
||||
|
||||
**扩展**:
|
||||
- gaze/lip edges 也需要 identity resolution
|
||||
- 统一使用 `tkg_nodes.properties.identity_id`
|
||||
|
||||
## 不迁移的 Node Types
|
||||
|
||||
### text_trace_nodes
|
||||
|
||||
**原因**:
|
||||
- chunk table 是必要持久化(sentence chunks)
|
||||
- 不依赖 face_detections
|
||||
- 保持现状,无需迁移
|
||||
|
||||
### JSON-based Nodes
|
||||
|
||||
**已无 PostgreSQL 依赖**:
|
||||
- yolo_object_nodes: `.yolo.json`
|
||||
- speaker_nodes: `.asrx.json`
|
||||
- appearance_trace_nodes: `.appearance.json`
|
||||
- skin_tone_trace_nodes: `.skin.json`
|
||||
- accessory_nodes: `.accessory.json`
|
||||
|
||||
## 性能影响预估
|
||||
|
||||
| 迁移项 | 当前耗时 | 预估迁移后 | 提升 |
|
||||
|--------|----------|------------|------|
|
||||
| gaze_trace_nodes | ~50ms (PG query) | ~15ms (Qdrant) | **3x** |
|
||||
| lip_trace_nodes | ~80ms (PG + lip.json) | ~20ms (Qdrant + lip.json) | **4x** |
|
||||
| co_occurrence_edges | ~120ms (PG) | ~30ms (Qdrant) | **4x** |
|
||||
| face_face_edges | ~90ms (PG) | ~25ms (Qdrant) | **3.6x** |
|
||||
|
||||
## 实施优先级
|
||||
|
||||
| 优先级 | 任务 | 影响 | 复杂度 |
|
||||
|--------|------|------|--------|
|
||||
| P1 | gaze_trace_nodes | 高(gaze 分析) | 低 |
|
||||
| P1 | co_occurrence_edges | 高(关系图) | 中 |
|
||||
| P2 | lip_trace_nodes | 中(lip 分析) | 中 |
|
||||
| P2 | face_face_edges | 中(face 关系) | 中 |
|
||||
| P3 | speaker_face_edges | 低(speaker 关系) | 中 |
|
||||
|
||||
## 关键决策
|
||||
|
||||
1. **text_trace_nodes**: 保持 chunk table 查询(必要持久化)
|
||||
2. **JSON nodes**: 无需迁移(已无 PG 依赖)
|
||||
3. **Qdrant 作为唯一 face 数据源**: trace_id, frame, bbox 全部从 payload 获取
|
||||
4. **渐进式迁移**: 按优先级分 Phase 2.5, 2.6, 2.7
|
||||
|
||||
## 验收标准
|
||||
|
||||
- ✅ gaze_trace_nodes: 无 face_detections 查询
|
||||
- ✅ lip_trace_nodes: 使用 Qdrant trace_id
|
||||
- ✅ 所有 edges: 使用 Qdrant payload
|
||||
- ✅ 性能测试: 比原架构快 2x 以上
|
||||
- ✅ Rule2/Rule3: 正常工作(identity resolution)
|
||||
|
||||
## 参考文档
|
||||
|
||||
- `docs_v1.0/M4_workspace/2026-06-21_tkg_phase2_progress.md` (Phase 2-3)
|
||||
- `src/core/processor/tkg.rs` (当前实现)
|
||||
- `src/core/db/face_embedding_db.rs` (Qdrant API)
|
||||
Reference in New Issue
Block a user