fix: face_detections INSERT in pipeline, add dependency graph doc
This commit is contained in:
@@ -4,6 +4,52 @@
|
||||
|
||||
## Pipeline
|
||||
|
||||
### Dependency Graph
|
||||
|
||||
```mermaid
|
||||
flowchart TB
|
||||
subgraph Processors["10 Processors"]
|
||||
Cut[Cut] --> ASR[ASR]
|
||||
ASR --> ASRX[ASRX]
|
||||
ASRX --> Story[Story]
|
||||
Cut --> Story
|
||||
YOLO[YOLO] --> VisualChunk[VisualChunk]
|
||||
VisualChunk --> Story
|
||||
Face[Face] --> Story
|
||||
Story --> FiveW1H[5W1H]
|
||||
OCR[OCR]
|
||||
Pose[Pose]
|
||||
end
|
||||
|
||||
subgraph Ingestion["入庫 (Post-Processing)"]
|
||||
ASR --> Rule1[Rule 1 Sentence]
|
||||
ASRX --> Rule1
|
||||
Rule1 --> Vectorize[Auto-Vectorize]
|
||||
Rule1 --> Phase1[Phase 1 Pack]
|
||||
|
||||
Cut --> Rule3[Rule 3 Scene]
|
||||
ASR --> Rule3
|
||||
|
||||
Face --> Trace[Face Trace]
|
||||
Trace --> Qdrant[Qdrant Sync]
|
||||
Trace --> TraceChunks[Trace Chunks]
|
||||
Trace --> TKG[TKG Builder]
|
||||
|
||||
Face --> TMDbMatch[TMDb Match]
|
||||
Face --> SceneMeta[Scene Metadata]
|
||||
YOLO --> SceneMeta
|
||||
Face --> IdentityAgent[Identity Agent]
|
||||
ASRX --> IdentityAgent
|
||||
|
||||
Cut --> Agent5W1H[5W1H Agent]
|
||||
ASR --> Agent5W1H
|
||||
Agent5W1H --> Phase2[Phase 2 Pack]
|
||||
end
|
||||
|
||||
style Processors fill:#1a1a2e,stroke:#e94560
|
||||
style Ingestion fill:#16213e,stroke:#0f3460
|
||||
```
|
||||
|
||||
### 10 Processor Stages
|
||||
|
||||
| # | Processor | Depends On | Description |
|
||||
@@ -16,7 +62,7 @@
|
||||
| 6 | `Face` | — | Face detection + recognition (InsightFace + CoreML) |
|
||||
| 7 | `Pose` | — | Pose estimation |
|
||||
| 8 | `VisualChunk` | YOLO | Visual object chunking |
|
||||
| 9 | `Story` | ASRX + Cut | Narrative scene summarization (LLM, with embedding) |
|
||||
| 9 | `Story` | ASRX + Cut + YOLO + Face | Narrative scene summarization (LLM, with embedding) |
|
||||
| 10 | `5W1H` | Story | Who/What/When/Where/Why extraction (LLM, with embedding) |
|
||||
|
||||
### Post-Processing (入庫)
|
||||
@@ -27,16 +73,17 @@ After all 10 processors complete, the pipeline runs the following storage & enri
|
||||
|---|------|----------|----------|
|
||||
| 1 | **Rule 1 Sentence Chunking** | ASR + ASRX | `chunk` table, `chunk_type = 'sentence'` |
|
||||
| 2 | **Auto-Vectorize** | Rule 1 | `chunk.embedding` IS NOT NULL (pgvector) |
|
||||
| 3 | **Rule 3 Scene Chunking** | Cut + ASR | `chunk` table, `chunk_type = 'cut'` |
|
||||
| 4 | **Face Trace + DB Store** | Face | `face_detections.trace_id` IS NOT NULL |
|
||||
| 5 | **Qdrant Face Sync** | Face Trace | Qdrant collection (face embeddings) |
|
||||
| 6 | **Trace Chunks** | Face Trace | `chunk` table, `chunk_type = 'trace'` |
|
||||
| 7 | **TKG Builder** | Face Trace | `tkg_nodes` + `tkg_edges` tables |
|
||||
| 8 | **TMDb Face Matching** | Face + TMDb enabled | `face_detections.identity_id` IS NOT NULL |
|
||||
| 9 | **Heuristic Scene Metadata** | Face + YOLO | `{file_uuid}.scene_meta.json` on disk |
|
||||
| 10 | **Identity Agent** | Face + ASRX | `identities` with `source = 'identity_agent'` |
|
||||
| 11 | **5W1H Agent** | Cut + ASR | `chunk.summary_text` IS NOT NULL (chunk_type = 'cut') |
|
||||
| 12 | **Release Pack** | 5W1H Agent | `release_pack.py --phase 2` output |
|
||||
| 3 | **Phase 1 Pack** | Rule 1 | `release_pack.py --phase 1` |
|
||||
| 4 | **Rule 3 Scene Chunking** | Cut + ASR | `chunk` table, `chunk_type = 'cut'` |
|
||||
| 5 | **Face Trace** | Face | `face_detections.trace_id` IS NOT NULL |
|
||||
| 6 | **Qdrant Face Sync** | Face Trace | Qdrant face_embedding collection |
|
||||
| 7 | **Trace Chunks** | Face Trace | `chunk` table, `chunk_type = 'trace'` |
|
||||
| 8 | **TKG Builder** | Face Trace | `tkg_nodes` + `tkg_edges` tables |
|
||||
| 9 | **TMDb Face Matching** | Face + TMDb enabled | `face_detections.identity_id` IS NOT NULL |
|
||||
| 10 | **Heuristic Scene Metadata** | Face + YOLO | `{file_uuid}.scene_meta.json` on disk |
|
||||
| 11 | **Identity Agent** | Face + ASRX | `identities` with `source = 'identity_agent'` |
|
||||
| 12 | **5W1H Agent** | Cut + ASR | `chunk.summary_text` IS NOT NULL (chunk_type = 'cut') |
|
||||
| 13 | **Release Pack** | 5W1H Agent | `release_pack.py --phase 2` output |
|
||||
|
||||
### Ingestion Status
|
||||
|
||||
|
||||
Reference in New Issue
Block a user