feat: Rule2 TKG relationship chunks + Phase0-1 Qdrant integration

Phase 0: TKG builder populate face_detections from face.json
- Fix face.json parser for pose_angle format
- Call store_traced_faces.py to set trace_id
- Skip if trace_id already populated

Phase 1: Qdrant face embeddings integration
- Add FaceEmbeddingDb module (src/core/db/face_embedding_db.rs)
- Create dev_face_embeddings collection (dim=512)
- Store 1122 face embeddings with pose metadata
- API: init_collection, batch_upsert, search_similar

Rule2: TKG edges → relationship chunks
- Design: RULE2_TKG_RELATIONSHIP_V1.0.md
- Implementation: rule2_ingest.rs
- ChunkType::Relationship added
- Edge types: SPEAKS_AS, MUTUAL_GAZE, CO_OCCURS_WITH, HAS_APPEARANCE, WEARS
- Auto-trigger on TKG rebuild

API:
- POST /api/v1/file/:file_uuid/rule2 (vectorization)
- POST /api/v1/file/:file_uuid/tkg/rebuild (auto Rule2)

Test: 75 relationship chunks created + vectorized
This commit is contained in:
Accusys
2026-06-21 00:22:41 +08:00
parent 17e4e15860
commit 3ad6f8740a
10 changed files with 3811 additions and 30 deletions

View File

@@ -0,0 +1,378 @@
<!-- module: tkg -->
<!-- description: Temporal Knowledge Graph — rebuild, nodes, edges, processor counts -->
<!-- depends: 05_process, 07_identity -->
## Temporal Knowledge Graph (TKG)
TKG is a time-aligned knowledge graph built from multi-processor outputs (face, yolo, ocr, pose, asrx, gaze, lip, appearance). It produces 9 node types and 14 edge types stored in `dev.tkg_nodes` and `dev.tkg_edges`.
### Node Types
| Node Type | Description | Key Properties |
|-----------|-------------|----------------|
| `face_trace` | A tracked face identity over time | `trace_id`, `face_count`, `avg_confidence` |
| `gaze_trace` | Gaze direction over time | `direction` (frontal/left/right/up/down + diagonals) |
| `lip_trace` | Lip movement synced with speech | `speaker_id`, `lip_area_range` |
| `text_trace` | Spoken text aligned to time | `speaker_id`, `text`, `start_time`, `end_time` |
| `appearance_trace` | Human appearance (clothing) over time | `clothing_color`, `upper_cloth`, `lower_cloth` |
| `skin_tone_trace` | Fitzpatrick skin tone classification | `fitzpatrick_type` (IVI) |
| `accessory` | Detected accessories | `type` (glasses/hat/etc.), `confidence` |
| `object` | YOLO-detected object | `class`, `confidence`, `frame_count` |
| `speaker` | ASRX speaker segment | `speaker_id`, `segment_count`, `total_duration` |
### Edge Types
| Edge Type | Source → Target | Description |
|-----------|-----------------|-------------|
| `co_occurs` | object ↔ object | Two objects appear together in same frame |
| `speaker_face` | speaker ↔ face_trace | Speaker matched to face trace via lip sync |
| `face_face` | face_trace ↔ face_trace | Two face traces interact (mutual gaze) |
| `mutual_gaze` | gaze_trace ↔ gaze_trace | Two people looking at each other |
| `lip_sync` | lip_trace ↔ text_trace | Lip movement aligned with spoken text |
| `has_appearance` | face_trace ↔ appearance_trace | Face has specific appearance |
| `wears` | face_trace ↔ accessory | Face wears an accessory |
---
### `POST /api/v1/file/:file_uuid/tkg/rebuild`
**Auth**: Required
**Scope**: file-level
Rebuild the Temporal Knowledge Graph for a file. Reads processor JSON outputs (face, yolo, ocr, pose, asrx, gaze, lip, appearance) and generates TKG nodes and edges. Clears existing nodes/edges for the file first, then rebuilds from scratch.
#### Example
```bash
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/rebuild" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"result": {
"face_trace_nodes": 16,
"gaze_trace_nodes": 16,
"lip_trace_nodes": 12,
"text_trace_nodes": 24,
"appearance_trace_nodes": 8,
"skin_tone_trace_nodes": 5,
"accessory_nodes": 3,
"object_nodes": 26,
"speaker_nodes": 4,
"co_occurrence_edges": 94,
"speaker_face_edges": 12,
"face_face_edges": 8,
"mutual_gaze_edges": 2,
"lip_sync_edges": 10,
"has_appearance_edges": 16,
"wears_edges": 3
},
"error": null
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | True if rebuild completed |
| `file_uuid` | string | 32-char hex UUID |
| `result` | object | Node and edge counts by type |
| `error` | string/null | Error message if failed |
---
### `POST /api/v1/file/:file_uuid/tkg/nodes`
**Auth**: Required
**Scope**: file-level
Query TKG nodes with pagination and optional type filter.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `node_type` | string | No | all | Filter by node type: `face_trace`, `gaze_trace`, `lip_trace`, `text_trace`, `appearance_trace`, `skin_tone_trace`, `accessory`, `object`, `speaker` |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 100 | Items per page (max 500) |
#### Example
```bash
# Get all face_trace nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/nodes" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"node_type": "face_trace", "page": 1, "page_size": 50}'
# Get all nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/nodes" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"total": 16,
"page": 1,
"page_size": 50,
"nodes": [
{
"id": 1,
"node_type": "face_trace",
"external_id": "trace_0",
"label": "Face Trace 0",
"properties": {
"trace_id": 0,
"face_count": 142,
"avg_confidence": 0.87
}
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `file_uuid` | string | 32-char hex UUID |
| `total` | integer | Total matching node count |
| `page` | integer | Current page |
| `page_size` | integer | Items per page |
| `nodes` | array | Array of node objects |
| `nodes[].id` | integer | Database primary key |
| `nodes[].node_type` | string | Node type (see table above) |
| `nodes[].external_id` | string | External identifier (e.g., `trace_0`, `gaze_1`) |
| `nodes[].label` | string | Human-readable label |
| `nodes[].properties` | object | Type-specific properties as JSON |
---
### `POST /api/v1/file/:file_uuid/tkg/edges`
**Auth**: Required
**Scope**: file-level
Query TKG edges with pagination and optional filters.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `edge_type` | string | No | all | Filter by edge type: `co_occurs`, `speaker_face`, `face_face`, `mutual_gaze`, `lip_sync`, `has_appearance`, `wears` |
| `source_type` | string | No | — | Filter by source node type |
| `target_type` | string | No | — | Filter by target node type |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 100 | Items per page (max 500) |
#### Example
```bash
# Get all co_occurrence edges
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/edges" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"edge_type": "co_occurs"}'
# Get edges between face_trace and speaker nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/edges" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"source_type": "speaker", "target_type": "face_trace"}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"total": 94,
"page": 1,
"page_size": 100,
"edges": [
{
"id": 1,
"edge_type": "co_occurs",
"source_node_id": 10,
"target_node_id": 15,
"properties": {
"frame_count": 45,
"confidence": 0.92
}
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `file_uuid` | string | 32-char hex UUID |
| `total` | integer | Total matching edge count |
| `page` | integer | Current page |
| `page_size` | integer | Items per page |
| `edges` | array | Array of edge objects |
| `edges[].id` | integer | Database primary key |
| `edges[].edge_type` | string | Edge type |
| `edges[].source_node_id` | integer | Source node ID (FK to tkg_nodes) |
| `edges[].target_node_id` | integer | Target node ID (FK to tkg_nodes) |
| `edges[].properties` | object | Edge-specific properties as JSON |
---
### `GET /api/v1/file/:file_uuid/tkg/node/:node_id`
**Auth**: Required
**Scope**: file-level
Get detail for a specific TKG node including its connected edges.
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/tkg/node/1" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"node": {
"id": 1,
"node_type": "face_trace",
"external_id": "trace_0",
"label": "Face Trace 0",
"properties": {
"trace_id": 0,
"face_count": 142,
"avg_confidence": 0.87
}
},
"connected_edges": [
{
"id": 5,
"edge_type": "co_occurs",
"source_node_id": 1,
"target_node_id": 10,
"properties": {"frame_count": 45}
}
],
"edge_count": 3
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `node` | object | Node detail (same format as nodes query) |
| `connected_edges` | array | Edges connected to this node |
| `edge_count` | integer | Total connected edge count |
#### Error Codes
| HTTP | When |
|------|------|
| `404` | Node not found |
---
### `GET /api/v1/file/:file_uuid/processor-counts`
**Auth**: Required
**Scope**: file-level
Get counts of processor JSON output files for a file. Scans the output directory for `{file_uuid}.{processor}.json` files and extracts frame counts, segment counts, and chunk counts from each file.
Supports short UUID prefix matching (e.g., `d3f9ae8e` → resolves to full `d3f9ae8e471a1fc4d47022c66091b920`).
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/processor-counts" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"output_dir": "/Users/accusys/momentry/output_dev",
"processors": [
{
"processor": "cut",
"has_json": true,
"frame_count": 5391,
"segment_count": null,
"chunk_count": null,
"last_modified": "2026-06-16T18:48:01.987241061+00:00"
},
{
"processor": "face",
"has_json": true,
"frame_count": 1112,
"segment_count": null,
"chunk_count": null,
"last_modified": "2026-06-18T17:21:37.408383765+00:00"
},
{
"processor": "asrx",
"has_json": true,
"frame_count": null,
"segment_count": 6,
"chunk_count": null,
"last_modified": "2026-06-18T17:21:40.872063642+00:00"
},
{
"processor": "story",
"has_json": true,
"frame_count": null,
"segment_count": null,
"chunk_count": 12,
"last_modified": "2026-06-18T17:22:00.000000000+00:00"
},
{
"processor": "mediapipe",
"has_json": false,
"frame_count": null,
"segment_count": null,
"chunk_count": null,
"last_modified": null
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | Full 32-char hex UUID (resolved from prefix) |
| `output_dir` | string | Output directory scanned |
| `processors` | array | Per-processor output info |
| `processors[].processor` | string | Processor name |
| `processors[].has_json` | boolean | Whether JSON file exists |
| `processors[].frame_count` | integer/null | Total frames processed (frame-based processors) |
| `processors[].segment_count` | integer/null | Segment count (ASRX segments, etc.) |
| `processors[].chunk_count` | integer/null | Chunk count (Story chunks, etc.) |
| `processors[].last_modified` | string/null | ISO 8601 timestamp of last modification |
#### Error Codes
| HTTP | When |
|------|------|
| `404` | File UUID not found in database |
---
*Updated: 2026-06-20 12:00:00*

View File

@@ -0,0 +1,235 @@
---
title: Rule 2 TKG Relationship Chunks V1.0
version: 1.0
date: 2026-06-20
author: OpenCode
status: approved
---
# Rule 2 TKG Relationship Chunks V1.0
| Scope | Status | Applicable to | Binary |
|-------|--------|---------------|--------|
| TKG relationship vectorization | Approved | `momentry_playground`, `momentry` | Both |
## Overview
Rule 2 creates **relationship chunks** by converting TKG edges into searchable, vectorized units. Each TKG edge becomes a chunk with LLM-generated natural language description, enabling semantic search for relationship queries.
**Key Change:** Original Rule 2 (YOLO frame objects) is deprecated due to COCO classes being too generic. New Rule 2 focuses on TKG relationships.
## Data Flow
```
┌─────────────────────────────────────────────────────────┐
│ UPSTREAM: TKG Builder │
│ │
│ tkg_nodes: face_trace, speaker, object, etc. │
│ tkg_edges: speaker_face, mutual_gaze, co_occurs, etc. │
│ │
└─────────────────────────────────────────────────────────┘
▼ after TKG complete
┌─────────────────────────────────────────────────────────┐
│ RULE 2 PROCESSING │
│ │
│ Triggered by: │
│ 1. Worker auto: job_worker.rs after TKG completes │
│ 2. HTTP API: POST /api/v1/file/:file_uuid/rule2 │
│ │
│ ingest_rule2(file_uuid): │
│ ├─ Query tkg_edges by type (priority order) │
│ ├─ For each edge: │
│ │ ├─ Resolve source_node / target_node │
│ │ ├─ Resolve identity names (if face_trace) │
│ │ ├─ Build context JSON │
│ │ ├─ call_llm(context) → text_content │
│ │ └─ INSERT INTO chunk (chunk_type='relationship') │
│ │ │
│ │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ DOWNSTREAM: vectorize_chunks() │
│ │
│ SELECT ... WHERE chunk_type='relationship' │
│ AND embedding IS NULL │
│ │
│ 1. embedder.embed_document(text_content) → vector │
│ 2. db.store_vector() → PG chunk.embedding │
│ 3. qdrant.upsert_vector() → momentry_rule2 collection │
│ │
└─────────────────────────────────────────────────────────┘
```
## Edge Type Priority
| Priority | Edge Type | Description | Example Output |
|----------|-----------|-------------|----------------|
| P0 | `speaker_face` | Speaker ↔ Face trace | "SPEAKER_01 以 Cary Grant 的身份說話,從 frame 100 到 350" |
| P0 | `mutual_gaze` | Two face traces looking at each other | "Cary Grant 和 Grace Kelly 互相看對方 24 幀,起始於 frame 450" |
| P1 | `face_face` | Two face traces co-occurring | "Cary Grant 和 Grace Kelly 同框 180 幀" |
| P1 | `co_occurs` | Object ↔ Object co-occurrence | "物件 'car' 和 'person' 在同一畫面出現 60 幀" |
| P2 | `has_appearance` | Face trace ↔ Appearance trace | "Cary Grant 穿著藍色上衣,戴眼鏡" |
| P2 | `wears` | Face trace ↔ Accessory | "Cary Grant 戴帽子,信心值 0.82" |
## Chunk Data Structure
### Content JSON (`content` column)
```json
{
"edge_type": "speaker_face",
"edge_id": 123,
"source_node": {
"id": 45,
"node_type": "speaker",
"external_id": "SPEAKER_01",
"label": "SPEAKER_01"
},
"target_node": {
"id": 67,
"node_type": "face_trace",
"external_id": "trace_5",
"label": "Face Trace 5",
"identity_name": "Cary Grant"
},
"properties": {
"first_frame": 100,
"last_frame": 350,
"frame_count": 250,
"lip_sync_confidence": 0.85
}
}
```
### Text Content (`text_content` column)
LLM-generated natural language description in Traditional Chinese:
```
"SPEAKER_01 以 Cary Grant 的身份說話,從 frame 100 到 frame 350唇語同步信心值 0.85"
```
### Metadata JSON (`metadata` column)
```json
{
"source_type": "speaker",
"target_type": "face_trace",
"has_identity": true,
"identity_source": "tmdb"
}
```
## LLM Prompt Template
```text
你是影片關係描述專家。請用繁體中文描述以下人物/物件關係:
關係類型: {edge_type}
來源節點: {source_node.node_type} - {source_node.external_id}
身份名稱: {identity_name} (如果有)
目標節點: {target_node.node_type} - {target_node.external_id}
身份名稱: {identity_name} (如果有)
關係屬性:
- 起始幀: {first_frame}
- 結束幀: {last_frame}
- 幀數: {frame_count}
- 信心值: {confidence}
要求:
1. 使用自然語言,不要輸出 JSON
2. 包含時間範圍(幀號)
3. 包含人物名字(如有 identity
4. 簡潔20-50 字
5. 用繁體中文
範例輸出:
"SPEAKER_01 以 Cary Grant 的身份說話,從 frame 100 到 frame 350"
"Cary Grant 和 Grace Kelly 互相看對方 24 幀,起始於 frame 450"
```
## Edge → Chunk Conversion Rules
### speaker_face Edge
```rust
// Source: speaker node
// Target: face_trace node
// Properties: first_frame, last_frame, lip_sync_confidence
let text_content = call_llm(format!(
"SPEAKER {} 對應 face trace {},身份 {}frame {}-{}",
speaker_id, trace_id, identity_name, first_frame, last_frame
));
```
### mutual_gaze Edge
```rust
// Source: face_trace node A
// Target: face_trace node B
// Properties: first_frame, gaze_frame_count, yaw_a_avg, yaw_b_avg
let text_content = call_llm(format!(
"人物 {}{} 互相看對方 {} 幀,起始於 frame {}",
identity_a, identity_b, gaze_frame_count, first_frame
));
```
### has_appearance Edge
```rust
// Source: face_trace node
// Target: appearance_trace node
// Properties: clothing colors, accessories
let text_content = call_llm(format!(
"人物 {} 穿著 {} 上衣,{} 下衣",
identity_name, upper_color, lower_color
));
```
## Search Contribution
| Search Path | Mechanism | Rule 2 Contribution |
|-------------|-----------|-------------------|
| **Semantic search** (Qdrant) | `chunk_type='relationship'` → embedding query | LLM descriptions enable natural language queries |
| **Keyword search** (BM25 ILIKE) | `text_content ILIKE '%互相看%'` | Relationship keywords searchable |
| **Agent tkg_query** | Direct edge queries | Rule 2 complements with vectorized search |
| **identity_text** | Reverse lookup | "誰戴眼鏡" → has_appearance chunks |
## Trigger Points
| Trigger | Location | Condition |
|---------|----------|-----------|
| Worker auto | `job_worker.rs` | After TKG builder completes |
| HTTP API | `POST /api/v1/file/:file_uuid/rule2` | Manual trigger |
| Pipeline | `pipeline_core::execute_rule2` | Called by other modules |
## Edge Cases
| Scenario | Behavior |
|----------|----------|
| No tkg_edges | Returns 0 immediately with info log |
| Edge without identity | Use node external_id (e.g., "trace_5") in description |
| LLM call fails | Fallback to template-based description |
| Multiple edges same type | Each edge becomes separate chunk |
## Qdrant Collection
| Property | Value |
|----------|-------|
| Collection name | `momentry_rule2` |
| Vector size | 768 (nomic-embed-text-v2-moe) |
| Distance | Cosine |
| Payload | `{chunk_id, file_uuid, edge_type, source_type, target_type}` |
## Version History
| Version | Date | Author | Change |
|---------|------|--------|--------|
| 1.0 | 2026-06-20 | OpenCode | Initial design: TKG edges → relationship chunks |

View File

@@ -0,0 +1,378 @@
<!-- module: tkg -->
<!-- description: Temporal Knowledge Graph — rebuild, nodes, edges, processor counts -->
<!-- depends: 05_process, 07_identity -->
## Temporal Knowledge Graph (TKG)
TKG is a time-aligned knowledge graph built from multi-processor outputs (face, yolo, ocr, pose, asrx, gaze, lip, appearance). It produces 9 node types and 14 edge types stored in `dev.tkg_nodes` and `dev.tkg_edges`.
### Node Types
| Node Type | Description | Key Properties |
|-----------|-------------|----------------|
| `face_trace` | A tracked face identity over time | `trace_id`, `face_count`, `avg_confidence` |
| `gaze_trace` | Gaze direction over time | `direction` (frontal/left/right/up/down + diagonals) |
| `lip_trace` | Lip movement synced with speech | `speaker_id`, `lip_area_range` |
| `text_trace` | Spoken text aligned to time | `speaker_id`, `text`, `start_time`, `end_time` |
| `appearance_trace` | Human appearance (clothing) over time | `clothing_color`, `upper_cloth`, `lower_cloth` |
| `skin_tone_trace` | Fitzpatrick skin tone classification | `fitzpatrick_type` (IVI) |
| `accessory` | Detected accessories | `type` (glasses/hat/etc.), `confidence` |
| `object` | YOLO-detected object | `class`, `confidence`, `frame_count` |
| `speaker` | ASRX speaker segment | `speaker_id`, `segment_count`, `total_duration` |
### Edge Types
| Edge Type | Source → Target | Description |
|-----------|-----------------|-------------|
| `co_occurs` | object ↔ object | Two objects appear together in same frame |
| `speaker_face` | speaker ↔ face_trace | Speaker matched to face trace via lip sync |
| `face_face` | face_trace ↔ face_trace | Two face traces interact (mutual gaze) |
| `mutual_gaze` | gaze_trace ↔ gaze_trace | Two people looking at each other |
| `lip_sync` | lip_trace ↔ text_trace | Lip movement aligned with spoken text |
| `has_appearance` | face_trace ↔ appearance_trace | Face has specific appearance |
| `wears` | face_trace ↔ accessory | Face wears an accessory |
---
### `POST /api/v1/file/:file_uuid/tkg/rebuild`
**Auth**: Required
**Scope**: file-level
Rebuild the Temporal Knowledge Graph for a file. Reads processor JSON outputs (face, yolo, ocr, pose, asrx, gaze, lip, appearance) and generates TKG nodes and edges. Clears existing nodes/edges for the file first, then rebuilds from scratch.
#### Example
```bash
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/rebuild" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"result": {
"face_trace_nodes": 16,
"gaze_trace_nodes": 16,
"lip_trace_nodes": 12,
"text_trace_nodes": 24,
"appearance_trace_nodes": 8,
"skin_tone_trace_nodes": 5,
"accessory_nodes": 3,
"object_nodes": 26,
"speaker_nodes": 4,
"co_occurrence_edges": 94,
"speaker_face_edges": 12,
"face_face_edges": 8,
"mutual_gaze_edges": 2,
"lip_sync_edges": 10,
"has_appearance_edges": 16,
"wears_edges": 3
},
"error": null
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | True if rebuild completed |
| `file_uuid` | string | 32-char hex UUID |
| `result` | object | Node and edge counts by type |
| `error` | string/null | Error message if failed |
---
### `POST /api/v1/file/:file_uuid/tkg/nodes`
**Auth**: Required
**Scope**: file-level
Query TKG nodes with pagination and optional type filter.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `node_type` | string | No | all | Filter by node type: `face_trace`, `gaze_trace`, `lip_trace`, `text_trace`, `appearance_trace`, `skin_tone_trace`, `accessory`, `object`, `speaker` |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 100 | Items per page (max 500) |
#### Example
```bash
# Get all face_trace nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/nodes" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"node_type": "face_trace", "page": 1, "page_size": 50}'
# Get all nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/nodes" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"total": 16,
"page": 1,
"page_size": 50,
"nodes": [
{
"id": 1,
"node_type": "face_trace",
"external_id": "trace_0",
"label": "Face Trace 0",
"properties": {
"trace_id": 0,
"face_count": 142,
"avg_confidence": 0.87
}
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `file_uuid` | string | 32-char hex UUID |
| `total` | integer | Total matching node count |
| `page` | integer | Current page |
| `page_size` | integer | Items per page |
| `nodes` | array | Array of node objects |
| `nodes[].id` | integer | Database primary key |
| `nodes[].node_type` | string | Node type (see table above) |
| `nodes[].external_id` | string | External identifier (e.g., `trace_0`, `gaze_1`) |
| `nodes[].label` | string | Human-readable label |
| `nodes[].properties` | object | Type-specific properties as JSON |
---
### `POST /api/v1/file/:file_uuid/tkg/edges`
**Auth**: Required
**Scope**: file-level
Query TKG edges with pagination and optional filters.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `edge_type` | string | No | all | Filter by edge type: `co_occurs`, `speaker_face`, `face_face`, `mutual_gaze`, `lip_sync`, `has_appearance`, `wears` |
| `source_type` | string | No | — | Filter by source node type |
| `target_type` | string | No | — | Filter by target node type |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 100 | Items per page (max 500) |
#### Example
```bash
# Get all co_occurrence edges
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/edges" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"edge_type": "co_occurs"}'
# Get edges between face_trace and speaker nodes
curl -s -X POST "$API/api/v1/file/$FILE_UUID/tkg/edges" \
-H "X-API-Key: $KEY" \
-H "Content-Type: application/json" \
-d '{"source_type": "speaker", "target_type": "face_trace"}'
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"total": 94,
"page": 1,
"page_size": 100,
"edges": [
{
"id": 1,
"edge_type": "co_occurs",
"source_node_id": 10,
"target_node_id": 15,
"properties": {
"frame_count": 45,
"confidence": 0.92
}
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `file_uuid` | string | 32-char hex UUID |
| `total` | integer | Total matching edge count |
| `page` | integer | Current page |
| `page_size` | integer | Items per page |
| `edges` | array | Array of edge objects |
| `edges[].id` | integer | Database primary key |
| `edges[].edge_type` | string | Edge type |
| `edges[].source_node_id` | integer | Source node ID (FK to tkg_nodes) |
| `edges[].target_node_id` | integer | Target node ID (FK to tkg_nodes) |
| `edges[].properties` | object | Edge-specific properties as JSON |
---
### `GET /api/v1/file/:file_uuid/tkg/node/:node_id`
**Auth**: Required
**Scope**: file-level
Get detail for a specific TKG node including its connected edges.
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/tkg/node/1" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"node": {
"id": 1,
"node_type": "face_trace",
"external_id": "trace_0",
"label": "Face Trace 0",
"properties": {
"trace_id": 0,
"face_count": 142,
"avg_confidence": 0.87
}
},
"connected_edges": [
{
"id": 5,
"edge_type": "co_occurs",
"source_node_id": 1,
"target_node_id": 10,
"properties": {"frame_count": 45}
}
],
"edge_count": 3
}
```
| Field | Type | Description |
|-------|------|-------------|
| `success` | boolean | Always true on 200 |
| `node` | object | Node detail (same format as nodes query) |
| `connected_edges` | array | Edges connected to this node |
| `edge_count` | integer | Total connected edge count |
#### Error Codes
| HTTP | When |
|------|------|
| `404` | Node not found |
---
### `GET /api/v1/file/:file_uuid/processor-counts`
**Auth**: Required
**Scope**: file-level
Get counts of processor JSON output files for a file. Scans the output directory for `{file_uuid}.{processor}.json` files and extracts frame counts, segment counts, and chunk counts from each file.
Supports short UUID prefix matching (e.g., `d3f9ae8e` → resolves to full `d3f9ae8e471a1fc4d47022c66091b920`).
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/processor-counts" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"file_uuid": "d3f9ae8e471a1fc4d47022c66091b920",
"output_dir": "/Users/accusys/momentry/output_dev",
"processors": [
{
"processor": "cut",
"has_json": true,
"frame_count": 5391,
"segment_count": null,
"chunk_count": null,
"last_modified": "2026-06-16T18:48:01.987241061+00:00"
},
{
"processor": "face",
"has_json": true,
"frame_count": 1112,
"segment_count": null,
"chunk_count": null,
"last_modified": "2026-06-18T17:21:37.408383765+00:00"
},
{
"processor": "asrx",
"has_json": true,
"frame_count": null,
"segment_count": 6,
"chunk_count": null,
"last_modified": "2026-06-18T17:21:40.872063642+00:00"
},
{
"processor": "story",
"has_json": true,
"frame_count": null,
"segment_count": null,
"chunk_count": 12,
"last_modified": "2026-06-18T17:22:00.000000000+00:00"
},
{
"processor": "mediapipe",
"has_json": false,
"frame_count": null,
"segment_count": null,
"chunk_count": null,
"last_modified": null
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| `file_uuid` | string | Full 32-char hex UUID (resolved from prefix) |
| `output_dir` | string | Output directory scanned |
| `processors` | array | Per-processor output info |
| `processors[].processor` | string | Processor name |
| `processors[].has_json` | boolean | Whether JSON file exists |
| `processors[].frame_count` | integer/null | Total frames processed (frame-based processors) |
| `processors[].segment_count` | integer/null | Segment count (ASRX segments, etc.) |
| `processors[].chunk_count` | integer/null | Chunk count (Story chunks, etc.) |
| `processors[].last_modified` | string/null | ISO 8601 timestamp of last modification |
#### Error Codes
| HTTP | When |
|------|------|
| `404` | File UUID not found in database |
---
*Updated: 2026-06-20 12:00:00*