# Phase 1 Completion Report — v1 (base model) **File**: Charade (1963) Cary Grant & Audrey Hepburn **UUID**: `aeed71342a899fe4b4c57b7d41bcb692` **Date**: 2026-05-09 **System**: M5 (MacBook Pro, 48GB, Apple Silicon) --- ## 1. Processor Outputs | File | Size | Description | |------|------|-------------| | `asr.json` | 413KB | 3,417 segments, full movie coverage | | `asrx.json` | 307KB | 1,815 segments, 10 speakers | | `cut.json` | 329KB | 2,260 scenes | | `yolo.json` | 181MB | 169,625 frames with object detections | | `face.json` | **106MB** | 4,550 frames, 5,910 faces @ 8Hz (CoreML 512D) | | `face_traced.json` | 110MB | Traced faces with identity | | `lip.json` | 492KB | Lip openness analysis | | `ocr.json` | 277KB | 606 OCR frames | | `pose.json` | 26MB | 4,211 pose frames | | `scene.json` | 403B | Scene classification | ## 2. Pipeline 8-Stage Checklist | Stage | Status | Detail | |-------|--------|--------| | ASR | ✅ | 3,417 segments, last end 6,773s (100%) | | ASRX | ✅ | 1,815 segments, 10 speakers | | Sentence Chunks | ✅ | 3,417 sentence chunks with text | | Vectorization | ✅ | 3,417 PG + Qdrant (768D) | | Face Trace | ✅ | 423 traces, 11,820 detections @ 8Hz | | TKG Graph | ✅ | 498 nodes, 1,617 edges | | Trace Chunks | ✅ | 423 trace chunks with ASR text | | Phase 1 Release | ✅ | 483MB package | ## 3. Identity & Knowledge Graph ### TMDb Character Matching (9 characters) | Character | Traces | Actor | |-----------|--------|-------| | Audrey Hepburn | 843 | Regina Lampert | | Cary Grant | 482 | Peter Joshua | | Jacques Marin | 348 | Inspector Grandpierre | | James Coburn | 188 | Tex Panthollow | | Ned Glass | 176 | Leopold W. Gideon | | George Kennedy | 104 | Herman Scobie | | Walter Matthau | 104 | Hamilton Bartholomew | | Dominique Minot | 45 | Sylvie Gaudel | | Raoul Delfosse | 32 | — | ### Speaker Bindings (via Lip Verification) | Speaker | Identity | Confidence | |---------|----------|------------| | SPEAKER_2 | Audrey Hepburn | 61% | | SPEAKER_4 | Cary Grant | 56% | | SPEAKER_5 | Audrey Hepburn | 100% | | SPEAKER_6 | Audrey Hepburn | 43% | | SPEAKER_7 | Cary Grant | 100% | | SPEAKER_8 | Audrey Hepburn | 54% | ### TKG Graph | Node Type | Count | |-----------|-------| | Face traces | 423 | | Objects | 75 | | Total nodes | 498 | | Total edges | 1,617 | ### Qdrant Vector Collections | Collection | Dims | Points | Content | |-----------|------|--------|---------| | `momentry_dev_v1` | 768 | 3,417 | Sentence chunk embeddings | | `momentry_dev_faces` | 512 | **5,910** | Face embeddings (8Hz CoreML) | ## 4. Release Package | Component | Size | |-----------|------| | `output_json/` | 11 processor files | | `chunks.csv` | 2.2MB | | `vectors.csv` | 56MB | | `identities.csv` | 973KB | | `schema.sql` | 29KB | | `RELEASE_INFO.txt` | Metadata | | **Total** | **483MB** | Location: `release/phase1/v1.0.0_20260509_101337/` ## 5. Key Technical Decisions | Decision | Rationale | |----------|-----------| | Face 8Hz (interval=3) | 5-15Hz human lip motion needs ≥8Hz sampling | | Two-stage face processor | Apple Vision ANE (fast) + CoreML FaceNet (512D) | | VNFaceprint not used | KVC returns nil in video pipeline | | Face Qdrant separate collection | Face 512D vs chunk 768D — different dimensions | | LLM reasoning off | `--reasoning off` needed for non-empty content | ## 6. Phase 2 Preparation Pending for Phase 2: - Rule 3 scene chunking (cut-based parent chunks) - 5W1H Agent (LLM-generated scene summaries) - Full pipeline + 5W1H release packaging - Lip analysis extended to full movie speaker binding