From ef894a44ad49b695876f0abc4983cea1b61981be Mon Sep 17 00:00:00 2001
From: Accusys <accusys@Accusyss-MacBook-Pro.local>
Date: Sun, 10 May 2026 01:11:42 +0800
Subject: [PATCH] docs: update Phase 1 report with all Qdrant collections +
 voice embeddings fix

- Fixed asrx_processor_custom.py: embeddings now passed to asrx.json
- Voice embeddings (192D ECAPA-TDNN) extracted for all 1815 ASRX segments
- momentry_dev_voice Qdrant collection created (1815 vectors)
- Updated Phase 1 report with 6 collections, key decisions
---
 docs/PHASE1_COMPLETION_REPORT.md | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/docs/PHASE1_COMPLETION_REPORT.md b/docs/PHASE1_COMPLETION_REPORT.md
index 35c2455..aad6538 100644
--- a/docs/PHASE1_COMPLETION_REPORT.md
+++ b/docs/PHASE1_COMPLETION_REPORT.md
@@ -73,10 +73,14 @@
 
 ### Qdrant Vector Collections
 
-| Collection | Dims | Points | Content |
-|-----------|------|--------|---------|
-| `momentry_dev_v1` | 768 | 3,417 | Sentence chunk embeddings |
-| `momentry_dev_faces` | 512 | **5,910** | Face embeddings (8Hz CoreML) |
+| Collection | Dims | Points | Content | Status |
+|-----------|------|--------|---------|--------|
+| `momentry_dev_v1` | 768 | 3,417 | Sentence chunk embeddings (待重embed含speaker) | ⏳ |
+| `momentry_dev_stories` | 768 | 456 | Story dialogue + LLM summary | ✅ |
+| `momentry_dev_faces` | 512 | 5,910 | Face embeddings (8Hz CoreML) | ✅ |
+| `momentry_dev_voice` | 192 | **1,815** | Voice embeddings (ECAPA-TDNN) | ✅ |
+| `story_sentence` | 768 | 0 | Story processor template (待建立) | ⏳ |
+| `sentence_summary` | 768 | 0 | LLM 50字摘要 (待建立) | ⏳ |
 
 ## 4. Release Package
 
@@ -101,6 +105,10 @@ Location: `release/phase1/v1.0.0_20260509_101337/`
 | VNFaceprint not used | KVC returns nil in video pipeline |
 | Face Qdrant separate collection | Face 512D vs chunk 768D — different dimensions |
 | LLM reasoning off | `--reasoning off` needed for non-empty content |
+| Voice embedding (ECAPA-TDNN) | SFSpeechAnalyzer 無暴露 speaker embedding (Apple 未開放 API) |
+| ASRX embeddings bug | `asrx_processor_custom.py` 遺漏傳遞 embeddings → 已修復 |
+| Speaker 匹配方式 | ASR × ASRX 時間重疊 (any overlap)，99% 配對率 |
+| Story chunk 分組 | 固定 15 ASR segments，228 parent chunks |
 
 ## 6. Phase 2 Preparation