From 4d75b2e2519c3837ed9f25f0c9384e2440e42ed4 Mon Sep 17 00:00:00 2001 From: Warren Date: Thu, 30 Apr 2026 15:10:41 +0800 Subject: [PATCH] docs: update docs_v1.0/ documentation - Fix markdown lint issues (MD030, MD047, MD051, MD028, MD005) - Update AI agents, architecture, implementation docs - Add new identity, face recognition, and API documentation - Remove deprecated face/person API guides --- .../AI_AGENTS/CONTEXT/METADATA_PROCESSORS.md | 20 +- docs_v1.0/AI_AGENTS/CORE/AGENT_SPEC.md | 112 ++- .../IDENTITY/FACE_SPEAKER_PERSON_API_GUIDE.md | 248 ------ .../FACE_SPEAKER_PERSON_IDENTITY_TUTORIAL.md | 28 +- .../IDENTITY/FACE_SPEAKER_PERSON_PROGRESS.md | 12 +- .../FACE_SPEAKER_PERSON_QUICK_START.md | 14 +- .../IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md | 407 ++++++--- .../IDENTITY/FACE_TO_IDENTITY_FLOW.md | 768 +++++++++++++++++ .../IDENTITY/FILE_IDENTITIES_TABLE_SPEC.md | 434 ++++++++++ .../AI_AGENTS/IDENTITY/IDENTITY_AGENT_SPEC.md | 549 ++++++++++++ .../IDENTITY/IDENTITY_MANAGEMENT_API.md | 550 ++++++++---- .../IDENTITY/PHASE1_MIGRATION_PLAN.md | 282 ++++++ .../IDENTITY/PHASE2_MIGRATION_SUMMARY.md | 113 +++ .../IDENTITY/V4_MIGRATION_COMPLETE.md | 119 +++ .../AI_AGENTS/IDENTITY/V4_MIGRATION_STATUS.md | 121 +++ .../SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md | 18 +- .../AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md | 6 +- docs_v1.0/API/PEOPLE_API_MARCOM_MAPPING.md | 442 ++++++++++ docs_v1.0/API_DOCUMENTATION.md | 699 +++++++++++++++ .../API_WORKFLOW_WORDPRESS_N8N.md | 6 +- .../ARCHITECTURE_DECISION_CARDS.md | 2 +- .../ARCHITECTURE_DECISION_EXECUTION_PLAN.md | 2 +- .../ARCHITECTURE_DOCUMENTATION_MAP.md | 2 +- .../ARCHITECTURE/ARCHITECTURE_OVERVIEW.md | 2 +- .../ARCHITECTURE_REVIEW_PROCESS.md | 2 +- .../ARCHITECTURE/ARCHITECTURE_ROADMAP.md | 53 +- .../CLIP_EMBEDDING_BENCHMARK_PLAN.md | 535 ++++++++++++ .../ARCHITECTURE/DESIGN_IMPLEMENTATION_GAP.md | 2 +- .../EVENT_RECOGNITION_TECHNICAL_ANALYSIS.md | 2 +- docs_v1.0/ARCHITECTURE/FAQ.md | 2 +- .../IDENTITY_REFERENCE_VECTOR_DESIGN.md | 573 +++++++++++++ .../JOB_WORKER_IMPLEMENTATION_PLAN.md | 120 ++- .../ARCHITECTURE/MCP_LAZY_LOADING_STRATEGY.md | 2 +- ...ULE_STANDARDIZATION_IMPLEMENTATION_PLAN.md | 2 +- .../MOMENTRY_CORE_ARCHITECTURE_V2.md | 158 +++- .../ARCHITECTURE/MONITORING_ARCHITECTURE.md | 2 +- .../ARCHITECTURE/MONITORING_SETUP_GUIDE.md | 2 +- .../MULTIMODAL_SEARCH_DESIGN_V5.md | 120 +-- .../ON_THE_FLY_PROCESSING_DESIGN.md | 36 +- .../PARENT_CHUNK_COVERAGE_ANALYSIS.md | 2 +- .../PERFORMANCE_AND_SCALABILITY.md | 2 +- .../PERSON_IDENTITY_INTEGRATION.md | 40 +- .../PERSON_IDENTITY_USAGE_GUIDE.md | 24 +- .../PIPELINE_AND_RESOURCE_ARCHITECTURE.md | 18 +- .../POSE_BASED_MATCHING_OPTIMIZATION_PLAN.md | 392 +++++++++ docs_v1.0/ARCHITECTURE/PROCESSING_PIPELINE.md | 58 +- docs_v1.0/ARCHITECTURE/QUICK_START_GUIDE.md | 2 +- .../PROCESSOR_LIFECYCLE.md | 34 +- .../PROCESSOR_REGISTRY_ARCHITECTURE.md | 8 +- .../RESOURCE_MONITORING_SPEC.md | 22 +- .../UNIFIED_RESOURCE_REGISTRY.md | 54 +- .../ROOT_API_WORKFLOW_WORDPRESS_N8N.md | 6 +- .../ARCHITECTURE/SECURITY_ARCHITECTURE.md | 2 +- .../ARCHITECTURE/SEMANTIC_SEARCH_DESIGN.md | 76 +- .../SOUND_RECOGNITION_EXTENSION.md | 408 +++++++++ .../TECHNICAL_DECISION_RECORDS.md | 179 +--- docs_v1.0/ARCHITECTURE/TERMINOLOGY_MAPPING.md | 2 +- .../ARCHITECTURE/USER_MANAGEMENT_PLAN.md | 6 +- .../_deprecated/SPEAKER_INTEGRATION.md | 24 +- .../_deprecated/TMDB_CHARACTER_INTEGRATION.md | 52 +- .../BODY_ACTION_DECODER_CLASSIFICATION.md | 362 ++++++++ .../CHUNKING/CORE/CHUNKING_ARCHITECTURE.md | 8 +- .../CHUNKING/CORE/CHUNKING_SCHEMA_SPEC.md | 6 +- .../RULES/SCENE_BASED/CHUNK_RULE_3_SCENE.md | 20 +- .../RULES/TEXT_BASED/CHUNK_RULE_1_SENTENCE.md | 24 +- .../RULES/VISUAL_BASED/CHUNK_RULE_2_VISUAL.md | 30 +- .../FACE_PROCESSOR_PERFORMANCE_2026-04-28.md | 196 +++++ ...E_TRACKER_INTEGRATION_REPORT_2026-04-28.md | 206 +++++ .../IDENTITY_SYSTEM_EXPERIMENT_2026-04-28.md | 204 +++++ .../LANDMARKS_SOURCE_ANALYSIS_2026-04-28.md | 309 +++++++ ...O_MANY_MATCHING_OPTIMIZATION_2026-04-28.md | 184 ++++ ..._BASED_MATCHING_FINAL_REPORT_2026-04-28.md | 231 +++++ docs_v1.0/FACE_THUMBNAIL_IMPLEMENTATION.md | 351 ++++++++ docs_v1.0/FACE_TRACKER_DATA_STRUCTURE.md | 620 +++++++++++++ docs_v1.0/FACE_TRACKER_GUIDE.md | 261 ++++++ docs_v1.0/FILE_UUID_SPEC.md | 208 +++++ docs_v1.0/IDENTITY_API_SPEC.md | 811 ++++++++++++++++++ .../AI_AGENT_DOCUMENTATION_GUIDE.md | 2 +- docs_v1.0/IMPLEMENTATION/API_CURL_EXAMPLES.md | 2 +- .../IMPLEMENTATION/API_TRAINING_MARCOM.md | 2 +- .../IMPLEMENTATION/CONTINUOUS_DEMO_GUIDE.md | 2 +- docs_v1.0/IMPLEMENTATION/DEMO_GUIDE.md | 2 +- docs_v1.0/IMPLEMENTATION/DEMO_MODES.md | 2 +- docs_v1.0/IMPLEMENTATION/DEMO_VIDEO_MODE.md | 2 +- docs_v1.0/IMPLEMENTATION/DEV_3003_REFACTOR.md | 199 +++++ .../FILE_IDENTITY_API_DESIGN.md | 198 ++++- .../IMPLEMENTATION/INSTALL_SYNONYM_FOREST.md | 2 +- .../N8N_SEARCH_API_COMPARISON.md | 7 +- .../N8N_SEARCH_API_TECHNICAL_SPEC.md | 56 +- .../PORTAL_BIRTH_UUID_ADAPTATION.md | 267 ++++++ .../SEARCH_ACCEPTANCE_CRITERIA.md | 12 +- .../IMPLEMENTATION/STAMP_SEARCH_PROGRESS.md | 12 +- .../IMPLEMENTATION/SYNONYM_CONFIGURATION.md | 2 +- .../IMPLEMENTATION/SYNONYM_FOREST_README.md | 2 +- .../IMPLEMENTATION/USER_MANAGEMENT_PLAN.md | 6 +- .../MEDIAPIPE_HOLISTIC_INTEGRATION_REPORT.md | 370 ++++++++ .../OPERATIONS/ARCHITECTURE_REVIEW_REPORT.md | 96 +-- docs_v1.0/OPERATIONS/DOCUMENT_AUDIT_REPORT.md | 14 +- .../IMPLEMENTATION_COMPATIBILITY_ANALYSIS.md | 8 +- .../OPERATIONS/INCIDENT_RESPONSE_PROCEDURE.md | 12 +- .../OPERATIONS/INTEGRATED_PLAYER_GUIDE.md | 2 +- .../OPERATIONS/PRODUCTION_DEPLOYMENT_GUIDE.md | 2 +- .../OPERATIONS/RELEASE_v0.4.0_2026-04-30.md | 196 +++++ .../TRAINING_MAINTENANCE_RECORDS.md | 2 +- .../maintenance_records/MIGRATION_PLAN.md | 2 +- .../OPERATIONS/maintenance_records/README.md | 2 +- .../maintenance_records/archive/README.md | 2 +- ...OCS_STANDARD_PHASE2_APPROVAL_2026_03_27.md | 2 +- ...NGE_N8N_MCP_INTEGRATION_TEST_2026_03_23.md | 2 +- ...DENT_TEST_SYSTEM_INTEGRATION_2026_03_27.md | 18 +- ...AN_BIRTH_UUID_IMPLEMENTATION_2026_04_27.md | 486 +++++++++++ ...E_MOMENTRY_CORE_DATA_CLEANUP_2026_03_28.md | 34 +- ...MENTRY_CORE_DATA_CLEANUP_FIX_2026_03_28.md | 2 +- .../RCA_TEST_SYSTEM_INTEGRATION_2026_03_27.md | 18 +- .../RCA_N8N_API_PORT_CONFLICT_2026_03_26.md | 2 +- ...RESS_TIMEOUT_EXTERNAL_ACCESS_2026_03_27.md | 4 +- .../ASR_PROCESSOR_COMPARISON_REPORT.md | 480 +++++++++++ ...EW_DOCS_STANDARD_IMPROVEMENT_2026_03_27.md | 16 +- .../templates/TEMPLATE_CHANGE.md | 18 +- .../templates/TEMPLATE_CHANGE_AI_OPTIMIZED.md | 18 +- .../templates/TEMPLATE_CHANGE_LEGACY.md | 2 +- .../templates/TEMPLATE_INCIDENT.md | 18 +- .../TEMPLATE_INCIDENT_AI_OPTIMIZED.md | 18 +- .../templates/TEMPLATE_INCIDENT_LEGACY.md | 2 +- .../templates/TEMPLATE_MAINTENANCE.md | 24 +- .../templates/TEMPLATE_RCA.md | 18 +- .../templates/TEMPLATE_RCA_AI_OPTIMIZED.md | 18 +- .../TEMPLATE_RCA_AI_OPTIMIZED_SIMPLE.md | 18 +- .../templates/TEMPLATE_RCA_LEGACY.md | 2 +- docs_v1.0/PORTAL_FACE_API_IMPLEMENTATION.md | 294 +++++++ docs_v1.0/PORTAL_FACE_DEMO_PLAN.md | 436 ++++++++++ .../PORTAL_FACE_FRONTEND_IMPLEMENTATION.md | 235 +++++ docs_v1.0/PORTAL_FACE_VERIFICATION.md | 214 +++++ docs_v1.0/PORTAL_UI_INTEGRATION_PROPOSAL.md | 721 ++++++++++++++++ docs_v1.0/POSE_ACTION_DECODER_GUIDE.md | 378 ++++++++ .../AI_PROCESSOR_MODULE_REVISION_RECORDS.md | 2 +- .../CORE/PROCESSOR_IMPLEMENTATION_STATUS.md | 4 +- ...ESSOR_PERFORMANCE_EVALUATION_2026_04_01.md | 2 +- .../CORE/PROCESSOR_QUICK_REFERENCE.md | 28 +- .../CORE/YOLO_PROCESSOR_TECHNICAL_REVIEW.md | 430 ++++++++++ .../AI_DRIVEN_PROCESSOR_CONTRACT.md | 2 +- .../AI_PROCESSOR_COMPLIANCE_CHECKLIST.md | 2 +- .../SPECIFICATION/PROCESSOR_OUTPUT_SPEC.md | 48 +- .../PROCESSOR_STANDARDIZATION_TEMPLATE.md | 2 +- .../ASRX_REPLACEMENT_MAC_STUDIO_ANALYSIS.md | 20 +- .../ASRX_SELF_VS_PYANNOTE_COMPARISON.md | 4 +- .../SPEECH/ASR_ASRX_SPEAKER_MODEL_ANALYSIS.md | 2 +- .../SPEECH/ASR_CONFIGURATION_UNIFICATION.md | 2 +- .../PROCESSORS/SPEECH/ASR_IMPROVEMENT_PLAN.md | 2 +- .../PROCESSORS/SPEECH/ASR_vs_ASRX_ANALYSIS.md | 2 +- .../SPEECH/ASR_vs_ASRX_EDGE_AI_ANALYSIS.md | 6 +- .../ASR_vs_ASRX_REPLACEMENT_ANALYSIS.md | 10 +- .../PROCESSORS/VISUAL/FACE_MODEL_ANALYSIS.md | 2 +- ...FACE_RECOGNITION_IMPLEMENTATION_SUMMARY.md | 6 +- .../VISUAL/IMAGE_PROCESSING_ARCHITECTURE.md | 2 +- ...E_CLASSIFICATION_TEST_REPORT_2026_04_01.md | 4 +- ..._CLASSIFICATION_TEST_RESULTS_2026_04_01.md | 2 +- .../PROCESSORS/VISUAL/VISUAL_CHUNK_DESIGN.md | 2 +- .../_CORE/PROCESSOR_RESUME_STRATEGY.md | 36 +- .../_CORE/PROCESSOR_UPGRADE_ANALYSIS.md | 321 +++++++ .../PROCESSORS/_CORE/RULE_SPECIFICATION.md | 74 +- docs_v1.0/PROCESSOR_STATUS_ANALYSIS.md | 328 +++++++ docs_v1.0/PROJECT_DOCS_V1_INTEGRATION_PLAN.md | 2 +- docs_v1.0/REFERENCE/API_QUICK_REFERENCE.md | 2 +- docs_v1.0/REFERENCE/API_REFERENCE.md | 2 +- docs_v1.0/REFERENCE/API_TRAINING_MARCOM.md | 34 +- .../MODULE_STANDARDIZATION_SPECIFICATION.md | 2 +- docs_v1.0/REFERENCE/PORTAL_API_DEMO_GUIDE.md | 416 +++++++++ .../REFERENCE/PROCESSING_STATUS_JSONB_SPEC.md | 682 +++++++++++++++ docs_v1.0/REFERENCE/VIDEO_PROCESSING_SPEC.md | 94 +- docs_v1.0/RULE1_CHUNK_INGESTION_CHECK.md | 204 +++++ docs_v1.0/RULE1_FACE_DATA_SOURCE_FIX.md | 239 ++++++ docs_v1.0/RULE1_TRIGGER_MECHANISM.md | 344 ++++++++ docs_v1.0/STANDARDS/DOCS_STANDARD.md | 328 ++++++- ...TANDARD_IMPROVEMENT_PROPOSAL_2026_03_27.md | 2 +- docs_v1.0/SYNONYM_CONFIGURATION.md | 2 +- docs_v1.0/TESTING/TEST_AND_BENCHMARK_PLAN.md | 2 +- docs_v1.0/TIME_FORMAT_UNIFICATION_PLAN.md | 2 +- docs_v1.0/UUID_CLEANUP_PLAN.md | 256 ++++++ docs_v1.0/UUID_LENGTH_ISSUE.md | 284 ++++++ docs_v1.0/V4_ISSUES_TRACKING.md | 249 ++++++ .../V4_MIGRATION_PHASE3_DISABLE_OLD_API.md | 187 ++++ docs_v1.0/VIDEOS_TABLE_NAMING_ISSUE.md | 285 ++++++ .../design/OBJECT_SNAPSHOT_SYSTEM_DESIGN.md | 710 +++++++++++++++ docs_v1.0/session-ses_2f27.md | 11 +- 185 files changed, 21071 insertions(+), 1605 deletions(-) delete mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_API_GUIDE.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/FACE_TO_IDENTITY_FLOW.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/FILE_IDENTITIES_TABLE_SPEC.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_AGENT_SPEC.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/PHASE1_MIGRATION_PLAN.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/PHASE2_MIGRATION_SUMMARY.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_COMPLETE.md create mode 100644 docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_STATUS.md create mode 100644 docs_v1.0/API/PEOPLE_API_MARCOM_MAPPING.md create mode 100644 docs_v1.0/API_DOCUMENTATION.md create mode 100644 docs_v1.0/ARCHITECTURE/CLIP_EMBEDDING_BENCHMARK_PLAN.md create mode 100644 docs_v1.0/ARCHITECTURE/IDENTITY_REFERENCE_VECTOR_DESIGN.md create mode 100644 docs_v1.0/ARCHITECTURE/POSE_BASED_MATCHING_OPTIMIZATION_PLAN.md create mode 100644 docs_v1.0/ARCHITECTURE/SOUND_RECOGNITION_EXTENSION.md create mode 100644 docs_v1.0/BODY_ACTION_DECODER_CLASSIFICATION.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/FACE_PROCESSOR_PERFORMANCE_2026-04-28.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/FACE_TRACKER_INTEGRATION_REPORT_2026-04-28.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/IDENTITY_SYSTEM_EXPERIMENT_2026-04-28.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/LANDMARKS_SOURCE_ANALYSIS_2026-04-28.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/ONE_TO_MANY_MATCHING_OPTIMIZATION_2026-04-28.md create mode 100644 docs_v1.0/EXPERIMENT_REPORTS/POSE_BASED_MATCHING_FINAL_REPORT_2026-04-28.md create mode 100644 docs_v1.0/FACE_THUMBNAIL_IMPLEMENTATION.md create mode 100644 docs_v1.0/FACE_TRACKER_DATA_STRUCTURE.md create mode 100644 docs_v1.0/FACE_TRACKER_GUIDE.md create mode 100644 docs_v1.0/FILE_UUID_SPEC.md create mode 100644 docs_v1.0/IDENTITY_API_SPEC.md create mode 100644 docs_v1.0/IMPLEMENTATION/DEV_3003_REFACTOR.md create mode 100644 docs_v1.0/IMPLEMENTATION/PORTAL_BIRTH_UUID_ADAPTATION.md create mode 100644 docs_v1.0/MEDIAPIPE_HOLISTIC_INTEGRATION_REPORT.md create mode 100644 docs_v1.0/OPERATIONS/RELEASE_v0.4.0_2026-04-30.md create mode 100644 docs_v1.0/OPERATIONS/maintenance_records/plans/_active/PLAN_BIRTH_UUID_IMPLEMENTATION_2026_04_27.md create mode 100644 docs_v1.0/OPERATIONS/maintenance_records/reviews/ASR_PROCESSOR_COMPARISON_REPORT.md create mode 100644 docs_v1.0/PORTAL_FACE_API_IMPLEMENTATION.md create mode 100644 docs_v1.0/PORTAL_FACE_DEMO_PLAN.md create mode 100644 docs_v1.0/PORTAL_FACE_FRONTEND_IMPLEMENTATION.md create mode 100644 docs_v1.0/PORTAL_FACE_VERIFICATION.md create mode 100644 docs_v1.0/PORTAL_UI_INTEGRATION_PROPOSAL.md create mode 100644 docs_v1.0/POSE_ACTION_DECODER_GUIDE.md create mode 100644 docs_v1.0/PROCESSORS/CORE/YOLO_PROCESSOR_TECHNICAL_REVIEW.md create mode 100644 docs_v1.0/PROCESSORS/_CORE/PROCESSOR_UPGRADE_ANALYSIS.md create mode 100644 docs_v1.0/PROCESSOR_STATUS_ANALYSIS.md create mode 100644 docs_v1.0/REFERENCE/PORTAL_API_DEMO_GUIDE.md create mode 100644 docs_v1.0/REFERENCE/PROCESSING_STATUS_JSONB_SPEC.md create mode 100644 docs_v1.0/RULE1_CHUNK_INGESTION_CHECK.md create mode 100644 docs_v1.0/RULE1_FACE_DATA_SOURCE_FIX.md create mode 100644 docs_v1.0/RULE1_TRIGGER_MECHANISM.md create mode 100644 docs_v1.0/UUID_CLEANUP_PLAN.md create mode 100644 docs_v1.0/UUID_LENGTH_ISSUE.md create mode 100644 docs_v1.0/V4_ISSUES_TRACKING.md create mode 100644 docs_v1.0/V4_MIGRATION_PHASE3_DISABLE_OLD_API.md create mode 100644 docs_v1.0/VIDEOS_TABLE_NAMING_ISSUE.md create mode 100644 docs_v1.0/design/OBJECT_SNAPSHOT_SYSTEM_DESIGN.md diff --git a/docs_v1.0/AI_AGENTS/CONTEXT/METADATA_PROCESSORS.md b/docs_v1.0/AI_AGENTS/CONTEXT/METADATA_PROCESSORS.md index f80737f..7d808f8 100644 --- a/docs_v1.0/AI_AGENTS/CONTEXT/METADATA_PROCESSORS.md +++ b/docs_v1.0/AI_AGENTS/CONTEXT/METADATA_PROCESSORS.md @@ -193,7 +193,7 @@ GROUP BY metadata_version; | `person_id` | varchar(255) | 人物唯一 ID (如 person_001) | | `name` | varchar(255) | 人物名稱 (可確認) | | `speaker_id` | varchar(255) | 對應的說話者 ID | -| `video_uuid` | varchar(255) | 影片 UUID | +| `file_uuid` | varchar(255) | 影片 UUID | | `face_identity_id` | integer | 對應的 global identity | | `appearance_count` | integer | 出現次數 | | `first_appearance_time` | double | 首次出現時間 | @@ -264,13 +264,13 @@ Step 4: Global Matching -- 取得影片中的人物列表 SELECT person_id, name, speaker_id, appearance_count FROM dev.person_identities -WHERE video_uuid = '384b0ff44aaaa1f1' +WHERE file_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966' ORDER BY appearance_count DESC; -- 取得 chunk 的人物 SELECT c.chunk_id, pi.name, pi.speaker_id FROM dev.chunks c -JOIN dev.person_identities pi ON c.uuid = pi.video_uuid +JOIN dev.person_identities pi ON c.uuid = pi.file_uuid WHERE c.chunk_id = 'sentence_0001'; ``` @@ -280,7 +280,7 @@ WHERE c.chunk_id = 'sentence_0001'; -- 取得某 chunk 的人物 SELECT pi.name, pi.speaker_id, pi.appearance_count FROM dev.person_identities pi -JOIN dev.chunks c ON c.uuid = pi.video_uuid +JOIN dev.chunks c ON c.uuid = pi.file_uuid WHERE c.chunk_id = 'sentence_0001'; ``` @@ -484,19 +484,19 @@ SELECT COUNT(*) FROM dev.chunks WHERE visual_stats IS NOT NULL;" ```bash # Step 1: ASRX 執行說話者分離 -python scripts/asrx_processor.py --uuid 384b0ff44aaaa1f1 +python scripts/asrx_processor.py --uuid 384b0ff44aaaa1f14cb2cd63b3fea966 # Step 2: Face 執行臉部偵測 -python scripts/analyze_video_faces.py --uuid 384b0ff44aaaa1f1 +python scripts/analyze_video_faces.py --uuid 384b0ff44aaaa1f14cb2cd63b3fea966 # Step 3: Auto-identify 建立影片級人物 -python scripts/auto_identify_persons.py --uuid 384b0ff44aaaa1f1 +python scripts/auto_identify_persons.py --uuid 384b0ff44aaaa1f14cb2cd63b3fea966 # Step 4: 全局 Identity 比對 (需累積一定數量的 face_identities) python scripts/match_faces_to_identities.py # Step 5: 重新生成 chunk 5W1H (包含新的 identity 資訊) -python scripts/generate_chunk_summaries.py --uuid 384b0ff44aaaa1f1 +python scripts/generate_chunk_summaries.py --uuid 384b0ff44aaaa1f14cb2cd63b3fea966 ``` ### 檢查待處理狀態 @@ -515,7 +515,7 @@ WHERE face_ids IS NOT NULL AND array_length(face_ids, 1) > 0;" # 檢查 person_identities psql -h localhost -U accusys -d momentry -c " SELECT COUNT(*) FROM dev.person_identities -WHERE video_uuid = '384b0ff44aaaa1f1';" +WHERE file_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966';" # 檢查 face_identities (全局) psql -h localhost -U accusys -d momentry -c " @@ -560,4 +560,4 @@ SELECT COUNT(*) FROM dev.face_identities;" 2. **face_ids** → 進入 `chunk_identity.faces` 3. **person_identities** → 進入 `chunk_identity.person_name` -確保 LLM 產生的 5W1H 包含最新的角色資訊。 \ No newline at end of file +確保 LLM 產生的 5W1H 包含最新的角色資訊。 diff --git a/docs_v1.0/AI_AGENTS/CORE/AGENT_SPEC.md b/docs_v1.0/AI_AGENTS/CORE/AGENT_SPEC.md index afeffc2..71685cb 100644 --- a/docs_v1.0/AI_AGENTS/CORE/AGENT_SPEC.md +++ b/docs_v1.0/AI_AGENTS/CORE/AGENT_SPEC.md @@ -1,10 +1,33 @@ +--- +document_type: "standard_doc" +service: "MOMENTRY_CORE" +title: "AI Agent 設計規範" +date: "2026-04-27" +version: "V1.1" +status: "active" +owner: "Warren" +created_by: "OpenCode" +tags: + - "AI Agent" + - "設計規範" + - "三層架構" + - "processing_status" +ai_query_hints: + - "查詢 AI Agent 設計規範的內容" + - "AI Agent 的三層架構定義" + - "Agent 類型列表" + - "Agent 進度追蹤方式" + - "processing_status JSONB agents 字段" + - "如何設計 AI Agent" +--- + # AI Agent 設計規範 (Agent Design Specification) | 項目 | 內容 | |------|------| | 建立者 | OpenCode | | 建立時間 | 2026-04-25 | -| 文件版本 | V1.0 | +| 文件版本 | V1.1 | --- @@ -13,6 +36,7 @@ | 版本 | 日期 | 目的 | 操作人 | 工具/模型 | |------|------|------|--------|-----------| | V1.0 | 2026-04-25 | 定義 Momentry Core 中 AI Agent 的標準設計與職責 | OpenCode | OpenCode | +| V1.1 | 2026-04-27 | 添加 Agent 類型列表和進度追蹤(processing_status JSONB) | OpenCode | GLM-5 | --- @@ -33,10 +57,10 @@ AI Agent 負責處理那些傳統程式難以精確定義規則的任務。 **注意**: 在系統架構中,Agent 被視為一種 **資源 (Resource)**,與 Processor 和 Service 統一由 **資源註冊中心 (Resource Registry)** 管理。 -1. **語義理解 (Semantic Understanding)**: 將非結構化數據(如 OCR 文字、雜訊 ASR 文本)轉化為結構化標籤 (5W1H)。 -2. **跨模態匹配 (Cross-Modal Matching)**: 綜合視覺、聽覺和文本證據,判斷「畫面中的臉」是否為「資料庫中的人」。 -3. **內容生成 (Content Generation)**: 為影片片段生成自然的摘要或標題。 -4. **查詢解析 (Query Parsing)**: 將用戶的自然語言請求轉譯為系統可執行的 API 調用序列。 +1. **語義理解 (Semantic Understanding)**: 將非結構化數據(如 OCR 文字、雜訊 ASR 文本)轉化為結構化標籤 (5W1H)。 +2. **跨模態匹配 (Cross-Modal Matching)**: 綜合視覺、聽覺和文本證據,判斷「畫面中的臉」是否為「資料庫中的人」。 +3. **內容生成 (Content Generation)**: 為影片片段生成自然的摘要或標題。 +4. **查詢解析 (Query Parsing)**: 將用戶的自然語言請求轉譯為系統可執行的 API 調用序列。 --- @@ -45,8 +69,8 @@ AI Agent 負責處理那些傳統程式難以精確定義規則的任務。 所有 AI Agent 的設計文件必須遵循以下結構: ### 3.1 檔案命名 -* **格式**: `[AGENT_TYPE]_[PURPOSE].md` -* **範例**: `CONTEXT_5W1H_INFERENCE.md` +* **格式**: `[AGENT_TYPE]_[PURPOSE].md` +* **範例**: `CONTEXT_5W1H_INFERENCE.md` ### 3.2 文件內容 @@ -56,17 +80,17 @@ AI Agent 負責處理那些傳統程式難以精確定義規則的任務。 #### 3.2.2 輸入數據 (Input) 定義 Agent 接收的數據格式。通常來自 Processor 輸出或 Rule 產物。 -* **來源**: `PROCESSORS/` 或 `CHUNKING/` -* **格式**: JSON, Text, List of Frames. +* **來源**: `PROCESSORS/` 或 `CHUNKING/` +* **格式**: JSON, Text, List of Frames. #### 3.2.3 核心邏輯 (Core Logic: Prompt / Workflow) 這是 Agent 的靈魂。 -* **單一 Prompt Agent**: 提供完整的 System Prompt。 +* **單一 Prompt Agent**: 提供完整的 System Prompt。 ```markdown ## System Prompt You are a scene analysis assistant... ``` -* **多步 Workflow Agent**: 提供步驟圖或偽代碼。 +* **多步 Workflow Agent**: 提供步驟圖或偽代碼。 ```mermaid graph TD A[Start] --> B[Extract Entities] @@ -86,31 +110,71 @@ AI Agent 負責處理那些傳統程式難以精確定義規則的任務。 #### 3.2.5 模型配置 (Model Config) 建議使用的模型類型及其原因。 -* **推理模型 (Reasoning)**: `o1`, `R1` (用於複雜邏輯判斷) -* **生成模型 (Generation)**: `GPT-4o`, `Sonnet` (用於摘要) -* **本地模型 (Local)**: `Llama-3`, `Qwen` (用於隱私數據) +* **推理模型 (Reasoning)**: `o1`, `R1` (用於複雜邏輯判斷) +* **生成模型 (Generation)**: `GPT-4o`, `Sonnet` (用於摘要) +* **本地模型 (Local)**: `Llama-3`, `Qwen` (用於隱私數據) --- ## 4. 開發工作流 (Development Workflow) -1. **定義需求**: 確定是否需要 AI 介入 (若規則可解,優先使用 Rule)。 -2. **撰寫 Prompt**: 在文檔中迭代 Prompt,直到達到穩定輸出。 -3. **工具串接**: 若需要外部數據 (如 TMDB),定義 Tool 定義。 -4. **實作封裝**: 將 Prompt/Workflow 封裝為 Rust/Python 模組,透過 API 調用。 +1. **定義需求**: 確定是否需要 AI 介入 (若規則可解,優先使用 Rule)。 +2. **撰寫 Prompt**: 在文檔中迭代 Prompt,直到達到穩定輸出。 +3. **工具串接**: 若需要外部數據 (如 TMDB),定義 Tool 定義。 +4. **實作封裝**: 將 Prompt/Workflow 封裝為 Rust/Python 模組,透過 API 調用。 --- ## 5. 相關文件 -* `UNIFIED_RESOURCE_REGISTRY.md` - 系統統一資源管理架構 (Agents 作為資源註冊)。 -* `AI_DRIVEN_PROCESSOR_CONTRACT.md` - Processor 層級的整合合約。 -* `CHUNKING_ARCHITECTURE.md` - Rule 層級的架構。 -* `FILE_IDENTITY_API_DESIGN.md` - 全局架構。 +* `UNIFIED_RESOURCE_REGISTRY.md` - 系統統一資源管理架構 (Agents 作為資源註冊)。 +* `AI_DRIVEN_PROCESSOR_CONTRACT.md` - Processor 層級的整合合約。 +* `CHUNKING_ARCHITECTURE.md` - Rule 層級的架構。 +* `FILE_IDENTITY_API_DESIGN.md` - 全局架構。 + +--- + +## 6. Agent 類型列表 + +| Agent | 目的 | 觸發條件 | 文檔 | +|-------|------|----------|------| +| **Translation Agent** | 多語言翻譯 | 用戶手動觸發 | `AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md` | +| **5W1H Agent** | 場景分析(Who/What/When/Where/Why/How) | Rule 3 完成 | `AI_AGENTS/SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md` | +| **Identity Agent** | 身份解析(Face/Speaker → Person) | Face/Speaker 完成 | `AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md` | + +--- + +## 7. Agent 進度追蹤 + +從 V1.2 起,所有 Agent 任務透過 `processing_status` JSONB 的 `agents` 字段追蹤。 + +### JSONB 範例 + +```json +{ + "agents": { + "5w1h": { + "status": "running", + "scenes_processed": 5, + "scenes_total": 1332, + "progress_pct": 0.4 + } + } +} +``` + +### 查詢 Agent 進度 + +```sql +SELECT processing_status->'agents'->'5w1h'->>'status' FROM videos WHERE uuid = 'xxx'; +``` + +詳細規範請參考: `REFERENCE/PROCESSING_STATUS_JSONB_SPEC.md` --- ## 版本資訊 -- 版本: V1.0 -- 建立日期: 2026-04-25 +* 版本: V1.1 +* 建立日期: 2026-04-25 +* 文件更新: 2026-04-27 diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_API_GUIDE.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_API_GUIDE.md deleted file mode 100644 index 9d4435f..0000000 --- a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_API_GUIDE.md +++ /dev/null @@ -1,248 +0,0 @@ -# Momentry Face / Speaker / Person API 開發指南 - -> **版本**: 3.5 | **更新日期**: 2026-04-17 -> **適用對象**: n8n 自動化流程開發者、Portal 前端開發者 - ---- - -## 快速開始 - -### 環境 - -| 環境 | URL | 說明 | -|------|-----|------| -| **正式版** | `https://api.momentry.ddns.net` | 外部存取 (HTTPS/TLSv1.3) | -| **本機版** | `http://localhost:3002` | 同一台機器使用 (延遲更低) | - -### 認證 - -所有 API 請求需在 Header 加入 API Key: - -```bash -curl https://api.momentry.ddns.net/api/v1/person/list \ - -H "X-API-Key: YOUR_API_KEY" -``` - -**API Key**(marcom 團隊使用): -``` -muser_68600856036340bcafc01930eb4bd839 -``` - ---- - -## ⚠️ 鐵律:所有 Face/Speaker/Person API 都必須提供 video_uuid - -**沒有例外。** 所有端點都需要 `video_uuid`。 - -``` -錯誤: GET /api/v1/person/list → 400 missing field `video_uuid` -錯誤: GET /api/v1/person/Person_0 → 400 missing field `video_uuid` -正確: GET /api/v1/person/list?video_uuid=xxx → 200 OK -``` - -| 識別碼 | 全域唯一 | 說明 | -|--------|:---:|------| -| `chunk_id` | ❌ | 每部影片重新編號 | -| `person_id` | ❌ | 每部影片有自己的 Person_0, Person_1... | -| `speaker_id` | ❌ | 每部影片有自己的 SPEAKER_0, SPEAKER_1... | -| **`video_uuid + person_id`** | ✅ | 唯一組合 | -| **`video_uuid + chunk_id`** | ✅ | 唯一組合 | -| `face_id` | ✅ | UUID 格式,全域唯一 | -| `merge_id` | ✅ | UUID 格式,全域唯一 | - ---- - -## API 端點總覽(全部需要 video_uuid) - -| 端點 | 方法 | video_uuid 位置 | 說明 | -|------|:---:|:---:|------| -| `/api/v1/person/list` | GET | query | 列出人物 | -| `/api/v1/person/auto-identify` | POST | body | 自動識別人 | -| `/api/v1/person/suggest` | POST | body | AI 建議 | -| `/api/v1/person/:id` | GET | query | 人物詳情 | -| `/api/v1/person/:id` | PATCH | query | 更新人物 | -| `/api/v1/person/:id/thumbnail` | GET | query | 臉部截圖 | -| `/api/v1/person/:id/timeline` | GET | query | 出場時間軸 | -| `/api/v1/person/:id/similar` | GET | query | 相似人物 | -| `/api/v1/person/:id/appearances` | GET | query | 出場紀錄 | -| `/api/v1/person/:id/unbind-speaker` | POST | body | 解除 Speaker | -| `/api/v1/person/:id/reassign-speaker` | POST | body | 重新綁定 Speaker | -| `/api/v1/person/:id/remove-appearance` | POST | body | 刪除出場紀錄 | -| `/api/v1/person/:id/reassign-appearance` | POST | body | 轉移出場紀錄 | -| `/api/v1/person/:id/split` | POST | body | 分割人物 | -| `/api/v1/person/merge` | POST | body | 合併人物 | -| `/api/v1/person/merge/undo` | POST | body | 撤銷合併 | -| `/api/v1/person/merge/history` | GET | query | 合併歷史 | -| `/api/v1/search/universal` | POST | body | 統一搜尋 | -| `/api/v1/search/persons` | GET | query | 搜尋人物 | -| `/api/v1/chunks/:id/persons` | GET | query | chunk 內人物 | -| `/api/v1/face/register` | POST | body | 註冊臉孔 | -| `/api/v1/face/list` | GET | query | 已註冊臉孔列表 | - ---- - -## 詳細 API 說明 - -### 1. GET /api/v1/person/list - -列出指定影片的人物。 - -**Query Parameters:** - -| 參數 | 類型 | 必填 | 說明 | -|------|:---:|:---:|------| -| `video_uuid` | string | **是** | 影片 UUID | -| `limit` | int | 否 | 每頁筆數 (預設 50) | -| `offset` | int | 否 | 偏移量 (預設 0) | -| `min_appearances` | int | 否 | 最低出場次數 | -| `has_speaker` | bool | 否 | 僅顯示有 Speaker 的人物 | - -**Request:** -``` -GET /api/v1/person/list?video_uuid=384b0ff44aaaa1f1&limit=10&min_appearances=100 -``` - -**Response:** -```json -{ - "success": true, - "persons": [ - { - "person_id": "Person_0", - "name": null, - "speaker_id": "SPEAKER_0", - "appearance_count": 17832, - "total_appearance_duration": 3600.5, - "first_appearance_time": 79.56, - "last_appearance_time": 6863.34, - "is_confirmed": false, - "speaker_confidence": 0.504 - } - ], - "total": 303 -} -``` - -### 2. GET /api/v1/person/:id - -取得人物詳情。 - -**Query Parameters:** - -| 參數 | 類型 | 必填 | -|------|:---:|:---:| -| `video_uuid` | string | **是** | - -### 3. POST /api/v1/person/merge - -合併多個人物為一人。 - -**Request:** -```json -{ - "video_uuid": "384b0ff44aaaa1f1", - "target_person_id": "Person_0", - "source_person_ids": ["Person_4", "Person_25"] -} -``` - -**Response:** -```json -{ - "success": true, - "message": "Merged 2 persons into Person_0", - "target_person_id": "Person_0", - "merge_id": "5b12e3ac-12fa-45c0-88e1-5cff67604a7d" -} -``` - -> ⚠️ **請儲存 `merge_id`**,以便日後撤銷合併。 - -### 4. POST /api/v1/search/universal - -統一搜尋。 - -**Request:** -```json -{ - "query": "stamp", - "uuid": "384b0ff44aaaa1f1", - "types": ["chunk", "person"], - "limit": 20 -} -``` - ---- - -## 影片定位:Frame 為主 - -**重要**: 所有影片位置都以 **frame (幀號)** 為唯一準確單位,time 僅供參考。 - -```json -{ - "start_frame": 29795, - "end_frame": 29963, - "fps": 59.94, - "start_time": 497.08, - "end_time": 499.88 -} -``` - -**轉換公式**: `time = frame / fps` - -> ⚠️ **注意**: 所有搜尋 API (`/api/v1/search`, `/api/v1/n8n/search`, `/api/v1/search/universal`) 現在都統一回傳 `start_frame`, `end_frame`, `fps` 欄位,確保前端可以精確定位影片幀號。 - ---- - -## n8n 工作流範例 - -``` -[Webhook: video_processed] - body: { "uuid": "384b0ff44aaaa1f1" } - ↓ -[HTTP: POST /api/v1/person/auto-identify] - body: { "video_uuid": "{{ $json.uuid }}" } - ↓ -[HTTP: POST /api/v1/person/suggest] - body: { "video_uuid": "{{ $json.uuid }}" } - ↓ -[IF: confidence >= 0.7] - ├─ YES → [HTTP: PATCH /api/v1/person/{{person_id}}?video_uuid={{uuid}}] - └─ NO → [等待人工確認] -``` - ---- - -## 錯誤碼 - -| HTTP | 說明 | -|:---:|------| -| 200 | 成功 | -| 400 | 缺少 video_uuid 或參數錯誤 | -| 401 | API Key 無效 | -| 404 | 資源不存在 | -| 422 | 請求體缺少 video_uuid | -| 500 | 伺服器錯誤 | - ---- - -## 資料庫結構 - -### person_identities - -| 欄位 | 類型 | 說明 | -|------|------|------| -| `person_id` | VARCHAR | 識別碼 (每部影片獨立) | -| `video_uuid` | VARCHAR | **所屬影片 (必填)** | -| `name` | VARCHAR | 人物名稱 | -| `speaker_id` | VARCHAR | 對應說話者 ID (每部影片獨立) | -| `appearance_count` | INT | 出場次數 | -| `is_confirmed` | BOOLEAN | 是否已確認 | - -### 唯一性約束 - -```sql -UNIQUE (video_uuid, person_id) -``` - -每部影片可以有自己的 `Person_0`,但同一部影片內 `person_id` 必須唯一。 diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_IDENTITY_TUTORIAL.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_IDENTITY_TUTORIAL.md index 74a0a3b..d358ff5 100644 --- a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_IDENTITY_TUTORIAL.md +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_IDENTITY_TUTORIAL.md @@ -6,27 +6,27 @@ 在開始之前,請區分以下名詞: -1. **Face (臉孔)**: 影像中偵測到的具體臉部特徵數據(向量)。 -2. **Person (角色實體)**: 在特定影片中出現的角色。他是 Face + Speaker (說話者) 的集合體。 - * *例如:影片 `384b0ff44aaaa1f1` 中的 `Person_17`。* -3. **Identity (真實身份)**: 跨越所有影片的全域實體(如真實演員或新聞人物)。 - * *例如:Cary Grant, Audrey Hepburn。* +1. **Face (臉孔)**: 影像中偵測到的具體臉部特徵數據(向量)。 +2. **Person (角色實體)**: 在特定影片中出現的角色。他是 Face + Speaker (說話者) 的集合體。 + * *例如:影片 `384b0ff44aaaa1f14cb2cd63b3fea966` 中的 `Person_17`。* +3. **Identity (真實身份)**: 跨越所有影片的全域實體(如真實演員或新聞人物)。 + * *例如:Cary Grant, Audrey Hepburn。* --- ## 前置準備 -* **API URL**: `http://localhost:3003` -* **API Key**: `/` -* **目標影片 (Video UUID)**: `384b0ff44aaaa1f1` (Charade) +* **API URL**: `http://localhost:3003` +* **API Key**: `/` +* **目標影片 (Video UUID)**: `384b0ff44aaaa1f14cb2cd63b3fea966` (Charade) --- ## 情境設定 我們要在影片中識別兩位主角: -1. **Audrey Hepburn** (飾演 Reggie Lampert) -2. **Cary Grant** (飾演 Peter Joshua) +1. **Audrey Hepburn** (飾演 Reggie Lampert) +2. **Cary Grant** (飾演 Peter Joshua) --- @@ -35,7 +35,7 @@ 首先,我們查詢系統在影片中偵測到了哪些人物 (Person)。 ```bash -curl -s "http://localhost:3003/api/v1/person/list?video_uuid=384b0ff44aaaa1f1&limit=5" \ +curl -s "http://localhost:3003/api/v1/person/list?file_uuid=384b0ff44aaaa1f14cb2cd63b3fea966&limit=5" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ | python3 -m json.tool ``` @@ -77,7 +77,7 @@ curl -s -X POST "http://localhost:3003/api/v1/identities/from-person" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ -H "Content-Type: application/json" \ -d '{ - "video_uuid": "384b0ff44aaaa1f1", + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", "person_id": "Person_17", "identity_name": "Audrey Hepburn", "metadata": { "role": "Reggie Lampert" } @@ -107,7 +107,7 @@ curl -s -X POST "http://localhost:3003/api/v1/identities/from-person" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ -H "Content-Type: application/json" \ -d '{ - "video_uuid": "384b0ff44aaaa1f1", + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", "person_id": "Person_4", "identity_name": "Cary Grant", "metadata": { "role": "Peter Joshua" } @@ -163,7 +163,7 @@ curl -s "http://localhost:3003/api/v1/identities?limit=10" \ 再次查詢影片中的 `Person` 列表,確認名稱是否已自動更新。 ```bash -curl -s "http://localhost:3003/api/v1/person/list?video_uuid=384b0ff44aaaa1f1&limit=5" \ +curl -s "http://localhost:3003/api/v1/person/list?file_uuid=384b0ff44aaaa1f14cb2cd63b3fea966&limit=5" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ | python3 -m json.tool ``` diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_PROGRESS.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_PROGRESS.md index 2bab1a6..7d81205 100644 --- a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_PROGRESS.md +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_PROGRESS.md @@ -1,6 +1,6 @@ # Face/Speaker/Person 分析完成度 -**UUID**: `384b0ff44aaaa1f1` +**UUID**: `384b0ff44aaaa1f14cb2cd63b3fea966` **视频**: Charade (1963) - ~115 min, 412,343 frames, 59.94 fps **更新日期**: 2026-04-14 @@ -10,11 +10,11 @@ | 模块 | 状态 | 文件 | 数据量 | |------|------|------|--------| -| **Face Detection** | ✅ 完成 | `384b0ff44aaaa1f1.face.json` | 10,691 frames, 25,174 faces | -| **Face Clustering** | ✅ 完成 | `384b0ff44aaaa1f1.face_clustered.json` | 302 unique Person IDs | -| **ASR (语音识别)** | ✅ 完成 | `384b0ff44aaaa1f1.asr.json` | 1,011 segments | -| **ASRX (增强语音)** | ✅ 完成 | `384b0ff44aaaa1f1.asrx.json` | - | -| **Pose (姿态)** | ✅ 完成 | `384b0ff44aaaa1f1.pose.json` | - | +| **Face Detection** | ✅ 完成 | `384b0ff44aaaa1f14cb2cd63b3fea966.face.json` | 10,691 frames, 25,174 faces | +| **Face Clustering** | ✅ 完成 | `384b0ff44aaaa1f14cb2cd63b3fea966.face_clustered.json` | 302 unique Person IDs | +| **ASR (语音识别)** | ✅ 完成 | `384b0ff44aaaa1f14cb2cd63b3fea966.asr.json` | 1,011 segments | +| **ASRX (增强语音)** | ✅ 完成 | `384b0ff44aaaa1f14cb2cd63b3fea966.asrx.json` | - | +| **Pose (姿态)** | ✅ 完成 | `384b0ff44aaaa1f14cb2cd63b3fea966.pose.json` | - | | **Speaker Diarization** | ⚠️ 未集成 | - | ASR segments 无 speaker 信息 | --- diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_QUICK_START.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_QUICK_START.md index 6aca459..956c32e 100644 --- a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_QUICK_START.md +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_QUICK_START.md @@ -12,7 +12,7 @@ ```bash export BASE="http://localhost:3002" export KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -export UUID="384b0ff44aaaa1f1" +export UUID="384b0ff44aaaa1f14cb2cd63b3fea966" ``` --- @@ -145,11 +145,11 @@ curl "$BASE/api/v1/person/list?min_appearances=100&has_speaker=true&limit=20" \ curl "$BASE/api/v1/person/Person_0" -H "X-API-Key: $KEY" # 取得臉部截圖 -curl "$BASE/api/v1/person/Person_0/thumbnail?video_uuid=$UUID" \ +curl "$BASE/api/v1/person/Person_0/thumbnail?file_uuid=$UUID" \ -H "X-API-Key: $KEY" -o person0_face.jpg # 取得第 5 次出現的臉部截圖 -curl "$BASE/api/v1/person/Person_0/thumbnail?video_uuid=$UUID&index=4" \ +curl "$BASE/api/v1/person/Person_0/thumbnail?file_uuid=$UUID&index=4" \ -H "X-API-Key: $KEY" -o person0_face_5.jpg ``` @@ -188,11 +188,11 @@ curl -X POST "$BASE/api/v1/face/register" \ ```bash # 預設:第一次出現的臉部 -curl "$BASE/api/v1/person/Person_0/thumbnail?video_uuid=$UUID" \ +curl "$BASE/api/v1/person/Person_0/thumbnail?file_uuid=$UUID" \ -H "X-API-Key: $KEY" -o face.jpg # 指定第 N 次出現 -curl "$BASE/api/v1/person/Person_0/thumbnail?video_uuid=$UUID&index=10" \ +curl "$BASE/api/v1/person/Person_0/thumbnail?file_uuid=$UUID&index=10" \ -H "X-API-Key: $KEY" -o face_10.jpg ``` @@ -229,7 +229,7 @@ curl "$BASE/api/v1/person/Person_0/similar?threshold=0.5&limit=10" \ curl -X POST "$BASE/api/v1/person/suggest" \ -H "X-API-Key: $KEY" \ -H "Content-Type: application/json" \ - -d '{"video_uuid": "'$UUID'"}' + -d '{"file_uuid": "'$UUID'"}' ``` ```json @@ -373,7 +373,7 @@ curl "$BASE/api/v1/person/merge/history" -H "X-API-Key: $KEY" | **搜尋人物** | GET | `/api/v1/search/persons?query=Person` | | **列出人物** | GET | `/api/v1/person/list?limit=20` | | **人物詳情** | GET | `/api/v1/person/:id` | -| **人物截圖** | GET | `/api/v1/person/:id/thumbnail?video_uuid=...` | +| **人物截圖** | GET | `/api/v1/person/:id/thumbnail?file_uuid=...` | | **相似人物** | GET | `/api/v1/person/:id/similar` | | **AI 建議** | POST | `/api/v1/person/suggest` | | **綁定名稱** | PATCH | `/api/v1/person/:id` | diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md index 50f924b..9f84811 100644 --- a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_SPEAKER_PERSON_WORKFLOW.md @@ -1,22 +1,43 @@ -# Face / Speaker / Person / Identity Workflow Guide +# Face to Identity Workflow Guide -This document describes the end-to-end workflow for managing characters in Momentry Core, from raw detection to a clean, aggregated identity database. +> Version: V4.0 | Date: 2026-04-28 +> Architecture: Two-layer (Face → Identity) +> Related: [FACE_TO_IDENTITY_FLOW.md](./FACE_TO_IDENTITY_FLOW.md) -## 📊 1. Workflow Visualization +--- + +## Overview + +V4.0 架構實現 Face → Identity 直接綁定,移除 person_id 中間層,簡化工作流程。 + +### Key Changes (V3.x → V4.0) + +| Change | V3.x | V4.0 | +|--------|------|------| +| **Architecture** | Three-layer (Face → Person → Identity) | Two-layer (Face → Identity) | +| **Person ID** | Video-local person_id | ❌ Removed | +| **Registration** | POST /identities/from-person | POST /identities/register | +| **Merge** | POST /person/merge | POST /agents/suggest/merge | +| **Candidates** | GET /person/list | GET /faces/candidates | +| **file_uuid** | Used everywhere | **file_uuid** | + +--- + +## Workflow Visualization ```mermaid graph TD %% Nodes Start((Start Analysis)) - ListPersons[List Persons] + ListCandidates[List Face Candidates] subgraph "Phase 1: Registration" CheckIdentity{Identity Exists?} Register[Register Identity] - Link[Link Person to Identity] + Bind[Bind Faces] end - subgraph "Phase 2: Aggregation" + subgraph "Phase 2: AI Analysis" Suggest[Get AI Suggestions] Review[Review Suggestions] Merge[Execute Merge] @@ -26,19 +47,19 @@ graph TD End((Database Clean)) %% Flow - Start --> ListPersons - ListPersons --> CheckIdentity + Start --> ListCandidates + ListCandidates --> CheckIdentity CheckIdentity -- No --> Register - Register --> Link - Link --> Suggest + Register --> Bind + Bind --> Suggest - CheckIdentity -- Yes --> Suggest + CheckIdentity -- Yes --> Bind + Bind --> Suggest Suggest --> Review Review -- Merge Recommended --> Merge - Review -- Naming Recommended --> Rename[Update Name] - Rename --> Confirm + Review -- Bind Recommended --> Bind Merge --> Confirm Confirm --> End @@ -46,122 +67,306 @@ graph TD style Start fill:#f9f,stroke:#333 style End fill:#bbf,stroke:#333 style Register fill:#dfd,stroke:#333 - style Merge fill:#dfd,stroke:#333 + style Bind fill:#dfd,stroke:#333 ``` --- -## 🛠️ 2. Step-by-Step API Operations +## Phase 1: Registration -### Phase 1: Registration (Creating Identities) -**Scenario**: You see `Person_17` is Audrey Hepburn. You want to create a global record for her. +**Scenario**: You found unregistered faces and want to create a new identity. -1. **Find the Person**: - ```bash - curl -s "http://localhost:3003/api/v1/person/list?video_uuid=...&limit=5" ... - # Output: Person_17 (1636 frames, null name) - ``` +### Step 1: List Face Candidates -2. **Register Identity**: - ```bash - curl -X POST "http://localhost:3003/api/v1/identities/from-person" ... \ - -d '{ - "video_uuid": "...", - "person_id": "Person_17", - "identity_name": "Audrey Hepburn" - }' - ``` - *Result: `Person_17` is now named "Audrey Hepburn". A global `identity_id` is created.* +```bash +curl -s "http://localhost:3003/api/v1/faces/candidates?min_confidence=0.8&pose_angle=frontal&limit=5" \ + -H "X-API-Key: YOUR_KEY" +``` ---- +**Response**: -### Phase 2: Suggestion (AI Analysis) -**Scenario**: You suspect `Person_25` might also be Audrey Hepburn, or you just want to clean up the data. - -1. **Ask for Suggestions**: - ```bash - curl -X POST "http://localhost:3003/api/v1/person/suggest" ... \ - -d '{"video_uuid": "..."}' - ``` - *Response*: - ```json - { - "merge_suggestions": [ - { - "person_id": "Person_17", - "merge_with": ["Person_25"], - "reasons": ["All share speaker_id: SPEAKER_1", "Person_17 has 88% of frames"], - "action": "auto_apply" - } - ] +```json +{ + "success": true, + "data": { + "candidates": [ + { + "face_id": "face_100", + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "frame": 100, + "timestamp": 5.2, + "pose_angle": "frontal", + "confidence": 0.92, + "trace_id": 2 + } + ], + "statistics": { + "total_candidates": 78, + "avg_confidence": 0.85 } - ``` + } +} +``` + +### Step 2: Register Identity + +```bash +curl -X POST "http://localhost:3003/api/v1/identities/register" \ + -H "X-API-Key: YOUR_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "face_ids": ["face_100", "face_150", "face_200"], + "name": "Audrey Hepburn", + "source": "manual", + "auto_bind_chunks": true + }' +``` + +**Response**: + +```json +{ + "success": true, + "data": { + "identity_uuid": "a9a90105-6d6b-46ff-92da-0c3c1a57dff4", + "name": "Audrey Hepburn", + "faces_bound": 3, + "chunks_bound": 10, + "speaker_ids": ["SPEAKER_0"], + "reference_vectors": { + "total": 3, + "angles": ["frontal"] + } + } +} +``` --- -### Phase 3: Review & Execution -**Scenario**: You verify the suggestion. The AI logic (Shared Speaker + Frame dominance) seems correct. +## Phase 2: AI Analysis -1. **Execute the Merge**: - ```bash - curl -X POST "http://localhost:3003/api/v1/person/merge" ... \ - -d '{ - "video_uuid": "...", - "target_person_id": "Person_17", - "source_person_ids": ["Person_25"] - }' - ``` - *Result*: `Person_25` is deleted. All 217 frames of `Person_25` are added to `Person_17`. +**Scenario**: You want AI to suggest potential merges or additional bindings. + +### Step 1: Get AI Suggestions + +```bash +curl -X POST "http://localhost:3003/api/v1/agents/suggest/clustering" \ + -H "X-API-Key: YOUR_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "min_confidence": 0.8, + "pose_angles": ["frontal"], + "max_suggestions": 5 + }' +``` + +**Response**: + +```json +{ + "success": true, + "data": { + "suggestions": [ + { + "suggestion_id": "suggest_1", + "cluster_type": "high_confidence", + "confidence": 0.92, + "recommended_faces": [ + { + "face_id": "face_100", + "pose_angle": "frontal", + "confidence": 0.95, + "is_primary": true + } + ], + "cluster_stats": { + "total_faces": 50, + "avg_similarity": 0.89 + }, + "reason": "High confidence frontal faces from same trace", + "action": "register" + }, + { + "suggestion_id": "suggest_2", + "cluster_type": "existing_identity", + "confidence": 0.88, + "identity_uuid": "a9a90105...", + "recommended_faces": [ + { + "face_id": "face_300", + "confidence": 0.87 + } + ], + "reason": "Similar to Audrey Hepburn (0.88)", + "action": "bind" + } + ] + } +} +``` + +### Step 2: Review & Execute + +**Option A: Bind to Existing Identity** + +```bash +curl -X POST "http://localhost:3003/api/v1/identities/a9a90105.../bind" \ + -H "X-API-Key: YOUR_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "face_ids": ["face_300", "face_400"], + "auto_bind_chunks": true + }' +``` + +**Option B: Register New Identity** + +```bash +curl -X POST "http://localhost:3003/api/v1/identities/register" \ + -H "X-API-Key: YOUR_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "face_ids": ["face_500", "face_550"], + "name": "Cary Grant", + "source": "manual" + }' +``` + +### Step 3: Merge Identities + +**Scenario**: Two identities are the same person. + +```bash +curl -X POST "http://localhost:3003/api/v1/agents/suggest/merge" \ + -H "X-API-Key: YOUR_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "identity_uuids": ["a9a90105...", "b8b80206..."], + "threshold": 0.85 + }' +``` + +**Response**: + +```json +{ + "success": true, + "data": { + "suggestions": [ + { + "suggestion_type": "merge", + "confidence": 0.88, + "identities": [ + {"identity_uuid": "a9a90105...", "name": "Person A", "face_count": 500}, + {"identity_uuid": "b8b80206...", "name": "Person B", "face_count": 300} + ], + "reason": "High embedding similarity (0.88)", + "recommended_action": { + "merge_target": "a9a90105...", + "merge_sources": ["b8b80206..."] + } + } + ] + } +} +``` --- -## 🚀 3. Automated Demo Script +## Query Operations -Run the following script to see the entire process in action automatically. +### List Identities in a File + +```bash +curl "http://localhost:3003/api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966/identities" \ + -H "X-API-Key: YOUR_KEY" +``` + +### List Files for an Identity + +```bash +curl "http://localhost:3003/api/v1/identities/a9a90105.../files" \ + -H "X-API-Key: YOUR_KEY" +``` + +### List Faces for an Identity + +```bash +curl "http://localhost:3003/api/v1/identities/a9a90105.../faces?limit=100" \ + -H "X-API-Key: YOUR_KEY" +``` + +### List Chunks for an Identity + +```bash +curl "http://localhost:3003/api/v1/identities/a9a90105.../chunks" \ + -H "X-API-Key: YOUR_KEY" +``` + +--- + +## Demo Script ```bash #!/bin/bash -# scripts/demo_identity_workflow.sh -# Usage: chmod +x scripts/demo_identity_workflow.sh && ./scripts/demo_identity_workflow.sh +# scripts/demo_identity_workflow_v4.sh -API_URL="http://localhost:3002" -API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" -UUID="384b0ff44aaaa1f1" +API_URL="http://localhost:3003" +API_KEY="YOUR_API_KEY" -echo "🎬 === MOMENTRY IDENTITY WORKFLOW DEMO ===" +echo "=== MOMENTRY IDENTITY WORKFLOW V4.0 ===" -# 1. Registration -echo "👉 STEP 1: Registering Person_17 as Audrey Hepburn..." -curl -s -X POST "$API_URL/api/v1/identities/from-person" \ - -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \ - -d "{\"video_uuid\":\"$UUID\", \"person_id\":\"Person_17\", \"identity_name\":\"Audrey Hepburn\"}" \ +# 1. List candidates +echo "STEP 1: Listing unregistered faces..." +curl -s "$API_URL/api/v1/faces/candidates?min_confidence=0.8&limit=5" \ + -H "X-API-Key: $API_KEY" \ | python3 -m json.tool -# 2. Suggestion +# 2. Register identity echo "" -echo "👉 STEP 2: Asking AI for cleaning suggestions..." -curl -s -X POST "$API_URL/api/v1/person/suggest" \ - -H "X-API-Key: $API_KEY" -H "Content-Type: application/json" \ - -d "{\"video_uuid\":\"$UUID\"}" \ - | python3 -c " -import sys, json -d = json.load(sys.stdin) -sugs = d.get('naming_suggestions', []) + d.get('merge_suggestions', []) -if sugs: - print(f' Found {len(sugs)} suggestions.') - for s in sugs: - print(f' - {s}') -else: - print(' No suggestions (Data is already clean!).') -" +echo "STEP 2: Registering Audrey Hepburn..." +curl -s -X POST "$API_URL/api/v1/identities/register" \ + -H "X-API-Key: $API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"face_ids": ["face_100"], "name": "Audrey Hepburn", "source": "manual"}' \ + | python3 -m json.tool -# 3. Execution (Example Merge if Person_25 existed) +# 3. Get AI suggestions echo "" -echo "👉 STEP 3: Simulating a merge (Merging hypothetical Person_25 -> Person_17)..." -# Note: In a real scenario, Person_25 would exist. -# Here we just show the command structure. -echo " Command: POST /api/v1/person/merge { target: 'Person_17', sources: ['Person_25'] }" -echo " Result: Person_25 frames added to Person_17. Person_25 deleted." +echo "STEP 3: Getting AI suggestions..." +curl -s -X POST "$API_URL/api/v1/agents/suggest/clustering" \ + -H "X-API-Key: $API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"min_confidence": 0.8, "max_suggestions": 3}' \ + | python3 -m json.tool + +# 4. Bind faces to identity +echo "" +echo "STEP 4: Binding additional faces..." +curl -s -X POST "$API_URL/api/v1/identities/a9a90105.../bind" \ + -H "X-API-Key: $API_KEY" \ + -H "Content-Type: application/json" \ + -d '{"face_ids": ["face_200"]}' \ + | python3 -m json.tool echo "" -echo "✅ Demo Complete." +echo "Demo Complete." +``` + +--- + +## Version History + +| Version | Date | Changes | +|---------|------|---------| +| V4.0 | 2026-04-28 | Two-layer architecture, 15 endpoints | +| V3.x | 2026-04-10 | Three-layer architecture, 33 endpoints | + +--- + +## Related Documents + +- [IDENTITY_MANAGEMENT_API.md](./IDENTITY_MANAGEMENT_API.md): API design +- [FACE_TO_IDENTITY_FLOW.md](./FACE_TO_IDENTITY_FLOW.md): Binding flow +- [FILE_IDENTITIES_TABLE_SPEC.md](./FILE_IDENTITIES_TABLE_SPEC.md): Table schema +- [IDENTITY_API_SPEC.md](../IDENTITY_API_SPEC.md): Complete API spec diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FACE_TO_IDENTITY_FLOW.md b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_TO_IDENTITY_FLOW.md new file mode 100644 index 0000000..357e3be --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FACE_TO_IDENTITY_FLOW.md @@ -0,0 +1,768 @@ +# Face to Identity Binding Flow + +> Version: V4.0 | Date: 2026-04-28 +> Architecture: Two-layer (Face → Identity) +> Related: [FILE_IDENTITIES_TABLE_SPEC.md](./FILE_IDENTITIES_TABLE_SPEC.md) + +--- + +## Overview + +V4.0 架構實現 Face → Identity 直接綁定,移除 person_id 中間層。 + +### Key Principles + +| Principle | Description | +|-----------|-------------| +| **Direct Binding** | Face 直接綁定到 Identity,無中間層 | +| **One-to-Many Reference** | Identity 擁有多個 Reference Vectors | +| **N:N File-Identity** | Identity 可跨多個 File | +| **Auto Chunk Binding** | Chunk 通過時間對齊自動綁定 | + +--- + +## Data Model + +``` +┌─────────────────┐ +│ face_detections│ +├─────────────────┤ +│ id │ +│ file_uuid ─────┼───┐ +│ frame │ │ +│ timestamp │ │ +│ trace_id │ │ +│ pose_angle │ │ +│ confidence │ │ +│ embedding (512) │ │ +│ identity_id ────┼───┼──┐ +└─────────────────┘ │ │ + │ │ +┌─────────────────┐ │ │ +│ files │ │ │ +├─────────────────┤ │ │ +│ uuid ◄──────────┼───┘ │ +│ file_name │ │ +│ duration │ │ +└─────────────────┘ │ + │ +┌─────────────────┐ │ +│ identities │ │ +├─────────────────┤ │ +│ id ◄────────────┼──────┘ +│ uuid │ +│ name │ +│ source │ +│ face_embedding │ (reference vector) +│ reference_data │ (JSONB, multiple vectors) +└─────────────────┘ + │ + │ N:N + ▼ +┌─────────────────┐ +│ file_identities │ +├─────────────────┤ +│ file_uuid │ +│ identity_id │ +│ face_count │ +│ speaker_count │ +│ confidence │ +└─────────────────┘ +``` + +--- + +## Binding Workflows + +### 1. Manual Registration (New Identity) + +**Trigger**: User selects face(s) and assigns name + +``` +User Selection + │ + ▼ +┌─────────────────────────┐ +│ POST /identities/register │ +├─────────────────────────┤ +│ face_ids: ["face_100"] │ +│ name: "Audrey Hepburn" │ +│ source: "manual" │ +│ auto_bind_chunks: true │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 1. Create Identity │ +│ - identity_uuid │ +│ - name, source │ +│ - face_embedding │ (from first face) +│ - reference_data │ (selected vectors) +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 2. Bind Faces │ +│ - Update face_detections │ +│ - Set identity_id │ +│ - Update file_identities │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 3. Auto Bind Chunks │ +│ - Time alignment │ +│ - Update chunk.metadata │ +│ - Update file_identities.speaker_count │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 4. Select Reference Vectors │ +│ - Trace-based selection │ +│ - Pose diversity │ +│ - Quality threshold │ +└─────────────────────────┘ +``` + +**Implementation**: + +```rust +pub async fn register_identity( + db: &PgPool, + req: RegisterIdentityRequest, +) -> Result { + let mut tx = db.begin().await?; + + // 1. Get faces + let faces = sqlx::query_as!( + FaceDetection, + "SELECT * FROM face_detections WHERE id = ANY($1)", + &req.face_ids + ) + .fetch_all(&mut *tx) + .await?; + + // 2. Create identity + let identity = sqlx::query_as!( + Identity, + r#" + INSERT INTO identities (uuid, name, source, face_embedding, reference_data) + VALUES ($1, $2, $3, $4, $5) + RETURNING * + "#, + Uuid::new_v4().to_string(), + req.name, + req.source, + faces[0].embedding.clone(), + json!({ + "vectors": vec![ReferenceVector { + embedding: faces[0].embedding.clone(), + pose_angle: faces[0].pose_angle.clone(), + quality: faces[0].confidence, + file_uuid: faces[0].file_uuid.clone(), + face_id: faces[0].id, + }], + "selection_strategy": "manual" + }), + ) + .fetch_one(&mut *tx) + .await?; + + // 3. Bind faces + for face in &faces { + sqlx::query!( + "UPDATE face_detections SET identity_id = $1 WHERE id = $2", + identity.id, + face.id + ) + .execute(&mut *tx) + .await?; + + // Update file_identities + update_file_identity_stats( + &mut tx, + &face.file_uuid, + identity.id, + 1, // face_count +1 + 0, // speaker_count + Some(face.confidence), + Some(face.timestamp), + ).await?; + } + + // 4. Auto bind chunks + if req.auto_bind_chunks { + auto_bind_chunks_for_identity(&mut tx, &identity.id, &faces).await?; + } + + tx.commit().await?; + Ok(identity) +} +``` + +--- + +### 2. Bind Faces to Existing Identity + +**Trigger**: User selects face(s) and assigns to existing identity + +``` +User Selection + │ + ▼ +┌────────────────────────────┐ +│ POST /identities/:uuid/bind │ +├────────────────────────────┤ +│ face_ids: ["face_200"] │ +│ auto_bind_chunks: true │ +└────────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 1. Validate Identity │ +│ - Check existence │ +│ - Get reference_data │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 2. Bind Faces │ +│ - Update face_detections │ +│ - Set identity_id │ +│ - Update file_identities │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 3. Update Reference Vectors │ +│ - Add new vector if quality > threshold │ +│ - Maintain diversity │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 4. Auto Bind Chunks │ +│ - Time alignment │ +└─────────────────────────┘ +``` + +**Implementation**: + +```rust +pub async fn bind_faces_to_identity( + db: &PgPool, + identity_uuid: &str, + req: BindFacesRequest, +) -> Result<()> { + let mut tx = db.begin().await?; + + // 1. Get identity + let identity = sqlx::query_as!( + Identity, + "SELECT * FROM identities WHERE uuid = $1", + identity_uuid + ) + .fetch_one(&mut *tx) + .await?; + + // 2. Get faces + let faces = sqlx::query_as!( + FaceDetection, + "SELECT * FROM face_detections WHERE id = ANY($1)", + &req.face_ids + ) + .fetch_all(&mut *tx) + .await?; + + // 3. Bind faces + for face in &faces { + sqlx::query!( + "UPDATE face_detections SET identity_id = $1 WHERE id = $2", + identity.id, + face.id + ) + .execute(&mut *tx) + .await?; + + update_file_identity_stats( + &mut tx, + &face.file_uuid, + identity.id, + 1, + 0, + Some(face.confidence), + Some(face.timestamp), + ).await?; + } + + // 4. Update reference vectors + update_reference_vectors(&mut tx, &identity.id, &faces).await?; + + // 5. Auto bind chunks + if req.auto_bind_chunks { + auto_bind_chunks_for_identity(&mut tx, &identity.id, &faces).await?; + } + + tx.commit().await?; + Ok(()) +} +``` + +--- + +### 3. Unbind Faces from Identity + +**Trigger**: User removes face from identity + +``` +User Selection + │ + ▼ +┌──────────────────────────────┐ +│ POST /identities/:uuid/unbind │ +├──────────────────────────────┤ +│ face_ids: ["face_400"] │ +└──────────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 1. Unbind Faces │ +│ - Set identity_id = NULL │ +│ - Update file_identities │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 2. Auto Unbind Chunks │ +│ - Remove if no overlapping faces │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 3. Update Reference Vectors │ +│ - Remove if vector source │ +│ - Re-select if needed │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ 4. Check Identity Deletion │ +│ - If face_count = 0, delete identity │ +└─────────────────────────┘ +``` + +--- + +### 4. Auto Chunk Binding + +**Trigger**: Face binding/unbinding + +**Principle**: Chunk 自動綁定,無需 Candidates/Suggest API + +``` +Face Timestamps + │ + ▼ +┌─────────────────────────┐ +│ Query Chunks by Time │ +│ - chunk.start_time <= face.timestamp │ +│ - chunk.end_time >= face.timestamp │ +│ - Same file_uuid │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ Check Overlap │ +│ - Count overlapping faces │ +│ - Calculate confidence │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ Update Chunk Metadata │ +│ - identity_id: ... │ +│ - confidence: 0.85 │ +│ - binding_source: "auto"│ +│ - faces: ["face_100"] │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ Update file_identities │ +│ - speaker_count += 1 │ +└─────────────────────────┘ +``` + +**Implementation**: + +```rust +pub async fn auto_bind_chunks_for_identity( + tx: &mut sqlx::Transaction<'_, sqlx::Postgres>, + identity_id: &i64, + faces: &[FaceDetection], +) -> Result<()> { + for face in faces { + // Find overlapping chunks + let chunks = sqlx::query!( + r#" + SELECT id, metadata + FROM chunks + WHERE file_uuid = $1 + AND start_time <= $2 + AND end_time >= $2 + "#, + face.file_uuid, + face.timestamp + ) + .fetch_all(&mut **tx) + .await?; + + for chunk in chunks { + let mut metadata: ChunkMetadata = + serde_json::from_value(chunk.metadata.clone()).unwrap_or_default(); + + // Update metadata + if !metadata.faces.contains(&face.id) { + metadata.faces.push(face.id); + } + metadata.identity_id = Some(*identity_id); + metadata.confidence = Some(face.confidence); + metadata.binding_source = "auto".to_string(); + + sqlx::query!( + r#" + UPDATE chunks + SET metadata = $1 + WHERE id = $2 + "#, + serde_json::to_value(metadata)?, + chunk.id + ) + .execute(&mut **tx) + .await?; + + // Update file_identities speaker_count + sqlx::query!( + r#" + UPDATE file_identities + SET speaker_count = speaker_count + 1 + WHERE file_uuid = $1 AND identity_id = $2 + "#, + face.file_uuid, + identity_id + ) + .execute(&mut **tx) + .await?; + } + } + + Ok(()) +} +``` + +--- + +### 5. Reference Vector Selection + +**Strategy**: Trace-based + Pose diversity + +``` +Face Detections (identity_id = X) + │ + ▼ +┌─────────────────────────┐ +│ Group by trace_id │ +│ - Each trace = one person track │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ For each trace: │ +│ - Find best frontal face │ +│ - Find best profile faces │ +│ - Quality > 0.85 │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ Select Top N Vectors │ +│ - Max 5 per trace │ +│ - Max 20 total │ +│ - Prioritize quality │ +└─────────────────────────┘ + │ + ▼ +┌─────────────────────────┐ +│ Store in reference_data │ +│ { +│ "vectors": [...], +│ "selection_strategy": "trace_based", +│ "total_traces": 4, +│ "total_faces": 500 +│ } +└─────────────────────────┘ +``` + +**Implementation**: + +```rust +pub async fn update_reference_vectors( + tx: &mut sqlx::Transaction<'_, sqlx::Postgres>, + identity_id: &i64, + new_faces: &[FaceDetection], +) -> Result<()> { + // Get all faces for this identity + let all_faces = sqlx::query_as!( + FaceDetection, + "SELECT * FROM face_detections WHERE identity_id = $1", + identity_id + ) + .fetch_all(&mut **tx) + .await?; + + // Group by trace_id + let mut trace_groups: HashMap> = HashMap::new(); + for face in &all_faces { + trace_groups.entry(face.trace_id).or_default().push(face); + } + + // Select vectors per trace + let mut selected_vectors = Vec::new(); + + for (_trace_id, faces) in trace_groups.iter() { + // Group by pose_angle + let mut pose_groups: HashMap> = HashMap::new(); + for face in faces { + pose_groups + .entry(face.pose_angle.clone()) + .or_default() + .push(face); + } + + // Select best from each pose (max 5 per trace) + for (_, pose_faces) in pose_groups.iter() { + let best = pose_faces + .iter() + .filter(|f| f.confidence > 0.85) + .max_by(|a, b| a.confidence.partial_cmp(&b.confidence).unwrap()); + + if let Some(face) = best { + selected_vectors.push(ReferenceVector { + embedding: face.embedding.clone(), + pose_angle: face.pose_angle.clone(), + quality: face.confidence, + file_uuid: face.file_uuid.clone(), + face_id: face.id, + }); + } + } + } + + // Sort by quality and take top 20 + selected_vectors.sort_by(|a, b| b.quality.partial_cmp(&a.quality).unwrap()); + selected_vectors.truncate(20); + + // Update identity + sqlx::query!( + r#" + UPDATE identities + SET reference_data = $1 + WHERE id = $2 + "#, + json!({ + "vectors": selected_vectors, + "selection_strategy": "trace_based", + "total_traces": trace_groups.len(), + "total_faces": all_faces.len(), + }), + identity_id + ) + .execute(&mut **tx) + .await?; + + Ok(()) +} +``` + +--- + +## Query Workflows + +### 1. List Identities in File + +```bash +GET /api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966/identities +``` + +**SQL**: + +```sql +SELECT + i.uuid AS identity_uuid, + i.name, + i.source, + fi.face_count, + fi.speaker_count, + fi.confidence +FROM file_identities fi +JOIN identities i ON i.id = fi.identity_id +WHERE fi.file_uuid = '384b0ff44aaaa1f14cb2cd63b3fea966' +ORDER BY fi.face_count DESC; +``` + +--- + +### 2. List Files for Identity + +```bash +GET /api/v1/identities/a9a90105.../files +``` + +**SQL**: + +```sql +SELECT + f.uuid AS file_uuid, + f.file_name, + f.duration, + fi.face_count, + fi.speaker_count, + fi.first_appearance, + fi.last_appearance, + fi.confidence +FROM file_identities fi +JOIN files f ON f.uuid = fi.file_uuid +WHERE fi.identity_id = 1 +ORDER BY fi.face_count DESC; +``` + +--- + +### 3. List Faces for Identity + +```bash +GET /api/v1/identities/a9a90105.../faces?limit=100 +``` + +**SQL**: + +```sql +SELECT + fd.id AS face_id, + fd.file_uuid, + fd.frame, + fd.timestamp, + fd.pose_angle, + fd.confidence, + fd.trace_id +FROM face_detections fd +WHERE fd.identity_id = 1 +ORDER BY fd.timestamp +LIMIT 100; +``` + +--- + +### 4. List Unregistered Faces (Candidates) + +```bash +GET /api/v1/faces/candidates?min_confidence=0.8&pose_angle=frontal +``` + +**SQL**: + +```sql +SELECT + fd.id AS face_id, + fd.file_uuid, + fd.frame, + fd.timestamp, + fd.pose_angle, + fd.confidence, + fd.trace_id +FROM face_detections fd +WHERE fd.identity_id IS NULL +AND fd.confidence >= 0.8 +AND fd.pose_angle = 'frontal' +ORDER BY fd.confidence DESC +LIMIT 100; +``` + +--- + +## Performance Considerations + +### Indexing Strategy + +```sql +-- Face queries +CREATE INDEX idx_face_detections_identity ON face_detections(identity_id) + WHERE identity_id IS NOT NULL; +CREATE INDEX idx_face_detections_candidates ON face_detections(confidence DESC) + WHERE identity_id IS NULL; + +-- File identity queries +CREATE INDEX idx_file_identities_file_uuid ON file_identities(file_uuid); +CREATE INDEX idx_file_identities_identity_id ON file_identities(identity_id); + +-- Chunk queries +CREATE INDEX idx_chunks_file_time ON chunks(file_uuid, start_time, end_time); +``` + +### Batch Operations + +```rust +// Batch bind faces (recommended for >10 faces) +pub async fn batch_bind_faces( + db: &PgPool, + identity_id: i64, + face_ids: &[i64], +) -> Result<()> { + let mut tx = db.begin().await?; + + // Single UPDATE statement + sqlx::query!( + "UPDATE face_detections SET identity_id = $1 WHERE id = ANY($2)", + identity_id, + face_ids + ) + .execute(&mut *tx) + .await?; + + // Batch update file_identities + // ... (use CTE or temp table) + + tx.commit().await?; + Ok(()) +} +``` + +--- + +## Error Handling + +### Common Errors + +| Error | Cause | Solution | +|-------|-------|----------| +| `Identity not found` | Invalid identity_uuid | Check UUID format | +| `Face already bound` | Face has identity_id | Unbind first | +| `Invalid face_ids` | Empty array or invalid IDs | Validate input | +| `Chunk overlap conflict` | Multiple identities in same chunk | Use latest binding | + +--- + +## Version History + +| Version | Date | Changes | +|---------|------|---------| +| V4.0 | 2026-04-28 | Two-layer architecture, direct binding | + +--- + +## Related Documents + +- [IDENTITY_MANAGEMENT_API.md](./IDENTITY_MANAGEMENT_API.md): API design +- [FILE_IDENTITIES_TABLE_SPEC.md](./FILE_IDENTITIES_TABLE_SPEC.md): Table schema +- [IDENTITY_AGENT_SPEC.md](./IDENTITY_AGENT_SPEC.md): Agent specification diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/FILE_IDENTITIES_TABLE_SPEC.md b/docs_v1.0/AI_AGENTS/IDENTITY/FILE_IDENTITIES_TABLE_SPEC.md new file mode 100644 index 0000000..377d7d0 --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/FILE_IDENTITIES_TABLE_SPEC.md @@ -0,0 +1,434 @@ +# File Identities Table Specification + +> Version: V4.0 | Date: 2026-04-28 +> Architecture: Two-layer (Face → Identity) +> Relationship: N:N (Identity ↔ File) + +--- + +## Overview + +`file_identities` 表實現 Identity 與 File 的多對多關係,支援跨檔案身份追蹤。 + +### Key Features + +| Feature | Description | +|---------|-------------| +| **N:N Relationship** | Identity 可跨多個 File,File 可包含多個 Identity | +| **Aggregate Stats** | 統計每個 File 中每個 Identity 的出現次數 | +| **Time Range** | 記錄首次/最後出現時間 | +| **Confidence** | 平均信心度 | + +--- + +## Table Schema + +```sql +CREATE TABLE file_identities ( + id BIGSERIAL PRIMARY KEY, + file_uuid VARCHAR(64) NOT NULL, + identity_id BIGINT NOT NULL, + face_count INTEGER DEFAULT 0, + speaker_count INTEGER DEFAULT 0, + first_appearance DOUBLE PRECISION, + last_appearance DOUBLE PRECISION, + confidence DOUBLE PRECISION DEFAULT 0.0, + created_at TIMESTAMPTZ DEFAULT NOW(), + updated_at TIMESTAMPTZ DEFAULT NOW(), + + CONSTRAINT fk_file_identities_file + FOREIGN KEY (file_uuid) + REFERENCES files(uuid) + ON DELETE CASCADE, + + CONSTRAINT fk_file_identities_identity + FOREIGN KEY (identity_id) + REFERENCES identities(id) + ON DELETE CASCADE, + + CONSTRAINT uq_file_identities + UNIQUE (file_uuid, identity_id) +); + +CREATE INDEX idx_file_identities_file_uuid ON file_identities(file_uuid); +CREATE INDEX idx_file_identities_identity_id ON file_identities(identity_id); +CREATE INDEX idx_file_identities_confidence ON file_identities(confidence DESC); +``` + +--- + +## Column Descriptions + +| Column | Type | Description | Example | +|--------|------|-------------|---------| +| `id` | BIGSERIAL | Primary key | `1` | +| `file_uuid` | VARCHAR(64) | File identifier (FK to files.uuid) | `384b0ff44aaaa1f14cb2cd63b3fea966` | +| `identity_id` | BIGINT | Identity ID (FK to identities.id) | `1` | +| `face_count` | INTEGER | Number of faces bound to identity in this file | `500` | +| `speaker_count` | INTEGER | Number of speaker segments bound | `10` | +| `first_appearance` | DOUBLE PRECISION | First appearance time in seconds | `5.2` | +| `last_appearance` | DOUBLE PRECISION | Last appearance time in seconds | `180.5` | +| `confidence` | DOUBLE PRECISION | Average confidence score | `0.86` | +| `created_at` | TIMESTAMPTZ | Record creation time | `2026-04-28T10:00:00Z` | +| `updated_at` | TIMESTAMPTZ | Record update time | `2026-04-28T12:00:00Z` | + +--- + +## Relationships + +### Identity → Files (One-to-Many) + +``` +identities (1) ──→ file_identities (N) ──→ files (N) +``` + +**Query**: List all files where an identity appears + +```sql +SELECT + f.uuid AS file_uuid, + f.file_name, + fi.face_count, + fi.speaker_count, + fi.first_appearance, + fi.last_appearance, + fi.confidence +FROM file_identities fi +JOIN files f ON f.uuid = fi.file_uuid +WHERE fi.identity_id = ? +ORDER BY fi.face_count DESC; +``` + +### File → Identities (One-to-Many) + +``` +files (1) ──→ file_identities (N) ──→ identities (N) +``` + +**Query**: List all identities in a file + +```sql +SELECT + i.uuid AS identity_uuid, + i.name, + i.source, + fi.face_count, + fi.speaker_count, + fi.confidence +FROM file_identities fi +JOIN identities i ON i.id = fi.identity_id +WHERE fi.file_uuid = ? +ORDER BY fi.face_count DESC; +``` + +--- + +## Data Flow + +### 1. Face Binding + +When a face is bound to an identity: + +```sql +-- Step 1: Create file_identities record if not exists +INSERT INTO file_identities (file_uuid, identity_id, face_count, confidence) +VALUES (?, ?, 1, ?) +ON CONFLICT (file_uuid, identity_id) +DO UPDATE SET + face_count = file_identities.face_count + 1, + confidence = (file_identities.confidence * file_identities.face_count + EXCLUDED.confidence) / (file_identities.face_count + 1), + updated_at = NOW(); + +-- Step 2: Update first/last appearance +UPDATE file_identities +SET + first_appearance = LEAST(first_appearance, ?), + last_appearance = GREATEST(last_appearance, ?) +WHERE file_uuid = ? AND identity_id = ?; +``` + +### 2. Face Unbinding + +When a face is unbound from an identity: + +```sql +-- Step 1: Get face info before unbinding +SELECT file_uuid, confidence FROM face_detections WHERE id = ?; + +-- Step 2: Update file_identities +UPDATE file_identities +SET + face_count = face_count - 1, + updated_at = NOW() +WHERE file_uuid = ? AND identity_id = ?; + +-- Step 3: Delete if face_count = 0 +DELETE FROM file_identities +WHERE file_uuid = ? AND identity_id = ? AND face_count = 0; +``` + +### 3. Chunk Binding (Auto) + +When a chunk is auto-bound to an identity via time alignment: + +```sql +-- Update speaker_count +UPDATE file_identities +SET + speaker_count = speaker_count + 1, + updated_at = NOW() +WHERE file_uuid = ? AND identity_id = ?; +``` + +--- + +## Indexes + +| Index | Purpose | +|-------|---------| +| `idx_file_identities_file_uuid` | Query identities by file | +| `idx_file_identities_identity_id` | Query files by identity | +| `idx_file_identities_confidence` | Sort by confidence | + +--- + +## Constraints + +### Foreign Keys + +| Constraint | On Delete | Description | +|------------|-----------|-------------| +| `fk_file_identities_file` | CASCADE | Delete file_identities when file is deleted | +| `fk_file_identities_identity` | CASCADE | Delete file_identities when identity is deleted | + +### Unique Constraint + +```sql +CONSTRAINT uq_file_identities UNIQUE (file_uuid, identity_id) +``` + +Ensures one record per file-identity pair. + +--- + +## Query Patterns + +### 1. Get Identity Files + +```rust +pub async fn get_identity_files( + db: &PgPool, + identity_uuid: &str, + page: i64, + page_size: i64, +) -> Result { + let rows = sqlx::query_as!( + FileIdentityRow, + r#" + SELECT + f.uuid AS file_uuid, + f.file_name, + f.duration, + fi.face_count, + fi.speaker_count, + fi.first_appearance, + fi.last_appearance, + fi.confidence + FROM file_identities fi + JOIN files f ON f.uuid = fi.file_uuid + JOIN identities i ON i.id = fi.identity_id + WHERE i.uuid = $1 + ORDER BY fi.face_count DESC + LIMIT $2 OFFSET $3 + "#, + identity_uuid, + page_size, + (page - 1) * page_size + ) + .fetch_all(db) + .await?; + + Ok(IdentityFilesResponse { files: rows }) +} +``` + +### 2. Get File Identities + +```rust +pub async fn get_file_identities( + db: &PgPool, + file_uuid: &str, + page: i64, + page_size: i64, +) -> Result { + let rows = sqlx::query_as!( + IdentityRow, + r#" + SELECT + i.uuid AS identity_uuid, + i.name, + i.source, + fi.face_count, + fi.speaker_count, + fi.confidence + FROM file_identities fi + JOIN identities i ON i.id = fi.identity_id + WHERE fi.file_uuid = $1 + ORDER BY fi.face_count DESC + LIMIT $2 OFFSET $3 + "#, + file_uuid, + page_size, + (page - 1) * page_size + ) + .fetch_all(db) + .await?; + + Ok(FileIdentitiesResponse { identities: rows }) +} +``` + +### 3. Update Stats + +```rust +pub async fn update_file_identity_stats( + db: &PgPool, + file_uuid: &str, + identity_id: i64, + face_count_delta: i32, + speaker_count_delta: i32, + confidence: Option, + timestamp: Option, +) -> Result<()> { + sqlx::query!( + r#" + INSERT INTO file_identities (file_uuid, identity_id, face_count, speaker_count, confidence, first_appearance, last_appearance) + VALUES ($1, $2, $3, $4, $5, $6, $6) + ON CONFLICT (file_uuid, identity_id) + DO UPDATE SET + face_count = file_identities.face_count + $3, + speaker_count = file_identities.speaker_count + $4, + confidence = CASE + WHEN $5 IS NOT NULL AND file_identities.face_count > 0 + THEN (file_identities.confidence * file_identities.face_count + $5) / (file_identities.face_count + $3) + ELSE file_identities.confidence + END, + first_appearance = CASE + WHEN $6 IS NOT NULL + THEN LEAST(file_identities.first_appearance, $6) + ELSE file_identities.first_appearance + END, + last_appearance = CASE + WHEN $6 IS NOT NULL + THEN GREATEST(file_identities.last_appearance, $6) + ELSE file_identities.last_appearance + END, + updated_at = NOW() + "#, + file_uuid, + identity_id, + face_count_delta, + speaker_count_delta, + confidence, + timestamp + ) + .execute(db) + .await?; + + Ok(()) +} +``` + +--- + +## Migration + +### V3.x → V4.0 + +**Before (V3.x)**: +- `person_identities` table (303 records, 0 registered identities) +- One-to-many relationship (person → identities) +- Video-local person IDs + +**After (V4.0)**: +- `file_identities` table (new) +- Many-to-many relationship (identity ↔ file) +- Global identity UUIDs +- Direct face → identity binding + +### Migration Script + +```sql +-- Step 1: Create file_identities table +CREATE TABLE file_identities ( ... ); + +-- Step 2: Populate from face_detections +INSERT INTO file_identities (file_uuid, identity_id, face_count, confidence, first_appearance, last_appearance) +SELECT + fd.file_uuid, + fd.identity_id, + COUNT(*) AS face_count, + AVG(fd.confidence) AS confidence, + MIN(fd.timestamp) AS first_appearance, + MAX(fd.timestamp) AS last_appearance +FROM face_detections fd +WHERE fd.identity_id IS NOT NULL +GROUP BY fd.file_uuid, fd.identity_id; + +-- Step 3: Update speaker_count from chunks +UPDATE file_identities fi +SET speaker_count = ( + SELECT COUNT(DISTINCT c.id) + FROM chunks c + WHERE c.file_uuid = fi.file_uuid + AND c.metadata->>'identity_id' = fi.identity_id::text +); + +-- Step 4: Drop person_identities table +DROP TABLE IF EXISTS person_identities; +``` + +--- + +## Performance Considerations + +### Index Strategy + +| Query Pattern | Index | +|---------------|-------| +| Get identities by file | `idx_file_identities_file_uuid` | +| Get files by identity | `idx_file_identities_identity_id` | +| Sort by confidence | `idx_file_identities_confidence` | + +### Query Optimization + +1. **Use JOINs sparingly**: Fetch identity/file data separately when possible +2. **Pagination**: Always use `LIMIT` and `OFFSET` +3. **Batch updates**: Use transactions for bulk face binding + +### Caching Strategy + +```rust +// Redis cache key patterns +const CACHE_KEY_FILE_IDENTITIES: &str = "momentry:file_identities:{}"; +const CACHE_KEY_IDENTITY_FILES: &str = "momentry:identity_files:{}"; + +// Cache TTL (5 minutes) +const CACHE_TTL: i64 = 300; +``` + +--- + +## Version History + +| Version | Date | Changes | +|---------|------|---------| +| V4.0 | 2026-04-28 | Initial design (N:N relationship) | + +--- + +## Related Documents + +- [IDENTITY_MANAGEMENT_API.md](./IDENTITY_MANAGEMENT_API.md): Identity API design +- [IDENTITY_AGENT_SPEC.md](./IDENTITY_AGENT_SPEC.md): Identity Agent specification +- [FACE_TO_IDENTITY_FLOW.md](./FACE_TO_IDENTITY_FLOW.md): Face binding workflow diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_AGENT_SPEC.md b/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_AGENT_SPEC.md new file mode 100644 index 0000000..f79ac6f --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_AGENT_SPEC.md @@ -0,0 +1,549 @@ +--- +document_type: "architecture_design" +service: "MOMENTRY_CORE" +title: "Identity Agent Design Specification" +date: "2026-04-28" +version: "V2.0" +status: "active" +owner: "Warren" +created_by: "OpenCode" +tags: + - "identity-agent" + - "agent" + - "face-clustering" + - "embedding-matching" + - "multi-file-aggregation" +ai_query_hints: + - "Identity Agent design specification" + - "Face to Identity inference flow" + - "Multi-file identity aggregation" + - "Embedding matching with pose adaptation" +related_documents: + - "AI_AGENTS/CORE/AGENT_SPEC.md" + - "AI_AGENTS/IDENTITY/IDENTITY_MANAGEMENT_API.md" + - "FILE_IDENTITIES_TABLE_SPEC.md" +--- + +# Identity Agent Design Specification + +| Item | Content | +|------|---------| +| Creator | OpenCode | +| Date | 2026-04-28 | +| Version | V2.0 (Two-layer Architecture) | + +--- + +## Version History + +| Version | Date | Changes | Author | +|---------|------|---------|--------| +| V2.0 | 2026-04-28 | Two-layer architecture (Face → Identity) | OpenCode | +| V1.0 | 2026-04-27 | Initial design (three-layer) | OpenCode | + +--- + +## Overview + +Identity Agent is an L3 Agent in Momentry Core, responsible for inferring "Who is Who" from Face Processor outputs and aggregating identities across multiple files. + +--- + +## Architecture Change (V1.0 → V2.0) + +| Aspect | V1.0 (Deprecated) | V2.0 (Current) | +|--------|-------------------|----------------| +| **Layers** | Face → Person → Identity | Face → Identity (2 layers) | +| **person_identities** | Required table | Removed (deprecated) | +| **Binding** | Person → Identity | Face → Identity (direct) | +| **Chunks** | Person → Chunk | Face → Chunk (auto-bind by time) | + +--- + +## Current Status + +| Component | Status | +|-----------|--------| +| Face Processor | ✅ Implemented (InsightFace) | +| Face Tracker | ✅ Implemented (trace_id) | +| ASRX Processor | ✅ Implemented (WhisperX) | +| Identity Agent | 🔧 Pending implementation | + +--- + +## 1. Agent Goals + +### 1.1 Core Problem + +**Question**: How to infer global Identity from Face embeddings across multiple files? + +**Challenges**: +1. **Same person in different files**: Need cross-file matching +2. **Different poses**: frontal vs profile have different thresholds +3. **Temporal alignment**: Chunks need time-based binding +4. **Quality variance**: Low-quality faces need filtering + +--- + +### 1.2 Agent Goals + +Aggregate evidence across files to create/maintain global Identities: + +| Evidence Source | Input | Output | +|-----------------|-------|--------| +| **Face Processor** | Face embedding + pose_angle | Face → identity_id | +| **Face Tracker** | trace_id (face tracking) | Trace statistics | +| **ASRX Processor** | Speaker segments | Chunk → identity_id (auto-bind) | +| **Identity Agent** | Face + trace + time | **Identity** (global) | + +--- + +## 2. Data Flow (Two-layer) + +``` +File → InsightFace → face_full_traced.json + ↓ + face_id + embedding + pose_angle + trace_id + ↓ + Identity Agent + ↓ + ┌─────────────────────────────────────┐ + │ Step 1: Select unregistered face │ + │ Step 2: Register identity │ + │ Step 3: Embedding matching │ + │ Step 4: Bind faces → identity_id │ + │ Step 5: Auto-bind chunks │ + └─────────────────────────────────────┘ + ↓ + identities + file_identities tables +``` + +--- + +## 3. Input Data + +### 3.1 Face Data Structure + +```json +{ + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "fps": 59.94, + "metadata": { + "trace_stats": { + "total_traces": 4, + "long_traces": 3 + } + }, + "frames": { + "100": { + "faces": [ + { + "face_id": "face_100", + "confidence": 0.92, + "embedding": [512-dim vector], + "pose_angle": { + "angle": "frontal", + "yaw": -5.2, + "pitch": 2.1, + "confidence": 0.95 + }, + "trace_id": 2, + "identity_id": null + } + ] + } + }, + "traces": { + "2": { + "trace_id": 2, + "total_appearances": 143, + "avg_confidence": 0.86, + "pose_distribution": { + "frontal": 20, + "profile_right": 125 + } + } + } +} +``` + +--- + +### 3.2 Data Sources + +| Data | Source File | Description | +|------|--------------|-------------| +| **Face frames** | `{uuid}.face_full_traced_v2.json` | Face detection + embedding + trace | +| **Speaker segments** | `{uuid}.asrx.json` | Speaker time segments | +| **Chunks** | `chunks` table | Sentence chunks (from pre_chunks) | + +--- + +## 4. Core Logic + +### 4.1 Inference Flow + +``` +┌─────────────────────────────────────────────────────────────────┐ +│ Identity Agent Workflow │ +├─────────────────────────────────────────────────────────────────┤ +│ │ +│ Step 1: Candidates Query │ +│ ───────────────────────────── │ +│ Query: GET /api/v1/faces/candidates │ +│ Filter: identity_id = NULL, confidence >= 0.8 │ +│ Result: Unregistered faces list │ +│ │ +│ Step 2: AI Suggestion │ +│ ───────────────── │ +│ Query: POST /api/v1/agents/suggest/clustering │ +│ Input: Unregistered faces │ +│ Output: Cluster suggestions + recommended primary face │ +│ │ +│ Step 3: Identity Registration │ +│ ───────────────────────────── │ +│ Query: POST /api/v1/identities/register │ +│ Input: face_ids + name │ +│ Output: identity_uuid │ +│ │ +│ Step 4: Face Binding │ +│ ───────────────── │ +│ For each face in same trace: │ +│ Calculate: embedding_similarity(face, identity.embedding) │ +│ Apply: adaptive_threshold(pose_angle) │ +│ If similarity > threshold: │ +│ UPDATE face_detections SET identity_id = identity.id │ +│ │ +│ Step 5: Chunk Auto-Binding │ +│ ───────────────────────────── │ +│ For each face with identity_id: │ +│ Query: chunks WHERE time overlaps face timestamp │ +│ Update: chunk.metadata.identity_id = identity.uuid │ +│ Update: chunk.metadata.chunk_identity.faces.push(face_id) │ +│ │ +│ Step 6: Statistics Aggregation │ +│ ─────────────────────────────── │ +│ Update: file_identities (face_count, speaker_count) │ +│ Update: identities.metadata (global stats) │ +│ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +--- + +### 4.2 Adaptive Threshold + +**Pose-based threshold strategy**: + +```python +def get_adaptive_threshold(pose_angle: str) -> float: + """Get matching threshold based on pose angle""" + thresholds = { + "frontal": 0.90, # Strict for frontal + "three_quarter": 0.85, # Moderate + "profile_left": 0.80, # Relaxed for profile + "profile_right": 0.80, + } + return thresholds.get(pose_angle, 0.75) +``` + +**Reasoning**: +- Frontal faces have best embedding quality → strict threshold +- Profile faces have distorted embedding → relaxed threshold +- Three_quarter is intermediate + +--- + +### 4.3 Embedding Matching + +```python +def match_face_to_identity( + face_embedding: List[float], + identity_embedding: List[float], + pose_angle: str +) -> Tuple[bool, float]: + """Match face to identity with pose-adaptive threshold""" + + similarity = cosine_similarity(face_embedding, identity_embedding) + threshold = get_adaptive_threshold(pose_angle) + + is_match = similarity > threshold + return is_match, similarity +``` + +--- + +### 4.4 Chunk Auto-Binding + +```python +def bind_chunks_to_identity( + identity_id: int, + file_uuid: str, + pool: PgPool +) -> int: + """Auto-bind chunks by time alignment""" + + # Get face time ranges + faces = sqlx::query( + "SELECT timestamp, pose_angle + FROM face_detections + WHERE identity_id = $1 AND file_uuid = $2" + ).bind(identity_id).bind(file_uuid).fetch_all(pool) + + # Find overlapping chunks + chunks_updated = 0 + for face in faces: + chunks = sqlx::query( + "UPDATE chunks + SET metadata = jsonb_set( + metadata, '{chunk_identity}', + jsonb_build_object( + 'identity_id', $1::text, + 'binding_source', 'auto' + ) + ) + WHERE file_uuid = $2 + AND ABS(start_time - $3) < 2.0" + ).bind(identity_id).bind(file_uuid).bind(face.timestamp) + .execute(pool) + + chunks_updated += chunks.rowcount() + + return chunks_updated +``` + +--- + +## 5. Database Schema + +### 5.1 identities Table + +| Field | Type | Description | +|-------|------|-------------| +| `uuid` | UUID | identity_uuid (global) | +| `name` | VARCHAR | Identity name | +| `face_embedding` | VECTOR(512) | Reference embedding | +| `reference_data` | JSONB | Multi-angle reference vectors | +| `metadata` | JSONB | Global statistics | + +--- + +### 5.2 file_identities Table (N:N) + +| Field | Type | Description | +|-------|------|-------------| +| `file_uuid` | UUID | File UUID | +| `identity_id` | BIGINT | Identity ID | +| `face_count` | INT | Faces in this file | +| `speaker_count` | INT | Speaker segments | +| `first_appearance` | FLOAT | First appearance time | +| `last_appearance` | FLOAT | Last appearance time | +| `confidence` | FLOAT | Avg confidence | + +--- + +### 5.3 face_detections Table + +| Field | Type | Description | +|-------|------|-------------| +| `identity_id` | BIGINT | Bound identity (direct) | +| `file_uuid` | UUID | File UUID | +| `pose_angle` | VARCHAR | Pose angle | +| `embedding` | VECTOR(512) | Face embedding | +| `trace_id` | INT | Trace ID (from Face Tracker) | + +--- + +### 5.4 chunks.metadata Structure + +```json +{ + "chunk_identity": { + "faces": [100, 150], + "speakers": ["SPEAKER_0"], + "identity_id": "a9a90105-...", + "confidence": 0.88, + "binding_source": "auto" + } +} +``` + +--- + +## 6. API Design + +### 6.1 Candidates API + +```http +GET /api/v1/faces/candidates + ?min_confidence=0.8 + &pose_angle=frontal + &page=1 + &page_size=15 + &limit=100 +``` + +**Response**: +```json +{ + "candidates": [ + { + "face_id": "face_100", + "pose_angle": "frontal", + "confidence": 0.92, + "trace_id": 2 + } + ] +} +``` + +--- + +### 6.2 Suggest API + +```http +POST /api/v1/agents/suggest/clustering +{ + "min_confidence": 0.8, + "max_suggestions": 5 +} +``` + +**Response**: +```json +{ + "suggestions": [ + { + "cluster_type": "high_confidence", + "recommended_faces": ["face_100"], + "action": "register" + } + ] +} +``` + +--- + +### 6.3 Register API + +```http +POST /api/v1/identities/register +{ + "face_ids": ["face_100"], + "name": "Person A", + "auto_bind_chunks": true +} +``` + +--- + +## 7. Multi-File Aggregation + +### 7.1 Cross-File Matching + +When a new file is processed: + +1. **Query existing identities**: `SELECT * FROM identities` +2. **For each unregistered face**: + - Calculate similarity with all identity.face_embedding + - Apply adaptive threshold + - If match: bind to existing identity +3. **If no match**: create new identity + +--- + +### 7.2 Statistics Update + +```sql +-- Update file_identities after binding +INSERT INTO file_identities ( + file_uuid, identity_id, face_count, confidence +) +SELECT + file_uuid, + identity_id, + COUNT(*), + AVG(confidence) +FROM face_detections +WHERE identity_id IS NOT NULL +GROUP BY file_uuid, identity_id +ON CONFLICT (file_uuid, identity_id) +DO UPDATE SET + face_count = EXCLUDED.face_count, + confidence = EXCLUDED.confidence; +``` + +--- + +## 8. Implementation Plan + +### 8.1 Phase 1: Core Matching + +| Task | Status | +|------|--------| +| Adaptive threshold function | Pending | +| Embedding matching logic | Pending | +| Face → Identity binding | Pending | +| Chunk auto-binding | Pending | + +--- + +### 8.2 Phase 2: Candidates API + +| Task | Status | +|------|--------| +| Candidates query endpoint | Pending | +| Pose distribution statistics | Pending | +| Trace-based filtering | Pending | + +--- + +### 8.3 Phase 3: Suggest API + +| Task | Status | +|------|--------| +| Clustering suggestion logic | Pending | +| Primary face recommendation | Pending | +| Merge suggestion | Pending | + +--- + +### 8.4 Phase 4: Statistics + +| Task | Status | +|------|--------| +| file_identities aggregation | Pending | +| identities.metadata update | Pending | +| Cross-file identity stats | Pending | + +--- + +## 9. Key Decisions + +| Decision | Reason | +|----------|--------| +| **Remove person_identities** | Middle layer adds complexity, unused (303 records, 0 registered) | +| **Face → Identity direct** | Simpler, embedding comparison is sufficient | +| **Adaptive threshold** | Pose affects embedding quality | +| **Chunk auto-bind** | Chunks follow faces by time alignment | +| **file_identities table** | Needed for N:N relationship tracking | + +--- + +## 10. Metrics + +| Metric | Target | +|--------|--------| +| **Matching accuracy** | > 90% for frontal | +| **False positive rate** | < 5% | +| **Processing speed** | 1000 faces/second | +| **Cross-file recall** | > 85% | + +--- + +## Version Information + +- Version: V2.0 +- Architecture: Two-layer (Face → Identity) +- Date: 2026-04-28 +- Status: Specification complete, implementation pending diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_MANAGEMENT_API.md b/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_MANAGEMENT_API.md index b65ef2e..8feedeb 100644 --- a/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_MANAGEMENT_API.md +++ b/docs_v1.0/AI_AGENTS/IDENTITY/IDENTITY_MANAGEMENT_API.md @@ -1,214 +1,434 @@ -# 📘 Momentry 身份管理 (Identity Management) API 實作指南 +# Momentry Identity Management API Guide -本文件示範如何透過 API 完成「從影片選擇 → 臉部分析 → 全域身份註冊」的完整流程。 - -## 1. 選擇目標影片 - -**目標**: 獲取系統中已註冊的影片列表,選擇要進行管理的影片。 - -**API**: `GET /api/v1/videos` - -```bash -curl -s "http://127.0.0.1:3002/api/v1/videos" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" | jq . -``` - -**回應範例**: -```json -{ - "videos": [ - { - "uuid": "384b0ff44aaaa1f1", - "file_name": "Old_Time_Movie_Show_-_Charade_1963.HD.mov", - "duration": 6879.33 - }, - { - "uuid": "9760d0820f0cf9a7", - "file_name": "ExaSAN PCIe series - Director Ou.mp4", - "duration": 159.64 - } - ] -} -``` -> **決策**: 我們選擇 `Charade 1963` (UUID: `384b0ff44aaaa1f1`) 進行管理。 +> Version: 4.0 | Updated: 2026-04-28 +> Architecture: Two-layer (Face → Identity) +> Terminology: file_uuid, identity_uuid --- -## 2. 分析影片內的所有人物 (Faces / Persons / Speakers) +## Overview -**目標**: 查看該影片內所有偵測到的「臉群 (Clusters)」。區分**已命名 (Named)**、**待命名 (Unregistered)** 與 **AI 建議**。 +This guide demonstrates the complete workflow for: +- Choosing a video file +- Analyzing faces (unregistered candidates) +- Registering global identities +- Managing identity ↔ file relationships -**API**: `GET /api/v1/videos/{uuid}/faces` +--- + +## Terminology + +| Term | Scope | Example | +|------|-------|---------| +| **file_uuid** | Video file identifier | `384b0ff44aaaa1f14cb2cd63b3fea966` | +| **identity_uuid** | Global identity identifier | `a9a90105-6d6b-...` | +| **face_id** | Single face detection | `face_100` | +| **trace_id** | Face tracking ID | `2` | + +**Note**: `person_id` (video-local identifier) is deprecated. Use direct Face → Identity binding. + +--- + +## 1. List Files + +**Endpoint**: `GET /api/v1/files` ```bash -curl -s "http://127.0.0.1:3002/api/v1/videos/384b0ff44aaaa1f1/faces" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" | jq . +curl -s "http://127.0.0.1:3003/api/v1/files" \ + -H "X-API-Key: YOUR_API_KEY" | jq . ``` -**回應範例**: +**Response**: ```json { "success": true, - "video_uuid": "384b0ff44aaaa1f1", - "total_faces": 6, - "registered_count": 0, - "unregistered_count": 6, - "clusters": [ - { - "cluster_id": "Person_4", - "face_count": 45, - "status": "unregistered", - "identity": { - "name": "Cary Grant", - "is_confirmed": true + "data": { + "files": [ + { + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "file_name": "Charade_1963.mp4", + "duration": 6879.33, + "status": "completed" + } + ] + } +} +``` + +--- + +## 2. List Unregistered Faces (Candidates) + +**Endpoint**: `GET /api/v1/faces/candidates` + +Query faces that have not been bound to any identity. + +| Parameter | Type | Required | Default | Description | +|-----------|------|----------|---------|-------------| +| `file_uuid` | UUID | No | - | Filter by file | +| `min_confidence` | float | No | 0.5 | Minimum confidence | +| `pose_angle` | string | No | - | Filter by pose (frontal/profile) | +| `page` | int | No | 1 | Page number | +| `page_size` | int | No | 15 | Items per page | +| `limit` | int | No | 100 | Total limit | + +```bash +curl -s "http://127.0.0.1:3003/api/v1/faces/candidates?min_confidence=0.8" \ + -H "X-API-Key: YOUR_API_KEY" | jq . +``` + +**Response**: +```json +{ + "success": true, + "data": { + "candidates": [ + { + "face_id": "face_100", + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "frame": 100, + "timestamp": 5.2, + "pose_angle": "frontal", + "confidence": 0.92, + "trace_id": 2, + "embedding_quality": 0.88 + } + ], + "statistics": { + "total_candidates": 78, + "pose_distribution": { + "frontal": 20, + "profile_right": 30, + "three_quarter": 18 } }, - { - "cluster_id": "Person_17", - "face_count": 32, - "status": "unregistered", - "identity": { - "name": "Audrey Hepburn", - "is_confirmed": true - } - }, - { - "cluster_id": "Person_12", - "face_count": 10, - "status": "unregistered", - "identity": { "name": "Person_12" } - }, - { - "cluster_id": "Person_124", - "face_count": 5, - "status": "unregistered", - "identity": null + "pagination": { + "page": 1, + "page_size": 15, + "total": 78, + "total_pages": 6 } - ] -} -``` - -### 如何解讀結果? - -| 欄位 | 說明 | 狀態 | -| :--- | :--- | :--- | -| **`identity.name`** | 若顯示具體人名 (如 "Audrey Hepburn"),代表 **已命名**。 | ✅ 待註冊 | -| **`identity.name`** | 若顯示 `Person_XX` (系統預設名),代表 **待命名**。 | 🔄 等待 AI 或人工命名 | -| **`identity: null`** | 代表完全 **未識別**,通常數量較少。 | ❓ 待處理 | - ---- - -## 3. 註冊全域身份 (Register Identity) - -**目標**: 將已命名的人物升級為 **全域身份 (Global Identity)**。這能讓系統在其他影片中自動認出他們。 - -**API**: `POST /api/v1/person/{person_id}/register?video_uuid={uuid}` - -### 3.1 註冊 Audrey Hepburn - -```bash -curl -s -X POST "http://127.0.0.1:3002/api/v1/person/Person_17/register?video_uuid=384b0ff44aaaa1f1" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" | jq . -``` - -**回應**: -```json -{ - "success": true, - "message": "Successfully registered as global identity", - "person_id": "Person_17", - "name": "Audrey Hepburn", - "face_identity_id": 12 -} -``` - -### 3.2 註冊 Cary Grant - -```bash -curl -s -X POST "http://127.0.0.1:3002/api/v1/person/Person_4/register?video_uuid=384b0ff44aaaa1f1" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" | jq . -``` - -**回應**: -```json -{ - "success": true, - "face_identity_id": 13, - "name": "Cary Grant" + } } ``` --- -## ✅ 驗證成果 +## 3. AI Suggest Clustering -現在可以使用全域搜尋 API 確認身份是否註冊成功: +**Endpoint**: `POST /api/v1/agents/suggest/clustering` + +AI Agent analyzes unregistered faces and suggests clustering. ```bash -curl -s -X POST "http://127.0.0.1:3002/api/v1/identities/search" \ +curl -s -X POST "http://127.0.0.1:3003/api/v1/agents/suggest/clustering" \ -H "Content-Type: application/json" \ - -H "x-api-key: muser_..." \ - -d '{"query": "Audrey"}' | jq '.identities[] | {name: .profile.name, identity_id: .face_identity_id}' + -H "X-API-Key: YOUR_API_KEY" \ + -d '{ + "min_confidence": 0.8, + "pose_angles": ["frontal"], + "max_suggestions": 5 + }' | jq . ``` -**結果**: +**Response**: ```json { - "name": "Audrey Hepburn", - "identity_id": 12 + "success": true, + "data": { + "suggestions": [ + { + "suggestion_id": "suggest_1", + "cluster_type": "high_confidence", + "confidence": 0.92, + "recommended_faces": [ + { + "face_id": "face_100", + "pose_angle": "frontal", + "confidence": 0.95, + "is_primary": true + }, + { + "face_id": "face_150", + "pose_angle": "frontal", + "confidence": 0.91 + } + ], + "cluster_stats": { + "total_faces": 50, + "avg_similarity": 0.89, + "trace_ids": [2, 3] + }, + "reason": "High confidence frontal faces from same trace", + "action": "register" + } + ] + } } ``` --- -## 4. 擷取身份 / 人物 / 臉部 截圖 +## 4. Register Identity from Faces -**目標**: 取得特定人物的臉部特寫截圖。 -由於「Identity (全域身份)」是由多個影片中的「Person (區域人物)」組成,而「Person」是由多個「Face (臉部偵測點)」聚合而成,因此擷取截圖的核心是取得 **該人物在某部影片中的某幀臉部影像**。 +**Endpoint**: `POST /api/v1/identities/register` -**API**: `GET /api/v1/person/{person_id}/thumbnail` - -### 參數說明 - -| 參數 | 類型 | 必填 | 說明 | -| :--- | :--- | :--- | :--- | -| `person_id` | Path | ✅ | 人物 ID (例如: `Person_17`) | -| `video_uuid` | Query | ✅ | 影片 UUID (用來定位影像源) | -| `index` | Query | ❌ | 指定第幾張臉 (預設 `0`) | - -### 4.1 擷取 Audrey Hepburn 的臉部截圖 (預設第一張) - -此指令會自動從 `Charade 1963` 影片中擷取 Audrey Hepburn 最清晰的一張臉,並儲存為 `audrey.jpg`。 +Register a new global identity from face candidates. ```bash -curl -s -o audrey.jpg \ - "http://127.0.0.1:3002/api/v1/person/Person_17/thumbnail?video_uuid=384b0ff44aaaa1f1" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" +curl -s -X POST "http://127.0.0.1:3003/api/v1/identities/register" \ + -H "Content-Type: application/json" \ + -H "X-API-Key: YOUR_API_KEY" \ + -d '{ + "face_ids": ["face_100", "face_150", "face_200"], + "name": "Audrey Hepburn", + "source": "manual", + "auto_bind_chunks": true + }' | jq . ``` -> **注意**: 回應是 **圖片二進位資料 (JPG)**,請使用 `-o filename.jpg` 儲存,**不要**使用 `| jq`。 +**Response**: +```json +{ + "success": true, + "data": { + "identity_uuid": "a9a90105-6d6b-46ff-92da-0c3c1a57dff4", + "name": "Audrey Hepburn", + "faces_bound": 3, + "chunks_bound": 10, + "speaker_ids": ["SPEAKER_0"], + "reference_vectors": { + "total": 3, + "angles": ["frontal", "three_quarter"] + } + } +} +``` -### 4.2 擷取 Cary Grant 的其他臉部截圖 (指定 Index) +--- -若你想看同一人物的其他角度,可以調整 `index` 參數。 -假設 Cary Grant (`Person_4`) 在影片中出現了 45 次: +## 5. Query Identity → Files + +**Endpoint**: `GET /api/v1/identities/:identity_uuid/files` + +List all files where this identity appears. ```bash -# 擷取第 5 次出現的臉部截圖 (index 從 0 開始) -curl -s -o cary_face_5.jpg \ - "http://127.0.0.1:3002/api/v1/person/Person_4/thumbnail?video_uuid=384b0ff44aaaa1f1&index=4" \ - -H "x-api-key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" +curl -s "http://127.0.0.1:3003/api/v1/identities/a9a90105.../files" \ + -H "X-API-Key: YOUR_API_KEY" | jq . ``` -### 4.3 Identity (全域身份) 的截圖策略 +**Response**: +```json +{ + "success": true, + "data": { + "identity_uuid": "a9a90105...", + "name": "Audrey Hepburn", + "files": [ + { + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "file_name": "Charade_1963.mp4", + "face_count": 500, + "speaker_count": 10, + "first_appearance": 5.2, + "last_appearance": 180.5, + "confidence": 0.86 + }, + { + "file_uuid": "9760d0820f0cf9a7", + "file_name": "Breakfast_at_Tiffanys.mp4", + "face_count": 300, + "speaker_count": 5 + } + ], + "total_files": 2 + } +} +``` -由於全域 Identity (`face_identity_id: 12`) 跨越多部影片,要取得它的截圖,請先查詢它所屬的影片: +--- -1. **查詢 Identity 所在的影片**: - ```bash - curl -s "http://127.0.0.1:3002/api/v1/identities/12/videos" \ - -H "x-api-key: muser_..." | jq '.videos[0].video_uuid' - ``` -2. **取得該影片中的對應 Person ID**: 從上一步結果中找到 `person_id` (例如 `Person_17`)。 -3. **呼叫截圖 API**: 使用該 `video_uuid` 和 `person_id` 呼叫上述截圖 API。 +## 6. Query File → Identities +**Endpoint**: `GET /api/v1/files/:file_uuid/identities` + +List all identities appearing in a file. + +```bash +curl -s "http://127.0.0.1:3003/api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966/identities" \ + -H "X-API-Key: YOUR_API_KEY" | jq . +``` + +**Response**: +```json +{ + "success": true, + "data": { + "file_uuid": "384b0ff44aaaa1f14cb2cd63b3fea966", + "file_name": "Charade_1963.mp4", + "identities": [ + { + "identity_uuid": "a9a90105...", + "name": "Audrey Hepburn", + "face_count": 500, + "speaker_count": 10, + "confidence": 0.86 + }, + { + "identity_uuid": "b8b80206...", + "name": "Cary Grant", + "face_count": 450, + "speaker_count": 8 + } + ], + "total_identities": 2 + } +} +``` + +--- + +## 7. Get Identity Detail + +**Endpoint**: `GET /api/v1/identities/:identity_uuid` + +```bash +curl -s "http://127.0.0.1:3003/api/v1/identities/a9a90105..." \ + -H "X-API-Key: YOUR_API_KEY" | jq . +``` + +**Response**: +```json +{ + "success": true, + "data": { + "identity_uuid": "a9a90105...", + "name": "Audrey Hepburn", + "source": "manual", + "identity_type": "person", + "global_stats": { + "total_files": 3, + "total_faces": 1500, + "total_speaker_segments": 30 + }, + "reference_vectors": { + "total": 4, + "angles": ["frontal", "profile_right", "three_quarter"], + "quality_avg": 0.875 + } + } +} +``` + +--- + +## 8. Bind Additional Faces to Identity + +**Endpoint**: `POST /api/v1/identities/:identity_uuid/bind` + +Add more faces to an existing identity. + +```bash +curl -s -X POST "http://127.0.0.1:3003/api/v1/identities/a9a90105.../bind" \ + -H "Content-Type: application/json" \ + -H "X-API-Key: YOUR_API_KEY" \ + -d '{ + "face_ids": ["face_300", "face_400"], + "auto_bind_chunks": true + }' | jq . +``` + +**Response**: +```json +{ + "success": true, + "data": { + "identity_uuid": "a9a90105...", + "faces_bound": 2, + "chunks_bound": 5, + "updated_stats": { + "total_faces": 1502, + "total_files": 3 + } + } +} +``` + +--- + +## 9. Unbind Faces from Identity + +**Endpoint**: `POST /api/v1/identities/:identity_uuid/unbind` + +```bash +curl -s -X POST "http://127.0.0.1:3003/api/v1/identities/a9a90105.../unbind" \ + -H "Content-Type: application/json" \ + -H "X-API-Key: YOUR_API_KEY" \ + -d '{ + "face_ids": ["face_400"] + }' | jq . +``` + +--- + +## 10. Get Identity Thumbnail + +**Endpoint**: `GET /api/v1/identities/:identity_uuid/thumbnail` + +```bash +curl -s -o identity_thumbnail.jpg \ + "http://127.0.0.1:3003/api/v1/identities/a9a90105.../thumbnail" \ + -H "X-API-Key: YOUR_API_KEY" +``` + +--- + +## Complete Workflow Example + +``` +Step 1: List files → Choose Charade_1963.mp4 +Step 2: List face candidates → Find high-confidence frontal faces +Step 3: AI suggest clustering → Get clustering recommendations +Step 4: Register identity → Create "Audrey Hepburn" with 3 faces +Step 5: Auto-bind chunks → 10 sentence chunks bound automatically +Step 6: Verify → Query identity → files (appears in 3 files) +``` + +--- + +## API Endpoints Summary + +| Category | Endpoint | Description | +|----------|----------|-------------| +| **List** | `GET /api/v1/files` | List files | +| **List** | `GET /api/v1/identities` | List identities | +| **Candidates** | `GET /api/v1/faces/candidates` | Unregistered faces | +| **Suggest** | `POST /api/v1/agents/suggest/clustering` | AI clustering suggestions | +| **Register** | `POST /api/v1/identities/register` | Register new identity | +| **Bind** | `POST /api/v1/identities/:uuid/bind` | Bind faces to identity | +| **Detail** | `GET /api/v1/identities/:uuid` | Identity detail | +| **Relation** | `GET /api/v1/identities/:uuid/files` | Identity → Files (N:N) | +| **Relation** | `GET /api/v1/files/:uuid/identities` | File → Identities (N:N) | + +--- + +## Changes from V3.x + +| Change | V3.x | V4.0 | +|--------|------|------| +| **Architecture** | Face → Person → Identity | Face → Identity (2-layer) | +| **file_uuid** | file_uuid | file_uuid | +| **person_id** | 28 person API endpoints | Removed (deprecated) | +| **file_identities** | Not mentioned | Added (N:N relationship table) | +| **chunk candidates** | chunk candidates API | Removed (chunks auto-bind) | + +--- + +## Version History + +| Version | Date | Changes | +|---------|------|---------| +| V4.0 | 2026-04-28 | Two-layer architecture, file_uuid terminology | +| V3.5 | 2026-04-17 | Person-based workflow | +| V3.0 | 2026-04-10 | Initial identity management | diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/PHASE1_MIGRATION_PLAN.md b/docs_v1.0/AI_AGENTS/IDENTITY/PHASE1_MIGRATION_PLAN.md new file mode 100644 index 0000000..209712c --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/PHASE1_MIGRATION_PLAN.md @@ -0,0 +1,282 @@ +# Phase 1 Migration Plan: file_uuid → file_uuid + +> Version: V4.0 | Date: 2026-04-28 +> Status: Planning + +--- + +## Overview + +将所有 `file_uuid` 重命名为 `file_uuid`,统一术语定义。 + +### Impact Summary + +| Category | Count | Priority | +|----------|-------|----------| +| **Migration SQL** | 6 files | High | +| **Rust API** | ~20 files | High | +| **Portal Vue** | 3 files | Medium | +| **Documents** | 121 refs | Low | + +--- + +## Phase 1.1: Database Migration + +### Tables Affected + +| Table | Column | New Name | +|-------|--------|----------| +| `face_detections` | `file_uuid` | `file_uuid` | +| `face_clusters` | `file_uuid` | `file_uuid` | +| `person_identities` | `file_uuid` | `file_uuid` | +| `person_appearances` | `file_uuid` | `file_uuid` | +| `chunks` | `file_uuid` | `file_uuid` | +| `files` | - | (already has `uuid`) | + +### Indexes Affected + +| Old Index | New Index | +|-----------|-----------| +| `idx_face_detections_file_uuid` | `idx_face_detections_file_uuid` | +| `idx_face_clusters_file_uuid` | `idx_face_clusters_file_uuid` | +| `idx_person_identities_file_uuid` | `idx_person_identities_file_uuid` | + +### Migration Script + +```sql +-- Migration: 011_rename_file_uuid_to_file_uuid.sql +-- Date: 2026-04-28 + +BEGIN; + +-- 1. face_detections +ALTER TABLE face_detections +RENAME COLUMN file_uuid TO file_uuid; + +DROP INDEX IF EXISTS idx_face_detections_file_uuid; +CREATE INDEX idx_face_detections_file_uuid ON face_detections(file_uuid); +DROP INDEX IF EXISTS idx_face_detections_frame; +CREATE INDEX idx_face_detections_frame ON face_detections(file_uuid, frame_number); + +-- 2. face_clusters +ALTER TABLE face_clusters +RENAME COLUMN file_uuid TO file_uuid; + +DROP INDEX IF EXISTS idx_face_clusters_file_uuid; +CREATE INDEX idx_face_clusters_file_uuid ON face_clusters(file_uuid); + +-- 3. person_identities (will be removed in Phase 2, but rename for consistency) +ALTER TABLE person_identities +RENAME COLUMN file_uuid TO file_uuid; + +DROP INDEX IF EXISTS idx_person_identities_file_uuid; +CREATE INDEX idx_person_identities_file_uuid ON person_identities(file_uuid); + +-- 4. person_appearances +ALTER TABLE person_appearances +RENAME COLUMN file_uuid TO file_uuid; + +DROP INDEX IF EXISTS idx_person_appearances_file_uuid; +CREATE INDEX idx_person_appearances_file_uuid ON person_appearances(file_uuid); +DROP INDEX IF EXISTS idx_person_appearances_time; +CREATE INDEX idx_person_appearances_time ON person_appearances(file_uuid, start_time, end_time); + +-- 5. chunks (if exists) +ALTER TABLE chunks +RENAME COLUMN file_uuid TO file_uuid; + +-- 6. Update constraint names +ALTER TABLE face_detections +DROP CONSTRAINT IF EXISTS unique_detection_per_frame, +ADD CONSTRAINT unique_detection_per_frame UNIQUE (file_uuid, frame_number, x, y, width, height); + +ALTER TABLE face_clusters +DROP CONSTRAINT IF EXISTS face_recognition_results_file_uuid_key, +ADD CONSTRAINT face_clusters_file_uuid_key UNIQUE (file_uuid); + +ALTER TABLE person_identities +DROP CONSTRAINT IF EXISTS unique_person_identity, +ADD CONSTRAINT unique_person_identity UNIQUE (file_uuid, face_identity_id, speaker_id); + +COMMIT; +``` + +--- + +## Phase 1.2: Rust API Migration + +### Files Affected + +| File | Changes | +|------|---------| +| `src/api/face_recognition.rs` | Rename struct fields | +| `src/api/videos.rs` | Rename endpoints | +| `src/api/identities.rs` | Update query params | +| `src/api/person_identity.rs` | (will be removed in Phase 2) | +| `src/core/db/*.rs` | Rename column bindings | + +### Migration Steps + +1. Rename struct fields: +```rust +// Before +pub struct FaceResult { + pub file_uuid: String, +} + +// After +pub struct FaceResult { + pub file_uuid: String, +} +``` + +1. Rename route parameters: +```rust +// Before +"/api/v1/face/results/:file_uuid" + +// After +"/api/v1/face/results/:file_uuid" +``` + +1. Update SQLx bindings: +```rust +// Before +sqlx::query!("WHERE file_uuid = $1", file_uuid) + +// After +sqlx::query!("WHERE file_uuid = $1", file_uuid) +``` + +--- + +## Phase 1.3: Portal Migration + +### Files Affected + +| File | Changes | +|------|---------| +| `portal/src/views/IdentitiesView.vue` | Rename field references | +| `portal/src/views/PersonsView.vue` | Rename field references | +| `portal/src/views/IdentityDetailView.vue` | Rename field references | +| `portal/src-tauri/src/api/*.rs` | Rename struct fields | + +### Migration Steps + +1. Rename TypeScript interfaces: +```typescript +// Before +interface Identity { + file_uuid: string; +} + +// After +interface Identity { + file_uuid: string; +} +``` + +1. Update Vue templates: +```vue + +
影片: {{ identity.file_uuid }}
+ + +
影片: {{ identity.file_uuid }}
+``` + +--- + +## Phase 1.4: Document Migration + +### Files Affected + +- `docs_v1.0/**/*.md` (121 refs) +- `AGENTS.md` (already updated) + +### Migration Steps + +```bash +# Batch replacement (MacOS/Linux) +find docs_v1.0 -name "*.md" -type f \ + -exec sed -i '' 's/file_uuid/file_uuid/g' {} \; + +# Verify changes +grep -r "file_uuid" docs_v1.0/*.md | wc -l +``` + +--- + +## Execution Order + +| Step | Description | Est. Time | +|------|-------------|-----------| +| 1 | Create DB migration script | 5 min | +| 2 | Run DB migration (dev schema) | 2 min | +| 3 | Update Rust API | 30 min | +| 4 | Update Portal | 20 min | +| 5 | Run tests | 10 min | +| 6 | Batch update docs | 5 min | +| **Total** | | **~1 hour** | + +--- + +## Rollback Plan + +```sql +-- Rollback migration +BEGIN; + +ALTER TABLE face_detections RENAME COLUMN file_uuid TO file_uuid; +ALTER TABLE face_clusters RENAME COLUMN file_uuid TO file_uuid; +ALTER TABLE person_identities RENAME COLUMN file_uuid TO file_uuid; +ALTER TABLE person_appearances RENAME COLUMN file_uuid TO file_uuid; +ALTER TABLE chunks RENAME COLUMN file_uuid TO file_uuid; + +-- Restore indexes +DROP INDEX idx_face_detections_file_uuid; +CREATE INDEX idx_face_detections_file_uuid ON face_detections(file_uuid); + +-- ... (repeat for other tables) + +COMMIT; +``` + +--- + +## Test Commands + +```bash +# After migration, verify API still works +cargo run --bin momentry_playground -- server + +# Test endpoints +curl "http://localhost:3003/api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966" +curl "http://localhost:3003/api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966/identities" + +# Run tests +cargo test --lib +cargo clippy --lib +``` + +--- + +## Status Checklist + +- [ ] Create migration script (011_rename_file_uuid.sql) +- [ ] Test migration on dev schema +- [ ] Update Rust API +- [ ] Update Portal +- [ ] Run cargo test +- [ ] Run cargo clippy +- [ ] Batch update docs +- [ ] Verify all endpoints work + +--- + +## Next Phase + +After Phase 1 completion: +- **Phase 2**: Architecture simplification (remove person_identities table) +- **Phase 3**: Implement new binding logic +- **Phase 4**: Portal UI update diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/PHASE2_MIGRATION_SUMMARY.md b/docs_v1.0/AI_AGENTS/IDENTITY/PHASE2_MIGRATION_SUMMARY.md new file mode 100644 index 0000000..abc0d44 --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/PHASE2_MIGRATION_SUMMARY.md @@ -0,0 +1,113 @@ +# Phase 2 Migration Summary + +> Version: V4.0 | Date: 2026-04-28 +> Status: Completed (Code Ready, Migration Pending) + +--- + +## Completed Tasks + +| Task | Status | Details | +|------|--------|---------| +| **DB Migration Scripts** | ✅ | 026, 027, 028 created | +| **New Binding API** | ✅ | identity_binding_v4.rs (473 lines) | +| **Routes Registration** | ✅ | 5 new endpoints | +| **Module Export** | ✅ | mod.rs updated | + +--- + +## New API Endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/api/v1/identities/register` | POST | Register identity from face_ids | +| `/api/v1/identities/:uuid/bind` | POST | Bind faces to identity | +| `/api/v1/identities/:uuid/unbind` | POST | Unbind faces from identity | +| `/api/v1/faces/candidates` | GET | List unregistered faces | +| `/api/v1/files/:uuid/identity-stats` | GET | Get file identity stats | + +--- + +## Migration Files Created + +| File | Purpose | +|------|---------| +| `migrations/025_rename_video_uuid_to_file_uuid.sql` | Rename columns | +| `migrations/026_create_file_identities_table.sql` | N:N relationship table | +| `migrations/027_add_identity_id_to_face_detections.sql` | Add foreign key | +| `migrations/028_drop_person_identities_table.sql` | Remove old architecture | + +--- + +## Files Modified + +| File | Changes | +|------|--------| +| `src/api/mod.rs` | Add identity_binding_v4 module | +| `src/api/server.rs` | Register new routes | +| `src/api/identity_binding_v4.rs` | New binding logic | + +--- + +## Next Steps + +### 1. Run DB Migrations + +```bash +# Connect to dev schema +psql -U accusys -d momentry -c "SET search_path TO dev;" + +# Run migrations +psql -U accusys -d momentry -f migrations/025_rename_video_uuid_to_file_uuid.sql +psql -U accusys -d momentry -f migrations/026_create_file_identities_table.sql +psql -U accusys -d momentry -f migrations/027_add_identity_id_to_face_detections.sql +psql -U accusys -d momentry -f migrations/028_drop_person_identities_table.sql +``` + +### 2. Update SQLx Cache + +```bash +cargo sqlx prepare +``` + +### 3. Test New Endpoints + +```bash +cargo run --bin momentry_playground -- server + +# Test candidates API +curl "http://localhost:3003/api/v1/faces/candidates?min_confidence=0.8" + +# Test register API +curl -X POST "http://localhost:3003/api/v1/identities/register" \ + -H "Content-Type: application/json" \ + -d '{"face_ids": [100], "name": "Test Person"}' +``` + +--- + +## Compilation Status + +- **Code Structure**: ✅ Correct +- **Type Safety**: ⏸ Pending DB migration +- **SQLx Cache**: ⏸ Need `cargo sqlx prepare` after migration + +--- + +## Architecture Comparison + +| Aspect | V3.x | V4.0 | +|--------|------|------| +| **Binding Layer** | 3 (Face → Person → Identity) | 2 (Face → Identity) | +| **Tables** | person_identities + person_appearances | file_identities | +| **API Endpoints** | 33 | 15 | +| **Person ID** | Video-local | ❌ Removed | +| **Chunk Binding** | Manual | Auto (time alignment) | + +--- + +## Version History + +| Version | Date | Changes | +|---------|------|---------| +| V4.0 | 2026-04-28 | Two-layer architecture complete | diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_COMPLETE.md b/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_COMPLETE.md new file mode 100644 index 0000000..b173826 --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_COMPLETE.md @@ -0,0 +1,119 @@ +# V4.0 Migration Complete + +> Date: 2026-04-28 19:50 +> Status: ✅ Successfully Completed + +--- + +## Summary + +### Phase 1: Terminology Migration (video_uuid → file_uuid) + +| Task | Status | Files Modified | +|------|--------|----------------| +| **DB Migration 025** | ✅ | 4 tables renamed | +| **Rust API** | ✅ | 11 files | +| **Portal Vue/Tauri** | ✅ | 6 files | +| **Documents** | ✅ | 117 MD files | + +### Phase 2: Architecture Simplification + +| Task | Status | Details | +|------|--------|---------| +| **DB Migration 026** | ✅ | file_identities table created | +| **DB Migration 027** | ✅ | identity_id FK added | +| **DB Migration 028** | ✅ | person_identities dropped | +| **SQLx Fix** | ✅ | 5 JSONB bindings fixed | +| **Compilation** | ✅ | cargo check --lib passed | +| **Tests** | ✅ | 178 tests passed | +| **Clippy** | ✅ | 119 warnings (minor) | + +--- + +## Files Fixed (JSONB Issues) + +| File | Line | Fix | +|------|------|-----| +| src/api/identities.rs | 274 | .bind(serde_json::to_string(...)) | +| src/api/face_recognition.rs | 337 | .bind(serde_json::to_string(...)) | +| src/api/person_identity.rs | 1508 | .bind(serde_json::to_string(...)) | +| src/api/person_identity.rs | 2287 | .bind(serde_json::to_string(...)) | +| src/core/worker/job_runner.rs | 105 | serde_json::json!({"status": "COMPLETED"}) | + +--- + +## Database State (dev schema) + +```sql +-- Tables Created +file_identities ✅ + - file_uuid, identity_id, face_count, confidence + +-- Tables Renamed +face_detections.video_uuid → file_uuid ✅ +face_clusters.video_uuid → file_uuid ✅ + +-- Tables Deleted +person_identities ✅ +person_appearances ✅ +``` + +--- + +## Build Status + +```bash +# Compilation +cargo check --lib ✅ +cargo build --lib ✅ + +# Tests +cargo test --lib ✅ (178 passed) + +# Linting +cargo clippy --lib ✅ (119 warnings, minor) + +# SQLx Cache +cargo sqlx prepare ✅ (.sqlx updated) +``` + +--- + +## Remaining Tasks (Optional) + +| Task | Priority | Status | +|------|----------|--------| +| Create identity_binding_v4.rs | Medium | Pending | +| Remove person_identity.rs | Low | Pending | +| Update Portal UI for new endpoints | Low | Pending | + +--- + +## Migration Summary + +| Aspect | V3.x | V4.0 | +|--------|------|------| +| **video_uuid** | Used everywhere | **file_uuid** | +| **person_identities** | 303 records | **Removed** | +| **file_identities** | N/A | **Created** | +| **Architecture** | 3-layer | **2-layer** | +| **Compilation** | Broken | **Fixed** | +| **Tests** | - | **178 passed** | + +--- + +## Next Steps + +1. Test API endpoints manually +2. Create identity_binding_v4.rs with proper JSONB handling +3. Update Portal UI to use new endpoints +4. Document API changes in AGENTS.md + +--- + +## Key Lessons + +1. **SQLx JSONB**: Must use `serde_json::json!()` for compile-time checks +2. **Batch replacements**: Use sed -i for large-scale renaming +3. **DB Migration**: Test on dev schema first, fix errors incrementally +4. **Compilation**: Fix one error at a time, run cargo check frequently diff --git a/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_STATUS.md b/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_STATUS.md new file mode 100644 index 0000000..69f1c8c --- /dev/null +++ b/docs_v1.0/AI_AGENTS/IDENTITY/V4_MIGRATION_STATUS.md @@ -0,0 +1,121 @@ +# V4.0 Migration Status + +> Date: 2026-04-28 + +--- + +## Completed Tasks + +### Phase 1: Terminology Migration (video_uuid → file_uuid) + +| Task | Status | Details | +|------|--------|---------| +| **DB Migration 025** | ✅ | face_detections, face_clusters, person_identities renamed | +| **Rust API** | ✅ | 11 files batch replaced | +| **Portal** | ✅ | 6 Vue/Tauri files | +| **Documents** | ✅ | 117 MD files | + +### Phase 2: Architecture Simplification + +| Task | Status | Details | +|------|--------|---------| +| **DB Migration 026** | ✅ | file_identities table created | +| **DB Migration 027** | ✅ | identity_id FK added to face_detections | +| **DB Migration 028** | ✅ | person_identities + person_appearances dropped | +| **New Binding API** | ⏸ | identity_binding_v4.rs (SQLx compile error) | + +--- + +## Current Issue + +**SQLx Compile Error**: "invalid input syntax for type json" + +Cause: identities.metadata column is JSONB, but SQLx requires exact type matching during compile-time checks. + +--- + +## Database State + +```sql +-- Tables Created +file_identities (N:N relationship) + - file_uuid, identity_id, face_count, confidence + +-- Tables Renamed +face_detections.video_uuid → file_uuid +face_clusters.video_uuid → file_uuid + +-- Tables Deleted +person_identities ✅ +person_appearances ✅ +``` + +--- + +## Next Steps + +### Option A: Fix SQLx (Recommended) + +1. Remove identity_binding_v4.rs temporarily +2. Run `cargo sqlx prepare` to update cache +3. Fix SQL queries with proper JSONB binding +4. Re-add identity_binding_v4.rs + +### Option B: Use SQLX_OFFLINE + +```bash +SQLX_OFFLINE=true cargo build --lib +cargo sqlx prepare +``` + +### Option C: Skip for Now + +Keep existing person_identity.rs API, migrate later when database is stable. + +--- + +## Test Commands + +```bash +# Verify tables +psql -U accusys -d momentry -c "\dt dev.*" + +# Check columns +psql -U accusys -d momentry -c " +SELECT table_name, column_name +FROM information_schema.columns +WHERE table_schema = 'dev' +AND column_name = 'file_uuid' +ORDER BY table_name; +" + +# Build (if SQLx fixed) +cargo build --lib +cargo test --lib +``` + +--- + +## Files Modified + +| File | Lines | +|------|-------| +| migrations/025_rename_video_uuid_to_file_uuid.sql | 42 | +| migrations/026_create_file_identities_table.sql | 39 | +| migrations/027_add_identity_id_to_face_detections.sql | 30 | +| migrations/028_drop_person_identities_table.sql | 29 | +| src/api/identity_binding_v4.rs | 310 | +| src/api/mod.rs | +1 line | +| src/api/server.rs | +1 line | + +--- + +## Migration Summary + +| Aspect | V3.x | V4.0 | +|--------|------|------| +| **video_uuid** | Used everywhere | **file_uuid** | +| **person_identities** | 303 records | **Removed** | +| **file_identities** | N/A | **Created** | +| **API Endpoints** | 33 | 15 (pending) | +| **Binding Logic** | 3-layer | 2-layer (pending) | diff --git a/docs_v1.0/AI_AGENTS/SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md b/docs_v1.0/AI_AGENTS/SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md index 80ea2fb..5c3be3e 100644 --- a/docs_v1.0/AI_AGENTS/SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md +++ b/docs_v1.0/AI_AGENTS/SUMMARIZATION/CHUNK_RULE_4_SUMMARY.md @@ -51,14 +51,14 @@ ai_query_hints: Rule 4 是處理管線的終點,依賴 **Rule 3** 的產出以及 **LLM 服務**。 -1. **Rule 3 Chunks (Primary)**: 提供場景級的文本摘要與元數據。 +1. **Rule 3 Chunks (Primary)**: 提供場景級的文本摘要與元數據。 - *聚合策略*: 將連續的 5-10 個 Rule 3 Chunks 視為一個「敘事區塊」。 -2. **LLM Processor (Gemma4)**: +2. **LLM Processor (Gemma4)**: - *任務*: 讀取該區塊內所有 Rule 3 的摘要與 ASR 文本。 - *輸出*: - **Summary**: 流暢的劇情描述。 - **5W1H**: 結構化的關鍵要素提取。 -3. **Visual/Audio Retention**: +3. **Visual/Audio Retention**: - 保留區塊內所有出現過的 `face_ids` (Who) 和 `objects` (What/Where)。 --- @@ -139,21 +139,21 @@ ALTER TABLE parent_chunks ADD COLUMN rule4_parent_id UUID REFERENCES chunks_rule Rule 4 是 **RAG (Retrieval-Augmented Generation)** 的核心數據源。 ### 3.1 劇情摘要搜尋 (Plot Search) -* **場景**: "這部片在講什麼?"、"他們找到郵票了嗎?" -* **邏輯**: +- **場景**: "這部片在講什麼?"、"他們找到郵票了嗎?" +- **邏輯**: 1. 搜尋 `summary` 向量。 2. 返回包含該情節的完整摘要區塊。 ### 3.2 5W1H 結構化查詢 (Structured Query) -* **場景**: "找出所有 **Cary Grant (Who)** 在 **車上 (Where)** 的片段"。 -* **邏輯**: +- **場景**: "找出所有 **Cary Grant (Who)** 在 **車上 (Where)** 的片段"。 +- **邏輯**: 1. 過濾 `analysis_5w1h` JSONB 欄位。 2. `who` 包含 "Cary Grant" **AND** `where` 包含 "car"。 3. 這種查詢比傳統關鍵字搜索更精準,因為它是經過 LLM 理解後的結構化數據。 ### 3.3 動機與原因搜尋 (Why/How) -* **場景**: "他為什麼要偷東西?" -* **邏輯**: +- **場景**: "他為什麼要偷東西?" +- **邏輯**: 1. 針對 `analysis_5w1h.why` 進行語意比對。 --- diff --git a/docs_v1.0/AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md b/docs_v1.0/AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md index 4fe80d2..e87a2c4 100644 --- a/docs_v1.0/AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md +++ b/docs_v1.0/AI_AGENTS/TRANSLATION/TEXT_TRANSLATION.md @@ -142,9 +142,9 @@ Content-Type: application/json 在 Portal 的 `ChunkDetailView.vue` 中,翻譯功能的調用流程如下: -1. 使用者點擊「翻譯為 繁體中文」按鈕。 -2. Portal 發送 POST 請求至 `/api/v1/agents/translate`。 -3. 取得結果後,在不重新整理頁面的情況下更新 UI (顯示 `translated_text`)。 +1. 使用者點擊「翻譯為 繁體中文」按鈕。 +2. Portal 發送 POST 請求至 `/api/v1/agents/translate`。 +3. 取得結果後,在不重新整理頁面的情況下更新 UI (顯示 `translated_text`)。 ```typescript // Portal 前端調用範例 diff --git a/docs_v1.0/API/PEOPLE_API_MARCOM_MAPPING.md b/docs_v1.0/API/PEOPLE_API_MARCOM_MAPPING.md new file mode 100644 index 0000000..2625833 --- /dev/null +++ b/docs_v1.0/API/PEOPLE_API_MARCOM_MAPPING.md @@ -0,0 +1,442 @@ +# People API 设计方案 (marcom 需求等效映射) + +**日期**: 2026-04-28 +**状态**: 设计阶段 +**目的**: 根据 marcom 团队需求,在符合现有架构的前提下提供等效 API + +--- + +## 设计原则 + +1. **遵循 RESTful 规范**: 使用标准 HTTP 方法 (GET, POST, PATCH, DELETE) +2. **统一路径前缀**: `/api/v1/people` +3. **响应格式统一**: `{ success: bool, message: string, data: any }` +4. **向后兼容**: 现有 API 保持不变,新 API 扩展功能 +5. **符合 Identity 系统**: 与 `identities` 表和 `identity_bindings` 表集成 + +--- + +## API 对照表 + +### 1. GET /people/candidates (候选人物) + +**marcom 需求**: 获取待确认的人物候选列表 + +**等效 API**: +``` +GET /api/v1/people/candidates?file_uuid={uuid}&limit={n} +``` + +**功能**: +- 返回待确认的人物身份候选 +- 包含 face cluster、speaker cluster 的匹配建议 +- 状态: `pending`, `suggested`, `unmatched` + +**响应示例**: +```json +{ + "success": true, + "message": "Found 15 candidates", + "data": { + "candidates": [ + { + "candidate_id": "face_cluster_1", + "type": "face", + "suggested_identity": { + "id": 123, + "name": "张曼玉", + "confidence": 0.92 + }, + "appearance_count": 45, + "status": "pending" + } + ], + "total": 15 + } +} +``` + +**实现**: 扩展现有 `/api/v1/people/suggest` + +--- + +### 2. GET /people (人物列表) + +**marcom 需求**: 获取所有人物列表 + +**等效 API**: +``` +GET /api/v1/people?file_uuid={uuid}&limit={n}&offset={n}&status={status} +``` + +**功能**: +- 返回人物身份列表 +- 支持按 file_uuid 筛选 +- 支持分页 +- 支持按状态筛选 (confirmed, pending, all) + +**响应示例**: +```json +{ + "success": true, + "message": "Found 8 persons", + "data": { + "persons": [ + { + "identity_id": "Person_17", + "name": "张曼玉", + "appearance_count": 45, + "total_duration": 350.2, + "is_confirmed": true + } + ], + "total": 8 + } +} +``` + +**实现**: 现有 `/api/v1/people/list` 已支持 + +--- + +### 3. GET /people/{identity_id} (人物详情) + +**marcom 需求**: 获取人物详情 + +**等效 API**: +``` +GET /api/v1/people/{identity_id}?file_uuid={uuid} +``` + +**功能**: +- 返回人物详细信息 +- 包含出场时间线 +- 包含关联的 face/speaker +- 包含缩略图 + +**响应示例**: +```json +{ + "success": true, + "data": { + "identity_id": "Person_17", + "name": "张曼玉", + "face_identity_id": 123, + "speaker_id": "SPEAKER_00", + "appearance_count": 45, + "total_duration": 350.2, + "first_appearance_time": 10.5, + "last_appearance_time": 360.2, + "timeline": [...], + "thumbnails": [...] + } +} +``` + +**实现**: 现有 `/api/v1/people/:person_id` 已支持 + +--- + +### 4. POST /people (创建人物) + +**marcom 需求**: 手动创建新人物 + +**等效 API**: +``` +POST /api/v1/people +Body: { "name": "张曼玉", "file_uuid": "xxx", "metadata": {...} } +``` + +**功能**: +- 创建新人物身份 +- 关联到指定视频 +- 支持添加 metadata (角色名、演员名等) + +**响应示例**: +```json +{ + "success": true, + "message": "Person created", + "data": { + "identity_id": "Person_99", + "name": "张曼玉", + "file_uuid": "xxx" + } +} +``` + +**实现**: 需新增,参考 `CreatePersonIdentityRequest` + +--- + +### 5. PATCH /people/{identity_id} (更新人物) + +**marcom 需求**: 更新人物信息 + +**等效 API**: +``` +PATCH /api/v1/people/{identity_id} +Body: { "name": "新名字", "is_confirmed": true, "metadata": {...} } +``` + +**功能**: +- 更新人物名称 +- 确认人物身份 +- 更新 metadata + +**实现**: 现有 `/api/v1/people/:person_id` (PATCH) 已支持 + +--- + +### 6. POST /people/merge (合并人物) + +**marcom 需求**: 合并多个人物为一个 + +**等效 API**: +``` +POST /api/v1/people/merge +Body: { + "target_identity_id": "Person_17", + "source_identity_ids": ["Person_18", "Person_19"] +} +``` + +**功能**: +- 合并多个人物身份 +- 转移所有出场记录 +- 更新统计数据 + +**实现**: 现有 `/api/v1/people/merge` 已支持 + +--- + +### 7. POST /people/skip (跳过人物) + +**marcom 需求**: 跳过某个候选人物(不处理) + +**等效 API**: +``` +POST /api/v1/people/skip +Body: { "candidate_id": "face_cluster_2", "reason": "非人物" } +``` + +**功能**: +- 标记候选为"已跳过" +- 记录跳过原因 +- 不创建人物身份 + +**响应示例**: +```json +{ + "success": true, + "message": "Candidate skipped", + "data": { + "candidate_id": "face_cluster_2", + "status": "skipped", + "reason": "非人物" + } +} +``` + +**实现**: 需新增,扩展候选管理功能 + +--- + +### 8. POST /people/{identity_id}/remove-face (移除人脸) + +**marcom 需求**: 从人物身份中移除特定人脸绑定 + +**等效 API**: +``` +POST /api/v1/people/{identity_id}/unbind +Body: { "binding_type": "face", "binding_value": "face_123" } +``` + +**功能**: +- 解绑人脸与人物身份的关联 +- 人脸回到候选状态 +- 更新人物出场统计 + +**响应示例**: +```json +{ + "success": true, + "message": "Face unbound", + "data": { + "identity_id": "Person_17", + "unbound_face": "face_123", + "updated_appearance_count": 42 + } +} +``` + +**实现**: 需新增,参考现有 `UnbindIdentityRequest` + +--- + +### 9. POST /people/split-face (分离人脸) + +**marcom 需求**: 将人脸从现有人物分离为新人物 + +**等效 API**: +``` +POST /api/v1/people/split +Body: { + "source_identity_id": "Person_17", + "face_ids": ["face_123", "face_124"], + "new_identity_name": "新人物" +} +``` + +**功能**: +- 从现有人物分离指定人脸 +- 创建新人物身份 +- 转移出场记录 + +**实现**: 现有 `/api/v1/people/:person_id/split` 部分支持 + +--- + +### 10. GET /people/{identity_id}/resolve (解决冲突) + +**marcom 需求**: 获取人物的冲突/歧义信息 + +**等效 API**: +``` +GET /api/v1/people/{identity_id}/conflicts +``` + +**功能**: +- 返回人物身份的潜在冲突 +- 显示相似人脸/声音的匹配 +- 提供解决方案建议 + +**响应示例**: +```json +{ + "success": true, + "data": { + "identity_id": "Person_17", + "conflicts": [ + { + "type": "similar_face", + "conflicting_identity": "Person_18", + "similarity": 0.85, + "suggestion": "merge" + } + ], + "resolution_options": ["merge", "keep_separate", "skip"] + } +} +``` + +**实现**: 需新增 + +--- + +### 11. POST /search (搜索) + +**marcom 需求**: 搜索人物 + +**等效 API**: +``` +POST /api/v1/people/search +Body: { + "query": "张", + "filters": { "type": "people", "file_uuid": "xxx" }, + "limit": 20 +} +``` + +**功能**: +- 搜索人物身份 +- 支持按名称、类型、视频筛选 +- 返回匹配结果 + +**实现**: 现有 `/api/v1/identities/search` 已支持,建议扩展 + +--- + +### 12. GET /people/status (人物状态) + +**marcom 需求**: 获取人物处理状态统计 + +**等效 API**: +``` +GET /api/v1/people/status?file_uuid={uuid} +``` + +**功能**: +- 返回人物处理统计 +- 待确认数量、已确认数量、跳过数量 +- 合并历史 + +**响应示例**: +```json +{ + "success": true, + "data": { + "file_uuid": "xxx", + "total_candidates": 15, + "confirmed": 8, + "pending": 5, + "skipped": 2, + "merge_count": 3, + "split_count": 1 + } +} +``` + +**实现**: 需新增 + +--- + +## 实现优先级 + +| 优先级 | API | 状态 | 预估工时 | +|--------|-----|------|----------| +| **P0** | GET /people | ✅ 已有 | 0h | +| **P0** | GET /people/{identity_id} | ✅ 已有 | 0h | +| **P0** | PATCH /people/{identity_id} | ✅ 已有 | 0h | +| **P0** | POST /people/merge | ✅ 已有 | 0h | +| **P1** | GET /people/candidates | ⚠️ 扩展 | 2h | +| **P1** | POST /people | ❌ 新增 | 2h | +| **P1** | POST /people/search | ⚠️ 扩展 | 1h | +| **P2** | POST /people/skip | ❌ 新增 | 2h | +| **P2** | POST /people/{identity_id}/unbind | ❌ 新增 | 2h | +| **P2** | POST /people/split | ⚠️ 扩展 | 1h | +| **P2** | GET /people/{identity_id}/conflicts | ❌ 新增 | 3h | +| **P2** | GET /people/status | ❌ 新增 | 2h | + +**总预估**: ~13h (P1+P2) + +--- + +## 数据库表需求 + +现有表结构支持大部分需求,可能需要扩展: + +```sql +-- 建议新增: candidates 表 (候选管理) +CREATE TABLE person_candidates ( + id BIGSERIAL PRIMARY KEY, + file_uuid VARCHAR(36) NOT NULL, + candidate_type VARCHAR(20), -- 'face', 'speaker' + candidate_id VARCHAR(50), -- 'face_cluster_1', 'speaker_2' + suggested_identity_id BIGINT, + confidence FLOAT, + status VARCHAR(20), -- 'pending', 'confirmed', 'skipped' + skip_reason TEXT, + created_at TIMESTAMP, + updated_at TIMESTAMP +); +``` + +--- + +## 参考文档 + +- `docs_v1.0/ARCHITECTURE/MOMENTRY_CORE_ARCHITECTURE_V2.md` - Identity 系统设计 +- `docs_v1.0/ARCHITECTURE/PERSON_IDENTITY_INTEGRATION.md` - Person Identity 整合 +- `src/api/person_identity.rs` - 现有 API 实现 +- `src/api/identity_binding.rs` - 身份绑定 API diff --git a/docs_v1.0/API_DOCUMENTATION.md b/docs_v1.0/API_DOCUMENTATION.md new file mode 100644 index 0000000..7306170 --- /dev/null +++ b/docs_v1.0/API_DOCUMENTATION.md @@ -0,0 +1,699 @@ +# Momentry Core API Documentation v1.0.0 + +## Overview +Momentry Core is a digital asset management system with video analysis, RAG, and face recognition capabilities. This document covers all API endpoints available in v1.0.0. + +**Base URL**: `http://:` +- Production: Port 3002 +- Development (Playground): Port 3003 + +**Authentication**: All protected routes require API key validation via `X-API-Key` header. + +--- + +## API Classification + +The API is organized into 7 categories: + +| Category | Prefix | Description | +|----------|--------|-------------| +| **Health & Auth** | `/health`, `/api/v1/auth` | System health, authentication | +| **Asset Management** | `/api/v1/register`, `/api/v1/files`, `/api/v1/assets` | File registration, probing, processing | +| **Search** | `/api/v1/search`, `/api/v1/n8n` | Text, hybrid, visual, and n8n search | +| **Video Details** | `/api/v1/videos`, `/api/v1/progress` | Video listing, details, chunks | +| **Identity & Binding** | `/api/v1/identities`, `/api/v1/signals` | Face/speaker identity management | +| **Jobs & Rules** | `/api/v1/jobs`, `/api/v1/rules` | Processing job monitoring | +| **Stats & Config** | `/api/v1/stats`, `/api/v1/config` | System statistics, configuration | + +--- + +## 1. Health & Authentication + +### `GET /health` +Basic health check. + +**Response**: +```json +{ + "status": "ok", + "version": "v1.0.0", + "uptime_ms": 12345 +} +``` + +### `GET /health/detailed` +Detailed health check with service status (PostgreSQL, Redis, Qdrant, MongoDB). + +**Response**: +```json +{ + "status": "ok", + "version": "v1.0.0", + "uptime_ms": 12345, + "services": { + "postgres": { "status": "ok", "latency_ms": 5 }, + "redis": { "status": "ok", "latency_ms": 2 }, + "qdrant": { "status": "ok", "latency_ms": 10 }, + "mongodb": { "status": "ok", "latency_ms": 8 } + } +} +``` + +### `POST /api/v1/auth/login` +Authenticate and obtain API key. + +**Request**: +```json +{ + "username": "demo", + "password": "demo" +} +``` + +**Response**: +```json +{ + "success": true, + "message": "Login successful", + "api_key": "muser_test_001", + "user": { "username": "demo" } +} +``` + +### `POST /api/v1/auth/logout` +Logout session. + +**Response**: +```json +{ "success": true } +``` + +--- + +## 2. Asset Management + +### `POST /api/v1/register` +Register a video file (legacy path-based). + +**Request**: +```json +{ "path": "./demo/video.mp4" } +``` + +**Response**: +```json +{ + "file_uuid": "384b0ff44aaaa1f1", + "file_id": 1, + "job_id": 1, + "file_name": "video.mp4", + "duration": 120.5, + "width": 1920, + "height": 1080, + "already_exists": false +} +``` + +### `POST /api/v1/files/register` +Register a file with full metadata (recommended). Supports move detection. + +**Request**: +```json +{ + "file_path": "/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4", + "user_id": null +} +``` + +**Response**: +```json +{ + "success": true, + "file_uuid": "384b0ff44aaaa1f1", + "file_name": "video.mp4", + "file_path": "/Users/accusys/momentry/var/sftpgo/data/demo/video.mp4", + "file_type": "video", + "duration": 120.5, + "width": 1920, + "height": 1080, + "fps": 30.0, + "total_frames": 3615, + "registration_time": null, + "already_exists": false, + "message": "File registered successfully" +} +``` + +### `GET /api/v1/files/scan` +Scan filesystem for unregistered files. + +### `POST /api/v1/unregister` +Unregister a video file. + +**Request**: +```json +{ "uuid": "384b0ff44aaaa1f1" } +``` + +### `POST /api/v1/probe` +Probe a video file for metadata. + +**Request**: +```json +{ "path": "./demo/video.mp4" } +``` + +**Response**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "file_name": "video.mp4", + "duration": 120.5, + "width": 1920, + "height": 1080, + "fps": 30.0, + "cached": true, + "format": { ... }, + "streams": [ ... ] +} +``` + +### `GET /api/v1/assets/:uuid/probe` +Probe a video by UUID. + +### `POST /api/v1/assets/:uuid/process` +Trigger processing pipeline for an asset. + +**Request**: +```json +{ + "processors": ["asr", "cut", "yolo", "ocr", "face", "pose", "asrx", "visual_chunk"] +} +``` + +**Response**: +```json +{ + "job_id": 1, + "asset_uuid": "384b0ff44aaaa1f1", + "status": "PENDING", + "message": "Processing triggered for video.mp4" +} +``` + +### `GET /api/v1/assets/:uuid/status` +Get asset processing status with frame progress. + +**Response**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "file_name": "video.mp4", + "registration_time": "2026-04-30T10:00:00Z", + "processing_status": "processing", + "current_job_id": "abc-123", + "frame_progress": { + "total_frames": 3615, + "processed_frames": 1200, + "progress_percent": 33.2 + } +} +``` + +--- + +## 3. Search + +### `POST /api/v1/search` +Vector/smart search across chunks. + +**Request**: +```json +{ + "query": "person talking about AI", + "mode": "smart", + "uuid": "384b0ff44aaaa1f1", + "limit": 10 +} +``` + +**Response**: +```json +{ + "results": [ + { + "uuid": "384b0ff44aaaa1f1", + "chunk_id": "chunk_1", + "chunk_type": "sentence", + "start_time": 10.5, + "end_time": 15.2, + "text": "AI is transforming...", + "score": 0.85 + } + ], + "query": "person talking about AI" +} +``` + +### `POST /api/v1/search/hybrid` +Hybrid search (vector + BM25). + +**Request**: +```json +{ + "query": "search term", + "limit": 10, + "uuid": "384b0ff44aaaa1f1", + "vector_weight": 0.7, + "bm25_weight": 0.3 +} +``` + +### `POST /api/v1/search/bm25` +BM25 full-text search. + +### `POST /api/v1/search/visual` +Search visual chunks by criteria. + +**Request**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "criteria": { + "object_class": "person", + "min_count": 1 + } +} +``` + +### `POST /api/v1/search/visual/class` +Search by object class. + +**Request**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "object_class": "person", + "min_count": 1, + "max_count": null +} +``` + +### `POST /api/v1/search/visual/density` +Search by object density. + +**Request**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "min_density": 0.5, + "max_density": null +} +``` + +### `POST /api/v1/search/visual/combination` +Search by object combination. + +**Request**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "combination": [["person", 2], ["car", 1]] +} +``` + +### `POST /api/v1/search/visual/stats` +Get visual chunk statistics. + +**Request**: +```json +{ "uuid": "384b0ff44aaaa1f1" } +``` + +### `POST /api/v1/n8n/search` +Search via n8n integration. + +### `POST /api/v1/n8n/search/bm25` +BM25 search via n8n. + +### `POST /api/v1/n8n/search/hybrid` +Hybrid search via n8n. + +### `POST /api/v1/n8n/search/smart` +Smart search via n8n. + +--- + +## 4. Video Details + +### `GET /api/v1/videos` +List all registered videos with pagination. + +**Query Parameters**: +- `page`: Page number (default: 1) +- `page_size`: Items per page (default: 20) +- `status`: Filter by status +- `q`: Search query +- `uuid`: Filter by UUID + +**Response**: +```json +{ + "files": [ + { + "file_uuid": "384b0ff44aaaa1f1", + "file_path": "/path/to/video.mp4", + "file_name": "video.mp4", + "file_type": "video", + "duration": 120.5, + "width": 1920, + "height": 1080, + "status": "completed", + "created_at": "2026-04-30T10:00:00Z", + "file_size": 52428800, + "total_frames": 3615 + } + ], + "count": 1, + "page": 1, + "page_size": 20 +} +``` + +### `DELETE /api/v1/videos/:uuid` +Delete a video and all associated data (faces, chunks, processor results). + +**Response**: +```json +{ + "success": true, + "message": "File 384b0ff44aaaa1f1 unregistered successfully...", + "file_uuid": "384b0ff44aaaa1f1", + "deleted_face_detections": 150, + "deleted_processor_results": 8, + "deleted_chunks": 45 +} +``` + +### `GET /api/v1/videos/:uuid/details` +Get detailed chunk information. + +**Query Parameters**: +- `chunk_id`: Specific chunk ID (required) +- `parent_id`: Parent chunk ID + +**Response**: +```json +{ + "uuid": "384b0ff44aaaa1f1", + "chunk_id": "chunk_1", + "chunk_type": "sentence", + "frame_range": { + "start_frame": 315, + "end_frame": 456, + "duration_frames": 141, + "fps": 30.0 + }, + "reference_time": { + "start": 10.5, + "end": 15.2 + }, + "text_content": "AI is transforming...", + "summary_text": "Discussion about AI impact", + "speaker_ids": ["SPEAKER_0"], + "person_ids": ["face_100"] +} +``` + +### `GET /api/v1/videos/:uuid/pre_chunks` +List pre-processor chunks. + +**Query Parameters**: +- `processor_type`: Filter by processor (asr, yolo, face, etc.) +- `page`: Page number +- `page_size`: Items per page + +### `GET /api/v1/progress/:uuid` +Get processing progress for a video. + +--- + +## 5. Identity & Binding + +### `POST /api/v1/identities/from-face` +Register a global identity from face.json with multi-angle reference vectors. + +**Request**: +```json +{ + "face_json_path": "/path/to/face.json", + "identity_name": "John Doe", + "schema": "dev" +} +``` + +### `POST /api/v1/identities/from-person` +Register identity from a person in a video. + +**Request**: +```json +{ + "file_uuid": "384b0ff44aaaa1f1", + "person_id": "person_1", + "identity_name": "John Doe" +} +``` + +### `GET /api/v1/identities` +List all global identities. + +**Query Parameters**: +- `page`: Page number +- `page_size`: Items per page + +### `GET /api/v1/faces/candidates` +List unbound face candidates. + +**Query Parameters**: +- `file_uuid`: Filter by file +- `min_confidence`: Minimum confidence (default: 0.5) +- `page`, `page_size`: Pagination + +### `GET /api/v1/identities/:identity_id/faces` +Get all faces for an identity. + +### `GET /api/v1/faces/:face_id/thumbnail` +Get face thumbnail image (JPEG). + +### `POST /api/v1/identities/bind` +Bind a face/speaker to an identity. + +**Request**: +```json +{ + "identity_id": 1, + "binding_type": "face", + "binding_value": "face_100", + "source": "manual" +} +``` + +### `POST /api/v1/identities/unbind` +Unbind an identity. + +**Request**: +```json +{ + "binding_type": "face", + "binding_value": "face_100" +} +``` + +### `GET /api/v1/identity/:binding_type/:binding_value` +Get identity info by binding. + +### `GET /api/v1/signals/unbound` +List unbound signals. + +**Query Parameters**: +- `uuid`: File UUID +- `binding_type`: "face" or "speaker" + +### `GET /api/v1/signals/:uuid/:binding_type/:binding_value/timeline` +Get signal timeline (all chunks for a face/speaker). + +### `POST /api/v1/identities/suggest-av` +Suggest audio-visual bindings based on temporal overlap. + +**Request**: +```json +{ + "file_uuid": "384b0ff44aaaa1f1", + "overlap_threshold": 0.6 +} +``` + +--- + +## 6. Jobs & Rules + +### `GET /api/v1/jobs` +List all monitor jobs. + +**Query Parameters**: +- `page`, `page_size`: Pagination +- `status`: Filter by status + +### `GET /api/v1/jobs/:job_id` +Get job details with processor information. + +**Response**: +```json +{ + "job_id": "1", + "asset_uuid": "384b0ff44aaaa1f1", + "rule": "default", + "status": "RUNNING", + "current_processor_id": "asr", + "frame_progress": { + "total_frames": 3615, + "processed_frames": 1200, + "progress_percent": 33.2 + } +} +``` + +### `GET /api/v1/rules/:rule/status` +Get rule status with active jobs. + +--- + +## 7. Stats & Configuration + +### `GET /api/v1/stats/ingest` +Get ingestion statistics. + +**Response**: +```json +{ + "total_videos": 50, + "total_chunks": 1200, + "sentence_chunks": 800, + "cut_chunks": 300, + "time_chunks": 100, + "searchable_chunks": 1150, + "chunks_with_visual": 450, + "chunks_with_summary": 200, + "pending_videos": 5 +} +``` + +### `GET /api/v1/stats/sftpgo` +Get SFTPGo status and registered videos. + +### `GET /api/v1/stats/inference` +Check inference engine health (Ollama, llama-server). + +**Response**: +```json +{ + "ollama": { + "engine": "Ollama", + "model": "nomic-embed-text", + "status": "ok", + "latency_ms": 15 + }, + "llama_server": { + "engine": "llama-server", + "model": "gemma4_e4b_q5", + "status": "ok", + "latency_ms": 25 + } +} +``` + +### `POST /api/v1/config/cache` +Toggle MongoDB cache. + +**Request**: +```json +{ "enabled": false } +``` + +**Response**: +```json +{ + "success": true, + "cache_enabled": false, + "message": "Cache disabled" +} +``` + +--- + +## API Usage Patterns + +### 1. List Pattern +``` +GET /api/v1/videos?page=1&page_size=20 +``` +- Supports pagination +- Optional filters via query parameters +- Returns `{ items: [...], count, page, page_size }` + +### 2. Detail Pattern +``` +GET /api/v1/videos/:uuid/details?chunk_id=chunk_1 +``` +- Path parameter for resource identifier +- Query parameters for sub-resource selection +- Returns detailed object with nested structures + +### 3. Operation Pattern +``` +POST /api/v1/assets/:uuid/process +``` +- Action-oriented endpoint +- Request body contains operation parameters +- Returns operation status and job ID + +### 4. Application Pattern +``` +POST /api/v1/identities/bind +POST /api/v1/identities/suggest-av +``` +- Complex workflows with multiple steps +- Often involve external services (Python scripts, FFmpeg) +- Return comprehensive results with metadata + +--- + +## Error Responses + +| Status Code | Description | +|-------------|-------------| +| `400` | Bad Request - Invalid parameters | +| `404` | Not Found - Resource doesn't exist | +| `500` | Internal Server Error - Database/service failure | + +--- + +## V4.0 Architecture Notes + +### Key Changes from V3.x +- `video_uuid` → `file_uuid` (terminology update) +- `person_identities` table **removed** +- Face → Identity direct binding (no intermediate person_id) +- 28 person_id APIs removed (except register/bind) +- Chunk binding auto via time alignment + +### Identity Model +``` +Face Detection → Identity (direct binding) +Speaker Detection → Identity (direct binding) +``` + +### Processing Pipeline +``` +Register → Probe → ASR → CUT → YOLO → OCR → Face → Pose → ASRX → Visual Chunk +``` diff --git a/docs_v1.0/ARCHITECTURE/API_WORKFLOW_WORDPRESS_N8N.md b/docs_v1.0/ARCHITECTURE/API_WORKFLOW_WORDPRESS_N8N.md index c2fabbc..3b108e2 100644 --- a/docs_v1.0/ARCHITECTURE/API_WORKFLOW_WORDPRESS_N8N.md +++ b/docs_v1.0/ARCHITECTURE/API_WORKFLOW_WORDPRESS_N8N.md @@ -152,7 +152,7 @@ const job = await response.json(); // 狀態檢查 if (job.status === 'completed') { - return [{ json: { done: true, video_uuid: job.video_uuid } }]; + return [{ json: { done: true, file_uuid: job.file_uuid } }]; } else { return [{ json: { done: false, status: job.status } }]; } @@ -403,13 +403,13 @@ add_shortcode('momentry_search', function($atts) { $html .= '