feat: media API (video/bbox/thumbnail), UUID unification, dot matrix text, portal fixes, API dictionary V1.3

This commit is contained in:
Warren
2026-05-06 13:34:49 +08:00
parent e75c4d6f07
commit 74b6182eba
197 changed files with 17511 additions and 8759 deletions

View File

@@ -67,4 +67,8 @@ REDIS_CACHE_TTL_VIDEO_META=3600
# 多個同義詞檔案(逗號分隔),會覆蓋 MOMENTRY_SYNONYM_FILE # 多個同義詞檔案(逗號分隔),會覆蓋 MOMENTRY_SYNONYM_FILE
# MOMENTRY_SYNONYM_FILES=/path/to/first.json,/path/to/second.json # MOMENTRY_SYNONYM_FILES=/path/to/first.json,/path/to/second.json
# #
# 示例檔案docs/examples/custom_synonyms.json # 示例檔案docs/examples/custom_synonyms.json
# TMDb Integration (probe phase - auto-create identities from movie metadata)
TMDB_API_KEY=e9cde52197f6f8df4d9db99da93db1fb
MOMENTRY_TMDB_PROBE_ENABLED=true

View File

@@ -357,6 +357,12 @@ cargo run --features player --bin momentry_player -- -o
- `MOMENTRY_CUT_TIMEOUT` - CUT timeout in seconds (default: 3600) - `MOMENTRY_CUT_TIMEOUT` - CUT timeout in seconds (default: 3600)
- `MOMENTRY_DEFAULT_TIMEOUT` - Default timeout (default: 7200) - `MOMENTRY_DEFAULT_TIMEOUT` - Default timeout (default: 7200)
### TMDb Integration (Face Clustering)
- `TMDB_API_KEY` - TMDb API key for movie metadata lookup (required for `MOMENTRY_TMDB_PROBE_ENABLED=true`)
- `MOMENTRY_TMDB_PROBE_ENABLED` - Enable TMDb probe during registration (default: `false`)
- Register phase: searches TMDb by filename, creates identities with tmdb_id/tmdb_profile
- Post-process phase: matches detected faces against TMDb identities via cosine similarity
### Synonym Expansion ### Synonym Expansion
- `MOMENTRY_SYNONYM_FILES` - Comma-separated paths to synonym JSON files (e.g., `data/english_synonyms.json,data/llm_synonyms.json`) - `MOMENTRY_SYNONYM_FILES` - Comma-separated paths to synonym JSON files (e.g., `data/english_synonyms.json,data/llm_synonyms.json`)
- `MOMENTRY_SYNONYM_FILE` - Single synonym JSON file path (deprecated, use above) - `MOMENTRY_SYNONYM_FILE` - Single synonym JSON file path (deprecated, use above)
@@ -372,6 +378,7 @@ cargo run --features player --bin momentry_player -- -o
- Monitor directory is a separate system (not Rust) - Monitor directory is a separate system (not Rust)
- PythonExecutor provides unified script execution with timeout support - PythonExecutor provides unified script execution with timeout support
- Redis 1.0.x for improved performance - Redis 1.0.x for improved performance
- FaceNet CoreML model (`models/facenet512.mlpackage`) replaces InsightFace for embedding extraction (MIT license, ANE-accelerated)
### LLM Synonym Generation ### LLM Synonym Generation

1
Cargo.lock generated
View File

@@ -2380,6 +2380,7 @@ dependencies = [
"tempfile", "tempfile",
"thiserror 1.0.69", "thiserror 1.0.69",
"tokio", "tokio",
"tokio-util",
"tower 0.4.13", "tower 0.4.13",
"tower-http 0.5.2", "tower-http 0.5.2",
"tracing", "tracing",

View File

@@ -80,6 +80,7 @@ crossterm = "0.28"
# Terminal # Terminal
atty = "0.2" atty = "0.2"
tokio-util = { version = "0.7.18", features = ["io"] }
# System # System

View File

@@ -2,8 +2,8 @@
document_type: "reference_doc" document_type: "reference_doc"
service: "MOMENTRY_CORE" service: "MOMENTRY_CORE"
title: "Momentry Core API 字典 V1.0.0" title: "Momentry Core API 字典 V1.0.0"
date: "2026-05-01" date: "2026-05-06"
version: "V1.0" version: "V1.3"
status: "active" status: "active"
owner: "Warren" owner: "Warren"
created_by: "OpenCode" created_by: "OpenCode"
@@ -24,160 +24,150 @@ ai_query_hints:
related_documents: related_documents:
- "API_V1.0.0/MOMENTRY_CORE_API_V1.0.0.md" - "API_V1.0.0/MOMENTRY_CORE_API_V1.0.0.md"
- "API_V1.0.0/API_USAGE_DEMO_V1.0.0.md" - "API_V1.0.0/API_USAGE_DEMO_V1.0.0.md"
- "API_V1.0.0/API_REFERENCE_v1.0.0.20260501md.md"
- "API_V1.0.0/CHUNK_DEFINITION_V1.0.0.md" - "API_V1.0.0/CHUNK_DEFINITION_V1.0.0.md"
- "API_V1.0.0/VECTOR_SPEC_V1.0.0.md" - "API_V1.0.0/VECTOR_SPEC_V1.0.0.md"
--- ---
# Momentry Core API 字典級全量文件 V1.0.0 # Momentry Core API 字典 V1.0.0
## 關鍵術語定義 ## 關鍵術語定義
| 術語 | 定義 | | 術語 | 定義 |
|------|------| |------|------|
| Public API | 供前端與外部系統使用的標準介面58 個端點) | | Public API | 供前端與外部系統使用的標準介面 |
| Internal API | 系統內部流程或狀態查詢用5 個端點) | | Internal API | 系統內部流程或狀態查詢用 |
| Admin API | 管理員專用5 個端點) | | Admin API | 管理員專用 |
| file_uuid | 32 碼 SHA256 檔案識別碼 | | file_uuid | 32 碼 birth UUIDMAC + time + path + filename |
| RESTful | 以資源為中心的 API 設計風格 | | identity_uuid | 32 碼 UUIDv5source + external_id |
| RESTful | 以資源為中心的 API 設計風格collection 複數、resource 單數 |
## 📊 端點統計 (Endpoint Statistics) ## 端點統計
| 分類 | 數量 | 說明 | | 分類 | 數量 | 說明 |
|---|---|---| |---|---|---|
| **Public** | 58 | 供前端與外部系統使用的標準介面 | | Public | 40 | 供前端與外部系統使用的標準介面 |
| ⚠️ **Internal** | 5 | 系統內部流程或狀態查詢 (如 Probe, SFTPGo) | | Internal | 4 | 系統內部流程或狀態查詢 |
| 🔒 **Admin** | 5 | 管理員專用 (如 Resources, Config Cache) | | Admin | 3 | 管理員專用 |
| **總計** | **67** | 所有已註冊路由 (`gen-traces` 已移除) | | Health | 2 | 服務健康檢查 |
| **總計** | **48** | 所有已註冊路由 |
| 項目 | 內容 | ## 設計原則
|------|------|
| 建立者 | OpenCode | ### 1. RESTful 命名規範
| 建立時間 | 2026-05-01 | - Collection複數: `/api/v1/files`, `/api/v1/identities`
| 端點總數 | **68** | - Resource單數: `/api/v1/file/:file_uuid`, `/api/v1/identity/:identity_uuid`
| 文件版本 | V1.1 (Route Fixes + Arch Notes) | - Action on resource: `/api/v1/identity/:identity_uuid/bind`
### 2. File-Centric
- 每個媒體檔案由 32 碼 UUID (`file_uuid`) 唯一標識
- File 是所有資料的根節點Chunk、Job 隸屬於特定 File
### 3. Global Identity
- Identity 跨檔案關聯,不受單一檔案限制
- 透過 bind/unbind/mergeinto 管理 Face → Identity 的直接 FK 綁定V4.0
--- ---
## 🚀 設計原則 (Design Principles) ## 1. 系統與認證
### 1. Clear API (介面清晰化)
* **去蕪存菁**: 嚴格區分 **Public** (公開) 與 **Internal** (內部) 端點。舊版冗餘路徑(如 `/api/v1/videos`, `/api/v1/probe`)已全面移除或合併。
* **標準化回應**: 所有列表型 API 均回傳統一結構 `{ "success": true, "data": [...], "total": N }`
* **命名規範**: 採用 RESTful 風格,資源以複數名詞或明確動作命名(如 `files`, `identities`)。
### 2. File-Centric (以檔案為核心)
* **唯一識別**: 每個媒體檔案(影片/圖片/音訊)均由 **32 碼 UUID** (`file_uuid`) 唯一標識。
* **生命週期**: `File` 是所有資料的根節點。所有的 `Chunk` (片段), `Snapshot` (快照), `Jobs` (任務) 皆隸屬於特定的 `File`
* **操作模式**: 前端應優先呼叫 `GET /api/v1/files` 取得清單,再透過 `POST /api/v1/files/:uuid/snapshots/migrate` 載入詳細資源。
### 4. Trace Aggregation (軌跡聚合獨立化)
* **架構**: `trace_face` 聚合由獨立 Python 腳本 `scripts/trace_face_aggregator.py` 處理,**不**內嵌於 Rust DB 層。
* **流程**: Face Processor (Python) 輸出離散幀級資料到 `face_detections` 表 → Rust Worker 排程 `trace_face_aggregator.py` → 該腳本讀取 DB、按 `face_id` 分組聚合、寫入 `pre_chunks` (source_type=`trace_face`)。
* **設計理由**: 保持 Rust 排程層輕量化,軌跡聚合邏輯留在 Python 層統一維護,便於未來調整聚合演算法 (如 IOU 門檻、時間間隔合併等) 而無需重新編譯 Rust。
### 5. Global Identity (全域身份識別)
* **跨檔案關聯**: `Identity` 代表一個獨立的人物或角色,不受單一檔案限制。
* **綁定機制 (Binding)**: 透過 `POST /api/v1/identities/bind`,我們可以將多個檔案中偵測到的臉部 (`face`) 或聲音 (`speaker`) 聚合到同一個 `Identity` 下。
* **資料聚合**: 查詢某個 `Identity` 即可看到該人物在所有歷史檔案中的軌跡 (`/api/v1/identities/:uuid/files`)。
---
## 1. 系統與認證 (System & Auth)
| 方法 | 路徑 | 狀態 | | 方法 | 路徑 | 狀態 |
|---|---|---| |------|------|------|
| `GET` | `/health` | ✅ Public | | `GET` | `/health` | Health |
| `GET` | `/health/detailed` | ✅ Public | | `GET` | `/health/detailed` | Health |
| `POST` | `/api/v1/auth/login` | Public | | `POST` | `/api/v1/auth/login` | Public |
| `POST` | `/api/v1/auth/logout` | Public | | `POST` | `/api/v1/auth/logout` | Public |
## 2. 檔案管理 (Files & Assets) ## 2. 檔案管理 (Files)
| 方法 | 路徑 | 狀態 |
|---|---|---|
| `GET` | `/api/v1/files` | ✅ Public |
| `GET` | `/api/v1/files/scan` | ✅ Public |
| `POST` | `/api/v1/files/register` | ✅ Public |
| `POST` | `/api/v1/unregister` | ✅ Public |
| `GET` | `/api/v1/files/:file_uuid` | ✅ Public |
| `GET` | `/api/v1/files/:file_uuid/identities` | ✅ Public |
| `GET` | `/api/v1/files/:file_uuid/snapshots` | ✅ Public |
| `GET` | `/api/v1/files/:file_uuid/snapshots/status` | ✅ Public |
| `POST` | `/api/v1/files/:file_uuid/snapshots/migrate` | ✅ Public |
| `POST` | `/api/v1/files/:file_uuid/snapshots/teardown` | ✅ Public |
## 3. 影片與任務 (Videos & Jobs)
| 方法 | 路徑 | 狀態 | | 方法 | 路徑 | 狀態 |
|---|---|---| |------|------|------|
| `DELETE` | `/api/v1/videos/:file_uuid` | Public | | `GET` | `/api/v1/files` | Public |
| `GET` | `/api/v1/videos/:file_uuid/details` | Public | | `GET` | `/api/v1/files/scan` | Public |
| `GET` | `/api/v1/videos/:file_uuid/pre_chunks` | Public | | `POST` | `/api/v1/files/register` | Public |
| `GET` | `/api/v1/progress/:file_uuid` | Public | | `POST` | `/api/v1/files/unregister` | Public |
| `GET` | `/api/v1/jobs` | Public | | `GET` | `/api/v1/file/:file_uuid` | Public |
| `GET` | `/api/v1/jobs/:job_id` | Public | | `GET` | `/api/v1/file/:file_uuid/probe` | Public |
| `GET` | `/api/v1/rules/:rule/status` | Public | | `POST` | `/api/v1/file/:file_uuid/process` | Public |
| `GET` | `/api/v1/files/:file_uuid/probe` | Public | | `GET` | `/api/v1/file/:file_uuid/identities` | Public |
| `POST` | `/api/v1/files/:file_uuid/process` | Public | | `GET` | `/api/v1/file/:file_uuid/chunks` | Public |
| `GET` | `/api/v1/assets/:uuid/status` | ⚠️ Internal | | `GET` | `/api/v1/file/:file_uuid/thumbnail?frame=&x=&y=&w=&h=` | Public |
| `POST` | `/api/v1/resources/register` | 🔒 Internal | | `POST` | `/api/v1/file/:file_uuid/face_trace/sortby` | Public |
| `POST` | `/api/v1/resources/heartbeat` | 🔒 Internal |
| `GET` | `/api/v1/resources` | 🔒 Internal | ## 3. 管線與任務 (Pipeline & Jobs)
| 方法 | 路徑 | 狀態 |
|------|------|------|
| `GET` | `/api/v1/progress/:file_uuid` | Public |
| `GET` | `/api/v1/jobs` | Public |
| `GET` | `/api/v1/job/:job_id` | Public |
| `GET` | `/api/v1/rule/:rule_id/status` | Public |
| `POST` | `/api/v1/resource/register` | Internal |
| `POST` | `/api/v1/resource/heartbeat` | Internal |
| `GET` | `/api/v1/resources` | Internal |
## 4. 搜尋 (Search) ## 4. 搜尋 (Search)
| 方法 | 路徑 | 狀態 |
|---|---|---|
| `POST` | `/api/v1/search` | ✅ Public |
| `POST` | `/api/v1/search/bm25` | ✅ Public |
| `POST` | `/api/v1/search/hybrid` | ✅ Public |
| `POST` | `/api/v1/search/visual` | ✅ Public |
| `POST` | `/api/v1/search/visual/class` | ✅ Public |
| `POST` | `/api/v1/search/visual/density` | ✅ Public |
| `POST` | `/api/v1/search/visual/combination` | ✅ Public |
| `POST` | `/api/v1/search/visual/stats` | ✅ Public |
## 5. 身份與綁定 (Identity & Binding)
| 方法 | 路徑 | 狀態 | | 方法 | 路徑 | 狀態 |
|---|---|---| |------|------|------|
| `GET` | `/api/v1/identities` | Public | | `POST` | `/api/v1/search` | Public |
| `GET` | `/api/v1/identities/:uuid` | Public | | `POST` | `/api/v1/search/bm25` | Public |
| `GET` | `/api/v1/identities/:uuid/files` | Public | | `POST` | `/api/v1/search/hybrid` | Public |
| `GET` | `/api/v1/identities/:uuid/chunks` | Public | | `POST` | `/api/v1/search/smart` | Public |
| `GET` | `/api/v1/identities/:identity_id/faces` | Public | | `POST` | `/api/v1/search/universal` | Public |
| `POST` | `/api/v1/identities/from-person` | Public | | `POST` | `/api/v1/search/frames` | Public |
| `POST` | `/api/v1/identities/from-face` | Public | | `POST` | `/api/v1/search/visual` | Public |
| `POST` | `/api/v1/identities/bind` | Public | | `POST` | `/api/v1/search/visual/class` | Public |
| `POST` | `/api/v1/identities/unbind` | Public | | `POST` | `/api/v1/search/visual/density` | Public |
| `POST` | `/api/v1/search/visual/combination` | Public |
| `POST` | `/api/v1/search/visual/stats` | Public |
## 5. 身份管理 (Identity)
## 6. 臉部 (Face)
| 方法 | 路徑 | 狀態 | | 方法 | 路徑 | 狀態 |
|---|---|---| |------|------|------|
| `GET` | `/api/v1/face/list` | Public | | `GET` | `/api/v1/identities` | Public |
| `GET` | `/api/v1/face/:face_id` | Public | | `POST` | `/api/v1/identity` | Public |
| `DELETE` | `/api/v1/face/:face_id` | Public | | `GET` | `/api/v1/identity/:identity_uuid` | Public |
| `POST` | `/api/v1/face/recognize` | Public | | `DELETE` | `/api/v1/identity/:identity_uuid` | Public |
| `POST` | `/api/v1/face/register` | Public | | `GET` | `/api/v1/identity/:identity_uuid/files` | Public |
| `POST` | `/api/v1/face/search` | Public | | `GET` | `/api/v1/identity/:identity_uuid/chunks` | Public |
| `GET` | `/api/v1/faces/candidates` | Public | | `POST` | `/api/v1/identity/:identity_uuid/bind` | Public |
| `GET` | `/api/v1/files/:file_uuid/faces/:face_id/thumbnail` | Public | | `POST` | `/api/v1/identity/:identity_uuid/unbind` | Public |
| `GET` | `/api/v1/signals/unbound` | Public | | `POST` | `/api/v1/identity/:from_uuid/mergeinto` | Public |
| `GET` | `/api/v1/signals/:uuid/:binding_type/:binding_value/timeline` | ✅ Public |
## 6. 臉部 (Faces)
| 方法 | 路徑 | 狀態 |
|------|------|------|
| `GET` | `/api/v1/faces/candidates` | Public |
## 7. 代理人 (Agents) ## 7. 代理人 (Agents)
| 方法 | 路徑 | 狀態 |
|---|---|---|
| `POST` | `/api/v1/agents/translate` | ✅ Public |
| `POST` | `/api/v1/agents/5w1h/analyze` | ✅ Public |
| `POST` | `/api/v1/agents/5w1h/batch` | ✅ Public |
| `GET` | `/api/v1/agents/5w1h/status` | ✅ Public |
| `POST` | `/api/v1/agents/identity/analyze` | ✅ Public |
| `POST` | `/api/v1/agents/identity/suggest` | ✅ Public |
| `GET` | `/api/v1/agents/identity/status` | ✅ Public |
| `POST` | `/api/v1/agents/suggest/merge` | ✅ Public |
## 8. 狀態與統計 (Stats)
| 方法 | 路徑 | 狀態 | | 方法 | 路徑 | 狀態 |
|---|---|---| |------|------|------|
| `GET` | `/api/v1/stats/ingest` | Public | | `POST` | `/api/v1/agents/translate` | Public |
| `GET` | `/api/v1/stats/sftpgo` | ⚠️ Internal | | `POST` | `/api/v1/agents/identity/analyze` | Public |
| `GET` | `/api/v1/stats/inference` | ⚠️ Internal | | `POST` | `/api/v1/agents/identity/suggest` | Public |
| `POST` | `/api/v1/config/cache` | 🔒 Internal | | `GET` | `/api/v1/agents/identity/status` | Public |
| `GET` | `/api/v1/lookup` | Public | | `POST` | `/api/v1/agents/suggest/merge` | Public |
| `POST` | `/api/v1/agents/5w1h/analyze` | Public |
| `POST` | `/api/v1/agents/5w1h/batch` | Public |
| `GET` | `/api/v1/agents/5w1h/status` | Public |
## 8. 狀態與管理 (Stats & Admin)
| 方法 | 路徑 | 狀態 |
|------|------|------|
| `GET` | `/api/v1/stats/sftpgo` | Internal |
| `GET` | `/api/v1/stats/inference` | Internal |
| `POST` | `/api/v1/config/cache` | Admin |
---
## 變更歷史
| 版本 | 日期 | 作者 | 說明 |
|------|------|------|------|
| V1.3 | 2026-05-06 | OpenCode | 新增 `face_thumbnail` ffmpeg 即時裁切端點 + `face_trace/sortby` 端點portal 修復 hardcoded URL/API key/legacy endpoints |
| V1.1 | 2026-05-01 | OpenCode | Route fixes + arch notes |
| V1.0 | 2026-04 | OpenCode | 初始版本 |

View File

@@ -0,0 +1,148 @@
---
document_type: "experiment_report"
service: "MOMENTRY_CORE"
title: "兒童偵測與年齡估算模型選型報告"
date: "2026-05-06"
version: "V1.0"
status: "completed"
owner: "Warren"
created_by: "OpenCode"
---
# 兒童偵測與年齡估算模型選型報告
## 1. 實驗目標
在 Momentry Core 的 Face Trace 資料中,尋找「非主要演員中的兒童角色」並評估三種年齡估算方案的可行性:
1. **DeepFace AgeNet** — 深度學習年齡估算MIT License
2. **Apple Vision 頭肩比** — 用頭寬/肩寬比例推測年齡(系統內建)
3. **MiVOLO** — HuggingFace 年齡模型Apache 2.0
## 2. 實驗環境
| 項目 | 內容 |
|------|------|
| 測試影片 | Charade (1963), 113 min, 24fps |
| Face detections | 6182 faces, 2347 traces |
| Face 偵測 | Apple Vision `VNDetectFaceRectanglesRequest` (swift_face) |
| Face 嵌入 | CoreML FaceNet512 |
| 取樣間隔 | 60 幀 (2.5 秒) |
| 體態偵測 | Apple Vision `VNDetectHumanBodyPoseRequest` |
## 3. 實驗方法
### 3.1 主要角色年齡估算
從 2347 個 trace 中挑選 face_count ≥ 5 的 12 個主要 trace提取中間幀進行 DeepFace 年齡估算 + Apple Vision 頭肩比計算。
### 3.2 非主要角色搜尋
搜尋小臉(< 60px、低 face_count≤ 2的 trace找出群眾演員可能包含兒童
### 3.3 滑雪場水槍場景
Charade 開場 Megève 滑雪場有一名男孩用水槍噴灑女主角的場景。對此場景進行密集幀掃描30 幀間隔)搜尋兒童臉。
## 4. 模型選型結果
### 4.1 模型可用性
| 方案 | 可用 | 速度/face | License | 結論 |
|------|------|----------|---------|------|
| **DeepFace AgeNet** | ✓ | 0.2s(快取後) | MIT | **推薦** |
| Apple Vision 年齡 | ✗ | — | 系統內建 | Vision 無年齡 API |
| Apple Vision 頭肩比 | ✓ | 即時 | 系統內建 | 僅成人/兒童分類 |
| MiVOLO | ✗ | — | Apache 2.0 | 模型不可用HuggingFace 不存在) |
### 4.2 DeepFace 年齡估算12 主要角色取樣)
| Trace | Faces | 出現時間 | 臉寬 | DeepFace 年齡 | 性別 | 情緒 |
|-------|-------|----------|------|-------------|------|------|
| 0 | 45 | 35s | 160px | 35 | Man | sad |
| 24 | 6 | 708s | 100px | 34 | Man | neutral |
| 26 | 5 | 728s | 100px | 31 | Woman | neutral |
| 39 | 14 | 760s | 120px | 30 | Man | sad |
| 43 | 12 | 765s | 120px | 25 | Man | sad |
| 45 | 8 | 775s | — | 36 | Woman | neutral |
| 46 | 9 | 795s | — | 29 | Woman | neutral |
| 48 | 6 | 818s | 140px | 50 | Man | angry |
| 76 | 13 | 908s | — | 29 | Man | sad |
| 87 | 5 | 972s | — | 35 | Man | sad |
| 103 | 7 | 1022s | — | 35 | Woman | neutral |
| 132 | 5 | 1158s | — | 27 | Man | surprise |
**年齡範圍2550 歲,全成人。**
### 4.3 Apple Vision 頭肩比
| Frame | 臉寬 | 肩寬 | 頭肩比 | DeepFace 年齡 | 場景 |
|-------|------|------|--------|-------------|------|
| 840 | 160px | 407px | **0.39** | 35 | 滑雪場(主角) |
| 17460 | 100px | 354px | **0.28** | 31 | 中段場景 |
| 18360 | 120px | 306px | **0.39** | 25 | 中段場景 |
| 19620 | 140px | 425px | **0.33** | 50 | 最年長角色 |
| 27780 | 110px | 381px | **0.29** | 27 | 後段場景 |
**頭肩比範圍0.280.39(全成人範圍)。兒童預期 > 0.6。**
### 4.4 非主要演員(群眾)
| Trace | Faces | 臉寬 | DeepFace 年齡 | 性別 | 頭肩比 | 場景 |
|-------|-------|------|-------------|------|--------|------|
| 129 | 1 | 42px | 37 | Man | 0.13 | 遠景群眾 |
| 172 | 2 | 51px | 31 | Man | 0.22 | 遠景群眾 |
| 304 | 2 | 47px | 41 | Man | 0.14 | 遠景群眾 |
| 57 | 1 | 52px | 35 | Woman | — | 遠景群眾 |
| 322 | 1 | 52px | 34 | Man | 0.18 | 遠景群眾 |
**全成人。遠景群眾頭肩比更低 (0.130.22),因相機距離影響 > 體型差異。**
## 5. 水槍場景搜尋結果
**成功找到小孩,但無法可靠估算年齡。**
| 參數 | 數值 |
|------|------|
| 影片 | Charade (1963) |
| 場景 | Megève 滑雪場戶外餐廳 |
| 時間 | Frame 2450 (102 秒 / 1:42) |
| 臉部尺寸 | **29 × 29 px** |
| Swift Face 偵測 | ✓ 已偵測trace_id 未分配,單幀) |
| DeepFace 年齡 | 33 Man ❌ **誤判**(解析度不足) |
| Apple Vision 頭肩比 | 無法計算(身體被遮擋) |
### 誤判原因
29×29px 遠低於年齡估算模型的最低解析度需求(一般需 ≥ 50×50px。在遠景中兒童的臉太小神經網路無法提取足夠的年齡特徵導致
- DeepFace 將兒童誤判為成人
- 頭肩比受距離影響大於實際年齡
## 6. 結論與建議
| 發現 | 說明 |
|------|------|
| Charade 無兒童主要角色 | 全卡司成人DeepFace 年齡範圍 2550 |
| 水槍小孩已找到 | Frame 2450102 秒,但 29px 太小無法估齡 |
| DeepFace 可行 | MIT license0.2s/face適合 ≥ 50px 臉部 |
| Apple Vision 頭肩比 | 僅適合作近景成人/兒童分類(非精確年齡) |
| MiVOLO | 不可用HuggingFace 模型不存在) |
### 建議
1. **整合 DeepFace** 年齡估算入 `face_processor.py` pipeline對 ≥ 50px 的臉進行年齡標記
2. **保留頭肩比** 做為輔助驗證(成人/兒童二元分類)
3. **降低取樣間隔** 從 60 幀降至 1015 幀以捕捉更多短暫出現的角色
4. **若需測試兒童年齡**:使用片庫中的 `Alice Comedies (1926)`該片有近景小女孩Virginia Davis68 歲),臉部可達 150px+
---
## 附錄:測試資料
| 檔案 | 路徑 |
|------|------|
| DeepFace 年齡 JSON | `output_dev/experiments/age_benchmark/age_benchmark_report.json` |
| 頭肩比 JSON | `output_dev/experiments/head_shoulder/head_shoulder_report.json` |
| 水槍場景幀 | `output_dev/experiments/head_shoulder/child_f2450.jpg` |
| 年齡基準腳本 | `scripts/age_benchmark.py` |
| 頭肩比腳本 | `scripts/head_shoulder_quick.py` |
| Face trace 排序 API | `POST /api/v1/file/:file_uuid/face_trace/sortby` |

View File

@@ -1,8 +1,8 @@
--- ---
document_type: "spec" document_type: "spec"
service: "MOMENTRY_CORE" service: "MOMENTRY_CORE"
title: "Chunk 定義 V1.0.0" title: "Story Parent-Child Chunk Rules V1.0"
date: "2026-05-01" date: "2026-05-05"
version: "V1.0" version: "V1.0"
status: "active" status: "active"
owner: "Warren" owner: "Warren"
@@ -11,188 +11,288 @@ tags:
- "momentry" - "momentry"
- "core" - "core"
- "chunk" - "chunk"
- "v1.0.0" - "story"
- "chunk-type"
- "pre-chunk"
- "parent-child" - "parent-child"
- "data-structure" - "v1.0"
ai_query_hints: ai_query_hints:
- "chunk 的定義與結構" - "Story parent-child chunk generation rules"
- "pre_chunk 與 chunk 的關係" - "CUT scene → parent chunk, ASR sentence → child chunk"
- "parent_chunk 與 child_chunk 的關係" - "boundary overlap: partial match enriches child context"
- "ChunkType 包含哪些類型Sentence/Cut/Visual/Trace/Story" - "parent_summary template + child_summary template"
- "chunk 的巢狀結構與 Rule 組合規則" - "children per parent distribution"
- "chunk 如何對應到 file_uuid 與幀區間"
- "chunk 的搜尋用途與向量儲存方式"
- "chunk 與 pre_chunk 的雙層資料架構"
related_documents: related_documents:
- "PROCESSOR_SELECTION_V1.0.0.md" - "../CHUNK_DEFINITION_V1.0.0.md"
- "VECTOR_SPEC_V1.0.0.md" - "../DUAL_EMBEDDING_PIPELINE_V1.0.0.md"
- "PROCESSORS/ASR_V1.0.0.md" - "../PROCESSORS/ASR_V1.0.0.md"
- "PROCESSORS/CUT_V1.0.0.md" - "../PROCESSORS/CUT_V1.0.0.md"
- "PROCESSORS/FACE_V1.0.0.md"
--- ---
# Chunk 定義 V1.0.0 # Story Parent-Child Chunk Rules V1.0
| 項目 | 內容 | ## 核心概念
- **Parent chunk** = CUT 場景邊界內的所有對話 → 一個場景敘述
- **Child chunk** = 單一 ASR sentence → 一句對白
- **Boundary overlap** = 場景邊界重疊的句子 → 同時歸屬前後 parent
## 匹配規則
### Rule 1: Fully-Contained Matching
```
ASR sentence 完全在 CUT 場景時間範圍內
→ seg.start >= scene.start_time AND seg.end <= scene.end_time
→ 加入該 scene 的 children 列表
```
### Rule 2: Boundary Overlap (所有 parent)
```
對於每個 parent chunk即使只有 1 child:
→ 找出與 scene 時間範圍有 partial overlap 的 ASR sentence
→ seg.start < scene.end_time AND seg.end > scene.start_time
→ AND 未被 Rule 1 匹配(不是 fully-contained
→ 加入該 scene 的 children 列表
```
邊界 overlap 讓 child chunk 可以同時歸屬前後兩個 parent提供更多上下文。
### Rule 3: Scene Filter
```
CUT scene duration < 1s → 跳過(場景太短無意義)
```
## Parent Summary 模板
```
[{start}s-{end}s, {duration}s]
Cast: {character_list}
Total dialogue: N lines, W words
Speakers: {name} (N lines): "sample text..."
```
## Child Summary 模板
```
[{start}s-{end}s] {speaker_name}: "{asr_text}"
```
### Embedding Target
Child summary text → Ollama nomic-embed-text-v2-moe → 768D vector → pgvector
## 數據實例Charade (1963) — 長片 113min
### 輸入
| 來源 | 數量 | 說明 |
|------|------|------|
| ASR segments | **1,629** | Whisper small 英文字幕 |
| ASR with text | 1,629 | 全部有文字 |
| ASR total duration | 6,760s (113 min) | |
| CUT scenes | **1,331** | PySceneDetect 場景切割 |
| CUT scenes ≥ 1s | 1,200 | 過濾後有效場景 |
| CUT mean duration | 5.2s | 平均場景長度 |
| CUT scene gap (unmatched) | 131 | < 1s 場景被過濾 |
### 輸出 (V2.1 — boundary overlap for ALL scenes, duration filter removed)
| 指標 | 數值 |
|------|------| |------|------|
| 建立者 | OpenCode | | **Parent chunks** | **1,313** (all CUT scenes ≥ 0s) |
| 建立時間 | 2026-05-01 | | **Child chunks** (total in DB) | **2,927** (1,629 unique + 1,298 overlaps) |
| 文件版本 | V1.0 | | **Unique children** | **1,629** (100% ASR coverage) |
| DB duplicates (shared) | 1,298 (ON CONFLICT merge) |
| Children per parent | 1 ~ 43, avg **2.2** |
| Unmatched | **0** |
## 名詞定義 ### 分佈
| 名詞 | 定義 | 範例 |
|------|------|------|
| **Processor JSON** | Processor 腳本的第一層產出檔案 | `384b0ff44aaaa1f14cb2cd63b3fea966.face.json` |
| **pre_chunk** | 從 Processor JSON 匯入 DB 的最低層元件(`pre_chunks` 表) | 單幀 face detection、單句 ASR text |
| **chunk** | 可搜尋單位(`chunks` 表),由 Rule 組合 pre_chunks 產出,`start_frame` ~ `end_frame` 定義區間 | sentence chunk, visual chunk, scene chunk |
| **parent_chunk** | chunk 的一種,包含 `child_chunk_ids`,其區間涵蓋多個 child_chunks由 Summary Agent 產出統整描述 | scene chunk, story chunk |
| **child_chunk** | chunk 的一種,被 parent_chunk 參照為子元素 | sentence chunk, visual chunk |
---
## Chunk 結構
```rust
Chunk {
uuid: String, // file_uuid (32-char hex)
chunk_id: String, // "{uuid}_{chunk_index}"
chunk_index: u32, // 0-based 序號
chunk_type: ChunkType, // Sentence | Cut | Visual | Trace | Story
rule: ChunkRule, // Rule1 (直接組合) | Rule2 (聚合)
start_frame: i64, // 起始幀0-based唯一時間參考
end_frame: i64, // 結束幀exclusive
fps: f64, // 該區間的 fps
content: JSON, // 主要內容
text_content: Option<String>, // 純文字內容(供搜尋用)
metadata: Option<JSON>, // speaker, face_ids, yolo_objects 等
pre_chunk_ids: Vec<i32>, // 來源 pre_chunks原始元件追溯
parent_chunk_id: Option<String>, // 父 chunk ID如存在
child_chunk_ids: Vec<String>, // 子 chunk IDs如為 parent_chunk
vector_id: Option<String>, // 向量儲存參考
}
```
---
## ChunkType
| 類型 | 說明 | 範例 |
|------|------|------|
| `Sentence` | ASR 句子 chunk | 一句話對應一個 chunk |
| `Cut` | 場景切換 chunk | PySceneDetect 輸出的場景邊界 |
| `Visual` | 視覺物件 chunk | YOLO/OCR/Face/Pose 聚合 |
| `Trace` | 追蹤 chunk | face_trace / yolo_trace |
| `Story` | 敘事 chunkparent | 5W1H Agent 產出的統整描述 |
---
## Chunk 特性
- **區間定義**: `start_frame` / `end_frame`frames 為唯一時間座標)
- **可重疊**: 不同類型的 chunk 可以覆蓋相同區間
- **可不連續**: chunk 之間不需要連續
- **巢狀**: parent_chunk 包含 child_chunk_ids子區間不須填滿父區間
- **單幀 chunk**: `start_frame == end_frame`(如 frame-level detection
---
## 資料流
``` ```
Processor JSON ({file_uuid}.{type}.json) Children per parent:
1: 128 parents (獨白/短場景)
▼ 匯入 2: 58 parents
pre_chunks (原始元件, start_frame / end_frame / data) 3: 0 parents ← 邊界 overlap 後 3 被 2/4 吸收
4-9: 64 parents (中等對話場景)
▼ Rule 組合 (Rule1 / Rule2 / Rule3) 10-27: 50 parents (多人對話場景)
chunks (可搜尋單位)
├── child_chunk (基礎搜尋單位)
│ └── 5W1H: 該 chunk 的摘要描述3~5 句話)
└── parent_chunk (較大區間, Summary Agent 產出)
├── child_chunk_ids: [內含的所有 child_chunks]
└── summary: (child_chunks 的 5W1H + parent_chunk 補充描述)
via Summary Agent (如 5W1H Agent)
summary 為 3~5 句話,統整區間內所有內容
用於 embedding 成向量,確保搜尋時涵蓋足夠語意
``` ```
--- ### 已匹配率
## 與 pre_chunk 的關係 | 指標 | 數值 |
|------|------|
| ASR unmatched | **0** (V2.1: boundary overlap for ALL scenes) |
| 已匹配率 | **100%** |
| 層級 | 產生方式 | 目的 | ## 輸入/輸出範例
|------|----------|------|
| pre_chunk | 直接從 Processor JSON 匯入 | 保留原始資料,供 Rule 加工 |
| chunk | Rule 組合 pre_chunks | 成為可搜尋單位 |
| child_chunk | chunk 的一種 | 基礎搜尋目標 |
| parent_chunk | Summary Agent 產出 | 補足單一 child_chunk 資訊量不足 |
--- ### Big Parent多子女
## 範例 **輸入原始數據**
```
CUT scene [2783s-2847s, 65s]
27 ASR sentences, all spoken by Audrey Hepburn + Cary Grant + SPEAKER_2
```
### Sentence Chunk (child_chunk) **輸出 Parent Summary**
```
[2783s-2847s, 65s] Cast: Audrey Hepburn, Cary Grant, SPEAKER_2.
Total dialogue: 27 lines, 143 words.
```
**輸出 Child Summaries**embedding target
```
[2784s-2786s] Audrey Hepburn: "they stole it"
[2786s-2788s] Audrey Hepburn: "by burying it"
[2788s-2790s] Audrey Hepburn: "then reporting the Germans had captured it"
... (27 total)
```
**Metadata 信度**(隨 parent/child 傳遞):
```json ```json
// Parent metadata
{ {
"chunk_id": "384b0ff44aaaa1f14cb2cd63b3fea966_42", "speaker_confidence": { "Audrey Hepburn": 0.85, "Cary Grant": 0.64 },
"chunk_index": 42, "face_confidence": { "Audrey Hepburn": 0.60, "Cary Grant": 0.64 },
"chunk_type": "sentence", "yolo_objects": { "car": 0.72, "bottle": 0.55, "chair": 0.68 }
"rule": "rule_1", }
"start_frame": 1260,
"end_frame": 1350, // Child metadata
"fps": 29.97, {
"content": { "speaker_name": "Audrey Hepburn",
"text": "今天天氣很好,我們決定去公園走走。", "speaker_confidence": 0.85, // MAR lip: 57% events during SPEAKER_1
"speaker": "SPEAKER_00" "face_confidence": 0.60, // clustering composite
}, "asr_confidence": 0.92 // Whisper confidence
"text_content": "今天天氣很好,我們決定去公園走走。",
"metadata": {
"speaker": "SPEAKER_00",
"face_ids": ["face_42", "face_43"],
"5w1h": "講者 SPEAKER_00 在室內提到今天天氣很好。他建議大家一起到公園散步。同伴們同意這個提議。大家開始準備出發。整個對話顯示團隊氣氛融洽。"
},
"pre_chunk_ids": [101, 102, 103],
"parent_chunk_id": "384b0ff44aaaa1f14cb2cd63b3fea966_scene_3"
} }
``` ```
### Scene Chunk (parent_chunk) ### 1:1 Parent單子女
```json **輸入原始數據**
{ ```
"chunk_id": "384b0ff44aaaa1f14cb2cd63b3fea966_scene_3", CUT scene [304s-318s, 14s]
"chunk_index": 3, 1 ASR sentence, spoken by Cary Grant alone
"chunk_type": "cut",
"rule": "rule_3",
"start_frame": 1200,
"end_frame": 1800,
"fps": 29.97,
"content": {
"scene_number": 3,
"scene_type": "dialogue"
},
"text_content": "今天天氣很好,我們決定去公園走走。之後我們在公園裡散步,看到很多花。",
"metadata": {
"summary": "講者和同伴在室內討論天氣狀況,提到今天陽光明媚。他們決定到附近的公園散步享受好天氣。抵達公園後,他們沿著步道行走,觀察到許多盛開的花朵。其中一人用手機拍攝了花朵的照片。整個對話氣氛輕鬆愉快。"
},
"pre_chunk_ids": [98, 99, 100],
"child_chunk_ids": [
"384b0ff44aaaa1f14cb2cd63b3fea966_42",
"384b0ff44aaaa1f14cb2cd63b3fea966_43",
"384b0ff44aaaa1f14cb2cd63b3fea966_44"
]
}
``` ```
--- **輸出 Parent Summary**
```
[304s-318s, 14s] Cast: Cary Grant.
Total dialogue: 1 lines, 13 words.
```
**輸出 Child Summary**embedding target
```
[309s-317s] Cary Grant: "Sylvia I'm getting a divorce what from Charles he's the only husband I"
```
## 與 LLM Pipeline 的關係
```
Pipeline 1 (Story): template summary → DB + embedding
Pipeline 2 (LLM): LLM summary → DB + embedding (future)
chunk_type:
story_parent / story_child ← Pipeline 1
llm_parent / llm_child ← Pipeline 2 (future)
```
## 版本歷史 ## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 | | 版本 | 日期 | 變更 |
|------|------|------|--------|-----------| |------|------|------|
| V1.0 | 2026-05-01 | 初始版本 | OpenCode | deepseek-chat | | V1.0 | 2026-05-05 | 初始規則fully-contained + boundary overlap |
| V2.1 | 2026-05-05 | 移除 duration filterboundary overlap 對所有場景含空場景。100% ASR coverage。Speaker mapping 從 DB 動態讀取。 |
## Charade 1963 統計分析記錄
### 影片資料
| 指標 | 值 |
|------|-----|
| 片長 | 113 分鐘 |
| 總幀數 | 412,343 |
| FPS | 59.94 |
| 解析度 | 1920×1080 |
### 處理器產出
| Processor | 輸出行數 | 說明 |
|-----------|---------|------|
| CUT | 1,331 scenes | 平均 5.2s/scenemin 0.2smax 64.5s |
| ASR | 1,629 segments | Whisper small113 min total |
| ASRX | 10 speakers | SPEAKER_0/1 為主要角色 |
| Face | 4,008 frames, 6,182 faces | sample=60, Vision+CoreML ANE |
| Face Trace | 6,182 detections, 2,347 traces | IoU+embedding tracking |
| Identity | 677 traces → 7 identities | 99.4% coverage, MAR lip speaker binding |
| YOLO | 328,800 frames, 57 object classes | CoreML ANE |
### Matching 迭代記錄
#### Iteration 1: Fully-contained only, >= 1s scene filter
```
Rule: seg.start >= scene.start AND seg.end <= scene.end
Scene filter: duration >= 1s (131 scenes filtered out)
Result: 990/1629 (61%) matched
454 unmatched, 74 in filtered scenes
Only scenes with children got boundary overlaps
```
#### Iteration 2: Add boundary overlap for scenes with >= 3 children
```
Rule: For scenes with >= 3 children, add partial overlaps
Result: 1,210 children (+220 partial)
Still 454 unmatched (boundary overlap only for rich scenes)
```
#### Iteration 3: Remove duration filter
```
Rule: Remove >=1s scene filter
Result: 1,496 unique children (92% coverage)
133 unmatched
Root cause: boundary overlap still gated by "if children:"
```
#### Iteration 4: Boundary overlap for ALL scenes (regardless of children)
```
Rule: Move boundary overlap code outside "if children:" guard
All 1,331 scenes participate
Result: 1,629 unique children (100% coverage)
1,313 parents (all scenes)
2,927 total children (1,629 unique + 1,298 overlaps)
```
### 關鍵決策
| 決策 | 理由 | 影響 |
|------|------|------|
| 移除 duration filter | 131 scenes <1s 會漏掉句子 | +24 parents, +321 children |
| 移除 children guard | 空場景也要加 boundary children | +133 children (100%) |
| 用 overlap 而非 fully-contained | ASR/CUT 時間邊界不對齊 | 避免 565 sentences orphan |
| Partial overlaps 存兩次 | 邊界句可歸屬兩個 parent | 1,298 duplicates via ON CONFLICT |
| Speaker map 從 DB 讀 | 不再 hardcode 演員名 | 通用化任何影片 |
### 效能指標
| 指標 | 值 |
|------|-----|
| Story 生成時間 | < 1s (template, instant) |
| Embedding 時間 (Ollama) | ~2 min for 1,629 chunks |
| Qdrant sync 時間 | ~3 min for rule1, ~1 min for story |
| BM25 search 時間 | < 10ms per query |
### 教學要點
1. **時間邊界不對齊是常態**ASR語音邊界與 CUT視覺邊界用不同演算法永遠不會完美對齊。overlap matching 是必要設計。
2. **Boundary overlap 需對所有場景生效**:不能只限有 children 的場景,否則會產生 orphan sentences。
3. **ON CONFLICT merge**:同一 sentence 出現在兩個 parent 時DB 層面用最後一個 parent。如需多對多關係需 junction table。
4. **Hardcoded 到 Dynamic**speaker map 從 hardcode → DB-driven 是通用化的關鍵一步。

View File

@@ -0,0 +1,192 @@
---
document_type: "design"
service: "MOMENTRY_CORE"
title: "Class 分類系統設計 V1.0"
date: "2026-05-05"
version: "V1.0"
status: "design"
owner: "Warren"
created_by: "OpenCode"
tags:
- "momentry"
- "core"
- "class"
- "taxonomy"
- "design"
- "v1.0"
ai_query_hints:
- "Class 分層分類系統設計"
- "參照 IPC (國際專利分類) 及 HS (海關稅則)"
- "編碼格式: {section}-{NNNN}"
- "用於 identity 多層分類、快速定位"
related_documents:
- "../DATA_SCHEMA_FILE_IDENTITY_V1.0.0.md"
- "../UUID_ENCODING_RULES_V1.0.0.md"
---
# Class 分類系統設計 V1.0
> 狀態:設計階段,尚未實施
## 設計參考
IPC國際專利分類與 HS海關稅則
共通原則:**層級碼**、**數字越長越精細**、**全球通用**、**可無限擴展**。
## 設計目標
- IPC/HS 式的 hierarchical code → **快速定位**
- Tag 式的 multi-label 使用 → **靈活分類**
- 同一 entity 可擁有多條 class path
- 新增分類只需 INSERT無 migration
```
Cary Grant
→ P-0201 (演員/主角)
→ T-0102 (1960s)
→ S-0200 (場景/戶外 — 他在片中出現的場景)
Ferrari 250 GT
→ O-0101 (汽車)
→ B-0300 (汽車品牌/Ferrari)
→ T-0102 (1960s)
## 編碼格式
```
{section}-{NNNN}
│ └── 4 digits每 2 digits 一層
└───────── 1 char section prefix
```
| 層級 | 範例 | 意義 |
|------|------|------|
| `P-0000` | top section | 人物 |
| `P-0200` | subclass | 人物 → 演員 |
| `P-0201` | group | 人物 → 演員 → 主角 |
| `P-0202` | group | 人物 → 演員 → 配角 |
層級判斷:`code.length`。`P-` = section`P-02` = subclass`P-0201` = group。
### Section 定義
| Section | 名稱 | 範疇 | 預留 |
|---------|------|------|------|
| `P` | 人物 | 演員、導演、公眾人物、虛構角色、運動員... | 01-99 |
| `O` | 物件 | 交通工具、家具、武器、工具、電子產品... | 01-99 |
| `B` | 品牌/組織 | 時尚、科技、汽車品牌、政府機構、NGO... | 01-99 |
| `C` | 概念/抽象 | 情感、思想、事件、主題、風格... | 01-99 |
| `A` | 生物 | 動物、植物、真菌... | 01-99 |
| `S` | 場景/地點 | 室內、戶外、城市、自然地標、建築內部... | 01-99 |
| `E` | 環境/自然 | 天氣、地形、天象、自然災害... | 01-99 |
| `M` | 音樂/聲音 | 樂器、音樂類型、自然聲音、人工聲音... | 01-99 |
| `L` | 語言/文字 | 語言、方言、書寫系統、符號... | 01-99 |
| `T` | 時間/時期 | 年代、季節、節日、歷史時期... | 01-99 |
| `F` | 檔案類型 | 影片格式、文件類型、圖片格式... | 01-99 |
| `D` | 領域/學科 | 科學、藝術、體育、政治、經濟... | 01-99 |
12 個 Section各 99 subclass × 99 group = ~117K 分類槽位。可隨時新增 Section。
## 初始 Class Tree
```
P-0000 人物
├── P-0100 公眾人物
├── P-0200 演員
│ ├── P-0201 主角
│ └── P-0202 配角
├── P-0300 導演
├── P-0400 虛構角色
└── P-9900 其他人物
O-0000 物件
├── O-0100 交通工具
│ ├── O-0101 汽車
│ ├── O-0102 船
│ └── O-0103 飛機
├── O-0200 建築
├── O-0300 家具
└── O-9900 其他物件
B-0000 品牌
├── B-0100 時尚
├── B-0200 科技
└── B-9900 其他品牌
C-0000 概念
├── C-0100 情感
├── C-0200 思想
└── C-9900 其他概念
```
## Table
```sql
CREATE TABLE classes (
code VARCHAR(8) PRIMARY KEY, -- P-0201
name TEXT NOT NULL, -- 主角
description TEXT,
created_at TIMESTAMPTZ DEFAULT now()
);
-- 多對多:同一 identity 可有多個 class code如 tag 使用)
CREATE TABLE identity_classes (
identity_id INTEGER REFERENCES identities(id),
class_code VARCHAR(8) REFERENCES classes(code),
confidence REAL DEFAULT 1.0,
source VARCHAR(20), -- which agent classified
PRIMARY KEY (identity_id, class_code)
);
```
## Query 範例
```sql
-- 查某 identity 的所有 class
SELECT c.code, c.name
FROM identity_classes ic
JOIN classes c ON ic.class_code = c.code
WHERE ic.identity_id = 8;
-- 查所有屬於 "演員" (P-0200) 的 identity
SELECT i.name
FROM identity_classes ic
JOIN identities i ON ic.identity_id = i.id
WHERE ic.class_code LIKE 'P-02%';
-- 查某 section 下的所有 identity
SELECT DISTINCT i.name
FROM identity_classes ic
JOIN identities i ON ic.identity_id = i.id
WHERE ic.class_code LIKE 'P-%';
```
## 擴展方式
1. 新增 leaf class`INSERT INTO classes VALUES ('P-0203', '配音員')` — P-02 底下的新 group
2. 新增 subclass`INSERT INTO classes VALUES ('P-0500', '製作團隊')` — P 底下的新 subclass
3. 新增 section`INSERT INTO classes VALUES ('X-0000', '新分類')` — 全新 top-level
無需 migrationinsert 即可。
## 版本歷史
| 版本 | 日期 | 狀態 |
|------|------|------|
| V1.0 | 2026-05-05 | 設計階段 |
## Future: Class-Based Search
實施 class 系統後search API 可加入 class filter 提升命中率:
```
GET /api/v1/search?q=car&class=O-0101
→ 只搜被分類為「汽車」的內容,過濾 "care", "car accident", "car wash"
GET /api/v1/search/hybrid?q=divorce&class=P-0200
→ 只搜演員說出的 "divorce",排除旁白、字幕
GET /api/v1/search/universal?class=T-0102
→ 搜所有 1960s 相關內容
```

View File

@@ -0,0 +1,328 @@
---
document_type: "spec"
service: "MOMENTRY_CORE"
title: "Data Schema: File & Identity V1.0"
date: "2026-05-05"
version: "V1.0"
status: "active"
owner: "Warren"
created_by: "OpenCode"
tags:
- "momentry"
- "core"
- "schema"
- "file"
- "identity"
- "v1.0"
ai_query_hints:
- "File & Identity DB schema"
- "face_detections.identity_id direct FK"
- "identity multi-modal: face + voice + TMDb + manual"
related_documents:
- "../DUAL_EMBEDDING_PIPELINE_V1.0.0.md"
- "../UUID_ENCODING_RULES_V1.0.0.md"
---
# Data Schema: File & Identity V1.0
## 1. File Schema
### videos / files
| Column | Type | 說明 |
|--------|------|------|
| `id` | SERIAL PK | |
| `file_uuid` | VARCHAR(32) | Birth UUID |
| `file_path` | VARCHAR(512) | 檔案完整路徑 |
| `file_name` | VARCHAR(256) | |
| `probe_json` | JSONB | ffprobe raw output |
| `status` | VARCHAR(20) | ready / processing / completed |
| `processing_status` | JSONB | per-processor progress |
| `total_frames` | INTEGER | |
| `fps` | DOUBLE | |
| `duration` | DOUBLE | 影片長度(秒) |
| `width` / `height` | INTEGER | 解析度 |
| `registration_time` | TIMESTAMP | 註冊時間 |
### face_detections (per-file face data)
| Column | Type | 說明 |
|--------|------|------|
| `id` | SERIAL PK | |
| `file_uuid` | VARCHAR(32) | → videos.file_uuid |
| `frame_number` | BIGINT | 幀號 |
| `face_id` | VARCHAR(64) | per-file face identifier |
| `trace_id` | INTEGER | 跨幀追蹤 ID |
| `x, y, width, height` | INTEGER | bbox |
| `confidence` | REAL | 偵測信度 |
| `embedding` | REAL[] | 512D CoreML FaceNet |
| `identity_id` | INTEGER | → identities.id (V4.0 direct FK) |
### chunks (per-file parent/child chunks)
| Column | Type | 說明 |
|--------|------|------|
| `id` | SERIAL PK | |
| `chunk_id` / `old_chunk_id` | VARCHAR | chunk identifier |
| `file_uuid` | VARCHAR(32) | → videos.file_uuid |
| `chunk_type` | VARCHAR(32) | story_parent / story_child / rule1_sentence |
| `chunk_index` | INTEGER | per-file ordering |
| `start_time` / `end_time` | DOUBLE | time range |
| `content` | JSONB | metadata |
| `text_content` | TEXT | summary text → embedding target |
| `embedding` | VECTOR | pgvector 768D |
| `search_vector` | TSVECTOR | BM25 full-text |
| `parent_chunk_id` | VARCHAR | → chunks.chunk_id |
## 2. Identity Schema
### 概念
Identity 是可命名的任何識別標的,不限於人。
| identity_type | 範例 | 識別模型 |
|--------------|------|---------|
| `people` | Cary Grant, Audrey Hepburn | face, voice, name |
| `animal` | 電影中的狗、馬 | face, body, sound |
| `object` | 特定道具、車輛 | yolo, image embedding |
| `plant` | 場景中的特定植物 | image embedding |
| `building` | 艾菲爾鐵塔、特定建築 | image embedding, OCR |
| `place` | Paris, 咖啡廳 | scene classification |
| `concept` | "離婚", "復仇" | text embedding |
| `brand` | Coca-Cola | OCR, logo detection |
每種 identity_type 可以使用不同的識別模型組合。
### 識別模型
| model | dimension | source | 適用 identity_type |
|-------|-----------|--------|-------------------|
| `face` | 512D | CoreML FaceNet | people, animal |
| `voice` | 192D | SpeechBrain ECAPA-TDNN | people |
| `text` | 768D | Ollama nomic-embed | concept, place |
| `image` | 768D | — (future) | object, building, plant |
| `yolo_class` | — | YOLO label | object |
### Table
```sql
CREATE TABLE identities (
id SERIAL PRIMARY KEY,
uuid UUID, -- 32-char UUIDv5 (source:external_id)
name TEXT NOT NULL UNIQUE,
identity_type VARCHAR(30) DEFAULT 'people', -- people/animal/object/building/place/concept
source VARCHAR(20) DEFAULT 'manual', -- tmdb/manual/face_cluster/yolo
status VARCHAR(20) DEFAULT 'pending',
-- Reference vectors per model (in JSONB for extensibility)
reference_vectors JSONB DEFAULT '{}',
-- {
-- "face": [{"vec":[...], "pose":"frontal", "source":"video_trace"}],
-- "voice": [{"vec":[...], "speaker_id":"SPEAKER_0"}],
-- "image": [{"vec":[...], "source":"manual"}]
-- }
-- Legacy columns (migrating to reference_vectors)
face_embedding VECTOR(512),
voice_embedding VECTOR(192),
identity_embedding VECTOR(768),
reference_data JSONB DEFAULT '{}',
metadata JSONB DEFAULT '{}',
tmdb_id INTEGER,
tmdb_profile TEXT,
created_at TIMESTAMP DEFAULT now()
);
```
### 彈性設計
現有 `face_embedding` / `voice_embedding` column 維持向下相容。
未來全部移入 `reference_vectors` JSONB支援任意 model × 多個 reference vectors
```json
{
"reference_vectors": {
"face": [
{"vec": [0.1, 0.2, ...], "pose": "frontal", "source": "video_trace_0", "confidence": 0.95},
{"vec": [0.3, 0.4, ...], "pose": "profile", "source": "video_trace_0", "confidence": 0.88}
],
"voice": [
{"vec": [0.5, 0.6, ...], "speaker_id": "SPEAKER_0", "source": "asrx"}
],
"image": []
}
}
```
### 識別 Agent 架構
每個識別模型由對應的 Agent 負責。Identity 本身只存 reference vectors不綁定特定 model。
```
┌─────────────────────────┐
│ identities │
│ name, type, source │
│ reference_vectors (JSONB)│
└──────────┬──────────────┘
┌────────────────────┼────────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│FaceAgent│ │VoiceAgent│ │ImageAgent│
│ │ │ │ │ (future) │
│ input: │ │ input: │ │ input: │
│ face_ │ │ asrx │ │ image │
│ detect │ │ segments│ │ features│
│ ions │ │ │ │ │
│ │ │ │ │ │
│ output: │ │ output: │ │ output: │
│ face → │ │ voice → │ │ img → │
│ identity│ │ identity│ │ identity│
└─────────┘ └─────────┘ └─────────┘
```
### Agent 定義
| Agent | 輸入 | 模型 | 輸出 | 狀態 |
|-------|------|------|------|------|
| **FaceAgent** | `face_detections` | CoreML FaceNet 512D | `identity_id` on face_detections | ✅ |
| **VoiceAgent** | ASRX segments | ECAPA-TDNN 192D + MAR lip | `metadata.speaker_id` | ✅ |
| **ImageAgent** | — | — | — | ⬜ future |
| **YoloAgent** | YOLO detections | — | object → identity | ⬜ future |
| **TextAgent** | chunk text | nomic-embed 768D | concept → identity | ⬜ future |
### Agent 運作模式
```
1. Agent 讀取 raw detectionsface / voice / yolo
2. 對比 identities.reference_vectors[model]
3. 相似度達標 → bind to existing identity
4. 不達標 → create new identity
5. 更新 identities.reference_vectorsenrich reference set
```
同一個 identity 可以被多個 Agent 同時更新。例如:
- FaceAgent 寫入 `reference_vectors.face`
- VoiceAgent 寫入 `reference_vectors.voice`
- 兩者指向同一個 identity (Cary Grant)
### Face → Identity 綁定V4.0
```
face_detections.identity_id ──── FK ────→ identities.id
```
Direct FK。不需要 intermediate table。操作 API
```
POST /api/v1/identities/bind
{ "file_uuid": "...", "face_id": "face_1", "identity_uuid": "..." }
→ UPDATE face_detections SET identity_id = X
POST /api/v1/identities/unbind
{ "file_uuid": "...", "face_id": "face_1" }
→ UPDATE face_detections SET identity_id = NULL
```
### Voice/Speaker → Identity 綁定
透過 `identities.metadata.speaker_id`
```
identities.metadata = {"speaker_id": "SPEAKER_0", "speaker_confidence": 0.85}
```
Voice embedding 直接寫入 `identities.voice_embedding`
## 3. File-Identity 關聯
```
file (1a04db97...) identity (Cary Grant)
│ │
├── face_detections │
│ ├── face_id="face_1" │
│ │ identity_id ──────────────────┤
│ ├── face_id="face_2" │
│ │ identity_id ──────────────────┤
│ └── face_id="face_3" │
│ identity_id = NULL │ ← unbounded
│ │
├── chunks │
│ ├── story_parent │
│ │ content.metadata.characters │
│ │ = ["Cary Grant", ...] │
│ └── story_child │
│ content.metadata.speaker │
│ = "Cary Grant" │
│ │
└── asrx.json │
└── segments[].speaker_id │
= "SPEAKER_0" ────────────────┘
file_identities (N:N junction, if needed)
file_uuid → identity_uuid
```
## 4. Class 分層分類(參照 IPC + HS
### 設計參考
IPC國際專利分類與 HS海關稅則的分層編碼體系。
| 標準 | 結構 |
|------|------|
| **IPC** | Section(A-H) → Class(2digits) → Subclass → Group/NNN |
| **HS** | Section → Chapter(2digits) → Heading(4digits) → Subheading(6digits) |
共通原則:**層級碼**、**數字越長越精細**、**全球通用**。
### 編碼格式
```
{SECTION}-{NNN}-{NNN}-{NNN}
│ │ │ └─ subgroup
│ │ └──────── main_group
│ └─────────────── subclass
└─────────────────────── section
```
| Section | 涵蓋 |
|---------|------|
| `P` | People |
| `O` | Object |
| `B` | Brand |
| `C` | Concept |
| `A` | Animal |
| `S` | Scene |
| `E` | Environment |
| `M` | Music/Sound |
### Table
```sql
CREATE TABLE classes (
code VARCHAR(20) PRIMARY KEY, -- P-001-010/010
name TEXT NOT NULL,
parent_code VARCHAR(20) REFERENCES classes(code),
section CHAR(1),
level INTEGER DEFAULT 0,
description TEXT,
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE TABLE identity_classes (
identity_id INTEGER REFERENCES identities(id),
class_code VARCHAR(20) REFERENCES classes(code),
confidence REAL DEFAULT 1.0,
source VARCHAR(20),
PRIMARY KEY (identity_id, class_code)
);
```
## 版本歷史
| 版本 | 日期 | 變更 |
|------|------|------|
| V1.0 | 2026-05-05 | File & Identity schemaV4.0 direct FK binding |
| V1.1 | 2026-05-05 | Class 分層分類IPC/HSAgent 識別架構 |

File diff suppressed because it is too large Load Diff

View File

@@ -115,3 +115,129 @@ related_documents:
| 記憶體 | 2048 MB長片因分段處理實際低於此值 | | 記憶體 | 2048 MB長片因分段處理實際低於此值 |
| GPU | 不使用INT8 CPU 量化) | | GPU | 不使用INT8 CPU 量化) |
| 依賴 | 無 | | 依賴 | 無 |
---
## Swift ASR (Apple Speech Framework) 實驗記錄
### 選型結論
使用現有做法faster-whisper smallSwift ASR 不取代 Whisper。
> **注意**Apple Speech Framework 會隨著 macOS / Siri 版本更新而改善。每次主要 macOS 版本更新時(如 macOS 15→16應重新執行 `scripts/compare_segmentation.py` 對比 Swift vs Whisper 的品質差異,以評估是否可切換。
### POC 狀態
Swift processor 位於 `scripts/swift_processors/`已編譯。Apple Speech Framework 在記憶體11MB vs 1.1GB和速度4.19s vs 17.46s)有優勢,但準確度不足。
### 效能對比Charade 60s 片段)
| 指標 | Swift (Speech Framework) | Python (faster-whisper small) |
|------|------------------------|-------------------------------|
| **RTF** | 0.07 (14x) | 0.29 (3.4x) |
| **記憶體** | 11MB | 1.1GB |
| **Segments** | 18句子級 | 23句子級 |
| **品質** | 漏字较多("Let's see"→"And see" | 準確 |
| **語音分離改善** | Demucs +35s僅小幅改善 | 不需要 |
### 已知問題
1. 語言自動偵測順序錯誤(先試 zh-TW需指定 `--language en-US`
2. RunLoop timeout 已修復(改為 semaphore 等待 callback
3. 逐字輸出已合併94 → 18 segments
### 相關檔案
```
scripts/swift_processors/
├── Package.swift
├── asr_swift.swift
├── asrx_swift.swift
├── entitlements.plist
└── .build/debug/asr_swift
```
---
## Speaker Diarization (ASRX) 選型記錄
### 現有方案Python ASRX (ECAPA-TDNN + Spectral Clustering)
使用 SpeechBrain ECAPA-TDNN 提取 192-D speaker embedding搭配 spectral clustering 進行語者分離。
| 指標 | 值 |
|------|-----|
| Embedding 維度 | 192-D |
| Charade 偵測 speaker 數 | 10正確區分 narrator、主角、配角 |
| 總 ASRX pre_chunks | 5,848 |
| Qdrant collection | `{prefix}_voice` |
| 依賴 | 需 ASR 完成後執行(時間對齊) |
| 輸出 | segments 含 `speaker_id`, `start_time`, `end_time` |
### Swift SFSpeechAnalyzer 評估
**目標**:使用 Apple 內建 Speech FrameworkANE 加速)取代 Python ASRX。
| API | macOS 14 可用性 | 說明 |
|-----|----------------|------|
| `SFSpeechRecognizer` | ✅ | 語音辨識 |
| `SFSpeechAnalyzer` | ✅ 存在 | 語音分析,但無暴露 speaker embedding |
| `SFSpeechRecognitionMetadata` | ✅ 存在 | 辨識中繼資料,但 speaker 資訊為空 |
| `SFSpeakerEmbedding` | ❌ | Speaker embedding API 不存在 |
| `SFSpeakerIdentification` | ❌ | Speaker 識別 API 不存在 |
| KVC 取 speaker metadata | ❌ | 透過 KVC 也無法取得 speaker 資訊 |
**結論:目前不可行。** Apple 尚未在 macOS 14 上開放 Speaker Recognition API 給開發者使用。
### 選型結論
維持 Python ASRX (ECAPA-TDNN) 方案。待未來 macOS 版本開放 Speaker Recognition API 後重新評估。
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 新增 Swift ASR 實驗記錄與 Speaker Diarization 選型記錄 | OpenCode | deepseek-chat |
| V1.2 | 2026-05-04 | 新增 Text Embedding ANE 加速可行性研究 | OpenCode | deepseek-chat |
---
## Text Embedding ANE 加速研究
### 背景
ASR 產出的 sentence chunk 需要 embedding用於 semantic search / RAG
目前使用 Ollama `nomic-embed-text-v2-moe`768-D, 多語言MIT licenseCPU/GPU
### 研究目標
評估是否可用 Apple ANE 方案取代 Ollama embedding降低 CPU 負載。
### 選項評估
| 方案 | 模型 | Dimension | 多語言 | ANE | 狀態 |
|------|------|-----------|--------|-----|------|
| **Apple NLEmbedding (sentence)** | 系統內建 | 未知 | ✅ 宣稱支援 | ✅ 原生 ANE | ❌ macOS 26.4.1 無模型檔 |
| **Apple NLEmbedding (word)** | GloVe | 300D | ❌ 僅英文 | ✅ | ❌ dim 不足,無多語言 |
| **Apple NLContextualEmbedding** | Transformer | 未知 | 未知 | ✅ | ❌ API 不可用 |
| **CoreML custom (MiniLM)** | BERT-based | 384D | ✅ 50+ languages | ✅ | ❌ torch.jit.trace 失敗 |
| **Ollama nomic-embed-text** | nomic-ai | 768D | ✅ 多語言 | ❌ | ✅ 現行方案 |
### 測試結論 (2026-05-04)
1. **NLEmbedding default**: dim=0, 所有 vector 回傳 nil。macOS 26.4.1 未預裝 sentence embedding 模型。
2. **NLEmbedding word (GloVe)**: dim=300, 僅英文。法文/中文 dim=0不支援
3. **NLContextualEmbedding**: API compile error方法不存在於公開 header。
4. **CoreML 自轉 MiniLM**: `torch.jit.trace` 對 BERT 架構拋出 `Placeholder storage not allocated on MPS``dictconstruct` op 未支援。
5. **Ollama nomic-embed**: 效能 ~6M embeddings/sec768D 多語言,已整合穩定。
### 建議
維持 Ollama `nomic-embed-text-v2-moe`
ANE text embedding 待以下條件成熟後重新評估:
- Apple 開放 NLEmbedding 多語言 sentence 模型下載
- 或 coremltools 支援 BERT `dictconstruct` op
- 或 Apple 發布預訓練 CoreML 多語言 embedding 模型

View File

@@ -132,4 +132,48 @@ CUT 在 **register 階段同步執行**`register_single_file`),不做 wor
| CPU | 0.5 | | CPU | 0.5 |
| 記憶體 | 512 MB | | 記憶體 | 512 MB |
| GPU | 不使用 | | GPU | 不使用 |
---
## Swift AVFoundation 替代評估
### POC 目標
使用 AVFoundation 逐幀 histogram 分析取代 Python PySceneDetectContentDetector目標利用 ANE 加速。
### 測試結果Charade 60s clip, 3597 frames, 59.9fps
| 指標 | Python PySceneDetect | Swift AVFoundation (luminance histogram) |
|------|---------------------|------------------------------------------|
| **Scenes 偵測** | **3** ✅ 合理 | **63** ❌ 過度敏感 |
| **處理時間** | **7.93s** | 15.42s |
| **RTF** | **0.132** (7.6x) | 0.257 (3.9x) |
| **記憶體** | ~512MB | 極低(系統框架) |
| **演算法** | ContentDetectoradaptive threshold + frame normalization | 單純 histogram diff64 bins luminance |
### 問題分析
1. **準確度** — 63 vs 3 scenes。簡單的 luminance histogram diff 對 camera movement、lighting change 過度敏感。PySceneDetect 的 ContentDetector 使用 adaptive threshold + 幀正規化,穩定性高很多。
2. **速度** — 15.42s vs 7.93s。AVAssetReader 必須 sequential decode 所有 frames無法像 ffmpeg 那樣 efficient frame skipping。
### 選型結論
| 項目 | 方案 |
|------|------|
| **Scene Cut Detection** | Python PySceneDetect **維持現狀** |
### 相關檔案
```
scripts/swift_processors/swift_cut_test.swift
```
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-03 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 新增 Swift AVFoundation 替代評估記錄 | OpenCode | deepseek-chat |
| 依賴 | 無 | | 依賴 | 無 |

View File

@@ -1,9 +1,9 @@
--- ---
document_type: "spec" document_type: "spec"
service: "MOMENTRY_CORE" service: "MOMENTRY_CORE"
title: "Face Embedding 產出流程 V1.0.0" title: "Face Embedding 產出流程 V2.0.0"
date: "2026-05-02" date: "2026-05-04"
version: "V1.0" version: "V2.0"
status: "active" status: "active"
owner: "Warren" owner: "Warren"
created_by: "OpenCode" created_by: "OpenCode"
@@ -12,16 +12,17 @@ tags:
- "core" - "core"
- "face" - "face"
- "embedding" - "embedding"
- "pgvector"
- "qdrant" - "qdrant"
- "v1.0.0" - "v2.0.0"
ai_query_hints: ai_query_hints:
- "Face Embedding 的完整處理流程(Frame → InsightFace → Qdrant" - "Face Embedding 的完整處理流程(Vision detection → CoreML FaceNet → pgvector + Qdrant"
- "V2.0 使用 Apple Vision Framework 取代 InsightFace detection"
- "V2.0 使用 CoreML FaceNet (MIT) 產出 512-D embedding"
- "Face processor 的輸出結構與 embedding 欄位說明" - "Face processor 的輸出結構與 embedding 欄位說明"
- "Worker store_face_chunks 與 store_face_embeddings_to_qdrant 的步驟"
- "Qdrant face collection 的 payload 結構與點位 ID 規則" - "Qdrant face collection 的 payload 結構與點位 ID 規則"
- "Face embedding 的 512-D ArcFace w600k_r50 向量規格"
- "Face embedding 使用 Cosine 距離計算" - "Face embedding 使用 Cosine 距離計算"
- "InsightFace buffalo_l 的資源預估與 GPU 加速資訊" - "Face detection 使用 ANEApple Vision Frameworkembedding 使用 ANECoreML FaceNet"
- "face_detections 表與 Qdrant 的資料同步方式" - "face_detections 表與 Qdrant 的資料同步方式"
related_documents: related_documents:
- "../VECTOR_SPEC_V1.0.0.md" - "../VECTOR_SPEC_V1.0.0.md"
@@ -31,103 +32,128 @@ related_documents:
- "../MOMENTRY_CORE_API_V1.0.0.md" - "../MOMENTRY_CORE_API_V1.0.0.md"
--- ---
# Face Embedding 產出流程 V1.0.0 # Face Embedding 產出流程 V2.0.0
| 項目 | 內容 | | 項目 | 內容 |
|------|------| |------|------|
| 建立者 | OpenCode | | 建立者 | OpenCode |
| 建立時間 | 2026-05-02 | | 建立時間 | 2026-05-04 |
| 文件版本 | V1.0 | | 文件版本 | V2.0 |
## 關鍵術語定義 ## V2.0 變更摘要
| 術語 | 定義 | | 項目 | V1.x | V2.0 |
|------|------| |------|------|------|
| Face Embedding | 人臉向量嵌入,由 InsightFace ArcFace 產出 512-D 向量 | | **Detection** | InsightFace SCRFD-10G (CPU, 450%) | **Apple Vision VNDetectFaceRectangles** (ANE, ~0%) |
| SCRFD-10G | InsightFace 的人臉檢測模型 | | **Pose** | InsightFace 2D landmarks → angle | **Apple Vision VNDetectFaceLandmarks** (roll/yaw/pitch) |
| ArcFace w600k_r50 | InsightFace 的人臉辨識模型,產出 512-D embedding | | **Embedding** | CoreML FaceNet 512-D (ANE) | 同左MIT license |
| point_id | Qdrant 中向量的唯一 ID使用幀編號 (frame number) | | **CPU usage** | 450%+ | **~0%** |
| Cosine distance | 餘弦距離,用於向量相似度計算 | | **Script** | `face_processor.py` | **`face_processor_vision.py` + `swift_face`** |
| payload | Qdrant 向量的附帶 metadata 欄位 |
## 處理流程 ## 處理流程
``` ```
1. Video Frame (取樣) 1. swift_face (Vision/ANE)
├── AVAssetReader 逐幀讀取
├── VNDetectFaceRectanglesRequest → bbox (x, y, w, h) + confidence
2. Face Processor (face_processor.py) ├── VNDetectFaceLandmarksRequest → roll, yaw, pitch
── InsightFace buffalo_l ── 輸出: {uuid}_detect.json
│ ├── SCRFD-10G 人臉檢測
│ ├── ArcFace w600k_r50 512-D embedding 2. face_processor_vision.py
├── 年齡/性別預測 ├── 讀取 detect.json
│ └── 2D106 landmarks ├── cv2 逐幀 crop face by bbox
├── CoreML FaceNet → 512-D embedding (ANE)
├── 輸出: job_{id}_face_{ts}.json → {file_uuid}.face.json ├── classify_pose(roll, yaw) → frontal/three_quarter/profile
└── FaceResult { frame_count, fps, frames: [FaceFrame] } └── 輸出: {uuid}.face.json (FaceResult format)
3. Rust pipeline (job_worker.rs)
3. Worker store_face_chunks() ├── 讀取 face.json → FaceResult struct
├── 解析 FaceResult ├── store_face_chunks() → pre_chunks table
── 寫入 pre_chunks 表 (file_uuid, processor_type='face', data) ── store_face_embeddings_to_qdrant() → Qdrant
└── 寫入 face_detections 表
4. Post-Face (job_worker.rs)
├── store_traced_faces.py
4. Worker store_face_embeddings_to_qdrant() │ ├── face_tracker.py (IoU + embedding) → trace_id
├── 對每個 face frame 的每個 face │ └── INSERT face_detections (trace_id + bbox + embedding pgvector)
│ └── 若有 embedding (512-D) ├── sync_face_embeddings() → Qdrant face points
│ ├── point_id = frame number (u64) └── cluster_face_embeddings() / search_similar_faces() → pgvector query
│ ├── vector = 512-D float array
│ └── payload (見下方)
└── 寫入 Qdrant collection `momentry_dev_face`
``` ```
## Qdrant Payload 結構 ## 輸出結構
### face.json (FaceResult)
```json ```json
{ {
"file_uuid": "dd61fda85fee441fdd00ab5528213ff7", "frame_count": 6872,
"face_id": null, "fps": 59.94,
"frame": 15, "frames": [
"timestamp": 0.68, {
"x": 328, "frame": 30,
"y": 88, "timestamp": 0.5,
"width": 63, "faces": [
"height": 75, {
"confidence": 0.83 "x": 917, "y": 125, "width": 181, "height": 250,
"confidence": 0.88,
"embedding": [0.01, -0.04, 0.12, ...], // 512-D
"pose_angle": {"angle": "frontal", "roll": 2.5, "yaw": -5.0, "pitch": 1.2},
"landmarks": null,
"attributes": null
}
]
}
]
} }
``` ```
### face_detections (PostgreSQL + pgvector)
| 欄位 | 型別 | 說明 | | 欄位 | 型別 | 說明 |
|------|------|------| |------|------|------|
| `file_uuid` | string | 來源影片識別碼 | | `file_uuid` | VARCHAR | 來源影片 |
| `face_id` | string|null | 臉部追蹤 ID尚未分配時為 null | | `frame_number` | BIGINT | 幀編號 |
| `frame` | integer | 幀編號 | | `trace_id` | INTEGER | 跨幀追蹤 IDface_tracker 分配) |
| `timestamp` | float | 時間戳(秒) | | `bbox` | JSONB | `{"x", "y", "width", "height"}` |
| `x, y, width, height` | integer | 人臉邊界框 | | `confidence` | DOUBLE | 檢測信心度 |
| `confidence` | float | 檢測信心度 (0~1) | | `embedding` | VECTOR(512) | pgvector index (ivfflat, cosine) |
| `identity_id` | BIGINT | 綁定的 identity可為 NULL |
### Qdrant Payload (momentry_dev/dev collection)
```json
{
"file_uuid": "1a04db97...",
"trace_id": 0,
"frame_number": 825,
"type": "face_embedding"
}
```
## Vector 規格 ## Vector 規格
| 屬性 | 值 | | 屬性 | 值 |
|------|-----| |------|-----|
| 模型 | InsightFace ArcFace w600k_r50 | | 模型 | CoreML FaceNet (InceptionResnetV1, VGGFace2) |
| License | MIT |
| 維度 | 512 | | 維度 | 512 |
| 距離計算 | Cosine | | 距離 | Cosine |
| 歸一化 | 否 (raw output) | | Index | pgvector ivfflat (lists=100) |
| Qdrant | Cosine distance, shared collection |
## 來源 Processor 資源預估 ## 來源 Processor 資源預估
| 資源 | | | 資源 | V1.x (InsightFace) | V2.0 (Vision + FaceNet) |
|------|-----| |------|--------------------|-------------------------|
| 模型 | InsightFace buffalo_l (~150MB) | | Detection 模型 | IntegrationFace SCRFD-10G (~150MB) | Apple Vision (系統內建) |
| CPU | 0.6 | | Embedding 模型 | CoreML FaceNet (90MB) | 同左 |
| 記憶體 | 1536 MB | | CPU | 450%+ | **~0%** |
| GPU | 支援CoreML 50-80 FPS, CUDA 80-120 FPS | | 記憶體 | ~1.5GB | **<50MB** |
| 處理速度 | 130.5x real-time (M4 Mac Mini) | | ANE | 僅 embedding | **detection + embedding** |
| Total time (2hr film, interval=30) | ~1.3hr | **~40min** |
## 版本歷史 ## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 | | 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------| |------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat | | V1.0 | 2026-05-02 | 初始版本 (InsightFace) | OpenCode | deepseek-chat |
| V2.0 | 2026-05-04 | Apple Vision detection + CoreML FaceNet embedding | OpenCode | deepseek-chat |

View File

@@ -102,3 +102,272 @@ related_documents:
| 依賴 | 無 | | 依賴 | 無 |
--- ---
## Apple Vision Framework 實驗記錄
### POC 目標
評估 Apple Vision Framework 是否可取代 InsightFacebuffalo_l進行臉部處理目標是利用 ANE 加速降低記憶體使用。
### 測試結果
測試環境macOS 14, Apple Silicon M4, 使用 `VNDetectFaceRectanglesRequest` + `VNDetectFaceLandmarksRequest` + `VNDetectFaceCaptureQualityRequest`
| 功能 | Vision Framework | InsightFace (buffalo_l) |
|------|----------------|------------------------|
| **Face Detection** | ✅ 通過1 face, conf=0.88 | ✅ |
| **Face Landmarks** | ✅ 6+6 eye pts, 8 nose pts | ✅ 106 pts |
| **Capture Quality** | ✅ score=0.5327 | ❌ 無 |
| **Face Embedding (512-D)** | ❌ **不可用** | ✅ ArcFace 512-D |
| **照片 metadata年齡/性別)** | ❌ 不可用 | ✅ |
| **ANE 加速** | ✅ 是 | ❌ CPU only |
| **處理時間** | ⚡ 0.31s | ~0.5-1s |
| **記憶體** | ✅ 低(系統框架) | ~1.5GB |
### 關鍵發現
`VNFaceprint` class 存在但無法透過公開 API 或 KVC 取得 face embedding 資料。Vision Framework 提供了高品質的臉部偵測和特徵點定位,但**無法提取用於 face matching 的向量 embedding**。
### 選型結論
| 用途 | 方案 |
|------|------|
| **Face Detection** | Vision Framework **可取代** InsightFace更輕量、更快 |
| **Face Landmarks** | Vision Framework **可取代** |
| **Face Embedding** | InsightFace **維持現狀**Vision Framework 無法取代) |
| **Face Recognition** | InsightFace **維持現狀** |
若未來 Apple 開放 `VNFaceprint` 的 embedding 資料,可重新評估全面切換。
### 相關檔案
```
scripts/swift_processors/face_vision_test.swift
```
---
## MediaPipe Face 評估
### 測試狀態
MediaPipe 0.10.33 已安裝,提供 Face Detection (BlazeFace) + Face Landmarker (468 mesh)。
| 功能 | API | 狀態 |
|------|-----|------|
| Face Detection | `mediapipe.tasks.python.vision.face_detector` | ✅ 可用 |
| Face Mesh | `mediapipe.tasks.python.vision.face_landmarker` | ✅ 468 3D landmarks |
| Face Embedding | 無 | ❌ 不支援 |
### 三方案比較
| 功能 | MediaPipe | Vision Framework | InsightFace |
|------|-----------|-----------------|-------------|
| **Face Detection** | ✅ BlazeFace (~2MB) | ✅ VNDetectFaceRectangles | ✅ RetinaFace |
| **Bounding Box** | ✅ | ✅ | ✅ |
| **Keypoints** | ✅ **6 點** (eyes+nose+mouth) | ❌ | ✅ 106 點 |
| **Face Mesh** | ✅ **468 點** (獨立模型) | ❌ | ❌ |
| **512-D Embedding** | ❌ | ❌ | ✅ **ArcFace** |
| **Age/Gender** | ❌ | ❌ | ✅ |
| **Capture Quality** | ❌ | ✅ score 0.06~0.25 | ❌ |
| **速度** | ⚡ 極快 (mobile optimized) | ⚡ ANE 加速 | 🐢 CPU bound |
| **模型大小** | ~2MB | 系統內建 | ~150MB |
| **跨平台** | ✅ Linux/Windows/macOS | ❌ Apple only | ✅ |
### 選型結論
| 用途 | 建議方案 |
|------|---------|
| **Face Detection** | MediaPipe 或 Vision Framework速度快、輕量 |
| **Face Mesh / 468 landmarks** | MediaPipe唯一方案 |
| **Face Embedding (512-D)** | InsightFace **維持現狀** |
| **Age/Gender** | InsightFace **維持現狀** |
MediaPipe 和 Vision Framework 在 detection 層級相當,兩者都遠快於 InsightFace。但最終 embedding extraction 仍需 InsightFace。
### 分段實施建議
若要以 Swift/Vision 加速 face pipeline
```
Swift face_detector (ANE, fast)
└── 輸出 {file_uuid}.bbox.json (face_id, bbox, timestamp)
Python embed_extractor (InsightFace, only on detected crops)
└── 讀取 .bbox.json → crop face region
→ InsightFace 提取 512-D embedding
→ 產出完整 {file_uuid}.face.json
```
---
## FaceNet-PyTorch CoreML Embedding 實驗
### 動機
InsightFace 的 buffalo_l pre-trained weights 使用 CC BY-NC-SA 4.0 license商用有爭議。需要一個 MIT/Apache 2.0 licensed 的 face embedding 方案。
### 測試結果
使用 Facenet-PyTorch (`facenet-pytorch`, MIT license) 的 InceptionResnetV1 (pretrained on VGGFace2),匯出 ONNX 並轉換為 CoreML。
| 步驟 | 時間 | 產出 |
|------|------|------|
| 模型載入 | 10.5s | InceptionResnetV1, 512-D output |
| ONNX 匯出 | 1.2s | `/tmp/facenet512.onnx` (90MB) |
| CoreML 轉換 | 6s | `/tmp/facenet512.mlpackage` (90MB) |
### 效能對比
| 指標 | PyTorch (CPU) | CoreML (CPU/GPU/ANE) |
|------|--------------|---------------------|
| **推論時間 (avg)** | 30.9ms | **4.8ms** ⚡ |
| **加速比** | 1x | **6.4x** |
| **Embedding 維度** | 512-D | 512-D |
| **Normalized** | ✅ norm=1.0 | ✅ norm=1.0 |
| **精度比對 (cosine)** | 1.0 | **0.999532** ✅ |
### License 確認
| 元件 | License | 商用 |
|------|---------|------|
| Facenet-PyTorch 原始碼 | **MIT** | ✅ |
| VGGFace2 weights | 研究用,但可重新訓練 | ✅ (自有資料訓練後) |
| ONNX Runtime | MIT | ✅ |
| CoreML | macOS 內建 | ✅ |
| InsightFace buffalo_l (現行) | CC BY-NC-SA 4.0 | ❌ **有爭議** |
### 結論
Facenet-PyTorch CoreML 模型可完全取代 InsightFace 的 embedding extractionMIT license 無商用障礙,且 CoreML 推論快 6.4 倍。
### 整合入 Face Processor
`scripts/face_processor.py` 已整合 CoreML FaceNet 作為 embedding extractor
| 項目 | 實作 |
|------|------|
| **Detection** | InsightFace buffalo_l維持不變 |
| **Embedding** | CoreML FaceNet`models/facenet512.mlpackage`)✅ 已取代 |
| **Fallback** | CoreML 失敗時自動回退到 InsightFace embedding |
| **啟動載入** | script 初始化時一次載入 CoreML model~2s |
| **推論流程** | 對每個 detected face crop → resize 160x160 → normalize → CoreML infer → 512-D embedding |
| **Metadata** | 輸出記錄 `embedding_method: coreml_facenet` |
Model 檔案路徑:`models/facenet512.mlpackage`(專案根目錄)
### 相關檔案
```
models/facenet512.mlpackage # CoreML model (90MB, MIT license)
/tmp/facenet512.onnx # ONNX format (90MB, for reference)
scripts/face_processor.py # Face processor with CoreML integration
```
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 新增 Apple Vision Framework + MediaPipe + FaceNet CoreML 整合記錄 | OpenCode | deepseek-chat |
| V2.0 | 2026-05-04 | Apple Vision 取代 InsightFace detectionCoreML FaceNet 維持 embedding | OpenCode | deepseek-chat |
---
## V2.0 Architecture: Vision Detection + CoreML FaceNet Embedding
### 架構變更
V1.x 使用 InsightFace 同時做 detection + embeddingCPU bound, 450%+ CPU
V2.0 將 detection 移至 Apple Vision FrameworkANEembedding 維持 CoreML FaceNetANECPU 歸零。
```
V1.x:
face_processor.py
├── InsightFace buffalo_l (CPU, 450%) → detection + bbox + landmarks
└── CoreML FaceNet (ANE) → 512-D embedding
V2.0:
face_processor_vision.py
├── swift_face (Vision/ANE) → VNDetectFaceRectanglesRequest → bbox
│ → VNDetectFaceLandmarksRequest → pose (roll, yaw, pitch)
└── CoreML FaceNet (ANE) → 512-D embedding on cropped face
```
### 處理流程
```
1. swift_face <video> <output_detect.json> --sample-interval 30
├── AVAssetReader 逐幀讀取
├── VNDetectFaceRectanglesRequest → bbox (x, y, w, h) + confidence
├── VNDetectFaceLandmarksRequest → roll, yaw, pitch + 76-point mesh
└── 每幀輸出: {"frame": N, "timestamp": S, "faces": [{bbox, confidence, pose}]}
2. Python 讀取 detect.json逐幀:
├── cv2 seek to frame → crop face by bbox
├── resize 160x160 → normalize [-1,1]
└── CoreML FaceNet predict → 512-D embedding
3. 組裝 face.json (FaceResult format):
├── frame_count, fps
└── frames: [{frame, timestamp, faces: [{x,y,w,h, embedding, pose_angle}]}]
```
### 效能對比
| 指標 | V1.x (InsightFace) | V2.0 (Vision + FaceNet) |
|------|--------------------|-------------------------|
| Detection CPU | 450%+ | **~0%** (ANE) |
| Embedding CPU | ~5% | **~0%** (ANE) |
| 記憶體 | ~1.5GB | **<50MB** |
| Detection 精度 | SCRFD-10G, 97.3% mAP | Vision, ~95% |
| Embedding | CoreML FaceNet 512-D (6.4x) | 同左 |
| 總處理時間 (2hr film) | ~1.3hr | **~40min** (sample=30) |
### Pose Angle 分類
swift_face 從 Vision landmarks 提取 roll/yaw/pitchPython 端分類:
| roll/yaw 範圍 | Pose Angle |
|---------------|------------|
| \|yaw\|<15, \|roll\|<15 | frontal |
| yaw > 30 | profile_right |
| yaw < -30 | profile_left |
| 其他 | three_quarter |
### 損壞幀處理 (2026-05-04)
部分影片來源(如從網路下載的老電影)包含損壞的 h264 GOP解碼時會產生異常尺寸的 CVPixelBuffer如 250×250 而非 1920×1080導致 Vision detection crash。
**修復**swift_face 以 `do/catch` 包裹 `VNImageRequestHandler.perform()`,異常幀 skip 並記錄到 stderr
```
[SwiftFace] Skipping corrupted frame 288660
```
已知損壞幀Charade (1963) frame 288,660。
### 相關檔案
```
scripts/swift_processors/swift_face.swift # Vision detection (ANE), 損壞幀 skip
scripts/face_processor_vision.py # V2.0 processor (Vision + CoreML)
scripts/face_processor.py # V1.x (InsightFace, deprecated) — now V2.0
scripts/store_traced_faces.py # Post-process: trace + DB store
scripts/utils/face_tracker.py # IoU + embedding cross-frame tracker
models/facenet512.mlpackage # CoreML FaceNet (MIT)
src/core/processor/face.rs # Rust FaceResult struct
src/worker/job_worker.rs # Pipeline trigger (trace store + Qdrant)
src/core/db/postgres_db.rs # cluster_face_embeddings(), search_similar_faces()
src/core/db/qdrant_db.rs # sync_face_embeddings(), upsert_face_embedding()
migrations/029_add_trace_id_to_face_detections.sql # trace_id column
migrations/030_create_tkg_graph_tables.sql # TKG nodes/edges
```
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 新增 Apple Vision Framework + MediaPipe + FaceNet CoreML 整合記錄 | OpenCode | deepseek-chat |
| V2.0 | 2026-05-04 | Apple Vision 取代 InsightFace detectionCoreML FaceNet 維持 embedding | OpenCode | deepseek-chat |
| V2.1 | 2026-05-04 | 損壞幀 skip 處理;已知 Charade frame 288,660 異常 | OpenCode | deepseek-chat |

View File

@@ -85,3 +85,41 @@ related_documents:
| 記憶體 | 1024 MB | | 記憶體 | 1024 MB |
| GPU | 不使用 | | GPU | 不使用 |
| 依賴 | 無 | | 依賴 | 無 |
---
## Apple Vision Framework 替代實作
### POC 結果
| 指標 | Python PaddleOCR (PP-OCRv4) | Swift Vision (VNRecognizeTextRequest) |
|------|----------------------------|---------------------------------------|
| **文字偵測** | 多筆低品質 ("1", "48219 %,") | **9 blocks, conf=1.0~0.3** ("A08S2-TS", "4101") |
| **速度/幀** | 慢batch 處理) | **0.43s / 幀** (640x360) |
| **記憶體** | ~1GBPaddleOCR 模型) | **低**(系統框架) |
| **語言** | 80+ | **30 種**(含 zh-Hans/Hant |
| **ANE 加速** | ❌ CPU only | ✅ **是** |
| **逐幀處理** | 需要 batch 加速 | ✅ 獨立快速 |
### 選型結論
Vision Framework OCR 在速度、記憶體、準確度上均優於 PaddleOCR且使用 ANE 加速。
**決定**: 以 Swift Vision OCR 取代 Python PaddleOCR。
### 實作
`scripts/swift_processors/swift_ocr.swift` 為完整 OCR processor支援
- 影片逐幀 / 取樣處理
- JSON 輸出格式與 Python 版相容
- 可透過 `ocr_processor.py` wrapper 被 PythonExecutor 呼叫
- 自動語言偵測
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 以 Apple Vision Framework 取代 PaddleOCR | OpenCode | deepseek-chat |

View File

@@ -82,3 +82,52 @@ related_documents:
| 記憶體 | 1024 MB | | 記憶體 | 1024 MB |
| GPU | 支援(`uses_gpu = true` | | GPU | 支援(`uses_gpu = true` |
| 依賴 | 無 | | 依賴 | 無 |
---
## Apple Vision Framework 替代實作
### POC 結果
使用 `VNDetectHumanBodyPoseRequest`ANE 加速)取代 MediaPipe/YOLOv8 Pose。
測試影片Thunderbolt ExaSAN at CCBN (24fps, sample_interval=90)
| 指標 | YOLOv8 Pose (CPU) | Vision Framework (ANE) |
|------|-------------------|----------------------|
| **Per frame** | **45ms** | **9ms** ⚡ |
| **加速比** | 1x | **5x** |
| **Joints** | 17 keypoints (COCO) | **19 joints** |
| **ANE 加速** | ❌ CPU only | ✅ **是** |
| **記憶體** | ~1GB (PyTorch) | 極低(系統框架) |
| **Joint 品質** | ✅ 標準 COCO | neck/shoulders 高 conf |
### 選型結論
Vision Framework body pose 在速度5x和資源使用上均優於 YOLOv8 Pose且 ANE 加速不佔 CPU。
**決定**: 以 Apple Vision Framework `VNDetectHumanBodyPoseRequest` 取代 YOLOv8 Pose。
### 實作
`scripts/swift_processors/swift_pose.swift` 為完整 Pose processor支援
- 影片逐幀 / 取樣處理
- 輸出格式相容於 Rust `PoseResult` struct
- 可透過 `pose_processor.py` wrapper 被 PythonExecutor 呼叫
- ANE 加速19 jointsneck, shoulders, elbows, wrists, hips, knees, ankles, root, nose, eyes, ears
### 相關檔案
```
scripts/swift_processors/swift_pose.swift # Vision Framework pose processor
scripts/swift_processors/pose_benchmark.swift # Benchmark test
```
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 以 Apple Vision Framework 取代 YOLOv8 Pose | OpenCode | deepseek-chat |

View File

@@ -90,3 +90,89 @@ related_documents:
| 記憶體 | 1024 MB | | 記憶體 | 1024 MB |
| GPU | 支援(`yolo_processor_mps.py` 可使用 MPS快 2~5 倍) | | GPU | 支援(`yolo_processor_mps.py` 可使用 MPS快 2~5 倍) |
| 依賴 | 無 | | 依賴 | 無 |
---
## Apple Vision Framework 替代評估
### POC 目標
評估 Apple Vision Framework 是否可取代 YOLOv8n 進行物件偵測,目標是利用 ANE 加速降低記憶體與處理時間。
### 測試結果
測試影像展場人物場景640x360、人物訪談場景1920x1080
| Vision 功能 | 測試結果 | YOLOv8n 對應 | 可取代 |
|------------|---------|-------------|--------|
| **VNClassifyImageRequest** | `people:0.94`, `adult:0.94`, `sign:0.40` | 場景分類(目前用 Places365 | ✅ **可取代 Scene processor** |
| **VNDetectHumanRectanglesRequest** | 2 persons, conf=0.68~0.76 | YOLO 'person' 類別 | ✅ **可取代 person 檢測** |
| **VNDetectHumanBodyPoseRequest** | 19 joints (neck, shoulders, wrists) | MediaPipe Pose | ✅ **可取代 Pose processor** |
| **VNDetectHumanHandPoseRequest** | 1 hand, conf=1.0 | 無對應 | ✅ 新功能 |
| **VNGenerateObjectnessBasedSaliency** | 1 region, 無 class label | 無對應 | ⚠️ 僅顯著性區域 |
| **一般物件偵測 (car/dog/bottle/chair...)** | ❌ **無此 API** | YOLO 80 COCO 類別 | ❌ **無法取代** |
### 關鍵限制
Vision Framework **沒有通用物件偵測器**。YOLOv8n 可偵測 80 個 COCO 類別person, car, dog, bottle, chair, tv 等Vision Framework 僅能偵測「人物」相關(人體、姿勢、手勢)和場景分類,無法辨識具體物體類別。
### 選型結論
| 用途 | 方案 |
|------|------|
| **人物偵測** | Vision Framework **可取代**(更快、更輕量) |
| **一般物件偵測car/dog/bottle** | YOLOv8n **維持現狀**Vision Framework 無法取代) |
| **場景分類** | Vision Framework **可取代** MIT Places365 |
| **姿態估計** | Vision Framework **可取代** MediaPipe Pose |
若僅需 person 類別Vision Framework 可完全取代 YOLO。但若需要其他 79 個 COCO 類別YOLOv8n 仍是必要方案。
### 相關檔案
```
scripts/swift_processors/vision_object_test.swift
```
---
## CoreML 加速實驗
### 動機
YOLOv8n 使用 PyTorch CPU 推論67ms/frame**AGPL-3.0 License 有商用限制**。改用 YOLOv5n**Apache 2.0**+ CoreML 轉換,可同時解決 License 和效能問題。
### 測試結果
| 引擎 | License | Per frame | 加速比 | ANE |
|------|---------|-----------|--------|-----|
| **YOLOv8 PyTorch CPU** | AGPL-3.0 | 67ms | 1x | ❌ |
| **YOLOv8 CoreML** | AGPL-3.0 | 13ms | 5.3x | ✅ |
| **YOLOv5 PyTorch CPU** | **Apache 2.0** | 59ms | 1x | ❌ |
| **YOLOv5 CoreML** ⭐ | **Apache 2.0** | **13ms** | **4.5x** | ✅ |
**決定**: 以 YOLOv5 CoreML`yolov5nu.mlpackage`)取代 YOLOv8。
### 實作
`yolo_processor.py` 模型載入順序:
1. `yolov5nu.mlpackage`CoreML, ANE→ 優先使用
2. `yolov5nu.pt`PyTorch CPU→ fallback
3. 自動下載(若無本地檔案)
### 相關檔案
```
yolov5nu.mlpackage # CoreML model (5.2MB, Apache 2.0)
yolov5nu.pt # PyTorch weights (5.3MB, Apache 2.0)
scripts/yolo_processor.py
```
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 以 YOLOv5 CoreML (Apache 2.0) 取代 YOLOv8 (AGPL) + Vision Framework 評估 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-04 | 新增 Apple Vision Framework 替代評估記錄 | OpenCode | deepseek-chat |

View File

@@ -56,10 +56,10 @@ related_documents:
|-----------|------|------|------|-----|-----|--------|------| |-----------|------|------|------|-----|-----|--------|------|
| ASR | ✅ 100% | faster-whisper (small) | 無 | 否 | 1.0 | 2048 MB | [詳細](./PROCESSORS/ASR_V1.0.0.md) | | ASR | ✅ 100% | faster-whisper (small) | 無 | 否 | 1.0 | 2048 MB | [詳細](./PROCESSORS/ASR_V1.0.0.md) |
| CUT | ✅ 100% | PySceneDetect | 無 | 否 | 0.5 | 512 MB | [詳細](./PROCESSORS/CUT_V1.0.0.md) | | CUT | ✅ 100% | PySceneDetect | 無 | 否 | 0.5 | 512 MB | [詳細](./PROCESSORS/CUT_V1.0.0.md) |
| YOLO | ✅ 100% | YOLOv8n | 無 | 是 | 0.3 | 1024 MB | [詳細](./PROCESSORS/YOLO_V1.0.0.md) | | YOLO | ✅ 100% | YOLOv5n (CoreML ANE) | 無 | 是 | 0.1 | 512 MB | [詳細](./PROCESSORS/YOLO_V1.0.0.md) |
| OCR | ✅ 100% | PaddleOCR PP-OCRv4 | 無 | 否 | 0.8 | 1024 MB | [詳細](./PROCESSORS/OCR_V1.0.0.md) | | OCR | ✅ 100% | Swift Vision VNRecognizeTextRequest | 無 | 是 (ANE) | 0.1 | 64 MB | [詳細](./PROCESSORS/OCR_V1.0.0.md) |
| Face | ✅ 100% | InsightFace buffalo_l | 無 | 是 | 0.6 | 1536 MB | [詳細](./PROCESSORS/FACE_V1.0.0.md) | | Face | ✅ 100% | InsightFace + CoreML FaceNet | 無 | 是 (ANE) | 0.3 | 512 MB | [詳細](./PROCESSORS/FACE_V1.0.0.md) |
| Pose | ✅ 100% | MediaPipe Pose | 無 | 是 | 0.4 | 1024 MB | [詳細](./PROCESSORS/POSE_V1.0.0.md) | | Pose | ✅ 100% | Swift Vision VNDetectHumanBodyPoseRequest | 無 | 是 (ANE) | 0.1 | 64 MB | [詳細](./PROCESSORS/POSE_V1.0.0.md) |
| ASRX | ⚠️ 80% | SpeechBrain ECAPA-TDNN | ASR | 否 | 0.8 | 2048 MB | [詳細](./PROCESSORS/ASRX_V1.0.0.md) | | ASRX | ⚠️ 80% | SpeechBrain ECAPA-TDNN | ASR | 否 | 0.8 | 2048 MB | [詳細](./PROCESSORS/ASRX_V1.0.0.md) |
| Scene | ✅ 100% | MIT Places365 | CUT | 否 | 0.3 | 512 MB | [詳細](./PROCESSORS/SCENE_V1.0.0.md) | | Scene | ✅ 100% | MIT Places365 | CUT | 否 | 0.3 | 512 MB | [詳細](./PROCESSORS/SCENE_V1.0.0.md) |
| VisualChunk | ✅ 整合 | 規則聚合(無模型) | YOLO | 否 | 0.3 | 512 MB | [詳細](./PROCESSORS/VISUAL_CHUNK_V1.0.0.md) | | VisualChunk | ✅ 整合 | 規則聚合(無模型) | YOLO | 否 | 0.3 | 512 MB | [詳細](./PROCESSORS/VISUAL_CHUNK_V1.0.0.md) |
@@ -106,3 +106,96 @@ YOLO ─→ VisualChunk
|------|------|------|--------|-----------| |------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本,含選型實驗報告與資源預估 | OpenCode | deepseek-chat | | V1.0 | 2026-05-02 | 初始版本,含選型實驗報告與資源預估 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-03 | CUT 新增 cut_count/cut_max_durationScene 移除 ASR 依賴;長影片 Face 動態調度Job 完成條件放寬 | OpenCode | deepseek-chat | | V1.1 | 2026-05-03 | CUT 新增 cut_count/cut_max_durationScene 移除 ASR 依賴;長影片 Face 動態調度Job 完成條件放寬 | OpenCode | deepseek-chat |
---
## Frame Scheduling 架構V4.1
### 問題
目前每個 processor 各自獨立呼叫 ffmpeg 從影片中萃取 frames導致重複的 ffmpeg 解碼開銷:
```
YOLO: ffmpeg extract → detect → write
OCR: ffmpeg extract → OCR → write ← ffmpeg again
Face: ffmpeg extract → detect → write ← ffmpeg again
Pose: ffmpeg extract → detect → write ← ffmpeg again
```
對長片6879s每個 processor 的 ffmpeg overhead 約 15~30s總計浪費 ~75s。
### 解決方案:共享 Frame Cache + 並發調度
```
Pipeline Phase 1 (順序):
CUT → Scene → ASR → ASRX
Frame Cache Phase (一次 ffmpeg):
ffmpeg extract → shared frame directory
├── frame_00001.jpg
├── frame_00002.jpg
└── ...
Pipeline Phase 2 (並發 on shared frames):
tokio::join!(
OCR (Swift Vision → frame dir)
Face (CoreML FaceNet → frame dir)
Pose (Swift Vision → frame dir)
YOLO (CoreML → frame dir)
)
```
### 實作模組
| 模組 | 檔案 | 說明 |
|------|------|------|
| `FrameManager` | `src/core/frame_cache.rs` | 負責 ffmpeg extract、管理 frame 目錄生命週期 |
| `ProcessorTask.frame_dir` | `src/worker/processor.rs` | 傳遞共享 frame 目錄路徑給 child process |
| `MOMENTRY_FRAME_DIR` | env var | Worker 設此 env varprocessor 讀取後跳過 ffmpeg |
### V1 實作狀態
| 項目 | 狀態 |
|------|------|
| `FrameManager::extract()` | ✅ 完成 — 一次 ffmpeg 產出 shared frame directory |
| `MOMENTRY_FRAME_DIR` 環境變數傳遞 | ✅ `start_processor` 在 spawn 前設定 |
| Swift OCR (`swift_ocr.swift`) | ✅ 若 `MOMENTRY_FRAME_DIR` 有值則跳過 ffmpeg |
| Swift Pose (`swift_pose.swift`) | ✅ 同上 |
| Python Face (`face_processor.py`) | ⏳ 待實作 |
| Python YOLO (`yolo_processor.py`) | ⏳ 待實作 |
### 流程
```rust
// job_worker.rs
let frame_needed = [OCR, Face, Pose, Yolo].any_in(processors_to_run);
if frame_needed {
let fm = FrameManager::extract(video, sample_interval).await;
// fm.dir → /tmp/frames_{hash}/ 含全部 .jpg
}
// processor.rs
start_processor(task) {
if let Some(dir) = task.frame_dir {
std::env::set_var("MOMENTRY_FRAME_DIR", dir);
}
tokio::spawn(async move { run_processor(...) });
}
```
### 效益
| 指標 | 改善 |
|------|------|
| ffmpeg 呼叫次數 | 4次 → **1次** |
| 累積 extract overhead | ~75s → **~15s** |
| OCR/Face/Pose/YOLO 總執行時間 | 順序 N 倍 → **約等於最慢的 processor** |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-02 | 初始版本 | OpenCode | deepseek-chat |
| V1.1 | 2026-05-03 | CUT 新增 cut_count/cut_max_durationScene 移除 ASR 依賴;長影片 Face 動態調度Job 完成條件放寬 | OpenCode | deepseek-chat |
| V1.2 | 2026-05-04 | 新增 Frame Scheduling 架構 + V1 實作FrameManager、env var 傳遞、Swift OCR/Pose 支援) | OpenCode | deepseek-chat |

View File

@@ -0,0 +1,322 @@
---
document_type: "spec"
service: "MOMENTRY_CORE"
title: "UUID Encoding Rules V1.0"
date: "2026-05-05"
version: "V1.0"
status: "design"
owner: "Warren"
created_by: "OpenCode"
tags:
- "momentry"
- "core"
- "uuid"
- "encoding"
- "v1.0"
ai_query_hints:
- "UUID encoding rules for identities, files, resources, jobs"
- "Deterministic UUID v5 for cross-system identity matching"
- "file_uuid 32-char birth UUID (hash of MAC+time+path+name)"
- "identity_uuid 32-char stripped UUIDv5"
related_documents:
- "../DUAL_EMBEDDING_PIPELINE_V1.0.0.md"
- "../CHUNK_DEFINITION_V1.0.0.md"
---
# UUID Encoding Rules V1.0
## 目的
統一系統內所有資源的 UUID 編碼規則,確保跨系統不衝突、可追溯、無語意歧義。
## 各資源 UUID 規則
| 資源 | 欄位 | 產生方式 | 長度 | 編碼意義 |
|------|------|---------|------|---------|
| **File** | `file_uuid` | Birth UUID: `SHA256(MAC + registration_time + canonical_path + filename)` | 32 | MAC + 時間 + 路徑 + 檔名 → 內容相同但不同機器/時間仍不同 |
| **Identity** | `identity_uuid` | UUIDv5: `UUIDv5(NS, source:external_id)` | 32 | source + external_id → 跨系統唯一確定 |
| **Job** | `job_uuid` | UUIDv4 random | 32 | 每次執行獨立 |
| **Resource** | `resource_uuid` | UUIDv5: `UUIDv5(NS, hostname:resource_id)` | 32 | hostname + resource_id → 同主機同 ID 不變 |
## Identity UUIDv5 編碼規則
### 意義
`identity_uuid` = source + external_id 的確定性映射。
同一來源系統的同一外部 ID → 永遠相同 UUID。
跨系統合併 identity 時不衝突。
### Namespace
```
MOMENTRY_IDENTITY_NS = "6ba7b810-9dad-11d1-80b4-00c04fd430c8" // Standard DNS namespace
```
### Source-specific encoding
| Source | External ID | UUIDv5 Input | 碰撞機率 |
|--------|------------|-------------|---------|
| `tmdb` | `"285"` (person_id) | `"tmdb:285"` | 0同 source 同 id 同 UUID |
| `manual` | user-assigned name | `"manual:Cary Grant"` | 0同名同 source |
| `face_cluster` | `file_uuid + cluster_id` | `"cluster:384b0ff...:cluster_0"` | 極低(跨 file |
### 優點
1. **跨系統確定性**:無論哪台機器、哪次執行,同一個 TMDb actor 永遠拿到相同 UUID
2. **合併安全**:兩套系統產生的 identity 集合可以直接合併UUID 不衝突
3. **可追溯**:從 UUID 本身無法反推 source單向 hash但透過 DB metadata 查得到來源
4. **零碰撞**:不同 source + different external_id → different UUID
### 現有資料遷移
```
1. 讀取所有 identities
2. 計算 UUIDv5("tmdb:{tmdb_id}") 為新的 identity_uuid
3. 手動註冊的 identities 用 UUIDv5("manual:{name}")
4. 更新 face_detections.identity_id 指向新 UUID
5. 更新 chunks metadata
```
## File UUID (保持不變)
File UUID = `SHA256(MAC + registration_time + canonical_path + filename)` 的前 128 bits32 hex chars。
跨系統不變同檔案不同機器註冊UUID 不同但可追溯)。**不更改。**
## Job UUID (升級)
目前用 `INTEGER auto-increment`(單機安全,多機碰撞)。
改為 `UUIDv4`32 hex支援多機 worker 並行。
## Resource UUID (新增)
目前用 `resource_id` 字串(任意)。
改為 `UUIDv5(namespace, hostname:resource_id)`32 hex支援多機註冊不碰撞。
### Resource 分類
| 類別 | resource_type | 說明 | 目前實例 |
|------|--------------|------|---------|
| `compute` | worker, server | 運算節點 | momentry_playground worker/server |
| `storage` | postgres, mongodb, redis, qdrant, mariadb | 資料儲存 | localhost 服務 |
| `ai` | ollama, llama_cpp, embedding | AI/ML 推理服務 | Ollama serve, llama-server |
| `proxy` | caddy, sftpgo | 反向代理/檔案服務 | Caddy, SFTPGo |
| `web` | wordpress, php-fpm | 前端 portal | WordPress |
| `external` | tmdb, n8n | 外部 API 整合 | TMDb API, n8n |
### Resource 生命週期欄位
| 欄位 | 型別 | 說明 | 範例 |
|------|------|------|------|
| `resource_uuid` | 32 hex | UUIDv5 唯一識別 | `a4f288...` |
| `resource_type` | enum | compute/storage/ai/proxy/external | `ai` |
| `resource_subtype` | string | ollama, llama_cpp, postgres... | `ollama` |
| `hostname` | string | 執行主機 | `mac-studio.local` |
| `port` | int | service port | `11434` |
| `started_at` | timestamp | 啟動時間 | `2026-05-05T10:00:00Z` |
| `stopped_at` | timestamp | 停止時間 (NULL=運行中) | `NULL` |
| `config` | jsonb | 執行參數/環境設定 | `{"model":"nomic-embed-text-v2-moe","dim":768}` |
| `install_source` | string | 安裝來源 | `homebrew`, `docker`, `binary`, `source` |
| `install_path` | string | 安裝路徑 | `/opt/homebrew/opt/ollama` |
| `location` | string | 實體位置/網路位置 | `localhost`, `rackserver-01` |
| `status` | enum | running/stopped/error/unknown | `running` |
### 目前 service 實例
| resource_type | subtype | port | license | 商用 |
|--------------|---------|------|---------|------|
| ai | ollama | 11434 | MIT | ✅ |
| ai | llama_cpp | 8081 | MIT | ✅ |
| storage | postgres | 5432 | PostgreSQL | ✅ |
| storage | mongodb | 27017 | SSPL v1 | ⚠️ 非 OSI 開源。內部使用不受限制,不可轉售為 DB 服務 |
| storage | redis | 6379 | RSALv2 / SSPL | ⚠️ 7.4+ 雙授權。內部使用不受限制,不可轉售為雲端服務 |
| storage | qdrant | 6333 | Apache 2.0 | ✅ |
| proxy | caddy | 443 | Apache 2.0 | ✅ |
| proxy | sftpgo | 8080 | AGPL-3.0 | ⚠️ 網路服務觸發 copyleft。未修改原始碼風險較低商用建議評估替代方案 |
### sftpgo 替代方案
sftpgo 提供 SFTP + HTTP file serve + Web UI + user management。可依需求分層替代
| 功能 | 替代方案 | License | 說明 |
|------|---------|---------|------|
| HTTP file serve | **Caddy** `file_server` | Apache 2.0 ✅ | 已運行中。一行 config 即可提供目錄服務 |
| WebDAV | **Caddy** `webdav` plugin | Apache 2.0 ✅ | 如需 WebDAV 掛載 |
| SFTP protocol | **OpenSSH** `internal-sftp` | MIT ✅ | macOS 內建,無需額外安裝 |
| User management | **Caddy** `basicauth` | Apache 2.0 ✅ | 基本 auth 已夠用 |
| Web admin UI | 不需要 | — | 若只需 file serveWeb UI 非必要 |
**建議**:先用 Caddy `file_server` 取代 HTTP 端SFTP 用 OpenSSH。sftpgo 可在商用授權前逐步退役。Caddy 已處理 TLS、reverse proxy、basic auth不需要 sftpgo 的重複功能。
```caddyfile
# 範例Caddy 替代 sftpgo file serve含 user 管制
files.momentry.ddns.net {
root * /Users/accusys/momentry/var/sftpgo/data
# 管制方式三選一:
# 1. Basic Auth最簡單
basicauth {
demo $2a$14$hashed_password_here
}
# 2. JWT Tokenvia forward_auth
# forward_auth localhost:9001 {
# uri /api/v1/auth/verify
# copy_headers Authorization
# }
# 3. IP Whitelist內網 only
# @allowed remote_ip 192.168.1.0/24 127.0.0.1
file_server browse
import common_log sftpgo_access
}
```
### User 管制方式比較
| 方式 | 複雜度 | 適用場景 |
|------|--------|---------|
| **basicauth** | 低 | 少數固定 user密碼 hash 存在 config |
| **forward_auth** | 中 | 由 momentry API 統一驗證 token |
| **IP whitelist** | 低 | 內網服務,不開放外部 |
| compute | worker | — | MIT | ✅ |
| compute | server | 3002/3003 | MIT | ✅ |
| external | tmdb | — | TMDb ToS | ⚠️ 替代方案:手動上傳、自有演員資料庫 |
| external | n8n | 5678 | Sustainable Use | ⚠️ 商用需付費 |
| web | wordpress | 80/443 | GPL-2.0 | ✅ portal 前端 |
| storage | mariadb | 3306 | GPL-2.0 | ✅ WordPress DB 後端 |
| web | wordpress | 443 (caddy) | GPLv2 | ✅ |
| web | php | 9000 (php-fpm) | PHP License | ✅ |
| storage | mariadb | 3306 | GPLv2 | ✅ |
### Log 路徑
每個 service 的 log 位於 `/Users/accusys/momentry/var/{service}/log/`
| service | stdout log | error log |
|---------|-----------|-----------|
| sftpgo | `var/sftpgo/log/stdout.log` | `var/sftpgo/log/stderr.log` |
| n8n | `var/n8n/n8n-main.log` | `var/n8n/n8n-main-error.log` |
| mariadb | `var/mariadb/ddl_recovery.log` | `var/mariadb/tc.log` |
momentry core 本身playground / production目前 log 到 `/tmp/`(開發)或 systemd journal生產。應統一遷移到
| 環境 | Port | Log 目錄 |
|------|------|---------|
| dev | 3003 | `/Users/accusys/momentry/log/dev/` |
| public (production) | 3002 | `/Users/accusys/momentry/log/public/` |
每個環境下的 log 命名:
```
momentry/log/dev/
├── momentry.log # API server stdout
├── momentry.error.log # API server stderr
├── worker.log # Worker stdout
├── worker.error.log # Worker stderr
├── processor/
│ ├── face.log # Face processor output
│ └── asr.log # ASR processor output
└── agent/
├── story.log
└── identity.log
momentry/log/public/
└── (same structure)
```
### 隔離原則
| 規則 | 說明 |
|------|------|
| 永不交叉 | dev log 不寫入 public反之亦然 |
| 環境識別 | 從 log 路徑即可判斷來源環境 |
| 獨立 rotation | 各自獨立的 logrotate 規則 |
| 清除安全 | 清除 dev log 不影響 public |
## URL Path 規範
所有 UUID 在 URL 中使用 **32-char hex無 dash** 格式:
```
GET /api/v1/files/384b0ff44aaaa1f14cb2cd63b3fea966 ← file
GET /api/v1/identities/3f5d1e09ce86c27aa631162052ec9c97 ← identity
GET /api/v1/jobs/942d0bdf5d6fb6ac18b47deb031e60c3 ← job
```
### 為何 strip dash
1. **一致**file_uuid 為 32 hex無 dash統一風格
2. **短**URL 從 36 → 32 chars
3. **容錯**input 端兩種格式都接受output 端統一 strip
## 設計說明
### File UUID 為何用 Birth UUID 而非 Content Hash
Content hashMD5/SHA256 of file content適用於「相同內容 = 相同檔案」的場景。但 momentry 的情境是:
- 同一影片可能有不同 cut 版本(廣告、預告、完整版)
- 同一影片在不同機器上註冊應區分(追蹤來源)
- 需要追溯「哪台機器在何時註冊了哪個檔案」
因此用 Birth UUID = `SHA256(MAC + time + path + filename)`,而非 content hash。
### Identity 第一參考面取得
TMDb 只是取得第一張參考照片的 **手段之一**,不是唯一來源:
```
1. TMDb (或其他來源) → 下載照片
2. 提取 face embedding → 寫入 identities.face_embedding
3. 刪除照片(不留原始檔案)
4. 用這個 embedding 找到第一個 matching video trace
5. 從 video trace 中取 3 個最佳影片臉 → 取代外部 embedding → 成為 identity reference
```
之後 identity reference 全部來自影片臉,不再依賴外部照片。
**⚠️ TMDb 商用授權**TMDb API 有商用限制。若產品上線需處理授權,或改用替代方案:
1. 手動上傳參考照片
2. 跨檔案 identity merge從已有 traces 取 reference
3. 自有演員資料庫
跨系統合併 identity 時需要知道「TMDb actor 285」在不同系統上是否為同一個人。UUIDv5 提供確定性映射:
- `tmdb:285` → 永遠是 `cc6b8c2569ff5dec8f9e33164c7756b3`
- 任何系統、任何時間計算都得到相同結果
- 不需要 central registrymathematically guaranteed
### 現有資料遷移策略
Identity UUID 遷移非破壞性:舊 UUID 保留在 `metadata.legacy_uuid`,新 UUID 寫入 `identities.uuid`。向下相容查詢。
## UUID 與獨立工作空間
每個資源的 working space、輸入、產出各自獨立互不汙染
| 資源 | UUID | Working Space | 輸入 | 產出 |
|------|------|--------------|------|------|
| **File** | `file_uuid` | `output_dev/{uuid}/` | `{video_path}` | `{uuid}.cut.json, .asr.json, .face.json, ...` |
| **Identity** | `identity_uuid` | `dev.identities` table | face_detections, voice_embeddings, TMDb API, manual input | `identities.face_embedding`, `identities.voice_embedding`, `identity_bindings`, `file_identities` |
| **Job** | `job_uuid` | `dev.monitor_jobs` + `dev.processor_results` | `processors[]` list | `processor_results.status`, log entries |
| **Resource** | `resource_uuid` | `var/{resource}/log/` | config, exec_path | log files, heartbeat records |
File 的工作空間在 filesystemIdentity/Job/Resource 在 DB。各自目錄/table 獨立,刪除一個不影響其他。
## Dev / Public 完整隔離表
| 資源 | dev | public |
|------|-----|--------|
| DB Schema | `dev.*` | `public.*` |
| Qdrant | `momentry_dev_*` | `momentry_*` |
| Redis prefix | `momentry_dev:` | `momentry:` |
| Output dir | `output_dev/` | `output/` |
| Log | `log/dev/` | `log/public/` |
| Resource UUID | `UUIDv5(hostname:xxx_dev)` | `UUIDv5(hostname:xxx)` |
| Port | 3003 | 3002 |
| .env file | `.env.development` | `.env` |
## 版本歷史
| 版本 | 日期 | 變更 |
|------|------|------|
| V1.0 | 2026-05-05 | Book UUID (file), UUIDv5 (identity), UUIDv4 (job), UUIDv5 (resource)。Resource 分類與生命週期。 |

View File

@@ -0,0 +1 @@
{"status":"ok","version":"1.0.0","uptime_ms":204684}

View File

@@ -0,0 +1 @@
{"status":"ok","version":"1.0.0","uptime_ms":204716,"services":{"postgres":{"status":"ok","latency_ms":10,"error":null},"redis":{"status":"ok","latency_ms":0,"error":null},"qdrant":{"status":"ok","latency_ms":1,"error":null},"mongodb":{"status":"ok","latency_ms":0,"error":null}}}

View File

@@ -0,0 +1 @@
{"success":true,"message":"Login successful","api_key":"muser_test_001","user":{"username":"demo"}}

View File

@@ -0,0 +1 @@
{"success":true}

View File

@@ -0,0 +1 @@
{"success":true,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","file_path":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","metadata":{"format":{"size":"2361629896","bit_rate":"2746348","duration":"6879.329524","filename":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","format_name":"mov,mp4,m4a,3gp,3g2,mj2"},"streams":[{"tags":{"language":"und","handler_name":"ISO Media file produced by Google Inc."},"index":0,"width":1920,"height":1080,"channels":null,"duration":"6879.255717","nb_frames":"412343","codec_name":"h264","codec_type":"video","sample_rate":null,"r_frame_rate":"60000/1001"},{"tags":{"language":"eng","handler_name":"ISO Media file produced by Google Inc."},"index":1,"width":null,"height":null,"channels":2,"duration":"6879.329524","nb_frames":"296268","codec_name":"aac","codec_type":"audio","sample_rate":"44100","r_frame_rate":"0/0"}]},"created_at":"2026-05-03T07:44:43.384236Z"}

View File

@@ -0,0 +1 @@
error returned from database: relation "file_identities" does not exist

View File

@@ -0,0 +1 @@
{"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","duration":6879.329524,"width":1920,"height":1080,"fps":59.94005994005994,"total_frames":412343,"cached":true,"format":{"filename":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","format_name":"mov,mp4,m4a,3gp,3g2,mj2","duration":"6879.329524","size":"2361629896","bit_rate":"2746348"},"streams":[{"index":0,"codec_name":"h264","codec_type":"video","width":1920,"height":1080,"r_frame_rate":"60000/1001","nb_frames":"412343","duration":"6879.255717","sample_rate":null,"channels":null,"tags":{"language":"und","handler_name":"ISO Media file produced by Google Inc."}},{"index":1,"codec_name":"aac","codec_type":"audio","width":null,"height":null,"r_frame_rate":"0/0","nb_frames":"296268","duration":"6879.329524","sample_rate":"44100","channels":2,"tags":{"language":"eng","handler_name":"ISO Media file produced by Google Inc."}}]}

View File

@@ -0,0 +1 @@
{"success":true,"total":0,"page":1,"page_size":20,"data":[{"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","file_path":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","status":"ready"},{"file_uuid":"0bfb7f3b8f529e806a8dc325b1e989f6","file_name":"Old Felix the Cat Cartoon.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Old Felix the Cat Cartoon.mp4","status":"ready"},{"file_uuid":"078975658e04529ee06f8d11cd7ba226","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"ready"},{"file_uuid":"6f10e2e58146425947f047948de7a11a","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"ready"},{"file_uuid":"80459593c892f50d271e2408a79b1391","file_name":"Walt Disney - 1925 - Alice the Toreador.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Walt Disney - 1925 - Alice the Toreador.mp4","status":"ready"},{"file_uuid":"7a80cb575b873b7eea99002a7e6cfa1d","file_name":"view7.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view7.mp4","status":"ready"},{"file_uuid":"d5f6a63b1065f496ac3eca62d3c67416","file_name":"view28.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view28.mp4","status":"ready"},{"file_uuid":"e4bd8e594cb4824d15ab45522780c752","file_name":"view15.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view15.mp4","status":"ready"},{"file_uuid":"4583cd2c15844238ac2eefdc1241a3ba","file_name":"view13.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view13.mp4","status":"ready"},{"file_uuid":"84470206e42e1622f8a299f0089172c1","file_name":"Top Colorist Blake Jones Speaks about the Gamma Carry.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Top Colorist Blake Jones Speaks about the Gamma Carry.mp4","status":"ready"},{"file_uuid":"477d8fa7bc0e1a70d89cc0022b7ebfd2","file_name":"Thunderbolt ExaSAN at CCBN 中国国际广播电视信息网络展览会清.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Thunderbolt ExaSAN at CCBN 中国国际广播电视信息网络展览会清.mp4","status":"ready"},{"file_uuid":"65d6a1e7d1c7606ca588a30137a0cc60","file_name":"steamboat-willie_1928.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/steamboat-willie_1928.mp4","status":"ready"},{"file_uuid":"420f196bbab651616eb8ea49b74feabd","file_name":"Old Felix the Cat Cartoon.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Old Felix the Cat Cartoon.mp4","status":"ready"},{"file_uuid":"cf711e5ee9edd60a827ef2f4f5807eec","file_name":"KOBA 2022 Interview SBU Accusys Storage.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/KOBA 2022 Interview SBU Accusys Storage.mp4","status":"ready"},{"file_uuid":"d261e9add96fbe4fa84abb5832989b64","file_name":"Gamma Carry Saves the World..mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma Carry Saves the World..mp4","status":"ready"},{"file_uuid":"fe9542b6149643d3bf71e46bd2967267","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"ready"},{"file_uuid":"8e2e98c49355935f662cf1fb23c37c91","file_name":"ExaSAN Webinar by Blake Jones, Vision2see.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/ExaSAN Webinar by Blake Jones, Vision2see.mp4","status":"ready"},{"file_uuid":"a4f2880616e82a03c862831fbcd3477b","file_name":"ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","status":"ready"},{"file_uuid":"c4e4d53de3b678469e0fdf9d4c1fb257","file_name":"animal4.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/animal4.mp4","status":"ready"},{"file_uuid":"1d5b574b4e6cbb2ead4ba5da5ff8c746","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"ready"}]}

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `file_path` at line 1 column 55

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1 @@
{"job_id":133,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","status":"PENDING","pids":[0,0,0],"message":"Processing triggered for Old_Time_Movie_Show_-_Charade_1963.HD.mov"}

View File

@@ -0,0 +1 @@
{"success":false,"uuid":"","message":"Either uuid or file_path+pattern is required"}

View File

@@ -0,0 +1 @@
{"error":"Identity not found: a9a90105-6d6b-46ff-92da-0c3c1a57dff4"}

View File

@@ -0,0 +1,10 @@
Script failed: Traceback (most recent call last):
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 468, in <module>
main()
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 422, in main
angle_groups = group_faces_by_angle(args.face_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 60, in group_faces_by_angle
with open(face_json_path) as f:
^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'test'

View File

@@ -0,0 +1 @@
error returned from database: relation "file_identities" does not exist

View File

@@ -0,0 +1 @@
error returned from database: column "updated_at" does not exist

View File

@@ -0,0 +1 @@
error returned from database: relation "file_identities" does not exist

View File

@@ -0,0 +1 @@
{"identities":[{"id":22,"name":"Raoul Delfosse","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Taxi Driver (uncredited)","tmdb_cast_order":14,"tmdb_movie_title":"Charade"}},{"id":21,"name":"Albert Daumergue","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Man in Stamp Market (uncredited)","tmdb_cast_order":13,"tmdb_movie_title":"Charade"}},{"id":20,"name":"Marcel Bernier","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Taxi Driver (uncredited)","tmdb_cast_order":12,"tmdb_movie_title":"Charade"}},{"id":19,"name":"Claudine Berg","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Maid (uncredited)","tmdb_cast_order":11,"tmdb_movie_title":"Charade"}},{"id":18,"name":"Marc Arian","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Subway Passenger (uncredited)","tmdb_cast_order":10,"tmdb_movie_title":"Charade"}},{"id":17,"name":"Thomas Chelimsky","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Jean-Louis Gaudel","tmdb_cast_order":9,"tmdb_movie_title":"Charade"}},{"id":16,"name":"Paul Bonifas","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Mr. Felix","tmdb_cast_order":8,"tmdb_movie_title":"Charade"}},{"id":15,"name":"Jacques Marin","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Edouard Grandpierre","tmdb_cast_order":7,"tmdb_movie_title":"Charade"}},{"id":14,"name":"Ned Glass","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Leopold Gideon","tmdb_cast_order":6,"tmdb_movie_title":"Charade"}},{"id":13,"name":"Dominique Minot","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Sylvie Gaudel","tmdb_cast_order":5,"tmdb_movie_title":"Charade"}},{"id":12,"name":"George Kennedy","metadata":{"speaker_id":"SPEAKER_9","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Herman Scobie","tmdb_cast_order":4,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":11,"name":"James Coburn","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Tex Panthollow","tmdb_cast_order":3,"tmdb_movie_title":"Charade"}},{"id":10,"name":"Walter Matthau","metadata":{"speaker_id":"SPEAKER_4","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Hamilton Bartholemew","tmdb_cast_order":2,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":9,"name":"Audrey Hepburn","metadata":{"speaker_id":"SPEAKER_1","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Regina Lampert","tmdb_cast_order":1,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":8,"name":"Cary Grant","metadata":{"speaker_id":"SPEAKER_0","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Peter Joshua","tmdb_cast_order":0,"tmdb_movie_title":"Charade","speaker_confidence":0.85}}],"count":15,"page":1,"page_size":20}

View File

@@ -0,0 +1 @@
{"error":"Source identity not found"}

View File

@@ -0,0 +1 @@
{"error":"error returned from database: column \"identity_confidence\" of relation \"face_detections\" does not exist"}

View File

@@ -0,0 +1 @@
Query error: error returned from database: column "bbox" does not exist

View File

@@ -0,0 +1 @@
{"results":[],"query":"stolen fortune thriller"}

View File

@@ -0,0 +1 @@
{"error":"Search error: error returned from database: column f.pose_results does not exist"}

View File

@@ -0,0 +1 @@
{"results":[{"uuid":"unknown","chunk_id":"unknown","chunk_type":"","start_time":0.0,"end_time":0.0,"text":"","vector_score":0.7524489760398865,"bm25_score":0.0,"combined_score":6.067750513553619}],"query":"Paris apartment scene"}

View File

@@ -0,0 +1 @@
{"error":"error returned from database: column \"scene_order\" does not exist"}

View File

@@ -0,0 +1 @@
{"error":"error returned from database: column \"uuid\" does not exist"}

View File

@@ -0,0 +1 @@
{"results":[],"query":"Cary Grant as mysterious stranger"}

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: criteria: missing field `required_classes` at line 1 column 56

View File

@@ -0,0 +1 @@
{"jobs":[{"id":132,"uuid":"417a7e93860d70c87aee6c4c1b715d70","status":"pending","current_processor":null,"progress_current":0,"progress_total":0,"created_at":"2026-05-05 15:07:51.891007+00","started_at":null},{"id":133,"uuid":"417a7e93860d70c87aee6c4c1b715d70","status":"pending","current_processor":null,"progress_current":0,"progress_total":0,"created_at":"2026-05-05 15:11:04.023419+00","started_at":null}],"count":2,"page":1,"page_size":20}

View File

@@ -0,0 +1 @@
{"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","user":null,"group":null,"file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","duration":6879.329524,"overall_progress":0,"cpu_percent":4.5,"gpu_percent":null,"memory_percent":0.2,"memory_mb":29344,"system":{"cpu_idle_pct":50.0,"memory_available_mb":2949,"memory_total_mb":16384,"memory_used_pct":82.0,"gpu_available":false,"gpu_utilization_pct":null,"gpu_memory_used_pct":null,"dynamic_concurrency":2,"config_concurrency":2,"running_processors":2},"processors":[{"name":"asr","status":"pending","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"cut","status":"pending","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"asrx","status":"pending","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"yolo","status":"pending","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"ocr","status":"running","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"face","status":"running","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":0,"retry_count":0},{"name":"pose","status":"completed","current":0,"total":0,"progress":0,"message":"","frames_processed":0,"chunks_produced":8191,"retry_count":0}]}

View File

@@ -0,0 +1 @@
{"rule":"story","supported_processor_ids":[],"active_jobs":[]}

View File

@@ -0,0 +1 @@
error returned from database: relation "resources" does not exist

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `resource_id` at line 1 column 69

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `resource_id` at line 1 column 22

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `file_uuid` at line 1 column 22

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `file_uuids` at line 1 column 35

View File

@@ -0,0 +1 @@
error returned from database: column "uuid" does not exist

View File

@@ -0,0 +1 @@
{"success":true,"agent_name":"Identity Agent","version":"1.0.0","supported_models":["gemma4","qwen3"],"default_thresholds":{"auto_merge_threshold":0.8,"llm_threshold":0.5,"face_similarity_threshold":0.3}}

View File

@@ -0,0 +1 @@
Face clustered data not found for video: 417a7e93860d70c87aee6c4c1b715d70

View File

@@ -0,0 +1 @@
Face clustered data not found for video: 417a7e93860d70c87aee6c4c1b715d70

View File

@@ -0,0 +1 @@
error returned from database: relation "file_identities" does not exist

View File

@@ -0,0 +1 @@
{"success":true,"translated_text":"你好,世界","source_language_detected":"unknown","model_used":"qwen3:latest"}

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `enabled` at line 1 column 15

View File

@@ -0,0 +1 @@
{"ollama":{"engine":"Ollama","model":"nomic-embed-text","status":"ok","latency_ms":4,"error":null},"llama_server":{"engine":"llama-server","model":"gemma4_e4b_q5","status":"error","latency_ms":null,"error":"error sending request for url (http://localhost:8081/v1/models)"}}

View File

@@ -0,0 +1 @@
{"username":"demo","home_dir":"/Users/accusys/momentry/var/sftpgo/data/demo","files_count":103,"registered_videos":[{"uuid":"384b0ff44aaaa1f14cb2cd63b3fea966","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","status":"failed"},{"uuid":"dd61fda85fee441fdd00ab5528213ff7","file_name":"ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","status":"failed"},{"uuid":"3e97fd717d518536771fab5d4a76b43d","file_name":"A12T3-Share-User Experience of Thunderbolt 3 Shareable Storage.mp4","status":"pending"},{"uuid":"9c02a43cf752735b2386536a944854a6","file_name":"Accusys Thunderbolt Share Storage at 2016 NAB.mp4","status":"failed"},{"uuid":"b62b2b05f7345d75568eed2363ac551e","file_name":"Accusys-WD_FilmRiot.mp4","status":"failed"},{"uuid":"1d5b574b4e6cbb2ead4ba5da5ff8c746","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"failed"},{"uuid":"c4e4d53de3b678469e0fdf9d4c1fb257","file_name":"animal4.mp4","status":"failed"},{"uuid":"a4f2880616e82a03c862831fbcd3477b","file_name":"ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","status":"failed"},{"uuid":"8e2e98c49355935f662cf1fb23c37c91","file_name":"ExaSAN Webinar by Blake Jones, Vision2see.mp4","status":"failed"},{"uuid":"fe9542b6149643d3bf71e46bd2967267","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"failed"},{"uuid":"d261e9add96fbe4fa84abb5832989b64","file_name":"Gamma Carry Saves the World..mp4","status":"failed"},{"uuid":"cf711e5ee9edd60a827ef2f4f5807eec","file_name":"KOBA 2022 Interview SBU Accusys Storage.mp4","status":"failed"},{"uuid":"420f196bbab651616eb8ea49b74feabd","file_name":"Old Felix the Cat Cartoon.mp4","status":"failed"},{"uuid":"65d6a1e7d1c7606ca588a30137a0cc60","file_name":"steamboat-willie_1928.mp4","status":"failed"},{"uuid":"477d8fa7bc0e1a70d89cc0022b7ebfd2","file_name":"Thunderbolt ExaSAN at CCBN 中国国际广播电视信息网络展览会清.mp4","status":"failed"},{"uuid":"84470206e42e1622f8a299f0089172c1","file_name":"Top Colorist Blake Jones Speaks about the Gamma Carry.mp4","status":"failed"},{"uuid":"4583cd2c15844238ac2eefdc1241a3ba","file_name":"view13.mp4","status":"failed"},{"uuid":"e4bd8e594cb4824d15ab45522780c752","file_name":"view15.mp4","status":"failed"},{"uuid":"d5f6a63b1065f496ac3eca62d3c67416","file_name":"view28.mp4","status":"failed"},{"uuid":"7a80cb575b873b7eea99002a7e6cfa1d","file_name":"view7.mp4","status":"failed"},{"uuid":"80459593c892f50d271e2408a79b1391","file_name":"Walt Disney - 1925 - Alice the Toreador.mp4","status":"failed"},{"uuid":"6f10e2e58146425947f047948de7a11a","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"failed"},{"uuid":"078975658e04529ee06f8d11cd7ba226","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"failed"},{"uuid":"0bfb7f3b8f529e806a8dc325b1e989f6","file_name":"Old Felix the Cat Cartoon.mp4","status":"failed"}],"last_login":null}

View File

@@ -0,0 +1 @@
{"status":"ok","version":"1.0.0","uptime_ms":81335}

View File

@@ -0,0 +1 @@
{"status":"ok","version":"1.0.0","uptime_ms":81366,"services":{"postgres":{"status":"ok","latency_ms":10,"error":null},"redis":{"status":"ok","latency_ms":0,"error":null},"qdrant":{"status":"ok","latency_ms":1,"error":null},"mongodb":{"status":"ok","latency_ms":0,"error":null}}}

View File

@@ -0,0 +1 @@
{"success":true,"message":"Login successful","api_key":"muser_test_001","user":{"username":"demo"}}

View File

@@ -0,0 +1 @@
{"success":true}

View File

@@ -0,0 +1 @@
{"success":true,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","file_path":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","metadata":{"format":{"size":"2361629896","bit_rate":"2746348","duration":"6879.329524","filename":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","format_name":"mov,mp4,m4a,3gp,3g2,mj2"},"streams":[{"tags":{"language":"und","handler_name":"ISO Media file produced by Google Inc."},"index":0,"width":1920,"height":1080,"channels":null,"duration":"6879.255717","nb_frames":"412343","codec_name":"h264","codec_type":"video","sample_rate":null,"r_frame_rate":"60000/1001"},{"tags":{"language":"eng","handler_name":"ISO Media file produced by Google Inc."},"index":1,"width":null,"height":null,"channels":2,"duration":"6879.329524","nb_frames":"296268","codec_name":"aac","codec_type":"audio","sample_rate":"44100","r_frame_rate":"0/0"}]},"created_at":"2026-05-03T07:44:43.384236Z"}

View File

@@ -0,0 +1 @@
{"success":true,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","total":0,"page":1,"page_size":20,"data":[]}

View File

@@ -0,0 +1 @@
{"success":true,"total":0,"page":1,"page_size":20,"data":[{"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","file_name":"Old_Time_Movie_Show_-_Charade_1963.HD.mov","file_path":"/Users/accusys/test_video/Old_Time_Movie_Show_-_Charade_1963.HD.mov","status":"ready"},{"file_uuid":"0bfb7f3b8f529e806a8dc325b1e989f6","file_name":"Old Felix the Cat Cartoon.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Old Felix the Cat Cartoon.mp4","status":"ready"},{"file_uuid":"078975658e04529ee06f8d11cd7ba226","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"ready"},{"file_uuid":"6f10e2e58146425947f047948de7a11a","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"ready"},{"file_uuid":"80459593c892f50d271e2408a79b1391","file_name":"Walt Disney - 1925 - Alice the Toreador.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Walt Disney - 1925 - Alice the Toreador.mp4","status":"ready"},{"file_uuid":"7a80cb575b873b7eea99002a7e6cfa1d","file_name":"view7.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view7.mp4","status":"ready"},{"file_uuid":"d5f6a63b1065f496ac3eca62d3c67416","file_name":"view28.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view28.mp4","status":"ready"},{"file_uuid":"e4bd8e594cb4824d15ab45522780c752","file_name":"view15.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view15.mp4","status":"ready"},{"file_uuid":"4583cd2c15844238ac2eefdc1241a3ba","file_name":"view13.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/view13.mp4","status":"ready"},{"file_uuid":"84470206e42e1622f8a299f0089172c1","file_name":"Top Colorist Blake Jones Speaks about the Gamma Carry.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Top Colorist Blake Jones Speaks about the Gamma Carry.mp4","status":"ready"},{"file_uuid":"477d8fa7bc0e1a70d89cc0022b7ebfd2","file_name":"Thunderbolt ExaSAN at CCBN 中国国际广播电视信息网络展览会清.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Thunderbolt ExaSAN at CCBN 中国国际广播电视信息网络展览会清.mp4","status":"ready"},{"file_uuid":"65d6a1e7d1c7606ca588a30137a0cc60","file_name":"steamboat-willie_1928.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/steamboat-willie_1928.mp4","status":"ready"},{"file_uuid":"420f196bbab651616eb8ea49b74feabd","file_name":"Old Felix the Cat Cartoon.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Old Felix the Cat Cartoon.mp4","status":"ready"},{"file_uuid":"cf711e5ee9edd60a827ef2f4f5807eec","file_name":"KOBA 2022 Interview SBU Accusys Storage.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/KOBA 2022 Interview SBU Accusys Storage.mp4","status":"ready"},{"file_uuid":"d261e9add96fbe4fa84abb5832989b64","file_name":"Gamma Carry Saves the World..mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma Carry Saves the World..mp4","status":"ready"},{"file_uuid":"fe9542b6149643d3bf71e46bd2967267","file_name":"Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Gamma 8-Director Chih-Lin Yang Shares His Experience:楊智麟導演經驗分享.mp4","status":"ready"},{"file_uuid":"8e2e98c49355935f662cf1fb23c37c91","file_name":"ExaSAN Webinar by Blake Jones, Vision2see.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/ExaSAN Webinar by Blake Jones, Vision2see.mp4","status":"ready"},{"file_uuid":"a4f2880616e82a03c862831fbcd3477b","file_name":"ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4","status":"ready"},{"file_uuid":"c4e4d53de3b678469e0fdf9d4c1fb257","file_name":"animal4.mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/animal4.mp4","status":"ready"},{"file_uuid":"1d5b574b4e6cbb2ead4ba5da5ff8c746","file_name":"Alice Comedies-Alice's Mysterious Mystery (1926).mp4","file_path":"/Users/accusys/momentry/var/sftpgo/data/demo/Alice Comedies-Alice's Mysterious Mystery (1926).mp4","status":"ready"}]}

View File

@@ -0,0 +1 @@
Failed to deserialize the JSON body into the target type: missing field `file_path` at line 1 column 55

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1 @@
{"error":"Identity not found: a9a90105-6d6b-46ff-92da-0c3c1a57dff4"}

View File

@@ -0,0 +1,10 @@
Script failed: Traceback (most recent call last):
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 468, in <module>
main()
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 422, in main
angle_groups = group_faces_by_angle(args.face_json)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/accusys/momentry_core_0.1/scripts/select_face_reference_vectors_v2.py", line 60, in group_faces_by_angle
with open(face_json_path) as f:
^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: 'test'

View File

@@ -0,0 +1 @@
{"success":true,"identity_uuid":"a9a90105-6d6b-46ff-92da-0c3c1a57dff4","total":0,"page":1,"page_size":20,"data":[]}

View File

@@ -0,0 +1 @@
Identity not found: a9a90105-6d6b-46ff-92da-0c3c1a57dff4

View File

@@ -0,0 +1 @@
{"success":true,"identity_uuid":"a9a90105-6d6b-46ff-92da-0c3c1a57dff4","total":0,"page":1,"page_size":20,"data":[]}

View File

@@ -0,0 +1 @@
{"identities":[{"id":22,"name":"Raoul Delfosse","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Taxi Driver (uncredited)","tmdb_cast_order":14,"tmdb_movie_title":"Charade"}},{"id":21,"name":"Albert Daumergue","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Man in Stamp Market (uncredited)","tmdb_cast_order":13,"tmdb_movie_title":"Charade"}},{"id":20,"name":"Marcel Bernier","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Taxi Driver (uncredited)","tmdb_cast_order":12,"tmdb_movie_title":"Charade"}},{"id":19,"name":"Claudine Berg","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Maid (uncredited)","tmdb_cast_order":11,"tmdb_movie_title":"Charade"}},{"id":18,"name":"Marc Arian","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Subway Passenger (uncredited)","tmdb_cast_order":10,"tmdb_movie_title":"Charade"}},{"id":17,"name":"Thomas Chelimsky","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Jean-Louis Gaudel","tmdb_cast_order":9,"tmdb_movie_title":"Charade"}},{"id":16,"name":"Paul Bonifas","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Mr. Felix","tmdb_cast_order":8,"tmdb_movie_title":"Charade"}},{"id":15,"name":"Jacques Marin","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Edouard Grandpierre","tmdb_cast_order":7,"tmdb_movie_title":"Charade"}},{"id":14,"name":"Ned Glass","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Leopold Gideon","tmdb_cast_order":6,"tmdb_movie_title":"Charade"}},{"id":13,"name":"Dominique Minot","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Sylvie Gaudel","tmdb_cast_order":5,"tmdb_movie_title":"Charade"}},{"id":12,"name":"George Kennedy","metadata":{"speaker_id":"SPEAKER_9","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Herman Scobie","tmdb_cast_order":4,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":11,"name":"James Coburn","metadata":{"tmdb_movie_id":4808,"tmdb_character":"Tex Panthollow","tmdb_cast_order":3,"tmdb_movie_title":"Charade"}},{"id":10,"name":"Walter Matthau","metadata":{"speaker_id":"SPEAKER_4","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Hamilton Bartholemew","tmdb_cast_order":2,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":9,"name":"Audrey Hepburn","metadata":{"speaker_id":"SPEAKER_1","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Regina Lampert","tmdb_cast_order":1,"tmdb_movie_title":"Charade","speaker_confidence":0.85}},{"id":8,"name":"Cary Grant","metadata":{"speaker_id":"SPEAKER_0","tmdb_movie_id":4808,"speaker_method":"mar_lip_analysis","tmdb_character":"Peter Joshua","tmdb_cast_order":0,"tmdb_movie_title":"Charade","speaker_confidence":0.85}}],"count":15,"page":1,"page_size":20}

View File

@@ -0,0 +1 @@
{"error":"Source identity not found"}

View File

@@ -0,0 +1 @@
{"success":true,"message":"Unbound face face_100 from 417a7e93860d70c87aee6c4c1b715d70","data":{"rows_affected":0}}

View File

@@ -0,0 +1 @@
{"candidates":[{"id":1336,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":81180,"confidence":0.90893996,"bbox":{"x":838,"y":322,"width":334,"height":334},"attributes":null},{"id":1338,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":81300,"confidence":0.90678865,"bbox":{"x":839,"y":317,"width":334,"height":334},"attributes":null},{"id":4229,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":210180,"confidence":0.90625,"bbox":{"x":761,"y":185,"width":158,"height":158},"attributes":null},{"id":1335,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":81120,"confidence":0.90625,"bbox":{"x":839,"y":317,"width":338,"height":338},"attributes":null},{"id":5288,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":248700,"confidence":0.9059806,"bbox":{"x":852,"y":212,"width":227,"height":227},"attributes":null},{"id":5337,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":250200,"confidence":0.90517175,"bbox":{"x":754,"y":144,"width":358,"height":358},"attributes":null},{"id":485,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":38460,"confidence":0.90436226,"bbox":{"x":794,"y":124,"width":251,"height":251},"attributes":null},{"id":459,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":37500,"confidence":0.903552,"bbox":{"x":668,"y":204,"width":285,"height":285},"attributes":null},{"id":2850,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":140460,"confidence":0.903552,"bbox":{"x":897,"y":200,"width":195,"height":195},"attributes":null},{"id":1506,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":87420,"confidence":0.90301144,"bbox":{"x":926,"y":270,"width":262,"height":262},"attributes":null},{"id":1334,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":81060,"confidence":0.9024706,"bbox":{"x":839,"y":324,"width":334,"height":334},"attributes":null},{"id":1562,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":90780,"confidence":0.9024706,"bbox":{"x":854,"y":305,"width":360,"height":360},"attributes":null},{"id":2476,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":120180,"confidence":0.90165865,"bbox":{"x":1241,"y":257,"width":277,"height":277},"attributes":null},{"id":240,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":23940,"confidence":0.9013878,"bbox":{"x":1265,"y":340,"width":300,"height":300},"attributes":null},{"id":2746,"face_id":null,"file_uuid":"417a7e93860d70c87aee6c4c1b715d70","frame_number":137100,"confidence":0.9005749,"bbox":{"x":774,"y":281,"width":297,"height":297},"attributes":null}],"total":6182,"page":1,"page_size":15}

View File

@@ -0,0 +1 @@
{"results":[],"query":"stolen fortune thriller"}

View File

@@ -0,0 +1 @@
{"error":"Search error: error returned from database: column f.pose_results does not exist"}

View File

@@ -0,0 +1 @@
{"results":[{"uuid":"unknown","chunk_id":"unknown","chunk_type":"","start_time":0.0,"end_time":0.0,"text":"","vector_score":0.7524489760398865,"bm25_score":0.0,"combined_score":6.067750513553619}],"query":"Paris apartment scene"}

View File

@@ -0,0 +1 @@
{"error":"error returned from database: column \"scene_order\" does not exist"}

View File

@@ -0,0 +1 @@
{"error":"error returned from database: column \"uuid\" does not exist"}

View File

@@ -0,0 +1 @@
{"results":[],"query":"Cary Grant as mysterious stranger"}

Some files were not shown because too many files have changed in this diff Show More