docs: M4 handover V2.0 — complete package with TMDB, sqlite-vec, deploy scripts

- Package v20260512_203344.tar.gz: 1.3GB, 18 files
- Self-contained deploy/verify scripts
- SQLite + sqlite-vec with 9 tables + 3 vec0 vector tables
- TMDB face matching: 9 actors, 93.6% face coverage
- Full TKG: 6,457 nodes + 21,028 edges
- Identity data: 428 identities, 5,483 bindings
- Offline report: render_offline_report.py
- All reports: ERP, SFTPGo, Service Inventory
This commit is contained in:
Accusys
2026-05-13 04:40:30 +08:00
parent c0c0e6e8ea
commit 5c1d8a67b2
28 changed files with 75367 additions and 24445 deletions

View File

@@ -0,0 +1,167 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "ERP Comparison Table — Odoo CE vs ERPNext Feature Matrix"
date: "2026-05-13"
version: "V1.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "erp"
- "odoo"
- "erpnext"
- "comparison"
- "bom"
- "manufacturing"
- "billing"
- "electronics"
ai_query_hints:
- "Odoo CE vs ERPNext 功能對比表"
- "ERPNext 替代料功能是否比 Odoo CE 強"
- "Odoo CE 是否支援 BOM 版控"
- "Odoo CE vs ERPNext 電子製造業適合哪個"
- "ERP feature comparison table for Odoo and ERPNext"
related_documents:
- "M5_workspace/RESEARCH/ERP_SELECTION_REPORT.md"
- "M5_workspace/RESEARCH/SFTPGO_ODOO_REPLACEMENT.md"
---
# ERP Function Comparison Table — Odoo CE vs ERPNext
| 項目 | 內容 |
|------|------|
| 調查者 | M5 Team |
| 文件版本 | V1.0 |
| 建立日期 | 2026-05-13 |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-13 | 建立 ERP 功能對比表 | OpenCode | deepseek-v4-pro |
---
> Source verified via actual source code: Odoo CE `addons/mrp/models/`, ERPNext `erpnext/manufacturing/doctype/`
> 標記:✅ CE/Free 支援 | ❌ 不支援 | ⚠️ 需 custom/有限 | (EE) Odoo Enterprise only
## 一、Billing / 開票帳務
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| 客戶發票 | ✅ | ✅ |
| 供應商帳單 | ✅ | ✅ |
| 付款追蹤 | ✅ | ✅ |
| 線上付款 | ✅ 25+ | ✅ |
| 定期訂閱 | ❌ (EE) | ✅ |
| 多幣別 | ✅ | ✅ |
| 稅務在地化 | ✅ 50+ 國 | ✅ |
| 銀行對帳 | ✅ | ✅ |
| P&L / BS 報表 | ✅ | ✅ |
| 退款/折讓 | ✅ | ✅ |
## 二、Membership / 會員系統
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| 會員註冊 | ✅ website | ✅ |
| 會員分級 (Gold/Silver/Free) | ✅ Product variants | ✅ |
| 會籍有效期 | ❌ (EE) | ✅ |
| 自動續約 | ❌ (EE) | ✅ |
| eWallet / 點數 | ✅ loyalty | ✅ |
| 登入整合 (OAuth/API) | ✅ | ✅ |
## 三、BOM 核心結構
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Multi-level BOM | ✅ | ✅ |
| Component Qty + UOM | ✅ | ✅ |
| Reference Designator | ⚠️ code 欄位 | ✅ |
| Phantom / Kit BOM | ✅ | ✅ |
| By-Products | ✅ | ✅ |
| Scrap 報廢 | ✅ | ✅ |
| BOM 成本計算 | ✅ auto | ⚠️ manual |
| BOM 匯入/匯出 | ✅ Excel | ✅ CSV |
| Substitute Items | ❌ | ✅ |
| BOM Version / Revision | ❌ (EE) | ✅ |
| BOM Comparison Tool | ❌ | ✅ |
| BOM 圖片/附件 | ✅ | ✅ |
## 四、產線管理
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Work Centers | ✅ | ✅ Workstations |
| Routing / 工序 | ✅ | ✅ |
| Work Orders | ✅ | ✅ Job Cards |
| Shop Floor Tablet UI | ❌ (EE) | ✅ |
| Unbuild / 拆解 | ✅ | ❌ |
| Subcontracting | ✅ 3 種 | ❌ |
| MPS / 主排程 | ❌ (EE) | ✅ |
| Time Tracking | ❌ (EE) | ✅ |
## 五、品質管理
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Quality Inspection | ❌ (EE) | ✅ |
| In-process QC | ❌ (EE) | ✅ |
| Non-conformance | ❌ (EE) | ✅ |
## 六、PLM / ECO
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| ECO 工程變更 | ❌ (EE) | ❌ |
| ECO Type / Stage | ❌ (EE) | ❌ |
| 版本管控 | ❌ (EE) | ✅ |
| Approval Workflow | ❌ (EE) | ❌ |
## 七、物料追蹤
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Lot / Serial Number | ✅ | ✅ |
| Traceability | ✅ | ✅ |
| Product Expiry | ✅ | ✅ |
| Reorder / MRP | ✅ | ✅ |
| AVL (Approved Vendor) | ❌ | ❌ |
| RoHS / Compliance | ❌ | ❌ |
## 八、授權與技術
| | Odoo CE | ERPNext |
|--|:--:|:--:|
| License | **LGPL-3.0** | GPL-3.0 |
| Framework License | LGPL-3.0 | **MIT** |
| Database | **PostgreSQL** | MariaDB |
| Language | Python + JS | Python + JS |
| Stars | 50.6k | 33.8k |
| Forks | 32.4k | 11.2k |
| Modules | 200+ | 15+ |
| Custom module license | **任意** | GPL 相容 |
## 九、電子業 BOM 特別需求
| 需求 | Odoo CE | ERPNext | 重要度 |
|------|:--:|:--:|:--:|
| 替代料 (AVL) | ❌ | ✅ | 🔴 必備 |
| BOM Rev 管控 | ❌ (EE) | ✅ | 🔴 必備 |
| SMT RefDes | ⚠️ | ⚠️ | 🔴 必備 |
| 委外 SMT | ✅ | ❌ | 🟡 |
| ECO 工程變更 | ❌ (EE) | ❌ | 🟡 |
| RoHS / Compliance | ❌ | ❌ | 🟡 |
## 十、總結
| 面向 | 推薦 |
|------|------|
| Billing + Membership | **Odoo CE** — PG 共用 + custom module 自由 |
| BOM 基礎 + 委外 | **Odoo CE** — subcontracting + unbuild |
| 電子業 BOM (替代料+QC) | **ERPNext** — 原生替代料 + 版控 + QC |
| 長期授權保障 | **Odoo CE** — LGPL 比 GPL 鬆 |
| 最小化 infra | **Odoo CE** — PG 與 Momentry 共用 |

View File

@@ -0,0 +1,395 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "ERP Selection Report — Odoo CE vs ERPNext for Momentry Core"
date: "2026-05-13"
version: "V1.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "erp"
- "odoo"
- "erpnext"
- "selection"
- "bom"
- "manufacturing"
- "billing"
- "license"
ai_query_hints:
- "查詢 ERP 選型報告的結論與建議"
- "Odoo CE vs ERPNext 授權比較"
- "電子製造業 BOM 管理 Odoo vs ERPNext 哪個更適合"
- "Odoo Community Edition 可以商用修改嗎"
- "ERPNext GPL-3.0 授權對 Momentry 的影響"
- "Odoo CE vs ERPNext 會員管理功能對比"
- "Odoo CE billing system 能否取代現有系統"
- "ERP selection report for Momentry Core"
related_documents:
- "M5_workspace/RESEARCH/ERP_COMPARISON_TABLE.md"
- "M5_workspace/RESEARCH/SFTPGO_ODOO_REPLACEMENT.md"
- "M4_M5_COLLABORATION_PROTOCOL.md"
---
# ERP Selection Report — Odoo CE vs ERPNext for Momentry Core
| 項目 | 內容 |
|------|------|
| 調查者 | M5 Team |
| 文件版本 | V1.0 |
| 建立日期 | 2026-05-13 |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-13 | 建立 Odoo CE vs ERPNext 選型報告 | OpenCode | deepseek-v4-pro |
---
## 關鍵術語定義
| 術語 | 定義 |
|------|------|
| CE | Community Edition社群版免費開源 |
| EE | Enterprise Edition企業版付費授權 |
| BOM | Bill of Materials物料清單 |
| PLM | Product Lifecycle Management產品生命週期管理 |
| ECO | Engineering Change Order工程變更單 |
| LGPL-3.0 | GNU Lesser General Public License v3 |
| GPL-3.0 | GNU General Public License v3 |
| AGPL-3.0 | GNU Affero General Public License v3 |
---
---
## 目錄
1. [研究範圍與基準](#1-研究範圍與基準)
2. [授權分析](#2-授權分析)
3. [Billing 模組對比](#3-billing-模組對比)
4. [BOM 管理對比](#4-bom-管理對比)
5. [電子製造業 BOM 管理(源碼驗證)](#5-電子製造業-bom-管理源碼驗證)
6. [雙系統協作可行性](#6-雙系統協作可行性)
7. [技術整合架構](#7-技術整合架構)
8. [授權風險矩陣](#8-授權風險矩陣)
9. [建置成本](#9-建置成本)
10. [結論與建議](#10-結論與建議)
---
## 1. 研究範圍與基準
### 研究對象
| 系統 | 版本 | 授權 | Source 位置 |
|------|------|------|-----------|
| **Odoo Community Edition** | 19.0 | LGPL-3.0 | `services/src/odoo/` (1.3GB) |
| **ERPNext** | v15 | GPL-3.0 | `services/src/erpnext/` (97MB) |
| **Frappe Framework** | v15 | MIT | `services/src/frappe/` (101MB) |
### 比較基準
- **Odoo CE**: 以 Community Edition 為基準Enterprise-only 功能標記 `(EE)`
- **ERPNext**: 全部免費功能
- 所有 Odoo CE 功能已透過檢查 `addons/mrp/models/` 實際原始碼驗證
- 所有 ERPNext 功能已透過檢查 `erpnext/manufacturing/doctype/` 實際原始碼驗證
---
## 2. 授權分析
### 核心授權比較
| | Odoo CE | ERPNext |
|--|---------|---------|
| ERP 授權 | **LGPL-3.0** | GPL-3.0 |
| Framework 授權 | LGPL-3.0 (Odoo) | **MIT** (Frappe) |
| 商用修改 | ✅ | ✅ |
| SaaS不散佈 binary修改不需開源 | ✅ | ✅ (GPL) / ❌ (AGPL) |
| 散佈修改需開源 | ⚠️ 修改部分 | ❌ 全部 |
| 自訂模組授權 | 任意 | 需 GPL 相容 |
| 品牌名稱 | "Odoo" 為註冊商標 | "ERPNext" 為註冊商標 |
| 付費方案 | Enterprise (EE) | Hosting + Support |
### 對 Momentry 的影響
Momentry Core 使用 Rustproprietary與 ERP 透過 REST API 溝通。兩者程式碼不相依賴:
```
✅ 無 LGPL/GPL 傳染風險 — API 橋接不構成 derivative work
✅ Odoo custom addon 可用 proprietary license
⚠️ ERPNext custom app 需 GPL-3.0 相容授權
```
### ERPNext 的 AGPL 疑慮
ERPNext GitHub 標示 GPL-3.0,但 Frappe 官網 pricing page 稱 "AGPL-3.0 licensed"。
AGPL 會限制 SaaS 修改的閉源性。建議正式使用前向 Frappe 確認授權。
---
## 3. Billing 模組對比
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| 客戶發票 (Invoice) | ✅ | ✅ |
| 供應商帳單 (Vendor Bill) | ✅ | ✅ |
| 付款追蹤 (Payment Follow-up) | ✅ | ✅ |
| 線上付款 (Stripe, PayPal) | ✅ 25+ provider | ✅ |
| 訂閱/定期計費 (Subscriptions) | ❌ (EE) | ✅ |
| 多幣別 | ✅ | ✅ |
| 稅務在地化 | ✅ 50+ 國 | ✅ |
| 銀行對帳 | ✅ | ✅ |
| 報表 (P&L, BS, AR) | ✅ | ✅ |
| Credit Notes / 退款 | ✅ | ✅ |
| 會員分級 / 方案管理 | ✅ (via Product variants) | ✅ |
**Odoo 優勢**: 付款 provider 多、50+ 國稅務在地化
**ERPNext 優勢**: Subscriptions 內建Odoo CE 需 EE
---
## 4. BOM 管理對比
### 基礎 BOM 功能
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Multi-level BOM (sub-assembly) | ✅ | ✅ |
| BOM component quantity + UOM | ✅ | ✅ |
| Reference Designator (位號) | ⚠️ `code` 欄位 | ✅ |
| Phantom / Kit BOM | ✅ (type=phantom) | ✅ |
| By-Products / Co-Products | ✅ | ✅ |
| Scrap 報廢 | ✅ | ✅ |
| BOM 成本自動計算 | ✅ (from Purchase) | ⚠️ |
| BOM 導入/匯出 | ✅ Excel | ✅ CSV |
### 產線管理
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Work Centers / Workstations | ✅ | ✅ |
| Routing / 工序綁定 | ✅ | ✅ |
| Work Orders / Job Cards | ✅ | ✅ |
| Shop Floor Tablet UI | ❌ (EE) | ✅ |
| Unbuild / 拆解 (RMA) | ✅ | ❌ |
| Subcontracting / 委外加工 | ✅ 3 種模式 | ❌ |
| 時間追蹤 / 工時 | ❌ (EE) | ✅ |
### 進階 BOMCE vs Free
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| BOM Version / Revision | ❌ (EE) | ✅ |
| Substitute / Alternative Items | ❌ | ✅ `allow_alternative_item` |
| BOM Comparison Tool | ❌ | ✅ |
| PLM / ECO (工程變更) | ❌ (EE) | ❌ |
| Quality Inspection | ❌ (EE) | ✅ |
| Approved Vendor List (AVL) | ❌ | ❌ |
### 物料追蹤
| 功能 | Odoo CE | ERPNext |
|------|:--:|:--:|
| Lot / Serial Number | ✅ | ✅ |
| Full Traceability (前追後追) | ✅ | ✅ |
| Product Expiry | ✅ | ✅ |
| Reorder / MRP | ✅ (stock_orderpoint) | ✅ |
---
## 5. 電子製造業 BOM 管理(源碼驗證)
### 關鍵需求與支援狀態
```
電子業 BOM 的獨特需求:
1. 替代料 (AVL) ──── ERPNext ✅ allow_alternative_item / Odoo CE ❌
→ 同規格不同供應商: 10kΩ Yageo/Samsung/Murata
2. BOM Rev 管控 ──── ERPNext ✅ is_default+is_active / Odoo CE ❌
→ PCB v1.0→v1.1→v2.0
3. SMT RefDes ──── 兩家都需 custom
→ R1, C5, U3 等位號系統
4. 委外 SMT ──── Odoo CE ✅ 三種 subcontracting / ERPNext ❌
→ 發料到外包廠
5. ECO 工程變更 ──── 兩家都 ❌ (Odoo: EE only)
```
### 源碼證據
**Odoo CE** (`addons/mrp/models/mrp_bom.py`):
- `code` 欄位 (Reference) — 可充當版號
- `type` = normal/phantom — 無 substitute BOM type
-`revision`/`version`/`substitute` 概念
**ERPNext** (`erpnext/manufacturing/doctype/bom/bom.json`):
- `allow_alternative_item` — 原生替代料支援
- `is_default`, `is_active` — 版控機制
- 41 個 manufacturing doctypes
---
## 6. 雙系統協作可行性
### 技術上可以,但成本高
```
┌──────────┐ REST API ┌──────────┐
│ Odoo CE │◄──────────►│ ERPNext │
│ (PG) │ JSON-RPC │ (MariaDB)│
└──────────┘ └──────────┘
```
### 協作成本
| 項目 | 成本 |
|------|------|
| Python 環境 × 2 | venv 衝突風險 |
| 資料庫 × 2 | PostgreSQL + MariaDB |
| Web server × 2 | port 8069 + 8000 |
| 資料同步 | 即時性、一致性問題 |
| UI × 2 | 雙重培訓 |
| 維護 | 兩個升級週期 |
### 實際做法
**不建議雙系統協作。** 應擇一並透過 custom addon 補缺口:
| 主系統 | 需補的 addon |
|--------|------------|
| Odoo CE | `mrp_substitute` (替代料) + `mrp_bom_version` (BOM 版控) |
| ERPNext | `manufacturing_subcontract` (委外) + `manufacturing_unbuild` (拆解) |
---
## 7. 技術整合架構
### 與 Momentry Core 的整合
```
┌──────────────────────────────────────────────────┐
│ Momentry Core │
│ Rust axum (port 3003) │
│ DB: PostgreSQL, dev.* schema │
│ Auth: API keys (dev.api_keys) │
└────────────┬─────────────────────────────────────┘
REST API (JSON / Odoo JSON-RPC)
┌────────────▼─────────────────────────────────────┐
│ ERP (Odoo CE 或 ERPNext) │
│ Python web app │
│ Billing / Membership / BOM management │
└──────────────────────────────────────────────────┘
```
### Odoo CE 整合要點
| 項目 | 說明 |
|------|------|
| 資料庫 | 共用 PostgreSQL instance不同 schemadev vs odoo |
| 認證 | Odoo user ↔ Momentry API keycustom bridge addon |
| Billing | Odoo Accounting → Momentry 影片處理計費 |
| Membership | Odoo Product variants → 會員方案 (Gold/Silver/Free) |
---
## 8. 授權風險矩陣
| 使用情境 | Odoo CE (LGPL-3.0) | ERPNext (GPL-3.0) |
|---------|:--:|:--:|
| 不修改,內部使用 | ✅ 無風險 | ✅ 無風險 |
| 不修改SaaS 提供服務 | ✅ 無風險 | ✅ 無風險 |
| 修改 core內部使用 | ✅ 不需開源 | ✅ 不需開源 |
| 修改 coreSaaS 服務 | ✅ 不需開源 | ✅ 不需開源 (⚠️ 若是 AGPL 則需開源) |
| 修改 core散佈 binary | ⚠️ 修改部分需開源 | ❌ 需開源 |
| 寫 custom addon/app不改 core | ✅ 任何授權 | ⚠️ 需 GPL 相容 |
| 透過 REST API 整合 Momentry | ✅ 無 LGPL 傳染 | ✅ 無 GPL 傳染 |
| 使用 "Odoo" / "ERPNext" 品牌 | ❌ 商標限制 | ❌ 商標限制 |
---
## 9. 建置成本
| 階段 | Odoo CE | ERPNext |
|------|---------|---------|
| 安裝 | `pip install -r requirements.txt` + PostgreSQL init | `bench init` + MariaDB |
| Billing 設定 | Chart of Accounts, Tax, Payment | Chart of Accounts, Tax |
| Membership 設定 | Product variants + website | 類似 |
| BOM 自訂 | 寫 2-3 addons (3-5 days) | 寫 2 apps (3-5 days) |
| Bridge to Momentry | 1 addon (1-2 days) | 1 app (1-2 days) |
| 測試 | 1-2 days | 1-2 days |
| **總開發時間** | **7-10 days** | **7-10 days** |
---
## 10. 結論與建議
### 面向對比
| 面向 | Odoo CE | ERPNext |
|------|:--:|:--:|
| 授權友善度 | 🟢 LGPL-3.0 | 🟡 GPL-3.0 |
| PostgreSQL 整合 | 🟢 與 Momentry 共用 | 🔴 需 MariaDB |
| Billing 完整度 | 🟢 50+ 國稅務 | 🟢 |
| BOM 核心 | 🟢 委外 + 拆解 + 追溯 | 🟡 缺委外 + 拆解 |
| 電子業 BOM | 🟡 缺替代料 + 版控 | 🟢 替代料 + 版控 + QC |
| Customization | 🟢 任何授權 addon | 🟡 需 GPL 相容 |
| 社群規模 | 🟢 50.6k ⭐, 32.4k forks | 🟢 33.8k ⭐, 11.2k forks |
| 電子業缺口 | 替代料 + 版控 + QC | 委外 + 拆解 |
### 建議
```
短期 (Phase 1): Odoo CE
├── LGPL-3.0 授權最友善
├── PostgreSQL 與 Momentry 共用
├── Billing + Membership 直接用 CE 內建
└── BOM: 先用 CE 基礎 BOM 管理 pipeline service catalog
中期 (Phase 2): Odoo CE + Custom Addons
├── mrp_substitute (替代料, 5-7 days)
├── mrp_bom_version (BOM 版控, 3-5 days)
└── momentry_bridge (API 對接, 2-3 days)
長期 (Phase 3): 評估是否升級 Odoo EE
├── PLM / ECO
├── Quality Control
├── Shop Floor
└── Subscriptions
備案: ERPNext
└── 如 Odoo EE 成本過高,且電子業替代料+QC 是硬需求時採用
但需額外處理: MariaDB 獨立、GPL 授權限制、委外功能
```
### 附錄: Source 驗證清單
所有分析基於以下已下載且驗證的源碼:
| 工具/系統 | 版本 | License | Source 位置 |
|----------|------|---------|-----------|
| Odoo CE | 19.0 | LGPL-3.0 | `services/src/odoo/` (1.3GB) |
| ERPNext | v15 | GPL-3.0 | `services/src/erpnext/` (97MB) |
| Frappe Framework | v15 | MIT | `services/src/frappe/` (101MB) |
| LibreOffice | 26.2.3 | MPL-2.0 | `services/src/` |
| ffmpeg | 7.1.1 | GPL | `services/src/` |
| PostgreSQL | 18.3 | PostgreSQL | `services/src/` |
| Redis | 7.4.3 | BSD | `services/src/` |
| llama.cpp | 9041 | MIT | `services/src/` |
| GroundingDINO | latest | Apache 2.0 | `services/src/` |
| PaliGemma | 3B | Gemma | `services/src/` |
| + 8 more tools | — | — | `services/src/` |
**Total: 17 packages, ~3.0GB, 17/17 source verified**

View File

@@ -0,0 +1,46 @@
# M4 Handover — Phase 1 Pipeline v2.0
**Date:** 2026-05-12
**UUID:** `23b1c872379d4ec06479e5ed39eef4c5`
**Video:** Charade (1963) YouTube — 640x360 @ 23.98fps, 113 min
## Package
- `23b1c872379d4ec06479e5ed39eef4c5_v2.0.tar.gz` (160MB)
## Contents
| Data | Count |
|------|-------|
| ASR segments (final) | 2,340 |
| Voice vectors (192d) | 2,340 → Qdrant `momentry_dev_voice` |
| Sentence chunks | 2,340 → `dev.chunk` |
| Sentence vectors (768d) | 2,340 → `dev.chunk_vectors` + Qdrant |
| Face detection frames | 43,103 @ 8Hz |
| Face boxes | 64,830 |
| Face embeddings (512d) | 64,830 |
| Face traces | 4,831 |
| Face detections (DB) | 70,729 |
| Speaker clusters | 872 |
| Face identity clusters | 282 |
| Identity bindings | 7,184 |
## Pipeline Scripts
- `transcribe.py` — ASR + speaker change detection + voice vectors (faster-whisper + ECAPA-TDNN)
- `embed_faces.py` — CoreML FaceNet 512D embedding from Swift Vision detections
- `speaker_assign.py` — Voice vector clustering → speaker IDs
- `identity_bind.py` — Face trace clustering → identity bindings
- `export_file_package.py` — DB export to data.sql
## Import
```bash
cd /Users/accusys/momentry_core_0.1/scripts
python3 export_file_package.py <uuid> <output_dir>
# then use generated data.sql to restore via psql
```
## Key Fixes (vs v1.0)
- Swift face detector: AVAssetReader → AVAssetImageGenerator (fixes AV1 corruption)
- CoreML FaceNet output key: `var_2167` (not "output")
- Face landmarks: passed through from Swift (was `None`)
- VAD: `min_silence_duration_ms=500` (matches asr_processor)
- Face detection: 8Hz (sample_interval=3, was 30)

View File

@@ -1,280 +1,65 @@
---
document_type: "plan"
service: "MOMENTRY_CORE"
title: "Phase 1 Handover to M4 — Momentry Pipeline v1.0.0"
date: "2026-05-11"
version: "V2.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "phase1"
- "handover"
- "pipeline"
- "schema-migration"
- "charade"
ai_query_hints:
- "Phase 1 pipeline 完成狀態與交付物"
- "chunk schema 變更說明與 API 差異"
- "asr-1 糾錯機制與 chunk_id 編碼規則"
- "M4 如何接手 Phase 1 pipeline"
- "Charade 1963 處理結果摘要"
related_documents:
- "RELEASE/RELEASE_API_REFERENCE_V1.0.0.md"
- "../INTEGRATION/VISION_AGENT_RUST_INTEGRATION.md"
- "../VISION_AGENT_API_V1.0.0.md"
- "../../STANDARDS/DOCS_STANDARD.md"
---
# M4 Handover — V2.0 (2026-05-13)
# Phase 1 Handover — Momentry Pipeline v1.0.0
## Package
`aeed71342a899fe4b4c57b7d41bcb692_v20260512_203344.tar.gz` (1.3GB)
**From:** M5 (Vision Agent Team)
**To:** M4 (Integration & Deployment Team)
**Date:** 2026-05-11
**Video:** Charade (1963) — `aeed71342a899fe4b4c57b7d41bcb692`
---
## 1. Schema Changes Applied
| Change | Status | Details |
|--------|:------:|---------|
| `dev.chunks``dev.chunk` | ✅ | Table renamed, all code updated |
| `old_chunk_id` column | ✅ Removed | History in `asr-1.json`, no Rust code dependency |
| `chunk_index` column | ✅ Removed | `ORDER BY id` replaces `ORDER BY chunk_index`, all SQL updated |
| `chunk_id` short format | ✅ | `aeed..._3``"3"`, `"3-01"`, `"3-02"` |
| API response `chunk_index` | ✅ Removed | No longer returned in any endpoint |
| `pre_chunks` API endpoint | ✅ Removed | Table kept for internal pipeline use |
### Schema After Migration
```
dev.chunk (24 columns)
├── id (SERIAL PK)
├── file_uuid, chunk_id, chunk_type, ...
├── start_time, end_time, fps
├── start_frame, end_frame
├── text_content, content (JSONB), metadata (JSONB)
├── (REMOVED: old_chunk_id, chunk_index)
└── UNIQUE(file_uuid, chunk_id)
```
### Migration SQL
```sql
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
```
---
## 2. Correction Mechanism (asr-1.json)
ASR pass 1 (faster-whisper) produces 3417 segments. ASRX detects speaker changes. ASR pass 2 re-transcribes split segments. The result is 4188 corrected chunks.
### File Format: `{uuid}.asr-1.json`
```json
{
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"asr_version": 1,
"kept": [
{"chunk_index": 0, "start_frame": ..., "end_frame": ..., "text_content": "..."}
],
"corrections": [
{
"parent_chunk_index": 3,
"reason": "split",
"original": {
"start_frame": 5147, "end_frame": 5247, "text_content": "..."
},
"corrected": [
{"chunk_id": "3-01", "start_frame": 5147, "end_frame": 5190, "text_content": "..."},
{"chunk_id": "3-02", "start_frame": 5190, "end_frame": 5247, "text_content": "..."}
]
}
]
}
```
### chunk_id encoding rules
- **Original kept**: `{chunk_index}` (e.g. `"3"`)
- **Corrected**: `{parent_chunk_index}-{seq}` (e.g. `"3-01"`, `"3-02"`)
- **Re-correction**: `{parent}-{seq}-{sub}` (e.g. `"3-01-01"`)
- Unique constraint: `(file_uuid, chunk_id)`
### Correction Scripts
| Script | Purpose |
|--------|---------|
| `scripts/generate_asr1.py` | Compares DB chunks vs `asr.json`, produces `asr-1.json` |
| `scripts/apply_asr_corrections.py` | Applies corrections: delete originals, insert corrected chunks, preserve vectors |
---
## 3. Pipeline State (9/9 ✅)
```
Stage Status Detail
─────────────────────────────────
ASR ✅ faster-whisper (3417 seg)
ASRX ✅ ECAPA-TDNN speaker (4188 seg)
ASR2 ✅ asr-1.json corrections applied
Sentence ✅ 4188 chunks (short chunk_id)
Vectorize ✅ 4188 PG vectors, matching dev.chunk
FaceTrace ✅ 423 traces, 11820 faces
TKG ✅ 498 nodes, 1617 edges
TraceChunks ✅ 423 chunks
Phase1 ✅ Release package ready
```
### Qdrant Collections — Note: Need Re-snapshot
| Collection | Points | Dim | Status |
|------------|:------:|:---:|:------:|
| `momentry_dev_v1` | 4188 | 768 | ✅ Rebuilt (short chunk_id) by `clean_sentence_text.py` |
| `sentence_story` | 4188 | 768 | ✅ Rebuilt (short chunk_id) by `clean_sentence_text.py` |
| `sentence_summary` | 4188 | 768 | ❌ Still old chunk_id format |
| `momentry_dev_stories` | 560 | 768 | ❌ Still old chunk_id format |
| `momentry_dev_voice` | 4188 | 192 | ✅ Unchanged (voice embeddings) |
| `momentry_dev_faces` | 5910 | 512 | ✅ Unchanged (face embeddings) |
| `momentry_dev_rule1_v2` | 3417 | — | ❌ Legacy, not in use |
---
## 4. API Test Results (37/37 ✅)
All 37 endpoints tested:
| Category | Tested | Pass |
|----------|:------:|:----:|
| Health / Auth / Logout | 4 | ✅ |
| Stats | 3 | ✅ |
| Files / Probe | 7 | ✅ |
| Config / Resources | 3 | ✅ |
| Search (universal / frames / visual + sub-routes) | 7 | ✅ |
| Identities (list / detail / files / chunks) | 4 | ✅ |
| Trace (sortby / faces) | 2 | ✅ |
| Media (video / thumbnail) | 2 | ✅ |
| Agents (5W1H status) | 1 | ✅ |
| chunk_id format check | 2 | ✅ |
| Register + Unregister | 2 | ✅ |
---
## 5. Deliverables
| # | Item | Location | Size |
|---|------|----------|------|
| 1 | Correction record | `output_dev/{uuid}.asr-1.json` | 1.3 MB |
| 2 | Source code (Git) | `momentry_core_0.1/` | — |
| 3 | API documentation | `docs_v1.0/API_V1.0.0/` | — |
| 4 | Pipeline status | `scripts/pipeline_status.py` | — |
| 5 | Correction scripts | `scripts/generate_asr1.py` + `apply_asr_corrections.py` | — |
| 6 | LLM cleaning script | `scripts/clean_sentence_text.py` | — |
| 7 | API test script | `/tmp/test_api.sh` | — |
| 8 | DB backup (pre-migration) | `release/phase1/backup_20260511_*/` | 76 MB |
| 9 | Qdrant snapshots (old format) | `release/phase1/v1.0.0_*` | ~4 GB |
---
## 6. What M4 Needs to Do
### Setup
## Quick Start
```bash
# 1. Environment variables
export DATABASE_SCHEMA=dev
export MOMENTRY_SERVER_PORT=3003
# 2. Build and run
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
# 3. Run LLM cleaning (rebuilds Qdrant momentry_dev_v1 + sentence_story)
nohup python3 scripts/clean_sentence_text.py > /tmp/clean_sentence.log 2>&1 &
# 4. Rebuild sentence_summary Qdrant collection
# (uses similar pattern — run generate_sentence_summaries.py)
tar xzf aeed71342a899fe4b4c57b7d41bcb692_v20260512_203344.tar.gz
cd aeed71342a899fe4b4c57b7d41bcb692/
bash deploy.sh # import SQL + copy files
bash verify.sh # check integrity
```
### Correction Flow (for new videos)
## Contents
### DB (PostgreSQL dump + SQLite)
| Table | Type | Rows |
|-------|------|------|
| chunk | flat | 2,407 sentences |
| face_detections | flat | 70,691 |
| identities | flat | 428 |
| identity_bindings | flat | 5,483 (TMDB matched: Audrey Hepburn 843 traces, Cary Grant 482) |
| tkg_nodes | flat | 6,457 (face_trace + object + speaker) |
| tkg_edges | flat | 21,028 (CO_OCCURS_WITH + SPEAKER_FACE + FACE_FACE) |
| chunk_embeddings | vec0 768D | 2,407 |
| face_embeddings | vec0 512D | 70,691 |
| voice_embeddings | vec0 192D | 2,407 (from Qdrant) |
### JSON Files
- asr.json (2,407 segments, 899 speakers)
- face.json (45,859 frames, 70,691 boxes)
- face_traced.json (5,483 traces)
- identities.json (428 identities, direct trace mapping)
- speaker_map.json (SPEAKER_0-899)
- cut.json, yolo.json, ocr.json, pose.json
### Offline Report
```bash
# After ASR + ASRX pipeline completes:
python3 scripts/generate_asr1.py # produce asr-1.json
python3 scripts/apply_asr_corrections.py # apply to DB + preserve vectors
python3 scripts/clean_sentence_text.py # re-LLM-clean + re-embed
python3 offline_report.py <uuid>.sqlite
# or
python3 offline_report.py <uuid>.sqlite -i 14188 # filter by identity
```
---
## 7. Known Issues
| Issue | Status | Workaround |
|-------|:------:|------------|
| Qdrant old snapshots | ❌ | Old format chunk_ids in payloads. Re-run `clean_sentence_text.py` after restore |
| `sentence_summary` Qdrant | ❌ | Needs separate rebuild script |
| `momentry_dev_stories` Qdrant | ❌ | Parent chunks unchanged, but chunk_ids in payloads are old format |
| `search/frames` | ❌ | `column f.pose_results does not exist` — pre-existing, `pose_results` column never added to `dev.frames` |
| `search/visual/*` | ⚠️ | No visual chunks exist for Charade (test returns empty results, not errors) |
| Unregister FK | ✅ **Fixed** | Added `DELETE FROM dev.pre_chunks` before deleting video |
| `face_embedding` type | ✅ **Fixed** | Added `::real[]` cast for pgvector columns |
| `created_at` type | ✅ **Fixed** | Added `::timestamptz` cast for TIMESTAMP→TIMESTAMPTZ |
---
## 8. Migration Notes for M4
### On M4 Machine
### Release CLI Commands
```bash
# 1. Restore DB schema + data from backup
psql -U accusys -d momentry < release/phase1/backup_20260511_*/dev.chunks.sql
psql -U accusys -d momentry < release/phase1/backup_20260511_*/dev.chunk_vectors.sql
# 2. Apply schema migration
psql -U accusys -d momentry -c "
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
"
# 3. Shorten existing chunk_ids
psql -U accusys -d momentry -c "
UPDATE dev.chunk SET chunk_id = substring(chunk_id from 34)
WHERE chunk_id LIKE (file_uuid || '_%');
UPDATE dev.chunk_vectors cv SET chunk_id = substring(cv.chunk_id from 34)
FROM dev.chunk c WHERE c.file_uuid = cv.uuid AND cv.chunk_id LIKE (c.file_uuid || '_%');
"
# 4. Apply corrections
python3 scripts/generate_asr1.py
python3 scripts/apply_asr_corrections.py
# 5. Rebuild Qdrant
python3 scripts/clean_sentence_text.py
release stats # list all packages
release deploy <tar.gz> # deploy package
release undeploy <uuid> # remove all data
release visualize <uuid> # face trace heatmap (PG)
release visualize-offline <uuid>.sqlite # offline report (no PG)
```
---
## Reports Included
- `ERP_SELECTION_REPORT.md` — Odoo CE vs ERPNext analysis
- `SERVICE_INVENTORY_V1.0.0.md` — 25 source-verified tools
- `SFTPGO_ODOO_REPLACEMENT.md` — SFTPGo migration plan
- `ERP_COMPARISON_TABLE.md` — Feature comparison table
## 9. Key Scripts Reference
| Script | Input | Output | Purpose |
|--------|-------|--------|---------|
| `split_asr_segments.py` | `asr.json` + audio | `asrx.json` (4188 seg) | Sub-window speaker change detection |
| `step3_asr_fine.py` | `asrx_fine.json` + audio | ASR pass 2 text | Re-transcribes with faster-whisper |
| `migrate_to_4188.py` | `asrx_fine.json` | DB `dev.chunks` | One-time migration to 4188 |
| `generate_asr1.py` | `asr.json` + DB | `asr-1.json` | Produces correction record |
| `apply_asr_corrections.py` | `asr-1.json` | DB `dev.chunk` + vectors | Applies corrections safely |
| `clean_sentence_text.py` | DB sentence chunks | Qdrant (2 collections) | LLM cleaning + re-embedding |
| `pipeline_status.py` | DB + Qdrant | Status table | Pipeline health check |
---
## 10. Contact
| Role | Member | Responsibility |
|------|--------|---------------|
| M5 Lead | — | Vision Agent, zero-shot detection, correction mechanism |
| M4 Lead | — | Integration, deployment, pipeline ops, schema migration |
## Key Changes from V1.0.3
- TMDB face matching: 9 actors matched (93.6% face coverage)
- sqlite-vec vector database (offline vector search)
- Self-contained deploy/verify scripts
- Complete TKG with speaker nodes
- Identity data included in package (was missing)
- All documentation V1.0.0 standard (YAML frontmatter)

View File

@@ -0,0 +1,280 @@
---
document_type: "plan"
service: "MOMENTRY_CORE"
title: "Phase 1 Handover to M4 — Momentry Pipeline v1.0.0"
date: "2026-05-11"
version: "V2.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "phase1"
- "handover"
- "pipeline"
- "schema-migration"
- "charade"
ai_query_hints:
- "Phase 1 pipeline 完成狀態與交付物"
- "chunk schema 變更說明與 API 差異"
- "asr-1 糾錯機制與 chunk_id 編碼規則"
- "M4 如何接手 Phase 1 pipeline"
- "Charade 1963 處理結果摘要"
related_documents:
- "RELEASE/RELEASE_API_REFERENCE_V1.0.0.md"
- "../INTEGRATION/VISION_AGENT_RUST_INTEGRATION.md"
- "../VISION_AGENT_API_V1.0.0.md"
- "../../STANDARDS/DOCS_STANDARD.md"
---
# Phase 1 Handover — Momentry Pipeline v1.0.0
**From:** M5 (Vision Agent Team)
**To:** M4 (Integration & Deployment Team)
**Date:** 2026-05-11
**Video:** Charade (1963) — `aeed71342a899fe4b4c57b7d41bcb692`
---
## 1. Schema Changes Applied
| Change | Status | Details |
|--------|:------:|---------|
| `dev.chunks``dev.chunk` | ✅ | Table renamed, all code updated |
| `old_chunk_id` column | ✅ Removed | History in `asr-1.json`, no Rust code dependency |
| `chunk_index` column | ✅ Removed | `ORDER BY id` replaces `ORDER BY chunk_index`, all SQL updated |
| `chunk_id` short format | ✅ | `aeed..._3``"3"`, `"3-01"`, `"3-02"` |
| API response `chunk_index` | ✅ Removed | No longer returned in any endpoint |
| `pre_chunks` API endpoint | ✅ Removed | Table kept for internal pipeline use |
### Schema After Migration
```
dev.chunk (24 columns)
├── id (SERIAL PK)
├── file_uuid, chunk_id, chunk_type, ...
├── start_time, end_time, fps
├── start_frame, end_frame
├── text_content, content (JSONB), metadata (JSONB)
├── (REMOVED: old_chunk_id, chunk_index)
└── UNIQUE(file_uuid, chunk_id)
```
### Migration SQL
```sql
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
```
---
## 2. Correction Mechanism (asr-1.json)
ASR pass 1 (faster-whisper) produces 3417 segments. ASRX detects speaker changes. ASR pass 2 re-transcribes split segments. The result is 4188 corrected chunks.
### File Format: `{uuid}.asr-1.json`
```json
{
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"asr_version": 1,
"kept": [
{"chunk_index": 0, "start_frame": ..., "end_frame": ..., "text_content": "..."}
],
"corrections": [
{
"parent_chunk_index": 3,
"reason": "split",
"original": {
"start_frame": 5147, "end_frame": 5247, "text_content": "..."
},
"corrected": [
{"chunk_id": "3-01", "start_frame": 5147, "end_frame": 5190, "text_content": "..."},
{"chunk_id": "3-02", "start_frame": 5190, "end_frame": 5247, "text_content": "..."}
]
}
]
}
```
### chunk_id encoding rules
- **Original kept**: `{chunk_index}` (e.g. `"3"`)
- **Corrected**: `{parent_chunk_index}-{seq}` (e.g. `"3-01"`, `"3-02"`)
- **Re-correction**: `{parent}-{seq}-{sub}` (e.g. `"3-01-01"`)
- Unique constraint: `(file_uuid, chunk_id)`
### Correction Scripts
| Script | Purpose |
|--------|---------|
| `scripts/generate_asr1.py` | Compares DB chunks vs `asr.json`, produces `asr-1.json` |
| `scripts/apply_asr_corrections.py` | Applies corrections: delete originals, insert corrected chunks, preserve vectors |
---
## 3. Pipeline State (9/9 ✅)
```
Stage Status Detail
─────────────────────────────────
ASR ✅ faster-whisper (3417 seg)
ASRX ✅ ECAPA-TDNN speaker (4188 seg)
ASR2 ✅ asr-1.json corrections applied
Sentence ✅ 4188 chunks (short chunk_id)
Vectorize ✅ 4188 PG vectors, matching dev.chunk
FaceTrace ✅ 423 traces, 11820 faces
TKG ✅ 498 nodes, 1617 edges
TraceChunks ✅ 423 chunks
Phase1 ✅ Release package ready
```
### Qdrant Collections — Note: Need Re-snapshot
| Collection | Points | Dim | Status |
|------------|:------:|:---:|:------:|
| `momentry_dev_v1` | 4188 | 768 | ✅ Rebuilt (short chunk_id) by `clean_sentence_text.py` |
| `sentence_story` | 4188 | 768 | ✅ Rebuilt (short chunk_id) by `clean_sentence_text.py` |
| `sentence_summary` | 4188 | 768 | ❌ Still old chunk_id format |
| `momentry_dev_stories` | 560 | 768 | ❌ Still old chunk_id format |
| `momentry_dev_voice` | 4188 | 192 | ✅ Unchanged (voice embeddings) |
| `momentry_dev_faces` | 5910 | 512 | ✅ Unchanged (face embeddings) |
| `momentry_dev_rule1_v2` | 3417 | — | ❌ Legacy, not in use |
---
## 4. API Test Results (37/37 ✅)
All 37 endpoints tested:
| Category | Tested | Pass |
|----------|:------:|:----:|
| Health / Auth / Logout | 4 | ✅ |
| Stats | 3 | ✅ |
| Files / Probe | 7 | ✅ |
| Config / Resources | 3 | ✅ |
| Search (universal / frames / visual + sub-routes) | 7 | ✅ |
| Identities (list / detail / files / chunks) | 4 | ✅ |
| Trace (sortby / faces) | 2 | ✅ |
| Media (video / thumbnail) | 2 | ✅ |
| Agents (5W1H status) | 1 | ✅ |
| chunk_id format check | 2 | ✅ |
| Register + Unregister | 2 | ✅ |
---
## 5. Deliverables
| # | Item | Location | Size |
|---|------|----------|------|
| 1 | Correction record | `output_dev/{uuid}.asr-1.json` | 1.3 MB |
| 2 | Source code (Git) | `momentry_core_0.1/` | — |
| 3 | API documentation | `docs_v1.0/API_V1.0.0/` | — |
| 4 | Pipeline status | `scripts/pipeline_status.py` | — |
| 5 | Correction scripts | `scripts/generate_asr1.py` + `apply_asr_corrections.py` | — |
| 6 | LLM cleaning script | `scripts/clean_sentence_text.py` | — |
| 7 | API test script | `/tmp/test_api.sh` | — |
| 8 | DB backup (pre-migration) | `release/phase1/backup_20260511_*/` | 76 MB |
| 9 | Qdrant snapshots (old format) | `release/phase1/v1.0.0_*` | ~4 GB |
---
## 6. What M4 Needs to Do
### Setup
```bash
# 1. Environment variables
export DATABASE_SCHEMA=dev
export MOMENTRY_SERVER_PORT=3003
# 2. Build and run
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
# 3. Run LLM cleaning (rebuilds Qdrant momentry_dev_v1 + sentence_story)
nohup python3 scripts/clean_sentence_text.py > /tmp/clean_sentence.log 2>&1 &
# 4. Rebuild sentence_summary Qdrant collection
# (uses similar pattern — run generate_sentence_summaries.py)
```
### Correction Flow (for new videos)
```bash
# After ASR + ASRX pipeline completes:
python3 scripts/generate_asr1.py # produce asr-1.json
python3 scripts/apply_asr_corrections.py # apply to DB + preserve vectors
python3 scripts/clean_sentence_text.py # re-LLM-clean + re-embed
```
---
## 7. Known Issues
| Issue | Status | Workaround |
|-------|:------:|------------|
| Qdrant old snapshots | ❌ | Old format chunk_ids in payloads. Re-run `clean_sentence_text.py` after restore |
| `sentence_summary` Qdrant | ❌ | Needs separate rebuild script |
| `momentry_dev_stories` Qdrant | ❌ | Parent chunks unchanged, but chunk_ids in payloads are old format |
| `search/frames` | ❌ | `column f.pose_results does not exist` — pre-existing, `pose_results` column never added to `dev.frames` |
| `search/visual/*` | ⚠️ | No visual chunks exist for Charade (test returns empty results, not errors) |
| Unregister FK | ✅ **Fixed** | Added `DELETE FROM dev.pre_chunks` before deleting video |
| `face_embedding` type | ✅ **Fixed** | Added `::real[]` cast for pgvector columns |
| `created_at` type | ✅ **Fixed** | Added `::timestamptz` cast for TIMESTAMP→TIMESTAMPTZ |
---
## 8. Migration Notes for M4
### On M4 Machine
```bash
# 1. Restore DB schema + data from backup
psql -U accusys -d momentry < release/phase1/backup_20260511_*/dev.chunks.sql
psql -U accusys -d momentry < release/phase1/backup_20260511_*/dev.chunk_vectors.sql
# 2. Apply schema migration
psql -U accusys -d momentry -c "
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
"
# 3. Shorten existing chunk_ids
psql -U accusys -d momentry -c "
UPDATE dev.chunk SET chunk_id = substring(chunk_id from 34)
WHERE chunk_id LIKE (file_uuid || '_%');
UPDATE dev.chunk_vectors cv SET chunk_id = substring(cv.chunk_id from 34)
FROM dev.chunk c WHERE c.file_uuid = cv.uuid AND cv.chunk_id LIKE (c.file_uuid || '_%');
"
# 4. Apply corrections
python3 scripts/generate_asr1.py
python3 scripts/apply_asr_corrections.py
# 5. Rebuild Qdrant
python3 scripts/clean_sentence_text.py
```
---
## 9. Key Scripts Reference
| Script | Input | Output | Purpose |
|--------|-------|--------|---------|
| `split_asr_segments.py` | `asr.json` + audio | `asrx.json` (4188 seg) | Sub-window speaker change detection |
| `step3_asr_fine.py` | `asrx_fine.json` + audio | ASR pass 2 text | Re-transcribes with faster-whisper |
| `migrate_to_4188.py` | `asrx_fine.json` | DB `dev.chunks` | One-time migration to 4188 |
| `generate_asr1.py` | `asr.json` + DB | `asr-1.json` | Produces correction record |
| `apply_asr_corrections.py` | `asr-1.json` | DB `dev.chunk` + vectors | Applies corrections safely |
| `clean_sentence_text.py` | DB sentence chunks | Qdrant (2 collections) | LLM cleaning + re-embedding |
| `pipeline_status.py` | DB + Qdrant | Status table | Pipeline health check |
---
## 10. Contact
| Role | Member | Responsibility |
|------|--------|---------------|
| M5 Lead | — | Vision Agent, zero-shot detection, correction mechanism |
| M4 Lead | — | Integration, deployment, pipeline ops, schema migration |

View File

@@ -1,72 +0,0 @@
# M4 Handover Package — Complete
## Contents
| File | Size | Description |
|------|:----:|-------------|
| `HANDOVER_V2.0.md` | 9.6K | Main handover document |
| `api_test.sh` | 8.7K | API smoke test (37 endpoints) |
| `M4_RESPONSE.md` | 1.0K | M4 response (this file) |
### Source Code (choose one)
| File | Size | Description |
|------|:----:|-------------|
| `momentry_core_v1.0.1_source.tar.gz` | 204M | Git archive (latest commit) |
| `momentry_core.bundle` | 150M | Git bundle (full repo, `git clone momentry_core.bundle`) |
### DB Backup (pre-migration)
| File | Size | Description |
|------|:----:|-------------|
| `dev.chunks.sql` | 20M | `dev.chunks` table (old schema, pre-migration) |
| `dev.chunk_vectors.sql` | 56M | `dev.chunk_vectors` table (pre-migration) |
### Scripts
| File | Description |
|------|-------------|
| `generate_asr1.py` | Generate correction record from DB + asr.json |
| `apply_asr_corrections.py` | Apply corrections, preserve chunk_vectors |
| `clean_sentence_text.py` | LLM cleaning + Qdrant re-embedding |
| `pipeline_status.py` | Pipeline health check (9 stages) |
| `split_asr_segments.py` | Sub-window speaker change detection |
## Quick Start (on M4 machine)
```bash
# 1. Restore DB
psql -U accusys -d momentry < dev.chunks.sql
psql -U accusys -d momentry < dev.chunk_vectors.sql
# 2. Apply schema migration
psql -U accusys -d momentry -c "
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
"
psql -U accusys -d momentry -c "
UPDATE dev.chunk SET chunk_id = substring(chunk_id from 34)
WHERE chunk_id LIKE (file_uuid || '_%');
UPDATE dev.chunk_vectors cv SET chunk_id = substring(cv.chunk_id from 34)
FROM dev.chunk c WHERE c.file_uuid = cv.uuid AND cv.chunk_id LIKE (c.file_uuid || '_%');
"
# 3. Get source code
git clone momentry_core.bundle momentry_core_0.1
# or: tar xzf momentry_core_v1.0.1_source.tar.gz
# 4. Apply corrections
python3 generate_asr1.py
python3 apply_asr_corrections.py
# 5. Rebuild Qdrant
python3 clean_sentence_text.py
# 6. Build and run
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
# 7. Run API test
bash api_test.sh
```

View File

@@ -1,53 +0,0 @@
# M4 Response — All Deliverables Ready
**Date:** 2026-05-11
**From:** M5
**To:** M4
## Status
| # | Item | Ref | Status |
|:-:|------|:---:|:------:|
| 1 | Source code (git bundle + tar.gz) | §8 | ✅ `momentry_core.bundle` (150M), `momentry_core_v1.0.1_source.tar.gz` (204M) |
| 2 | DB backup (pre-migration) | §5 #8 | ✅ `dev.chunks.sql` + `dev.chunk_vectors.sql` (76M total) |
| 3 | Scripts (generate, apply, clean, pipeline) | §2, §9 | ✅ 5 scripts in this directory |
| 4 | Handover document | §1 | ✅ `HANDOVER_V2.0.md` |
| 5 | API test script | §4 | ✅ `api_test.sh` (37/37 ✅) |
| 6 | INDEX.md | — | ✅ Complete contents + quick start |
## Migration Steps (on M4 machine)
```bash
# 1. Restore DB from backup
psql -U accusys -d momentry < dev.chunks.sql
psql -U accusys -d momentry < dev.chunk_vectors.sql
# 2. Schema migration
psql -U accusys -d momentry -c "
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
"
psql -U accusys -d momentry -c "
UPDATE dev.chunk SET chunk_id = substring(chunk_id from 34)
WHERE chunk_id LIKE (file_uuid || '_%');
UPDATE dev.chunk_vectors cv SET chunk_id = substring(cv.chunk_id from 34)
FROM dev.chunk c WHERE c.file_uuid = cv.uuid AND cv.chunk_id LIKE (c.file_uuid || '_%');
"
# 3. Clone source
git clone momentry_core.bundle momentry_core_0.1
# or: tar xzf momentry_core_v1.0.1_source.tar.gz
# 4. Apply corrections
python3 generate_asr1.py
python3 apply_asr_corrections.py
# 5. LLM cleanup + Qdrant rebuild
python3 clean_sentence_text.py
# 6. Build and verify
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
bash api_test.sh
```

View File

@@ -0,0 +1,180 @@
# Portal Handover — Momentry Portal v0.1
**Date**: 2026-05-11
**From**: M4 (Integration & Deployment)
**To**: M5 (Development)
**Deliverable**: `momentry_portal_v0.1_source.tar.gz` (182 KB)
---
## 1. Overview
Tauri + Vue 3 desktop application providing visual interface for Momentry Core.
| Property | Value |
|----------|-------|
| Framework | Vue 3.4 + TypeScript + Vite 5 |
| Desktop | Tauri 2.x |
| CSS | Tailwind CSS 3.4 |
| 3D | Three.js 0.184 |
| State | Pinia 2 |
| Dev Port | 1420 (`npm run dev`) |
---
## 2. Directory Structure
```
portal/
├── src/
│ ├── main.ts # Vue entry
│ ├── App.vue # Root component (nav + ApiDemo dev-gated)
│ ├── router.ts # Vue Router with scrollBehavior + 404
│ ├── api/
│ │ └── client.ts # HTTP fetch wrapper, env config
│ ├── views/ # 14 page views (see below)
│ ├── components/ # 11 shared components
│ └── stores/ # Pinia stores
├── src-tauri/src/
│ ├── main.rs # Tauri entry
│ ├── config.rs # Config management
│ └── api/ # Tauri command handlers
├── package.json
├── vite.config.ts
└── tailwind.config.js
```
---
## 3. Page Views (14)
| View | Route | Purpose |
|------|-------|---------|
| `HomeView` | `/` | Status overview, service health |
| `LoginView` | `/login` | API key auth, auto-login from query |
| `FilesView` | `/files` | Registered video files, search, status |
| `VideoDetailView` | `/file/:file_uuid` | Video detail, face traces, SpaceTimeCube |
| `SearchView` | `/search` | Universal keyword + trace search |
| `PersonsView` | `/persons` | Identity/person management |
| `IdentityDetailView` | `/identity/:id` | Single identity detail + chunks |
| `FaceCandidatesView` | `/traces` | Face trace management (pagination, bind filter) |
| `TraceDetailView` | `/file/:file_uuid/trace/:id` | Single trace detail + face list |
| `TraceVizView` | `/trace-viz` | Standalone 3D cube (no auth, key from query) |
| `ChunkDetailView` | `/file/:file_uuid/chunk/:chunk_id` | Single chunk detail |
| `SettingsView` | `/settings` | System config, inference engines, processing stats |
| `PipelineProgressView` | `/pipeline` | Pipeline progress monitoring |
| `NotFoundView` | `*` | 404 page |
---
## 4. Key Components (11)
| Component | Purpose |
|-----------|---------|
| `SpaceTimeCube` | **V5 Feature**: 3D space-time cube (Three.js) with colored face points, trajectory line, orbit controls |
| `Face3DViewer` | 3D face embedding visualization |
| `IdentitySwimlane` | Horizontal scroll of identity thumbnails |
| `FaceTraceTimeline` | Timeline view of trace frame ranges |
| `TraceThumbnailTimeline` | Face thumbnails along timeline |
| `TraceDurationHistogram` | Histogram of trace durations |
| `TraceSimilarityMatrix` | Similarity matrix between traces |
| `ServiceStatusCard` | Health status of backend services |
| `ApiDemo` | Dev-only API key + endpoint demo |
| `PersonThumbnail` | Person face thumbnail with lazy loading |
| `TranslatableText` | Multi-language text (zh_TW/en) |
---
## 5. API Endpoints Used
| Endpoint | Used By | Method |
|----------|---------|--------|
| `/api/v1/auth/login` | LoginView | POST |
| `/api/v1/auth/logout` | App.vue | POST |
| `/api/v1/files` | FilesView | GET |
| `/api/v1/file/:uuid` | VideoDetailView | GET |
| `/api/v1/file/:uuid/chunk/:chunk_id` | ChunkDetailView | GET |
| `/api/v1/file/:uuid/identities` | VideoDetailView | GET |
| `/api/v1/file/:uuid/face_trace/sortby` | VideoDetailView, TraceDetailView | POST |
| `/api/v1/file/:uuid/trace/:id/faces?dimension=3d` | SpaceTimeCube | GET |
| `/api/v1/file/:uuid/video` | VideoDetailView | GET |
| `/api/v1/file/:uuid/thumbnail` | FilesView, VideoDetailView | GET |
| `/api/v1/search/universal` | SearchView | POST |
| `/api/v1/identities` | PersonsView | GET |
| `/api/v1/identity/:id` | IdentityDetailView | GET |
| `/api/v1/identity/:id/files` | IdentityDetailView | GET |
| `/api/v1/identity/:id/chunks` | IdentityDetailView | GET |
| `/api/v1/faces/candidates` | FaceCandidatesView | GET |
| `/api/v1/resources` | SettingsView | GET |
| `/api/v1/file/:uuid/probe` | FilesView | GET |
### API Notes
- Auth via `X-API-Key` header
- `/file/:uuid/chunk/:chunk_id` — replaced old `/file/:uuid/chunks` (M5 V1.0.2, single chunk fetch)
- `/trace/:id/faces?dimension=3d` — adds `z_rel` for 3D rendering (M4 feature, to be reported to M5)
---
## 6. Build & Run
```bash
cd portal
npm install
npm run dev # Vue dev server (port 1420)
npm run tauri dev # Full Tauri desktop app
```
Requires:
- Node.js 18+
- Rust 1.70+
- Momentry API server (port 3003 for dev)
---
## 7. M4 Changes (since M5 handover baseline)
| Change | File | Description |
|--------|------|-------------|
| V5 3D Space-Time Cube | `SpaceTimeCube.vue`, `TraceVizView.vue` | Three.js 3D trace visualization with `z_rel` from `dimension=3d` |
| ChunkDetail fix | `ChunkDetailView.vue` | Uses new `chunk/:chunk_id` endpoint (single fetch) |
| Search: All Files | `SearchView.vue` | "All Files" option in search |
| Face Traces (from FaceCandidates) | `FaceCandidatesView.vue` | Rewritten: manages traces, not individual faces |
| Trace search in Search | `SearchView.vue` | Added trace search type |
| Service status component | `ServiceStatusCard.vue` | Extracted from SettingsView |
| 404 page | `NotFoundView.vue` | Proper 404 handling |
| Scroll behavior | `router.ts` | `scrollBehavior` for navigation |
| API demo dev-gated | `App.vue` | ApiDemo only in devMode |
| NaN fix | `VideoDetailView.vue` | Video bitrate NaN → computed fallback |
| Tauri CLI dep | `package.json` | Added `@tauri-apps/cli` to devDependencies |
| Search play fix | `SearchView.vue` | Don't seek when segment already extracted via start/end |
| API key fix | `.env.development` | Corrected `VITE_API_KEY` to valid key |
---
## 8. Known Limitations
| Issue | Workaround |
|-------|-----------|
| 3D cube in Portal iframe requires auth | Standalone `TraceVizView` (`/trace-viz?key=...`) bypasses iframe auth |
| Identity thumbnails use `file_uuid`, not identity UUID | Direct endpoint call with correct params |
| `z_rel` (3D dimension) M4 feature | Needs M5 to adopt into mainline |
| Tauri CLI not in deps | `npm install` installs `@tauri-apps/cli` from updated package.json |
---
## 9. Build & Run
```bash
cd portal
npm install # includes @tauri-apps/cli
npm run dev # Vue dev server (port 1420)
npm run tauri dev # Full Tauri desktop app
```
## 10. Delivery
```bash
# Location on shared volume
ls -lh /Volumes/docs_v1.0/M4_HANDOVER/momentry_portal_v0.1_source.tar.gz
# 182KB (excludes node_modules — run npm install after extract)
```

View File

@@ -0,0 +1,13 @@
Release: v1.0.3
Date: 2026-05-11
UUID: aeed71342a899fe4b4c57b7d41bcb692
Pipeline: 9/9 ✅
Sentence chunks: 4188
Vectors: 4188
Matched: 4188
Schema: dev.chunk (24 cols, post-migration)
Backup: dev_backup_post_correction.sql (86 MB, no migration needed)
Source: momentry_core_v1.0.3_source.tar.gz (378 MB)
Correction: asr-1.json (1.3 MB)
Scripts: generate_asr1.py, apply_asr_corrections.py, clean_sentence_text.py
API tests: 39/39 ✅

View File

@@ -0,0 +1,103 @@
# 交付架構說明 — M4
## 三包制
### 1. 開發系統升級包
```
路徑: release/system/dev/latest/
內容: source code + dev schema + scripts + portal frontend
用途: playground (3003) 環境升級
升級: 覆蓋 code → 執行 migration → cargo build → 重啟
```
### 2. 生產系統升級包
```
路徑: release/system/prod/latest/
內容: source code + public schema + scripts
用途: production (3002) 環境升級
升級: 覆蓋 code → 執行 migration → cargo build --release → 重啟
```
### 3. 檔案內容包
```
路徑: release/files/{file_uuid}/latest/
內容: 單一影片的完整資料 (processors + chunks + vectors + TKG + face detections)
用途: 轉移影片到另一個環境
匯入: register → import_file_package.py → 狀態更新
```
## 轉移流程
### 情境 A: 開發環境轉移給 M4
```bash
# 1. 打包系統
bash scripts/package_system.sh dev <version>
# 2. 打包所有檔案
for uuid in $(psql -t -A -c "SELECT file_uuid FROM dev.videos"); do
bash scripts/package_file.sh $uuid
done
# 3. 交付
# release/system/dev/latest/ → M4 開發機
# release/files/*/latest/ → M4 開發機
# 4. M4 端
tar xzf source.tar.gz
cp .env.development .env.development
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
# 5. M4 匯入檔案
for uuid in $(ls release/files/); do
python3 scripts/import_file_package.py \
--uuid $uuid \
--package release/files/$uuid/latest/
done
```
### 情境 B: M4 回傳檔案內容包
```bash
# M4 端打包
bash scripts/package_file.sh <file_uuid> <version>
# 交付到 M5:
# release/files/<file_uuid>/<version>/
```
## 目錄結構
```
release/
├── system/
│ ├── dev/ ← 開發系統升級包
│ │ ├── latest → v1.0.3
│ │ └── v1.0.3/
│ └── prod/ ← 生產系統升級包
│ └── latest → ...
├── files/ ← 檔案內容包
│ ├── aeed71342.../
│ │ └── latest/
│ └── 384b0ff44.../
│ └── latest/
└── archive/ ← 已封存舊版
```
## 腳本參考
| 腳本 | 功能 | 用法 |
|------|------|------|
| `scripts/package_system.sh` | 打包系統升級包 | `bash package_system.sh dev v1.0.3` |
| `scripts/package_file.sh` | 打包單一檔案 | `bash package_file.sh {uuid}` |
| `scripts/import_file_package.py` | 匯入檔案內容包 | `python3 import_file_package.py --uuid {uuid} --package path/` |
## 現有交付
開發系統升級包: `release/system/dev/v1.0.3` (385MB)

View File

@@ -0,0 +1,250 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "Go Compiler and Gitea Service Build Report"
date: "2026-05-13"
version: "V1.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "go"
- "gitea"
- "compiler"
- "git-service"
- "source-build"
- "self-hosting"
- "bootstrap"
- "service-inventory"
ai_query_hints:
- "Go 編譯器如何從源碼構建"
- "Gitea 服務如何從源碼構建和安裝"
- "Go compiler bootstrap 流程"
- "Gitea binary build with bindata tags"
- "Go 和 Gitea 在 Momentry 系統中的角色"
- "Go self-hosting 編譯器原理解釋"
- "查詢 Go compiler 和 Gitea 的源碼版本"
related_documents:
- "M5_workspace/RESEARCH/ERP_SELECTION_REPORT.md"
- "../RELEASE/SERVICE_INVENTORY_V1.0.0.md"
---
# Go Compiler and Gitea Service Build Report
| 項目 | 內容 |
|------|------|
| 調查者 | M5 Team |
| 文件版本 | V1.0 |
| 建立日期 | 2026-05-13 |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-13 | 記錄 Go 編譯器與 Gitea 源碼構建流程 | OpenCode | deepseek-v4-pro |
---
## 關鍵術語定義
| 術語 | 定義 |
|------|------|
| Self-hosting | 編譯器可以用自己編譯自己Go 是 self-hosting 語言) |
| Bootstrap | 用現有編譯器brew Go編譯 source → 產出獨立 binary |
| Gitea | Go 語言撰寫的 Git 自託管服務(類似 GitHub |
| Bindata | Gitea 的靜態資源嵌入標籤(前後端合一的 binary |
| Go Module | Go 的套件管理系統(`go.mod`, `go.sum` |
| Make backend | Gitea 的 Makefile target編譯後端 binary |
---
## 1. Go Compiler
### 源碼來源
| 項目 | 內容 |
|------|------|
| Source URL | `https://github.com/golang/go` |
| Branch | `go1.26.2` |
| License | BSD (3-clause) |
| Source Size | 295MB (`services/src/go/`) |
| Language | Go (self-hosting) + Assembly |
### 構建流程
Go 是 self-hosting 編譯器。整個構建流程如下:
```
Phase 1: Bootstrap (環境預檢)
├── 檢查系統 GCC/Clang
├── 檢查系統 Go 編譯器brew Go 1.26.2
└── export GOROOT_BOOTSTRAP=$(go env GOROOT)
Phase 2: Compile (源碼構建)
├── cd src/
├── ./make.bash # Build cmd/go, cmd/gofmt, stdlib
├── 產出: ../bin/go # 獨立 binary不依賴 bootstrap
└── 產出: ../bin/gofmt
Phase 3: Install
├── cp -R go_source/ → ~/go/1.26.2/
├── ln -s ~/go/1.26.2/bin/go → ~/go/bin/go
└── ln -s ~/go/1.26.2/bin/gofmt → ~/go/bin/gofmt
```
### 構建指令
```bash
# Download
git clone --depth 1 --branch go1.26.2 https://github.com/golang/go.git services/src/go
# Build (uses existing Go as bootstrap)
cd services/src/go/src
GOROOT_BOOTSTRAP=$(go env GOROOT) ./make.bash
# Install
cp -R services/src/go ~/go/1.26.2
ln -sf ~/go/1.26.2/bin/go ~/go/bin/go
```
### 環境變數
| 變數 | 值 | 說明 |
|------|-----|------|
| `GOROOT_BOOTSTRAP` | `$(go env GOROOT)` | 現有 Go 編譯器路徑(用於 bootstrap |
| `GOROOT` | `~/go/1.26.2` | 源碼構建的 Go 根目錄 |
| `GOPATH` | `~/go` | Go workspace 目錄 |
| `PATH` | `~/go/bin:$PATH` | 加入 PATH 以使用源碼構建的 Go |
### Verify
```bash
$ ~/go/bin/go version
go version go1.26.2 darwin/arm64
$ ~/go/bin/go run hello.go
Go 1.26.2 source-built OK
```
---
## 2. Gitea
### 源碼來源
| 項目 | 內容 |
|------|------|
| Source URL | `https://github.com/go-gitea/gitea` |
| Branch | `v1.25.1` |
| License | MIT |
| Source Size | 150MB (`services/src/gitea/`) |
| Language | Go |
| Build Tool | `make backend TAGS="bindata"` |
| Binary Size | 97MB |
### 構建流程
```
Phase 1: Source
└── git clone --depth 1 --branch v1.25.1 https://github.com/go-gitea/gitea.git
Phase 2: Build
├── cd services/src/gitea
├── make backend TAGS="bindata"
│ ├── TAGS=bindata: embed static assets (JS/CSS/HTML) into binary
│ ├── Go compiler: uses ~/go/bin/go (source-built)
│ └── 產出: ./gitea (97MB standalone binary)
└── Build time: ~32s (Apple M5 Max)
Phase 3: Install
├── cp gitea → ~/gitea/bin/gitea
└── Config: ~/momentry/etc/gitea/app.ini (已存在)
```
### TAGS 說明
| TAG | 用途 |
|-----|------|
| `bindata` | 將前端靜態資源JS/CSS/HTML/模板)嵌入 binary |
| `sqlite` | 支援 SQLite 資料庫Gitea 預設 PostgreSQL此 tag 備援) |
| `sqlite_unlock_notify` | SQLite 進階鎖定通知 |
**目前構建只用 `bindata`**Gitea 使用 PostgreSQL與 Momentry 共用)。
### 組態
```ini
# ~/momentry/etc/gitea/app.ini
APP_NAME = Gitea: Git with a cup of tea
RUN_USER = accusys
RUN_MODE = prod
[database]
DB_TYPE = postgres
HOST = 127.0.0.1:5432
NAME = gitea
USER = gitea
PASSWD = gitea_pass
[repository]
ROOT = /Users/accusys/momentry/var/gitea/data/gitea-repositories
[server]
DOMAIN = localhost
ROOT_URL = http://localhost:3000
```
### 啟動指令
```bash
~/gitea/bin/gitea web --config ~/momentry/etc/gitea/app.ini
```
---
## 3. 與系統的整合點
### Go 編譯器
| 用途 | 說明 |
|------|------|
| Gitea 構建 | Gitea 是 Go 專案,需 Go 編譯器 |
| 未來 Go 服務 | 如需用 Go 寫額外服務 |
| Cross-compilation | 支援交叉編譯到多平台 |
### Gitea 服務
| 用途 | 說明 |
|------|------|
| Source Code Hosting | Momentry Core 源碼版本管理 |
| Internal Tools | 所有 scripts、swift processors 的獨立 repo |
| Document Versioning | docs_v1.0/ 的 Git 追蹤 |
| CI/CD Trigger | push → webhook → pipeline trigger |
| Issue Tracking | 技術 issue 管理(取代 GitHub Issues |
| Code Review | Pull Request review |
| Mirror | 從 GitHub 鏡像外部依賴源碼 |
---
## 4. 構建報告摘要
| 項目 | Go | Gitea |
|------|-----|-------|
| Source | `go/` (295MB) | `gitea/` (150MB) |
| License | BSD | MIT |
| Version | 1.26.2 | 1.25.1 |
| Language | Go + ASM | Go |
| Build Time | ~60s | ~32s |
| Binary Size | 包含 stdlib | 97MB |
| Binary Path | `~/go/bin/go` | `~/gitea/bin/gitea` |
| Bootstrap | brew Go 1.26.2 | source-built Go |
---
## 5. Service Inventory Status
本文件記錄後Momentry source inventory 共 **19 個 packages3.4GB**
完整清單見 `service source list` 輸出。

View File

@@ -0,0 +1,242 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "Service Inventory Report — All Source-Verified Tools & Dependencies"
date: "2026-05-13"
version: "V1.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "service-inventory"
- "source-build"
- "tools"
- "dependencies"
- "sqlite-vec"
- "release-package"
ai_query_hints:
- "查詢全部服務依賴清單"
- "Momentry Core 使用哪些開源工具"
- "哪些服務是從源碼構建"
- "Service inventory total size"
- "source-verified tools list"
related_documents:
- "REPORTS/ERP_SELECTION_REPORT.md"
- "REPORTS/SFTPGO_ODOO_REPLACEMENT.md"
- "REPORTS/SERVICE_GO_GITEA_BUILD.md"
- "STANDARDS/DOCS_STANDARD.md"
---
# Service Inventory Report — All Source-Verified Tools
| 項目 | 內容 |
|------|------|
| 調查者 | M5 Team |
| 文件版本 | V1.0 |
| 建立日期 | 2026-05-13 |
| 總工具數 | 25 |
| 總源碼大小 | 3.7GB |
| 驗證指令 | `cargo run --bin service -- source verify` |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-13 | 建立完整服務源碼清單 | OpenCode | deepseek-v4-pro |
---
## 1. 分層架構
```
┌──────────────────────────────────────────────────────┐
│ Level 4: Applications │
│ Odoo 19 CE, ERPNext v15, Gitea v1.25 │
├──────────────────────────────────────────────────────┤
│ Level 3: ML Models & Pipelines │
│ llama.cpp, GroundingDINO, PaliGemma, │
│ transcribe.py, embed_faces.py, speaker_assign.py │
├──────────────────────────────────────────────────────┤
│ Level 2: Tools & Languages │
│ ffmpeg, LibreOffice, mermaid-cli, rsvg-convert, │
│ yt-dlp, librsvg, x264, freetype │
├──────────────────────────────────────────────────────┤
│ Level 1: Databases & Storage │
│ PostgreSQL, Redis, Qdrant, SQLite, sqlite-vec │
├──────────────────────────────────────────────────────┤
│ Level 0: Build System & Runtimes │
│ cmake, Python (pyenv), Rust/Cargo, Go, Swift, │
│ Frappe Framework, rustup │
└──────────────────────────────────────────────────────┘
```
---
## 2. 完整清單(按分類)
### Build System (5)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 1 | cmake | 4.2.0 | 80MB | OSI | Binary (cmake.org) |
| 2 | Python | 3.11.15 | via pyenv | PSF | pyenv source build |
| 3 | Go | 1.26.2 | 295MB | BSD | self-hosting bootstrap |
| 4 | Rust/Cargo | 1.95.0 | 259MB | Apache 2.0/MIT | rustup-managed |
| 5 | Swift | 6.3.1 | 36MB | Apache 2.0 | Xcode CLT |
### Databases (5)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 6 | PostgreSQL | 18.3 | 28MB | PostgreSQL | ./configure + make |
| 7 | Redis | 7.4.3 | 3MB | BSD | make |
| 8 | SQLite | 3.49.1 | 3MB | Public Domain | amalgamation |
| 9 | sqlite-vec | 0.1.10 | 4.4MB | MIT | Cargo + C |
| 10 | Qdrant | 1.17.1 | in repo | Apache 2.0 | Cargo build |
### Media Processing (3)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 11 | ffmpeg | 7.1.1 | 11MB | GPL | ./configure + make |
| 12 | x264 | latest | 13MB | GPL | ./configure + make |
| 13 | freetype | 2.13.3 | 4MB | FTL | ./configure + make |
### ML & AI (3)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 14 | llama.cpp | 9041 | 183MB | MIT | cmake + make |
| 15 | GroundingDINO | latest | 23MB | Apache 2.0 | git clone |
| 16 | PaliGemma | 3B | 4KB ref | Gemma | HuggingFace |
### Document & Graphics (4)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 17 | LibreOffice | 26.2.3 | 279MB + 281MB | MPL-2.0 | TDF binary + source |
| 18 | librsvg | 2.62.1 | 564MB | LGPL | Cargo build |
| 19 | mermaid-cli | 11.14.0 | 1MB | MIT | npm install |
| 20 | yt-dlp | 2026.03.17 | 16MB | Unlicense | git clone |
### ERP & Git (4)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 21 | Odoo 19 CE | 19.0 | 1.3GB | LGPL-3.0 | git clone |
| 22 | ERPNext v15 | v15 | 97MB | GPL-3.0 | git clone |
| 23 | Frappe Framework | v15 | 101MB | MIT | git clone |
| 24 | Gitea | 1.25.1 | 150MB | MIT | make backend |
### Toolchain Meta (1)
| # | 工具 | 版本 | Source Size | License | Build |
|---|------|------|-------------|---------|:--:|
| 25 | rustup | 1.28.1 | 988KB | Apache 2.0 | tarball |
---
## 3. Release Package 結構
```
<uuid>_v<timestamp>.tar.gz
├── data.sql PostgreSQL dump (6 tables)
├── <uuid>.sqlite SQLite database with vec0 vectors
├── <uuid>.asr.json ASR transcription
├── <uuid>.face.json Face detection + embeddings
├── <uuid>.face_traced.json Face traces
├── <uuid>.identities.json 428 identities + bindings
├── <uuid>.speaker_map.json Speaker assignments
├── <uuid>.cut.json Scene cuts
├── <uuid>.yolo.json YOLO detections
├── <uuid>.ocr.json OCR text
├── <uuid>.pose.json Body poses
├── <video_file>.mp4 Original video file
└── file_info.json Metadata
```
## 4. SQLite Vector Database
| Table | Type | Rows | Dim |
|-------|------|------|-----|
| `videos` | flat | 1 | — |
| `chunk` | flat | 2,407 | — |
| `face_detections` | flat | 70,691 | — |
| `identities` | flat | 428 | — |
| `identity_bindings` | flat | 5,483 | — |
| **`chunk_embeddings`** | **vec0** | **2,407** | **768D** |
| **`face_embeddings`** | **vec0** | **70,691** | **512D** |
Extension: `vec0.dylib` (190KB, MIT, sqlite-vec loadable extension)
## 5. 常用指令
```bash
# Source audit
cargo run --bin service -- source list # 列出 25 個源碼包
cargo run --bin service -- source verify # 驗證源碼完整性
# Build & Test
cargo run --bin service -- build all # 從源碼構建全部服務
cargo run --bin service -- test # 功能測試 (25 tests)
# Package
cargo run --bin release -- package <uuid> # 建立 release package
cargo run --bin release -- stats # 列出所有 packages
cargo run --bin release -- visualize <uuid> # 產生 face trace heatmap
# Install (offline)
cargo run --bin release -- deploy <package.tar.gz> # 部署 package
cargo run --bin release -- undeploy <uuid> # 移除所有 data
```
## 6. 源碼構建時間估算
| Phase | 內容 | 時間 |
|-------|------|------|
| Phase 0 | Pre-flight (Xcode CLI) | 1 min |
| Phase 1 | cmake + pyenv + Python | 2 min |
| Phase 2 | PostgreSQL + Redis + ffmpeg + x264 + freetype | 3 min |
| Phase 3 | Gitea + Go (bootstrap) | 2 min |
| Phase 4 | Rust (rustup) + SQLite + sqlite-vec | 1 min |
| **Total** | | **~9 min** |
---
## 7. 授權分布
| License | Count | Tools |
|---------|:-----:|-------|
| MIT | 6 | llama.cpp, mermaid-cli, Gitea, sqlite-vec, Frappe Framework, librsvg |
| Apache 2.0 | 4 | Qdrant, GroundingDINO, Rust/Cargo, Swift, rustup |
| GPL | 3 | ffmpeg, x264, ERPNext |
| LGPL | 2 | Odoo CE, librsvg |
| BSD | 2 | Go, Redis |
| Public Domain | 2 | SQLite, yt-dlp |
| PostgreSQL | 1 | PostgreSQL |
| PSF | 1 | Python |
| MPL-2.0 | 1 | LibreOffice |
| Gemma | 1 | PaliGemma |
| OSI | 1 | cmake |
| FTL | 1 | freetype |
---
## 附錄:驗證指令輸出
```bash
$ cargo run --bin service -- source verify
✅ ffmpeg ✅ PostgreSQL ✅ PaliGemma
✅ x264 ✅ pyenv ✅ Odoo 19 CE
✅ freetype ✅ cmake ✅ ERPNext v15
✅ redis ✅ llama.cpp ✅ Frappe Framework
✅ yt-dlp ✅ librsvg ✅ Gitea v1.25
✅ SQLite ✅ GroundingDINO ✅ Go v1.26
✅ sqlite-vec ✅ mermaid-cli ✅ Rust/Cargo
✅ Swift v6.3 ✅ LibreOffice ✅ rustup
25/25 sources verified
```

View File

@@ -0,0 +1,432 @@
---
document_type: "plan"
service: "MOMENTRY_CORE"
title: "SFTPGo Replacement Plan — Migration to Odoo CE File Upload"
date: "2026-05-13"
version: "V1.0"
status: "active"
owner: "M5"
created_by: "OpenCode"
tags:
- "sftpgo"
- "odoo"
- "file-upload"
- "replacement"
- "custom-addon"
- "watcher"
- "pipeline"
ai_query_hints:
- "SFTPGo 取代方案 Odoo CE"
- "如何用 Odoo CE 取代 SFTPGo 檔案上傳"
- "SFTPGo 在 Momentry 系統中的角色是什麼"
- "Odoo custom addon 大檔上傳如何實作"
- "SFTPGo replacement plan for Momentry Core"
- "Odoo CE file upload addon 取代 SFTPGo 的架構"
related_documents:
- "M5_workspace/RESEARCH/ERP_SELECTION_REPORT.md"
- "M5_workspace/RESEARCH/ERP_COMPARISON_TABLE.md"
---
# SFTPGo Replacement Plan — Migration to Odoo CE
| 項目 | 內容 |
|------|------|
| 調查者 | M5 Team |
| 文件版本 | V1.0 |
| 建立日期 | 2026-05-13 |
---
## 版本歷史
| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-05-13 | 建立 SFTPGo→Odoo 取代方案分析 | OpenCode | deepseek-v4-pro |
---
## 關鍵術語定義
| 術語 | 定義 |
|------|------|
| SFTPGo | 開源 SFTP/WebDAV 檔案伺服器,負責影片上傳 |
| Watcher | Momentry Rust 模組,掃描目錄並觸發影片註冊 |
| Demo Dir | Watcher 監控的目錄 (`MOMENTRY_SFTP_ROOT`) |
| Custom Addon | Odoo CE 自訂模組,擴展原生功能 |
| `ir.attachment` | Odoo 內建附件管理模型 |
---
**狀態:** 方案分析
---
## 目錄
1. [現狀分析](#1-現狀分析)
2. [取代架構](#2-取代架構)
3. [需要自訂的 Addon](#3-需要自訂的-addon)
4. [技術細節](#4-技術細節)
5. [風險與應對](#5-風險與應對)
6. [實作計畫](#6-實作計畫)
7. [結論](#7-結論)
---
## 1. 現狀分析
### SFTPGo 在系統中的角色
```
SFTPGo :8080 Momentry Core
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│ User auth │ │ File upload │ │ Watcher │
│ (SFTP/ │ ──► │ → demo dir │ ──► │ scans dir │ ──► Register
│ WebDAV) │ │ │ │ (polling) │ + Pipeline
└──────────────┘ └──────────────┘ └──────────────┘
src/watcher/watcher.rs
```
SFTPGo 做的事情很薄,只有三件事:
1. **認證** — SFTP/WebDAV username/password
2. **檔案上傳** — 用戶透過 SFTP client 上傳影片
3. **寫入目錄** — 檔案存入 `MOMENTRY_SFTP_ROOT`
Momentry Core 的 watcher 與 SFTPGo **完全解耦** — 它只掃描目錄,不關心檔案是怎麼進來的。
### 現有配置
```bash
# .env.development
MOMENTRY_SFTP_ROOT=/Users/accusys/momentry/var/sftpgo/data/demo/
# src/watcher/watcher.rs
# Default fallback:
"/Users/accusys/momentry/var/sftpgo/data/demo/"
```
### 為什麼要取代 SFTPGo
| 問題 | 說明 |
|------|------|
| 多餘的服務 | SFTPGo 是一個獨立的 binary、port、auth 系統 |
| 用戶管理分散 | SFTPGo 有自己的 user DB與 Momentry/Odoo 不互通 |
| 無上傳紀錄 | 誰上傳了什麼檔案?多久?無法追溯 |
| 無法觸發註冊 | 上傳完成後需等 watcher 掃描,非即時 |
| 無 Web UI | 需要 SFTP client一般用戶不會用 |
---
## 2. 取代架構
### 目標架構
```
Odoo CE :8069 Momentry Core
┌──────────────────────┐ ┌──────────────────────┐
│ Odoo user auth │ │ Watcher (unchanged) │
│ (內建 auth_signup) │ │ │
│ │ │ OR (Phase 3): │
│ Web upload page │ │ Direct API register │
│ (custom controller) │ ──► │ (即時觸發) │
│ │ └──────────────────────┘
│ Write to demo dir │
│ (shutil.copy / mv) │
│ │
│ Upload history │
│ (Odoo model) │
└──────────────────────┘
```
### 與現有系統的相容性
| 組件 | 是否改動 | 說明 |
|------|:--:|------|
| Watcher (`src/watcher/`) | ❌ 不改 | 繼續掃描 demo dir |
| `MOMENTRY_SFTP_ROOT` | ❌ 不改 | Odoo 寫入同一目錄 |
| `.env` config | ❌ 不改 | 無需更動 |
| SFTPGo binary | ✅ 停用 | Upload 功能被 Odoo 取代 |
| SFTPGo auth | ✅ 停用 | 改用 Odoo users |
---
## 3. 需要自訂的 Addon
### Addon 結構
```
odoo_custom_addons/
└── momentry_upload/
├── __init__.py
├── __manifest__.py # depends: ['base', 'website', 'portal']
├── controllers/
│ └── upload.py # Web upload endpoint
├── models/
│ └── upload_record.py # 上傳記錄 model
├── views/
│ ├── upload_form.xml # 上傳頁面模板
│ ├── upload_success.xml # 成功頁面
│ └── upload_menu.xml # 導航選單
└── security/
├── ir.model.access.csv # 權限定義
└── upload_security.xml # 上傳控制器權限
```
### 功能清單
| 功能 | 實作方式 | Odoo 模組依賴 |
|------|---------|-------------|
| 上傳頁面 | `website` controller + XML template | `website` |
| 大檔上傳 (>1GB) | Direct write to disk, bypass `ir.attachment` | — |
| 用戶隔離 | `request.env.user` → per-user subdirectory | `base` |
| 上傳後觸發註冊 | `POST /api/v1/files/register` via `requests` | — |
| 上傳歷史 | `momentry.upload.record` model | `base` |
| 用戶權限 | `security/ir.model.access.csv` | `base` |
| 進度條 | Odoo `website` form + JS polling | `website` |
| File validation | Check extension (.mp4, .mov, etc.) | — |
### 核心程式碼概念
```python
# controllers/upload.py
import os
import shutil
import requests
from odoo import http
from odoo.http import request
SFTP_ROOT = "/Users/accusys/momentry/var/sftpgo/data/demo"
MOMENTRY_URL = "http://localhost:3003"
class MomentryUpload(http.Controller):
@http.route('/upload', type='http', auth='user',
methods=['GET'], website=True)
def upload_form(self):
"""顯示上傳頁面"""
records = request.env['momentry.upload.record'].search(
[('user_id', '=', request.env.user.id)],
order='create_date desc', limit=20
)
return request.render('momentry_upload.upload_form', {
'records': records,
})
@http.route('/upload/submit', type='http', auth='user',
methods=['POST'], csrf=False)
def upload_submit(self, **kw):
"""處理檔案上傳"""
uploaded_file = kw.get('file')
if not uploaded_file:
return request.render('momentry_upload.upload_form', {
'error': 'No file selected'
})
filename = uploaded_file.filename
user_dir = os.path.join(SFTP_ROOT, request.env.user.login)
os.makedirs(user_dir, exist_ok=True)
dest_path = os.path.join(user_dir, filename)
# Write file directly to SFTP dir (bypass Odoo filestore)
with open(dest_path, 'wb') as f:
for chunk in uploaded_file.read():
f.write(chunk)
# Create upload record
record = request.env['momentry.upload.record'].create({
'user_id': request.env.user.id,
'filename': filename,
'file_path': dest_path,
'file_size': os.path.getsize(dest_path) if os.path.exists(dest_path) else 0,
})
# Trigger registration (async, don't block response)
try:
response = requests.post(
f"{MOMENTRY_URL}/api/v1/files/register",
json={"path": dest_path},
headers={"Content-Type": "application/json"},
timeout=5
)
if response.status_code == 200:
record.write({'status': 'registered',
'momentry_uuid': response.json().get('file_uuid', '')})
except Exception:
record.write({'status': 'uploaded'}) # will be picked up by watcher
return request.render('momentry_upload.upload_success', {
'record': record,
})
# models/upload_record.py
from odoo import models, fields
class MomentryUploadRecord(models.Model):
_name = 'momentry.upload.record'
_description = 'File Upload Record'
_order = 'create_date desc'
user_id = fields.Many2one('res.users', string='Uploader', required=True)
filename = fields.Char(required=True)
file_path = fields.Char()
file_size = fields.Integer(string='Size (bytes)')
status = fields.Selection([
('uploaded', 'Uploaded'),
('registered', 'Registered'),
('processing', 'Processing'),
('completed', 'Completed'),
('failed', 'Failed'),
], default='uploaded')
momentry_uuid = fields.Char(string='Momentry UUID')
notes = fields.Text()
create_date = fields.Datetime(string='Upload Time', readonly=True)
```
---
## 4. 技術細節
### 大檔上傳處理
Odoo 預設限制 25MB (`--max-file-size`)。影片檔可達數 GB。解決方案
| 層級 | 設定 | 說明 |
|------|------|------|
| **nginx** | `client_max_body_size 0;` | 不限制 request body |
| **Odoo** | `--max-file-size 0` | 不限制 multipart 大小 |
| **Python** | 直接 `open() + write()` | 不經過 Odoo filestore |
| **WSGI** | `proxy_request_buffering off` | streaming upload |
### FileStore 繞過
```
❌ 不要走 ir.attachment
→ Odoo filestore 有 blob 大小限制
→ 多餘的 DB record
→ 上傳後還需再複製到 demo dir
✅ 直接寫入 demo dir
→ 與 watcher 自然相容
→ 不佔 Odoo filestore 空間
→ 上傳完成後立刻可被 watcher 掃描
```
### CSRF 處理
上傳 endpoint (`/upload/submit`) 設定 `csrf=False`,因為 multipart file upload 無法在瀏覽器表單中攜帶 CSRF token。這在 Odoo 中是常見做法(`website_sale` 的 checkout 也這樣處理)。
### 用戶隔離
每個 Odoo user 有自己的子目錄:
```
demo/
├── admin/ # admin 上傳的檔案
│ └── video1.mp4
├── user_a/ # user_a 上傳的檔案
│ └── video2.mov
└── user_b/
└── video3.mp4
```
權限由 Odoo user 控制(可限制哪些用戶可以上傳)。
### Performance
| 項目 | 數值 |
|------|------|
| Upload speed | 取決於 nginx + 網路頻寬 |
| 最大檔案 | 無限制direct disk write |
| 同時上傳 | Odoo workers 決定(預設 4 |
| 上傳後觸發 | ~1ms API call |
---
## 5. 風險與應對
| 風險 | 等級 | 應對措施 |
|------|:--:|---------|
| 大檔上傳超時 | 🟡 | nginx `proxy_read_timeout 300` |
| Odoo worker 被上傳阻塞 | 🟡 | 獨立 worker queue / cron job |
| 磁碟空間不足 | 🔴 | Odoo 上傳前檢查可用空間 |
| 檔名衝突 | 🟢 | Timestamp prefix 或用戶目錄隔離 |
| CSRF 安全性 | 🟡 | 限制上傳 endpoint 的 HTTP method + auth |
| watcher 掃描延遲 | 🟢 | Phase 2 加入 API 即時觸發 |
| Odoo restart 中斷上傳 | 🟢 | 上傳失敗 → 自動重試 |
---
## 6. 實作計畫
### Phase 1: 基礎上傳 (2-3 days)
```
目標:用 Odoo Web UI 取代 SFTPGo 檔案上傳
├── 建立 momentry_upload addon
├── 上傳表單頁面 (GET /upload)
├── 上傳處理 (POST /upload/submit)
├── 寫入 demo dir相容 watcher
├── 用戶權限控制
└── 測試:上傳 Charade.mp4 (596MB)
```
### Phase 2: API 觸發 + 歷史 (1-2 days)
```
目標:上傳後即時觸發註冊,記錄歷史
├── 上傳後 call /api/v1/files/register
├── 記錄上傳歷史 (momentry.upload.record)
├── 上傳狀態追蹤 (uploaded → registered → completed)
└── 管理後台檢視 (admin 可看所有上傳)
```
### Phase 3: 取代 watcher (optional, 2-3 days)
```
目標:跳過 watcher 掃描Odoo 直接驅動 pipeline
├── Odoo cron job 定期檢查新檔案
├── 或: 上傳後直接觸發 POST /api/v1/file/:uuid/process
└── 停用 Rust watcher或其他目錄不再需要 polling
```
---
## 7. 結論
### 可行性
| 項目 | 評估 |
|------|------|
| 技術可行性 | ✅ 高 — Odoo CE + custom addon |
| 相容性 | ✅ 完全相容現有 watcher |
| 開發量 | Phase 1: 2-3 days |
| 風險 | 低 — 只改前端上傳,不碰 pipeline |
### 建議
```
Phase 1 (MVP): 2-3 days
→ 可以取代 SFTPGo 的核心檔案上傳功能
→ SFTPGo 仍保留作為備用(不同 port
Phase 2: 1-2 days
→ 加上即時註冊觸發 + 歷史記錄
→ 體驗完整
Phase 3: optional
→ 考量 watcher 是否需要保留
```
### 附錄SFTPGo 模組資訊
| 項目 | 說明 |
|------|------|
| Binary | SFTPGo 自帶 binary |
| Port | 8080 (SFTP), 8081 (WebDAV) |
| Config | `/Users/accusys/momentry/etc/sftpgo/` |
| Data | `/Users/accusys/momentry/var/sftpgo/data/` |
| Auth | 獨立 user DB |
| Source | 未納入源碼清單Go 語言,未從源碼構建) |

File diff suppressed because it is too large Load Diff

View File

@@ -174,6 +174,11 @@ test_post "POST /api/v1/search/visual/combination" "/api/v1/search/visual/combin
title "5W1H Agent"
test_get "GET /api/v1/agents/5w1h/status" "/api/v1/agents/5w1h/status"
# ── Chunk detail endpoint ──
title "Chunk detail"
test_get "GET /api/v1/file/$UUID/chunk/0-01" "/api/v1/file/$UUID/chunk/0-01"
test_get "GET /api/v1/file/$UUID/chunk/nonexistent" "/api/v1/file/$UUID/chunk/nonexistent" 404
# ── Specific search tests for chunk_id format ──
title "chunk_id format check"
RESULT=$(curl -s -X POST "$BASE/api/v1/search/universal" \

View File

@@ -0,0 +1,85 @@
#!/bin/bash
# Momentry Release Package — Deploy Script
# Usage: bash deploy.sh [--db-only] [--skip-video]
set -euo pipefail
DIR="$(cd "$(dirname "$0")" && pwd)"
UUID=$(basename "$DIR")
PG_BIN="${PG_BIN:-/Users/accusys/pgsql/18.3/bin}"
DB_NAME="${DB_NAME:-momentry}"
DB_USER="${DB_USER:-accusys}"
DEMO_DIR="${DEMO_DIR:-/Users/accusys/momentry/var/sftpgo/data/demo}"
OUTPUT_DIR="${OUTPUT_DIR:-/Users/accusys/momentry/output_dev}"
echo "=== Momentry Package Deploy ==="
echo "UUID: $UUID"
echo "Time: $(date '+%Y-%m-%d %H:%M:%S')"
echo ""
# 1. Verify package integrity
echo "[1/5] Verifying package..."
REQUIRED_FILES=("data.sql" "file_info.json")
MISSING=0
for f in "${REQUIRED_FILES[@]}"; do
if [ ! -f "$DIR/$f" ]; then
echo " ❌ Missing: $f"
MISSING=1
fi
done
if [ $MISSING -eq 1 ]; then
echo "ERROR: Package incomplete"
exit 1
fi
echo " ✅ Package verified"
# 2. Import data.sql
echo "[2/5] Importing DB data..."
"$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -f "$DIR/data.sql" 2>&1 | tail -3
echo " ✅ Data imported"
# 3. Copy video to demo dir
VIDEO_FILE=$(ls "$DIR"/*.mp4 "$DIR"/*.mov "$DIR"/*.avi "$DIR"/*.mkv 2>/dev/null | head -1)
if [ -n "$VIDEO_FILE" ]; then
VIDEO_NAME=$(basename "$VIDEO_FILE")
DEST="$DEMO_DIR/$VIDEO_NAME"
if [ ! -f "$DEST" ]; then
cp "$VIDEO_FILE" "$DEST"
echo "[3/5] Video copied: $VIDEO_NAME$DEMO_DIR"
else
echo "[3/5] Video already in demo dir, skipping"
fi
else
echo "[3/5] No video file in package, skipping"
fi
# 4. Copy output files
echo "[4/5] Copying output files..."
COPIED=0
for f in "$DIR"/*.json "$DIR"/*.sqlite "$DIR"/*.sqlite; do
if [ -f "$f" ]; then
FNAME=$(basename "$f")
if [ "$FNAME" != "file_info.json" ] && [ "$FNAME" != "package.json" ]; then
cp "$f" "$OUTPUT_DIR/$FNAME"
COPIED=$((COPIED + 1))
fi
fi
done
echo "$COPIED files copied to $OUTPUT_DIR"
# 5. Verify deployment
echo "[5/5] Verifying deployment..."
CHUNKS=$("$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -t -A -c "SELECT COUNT(*) FROM dev.chunk WHERE file_uuid='$UUID' AND chunk_type='sentence'" 2>/dev/null || echo "?")
FACES=$("$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -t -A -c "SELECT COUNT(*) FROM dev.face_detections WHERE file_uuid='$UUID'" 2>/dev/null || echo "?")
echo ""
echo "=== Deploy Complete ==="
echo " UUID: $UUID"
echo " Chunks: $CHUNKS"
echo " Faces: $FACES"
echo " Output: $OUTPUT_DIR/"
echo ""
echo "Next: trigger pipeline processing"
echo " curl -X POST http://localhost:3003/api/v1/file/$UUID/process"
echo ""
echo "Or open the offline report:"
echo " python3 render_offline_report.py $OUTPUT_DIR/$UUID.sqlite"

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

View File

@@ -0,0 +1,161 @@
#!/opt/homebrew/bin/python3.11
"""
Process Swift face detection output + add CoreML FaceNet embeddings.
Replaces face_processor.py Step 2 when Swift already ran.
"""
import sys, os, json, argparse, time
import cv2
import numpy as np
import coremltools as ct
from pathlib import Path
SCRIPT_DIR = os.path.dirname(os.path.abspath(__file__))
FACENET_PATH = os.path.join(SCRIPT_DIR, "..", "models", "facenet512.mlpackage")
def classify_pose(roll, yaw):
abs_yaw = abs(yaw)
abs_roll = abs(roll)
if abs_yaw < 15 and abs_roll < 15:
return "frontal"
elif abs_yaw > 30:
return "profile_right" if yaw > 0 else "profile_left"
else:
return "three_quarter"
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--swift-json", required=True, help="Swift detection output")
parser.add_argument("--video", required=True, help="Video file path")
parser.add_argument("--output", required=True, help="Output face.json path")
parser.add_argument("--fps", type=float, default=24.0)
args = parser.parse_args()
print(f"[EMBED] Loading Swift output: {args.swift_json}")
with open(args.swift_json) as f:
swift = json.load(f)
swift_frames = swift.get("frames", [])
print(f"[EMBED] Swift frames: {len(swift_frames)}")
# Load CoreML FaceNet
facenet = os.path.normpath(FACENET_PATH)
coreml_model = None
if os.path.exists(facenet):
coreml_model = ct.models.MLModel(facenet)
print(f"[EMBED] FaceNet loaded")
else:
print(f"[EMBED] WARNING: FaceNet not found at {facenet}")
# Open video
video = cv2.VideoCapture(args.video)
if not video.isOpened():
raise RuntimeError(f"Cannot open {args.video}")
v_fps = video.get(cv2.CAP_PROP_FPS)
v_total = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
v_width = int(video.get(cv2.CAP_PROP_FRAME_WIDTH))
v_height = int(video.get(cv2.CAP_PROP_FRAME_HEIGHT))
print(f"[EMBED] Video: {v_width}x{v_height}, {v_fps:.1f}fps")
# Sequential read optimization: build lookup set
needed_frames = set()
frame_data_map = {}
for sf in swift_frames:
fn = int(sf.get("frame", sf.get("frame_number", 0)))
needed_frames.add(fn)
frame_data_map[fn] = sf
output_frames = []
embed_count = 0
t0 = time.time()
current_frame = 0
while True:
ret, frame = video.read()
if not ret:
break
if current_frame not in needed_frames:
current_frame += 1
continue
sf = frame_data_map[current_frame]
timestamp = sf.get("timestamp", current_frame / v_fps)
faces_in = sf.get("faces", [])
processed_faces = []
for face in faces_in:
bb = face.get("bbox", {})
x, y, w, h = bb.get("x", 0), bb.get("y", 0), bb.get("width", 0), bb.get("height", 0)
if w <= 10 or h <= 10:
continue
x1, y1 = max(0, x), max(0, y)
x2, y2 = min(v_width, x + w), min(v_height, y + h)
if x2 <= x1 or y2 <= y1:
continue
face_img = frame[y1:y2, x1:x2]
if face_img.size == 0:
continue
emb = None
if coreml_model is not None and face_img.shape[0] > 0 and face_img.shape[1] > 0:
try:
resized = cv2.resize(face_img, (160, 160))
rgb = cv2.cvtColor(resized, cv2.COLOR_BGR2RGB).astype(np.float32)
normalized = rgb / 127.5 - 1.0
input_data = np.expand_dims(np.transpose(normalized, (2, 0, 1)), axis=0)
result = coreml_model.predict({"input": input_data})
emb = list(result.values())[0].flatten().tolist()
embed_count += 1
except Exception as e:
pass
# Pose
pose_info = face.get("pose", {})
pose_angle = classify_pose(pose_info.get("roll", 0), pose_info.get("yaw", 0))
processed_faces.append({
"x": x, "y": y, "width": w, "height": h,
"confidence": face.get("confidence", 0.5),
"embedding": emb,
"pose_angle": {
"angle": pose_angle,
"roll": pose_info.get("roll", 0),
"yaw": pose_info.get("yaw", 0),
"pitch": pose_info.get("pitch", 0),
},
"lips": face.get("lips"),
"landmarks": face.get("landmarks"),
"attributes": None,
})
if processed_faces:
output_frames.append({
"frame": current_frame,
"timestamp": timestamp,
"faces": processed_faces,
})
current_frame += 1
if len(output_frames) % 500 == 0:
print(f"[EMBED] {len(output_frames)}/{len(needed_frames)} frames, {embed_count} embeddings, {time.time()-t0:.0f}s")
video.release()
output = {
"frame_count": len(output_frames),
"fps": v_fps,
"frames": output_frames,
}
os.makedirs(os.path.dirname(args.output), exist_ok=True)
with open(args.output, "w") as f:
json.dump(output, f, indent=2, ensure_ascii=False)
elapsed = time.time() - t0
print(f"[EMBED] Done: {len(output_frames)} frames, {embed_count} embeddings, {elapsed:.0f}s → {args.output}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,67 @@
#!/opt/homebrew/bin/python3.11
"""
Export a single file's data to SQL file (COPY format).
Usage: python3 export_file_package.py <file_uuid> <output_dir>
"""
import json, os, sys, subprocess
PG_BIN = "/Users/accusys/pgsql/18.3/bin"
DB_URL = "postgresql://accusys@localhost:5432/momentry"
TABLES = [
("dev.videos", "file_uuid"),
("dev.chunk", "file_uuid"),
("dev.chunk_vectors", "uuid"),
("dev.face_detections", "file_uuid"),
]
def main():
uuid = sys.argv[1] if len(sys.argv) > 1 else "aeed71342a899fe4b4c57b7d41bcb692"
outdir = sys.argv[2] if len(sys.argv) > 2 else "/tmp/file_pkg"
os.makedirs(outdir, exist_ok=True)
sql_path = os.path.join(outdir, "data.sql")
print(f"Exporting {uuid}{sql_path}")
with open(sql_path, "w") as f:
f.write(f"-- File package: {uuid}\nBEGIN;\n\n")
for tbl, col in TABLES:
f.write(f"-- {tbl} WHERE {col} = '{uuid}'\n")
# Get column list
schema, table = tbl.split(".")
r = subprocess.run(
[f"{PG_BIN}/psql", "-U", "accusys", "-d", "momentry", "-t", "-A",
"-c", f"SELECT string_agg(column_name, ', ' ORDER BY ordinal_position) FROM information_schema.columns WHERE table_schema='{schema}' AND table_name='{table}' AND is_updatable='YES'"],
capture_output=True, text=True, timeout=15)
cols = r.stdout.strip()
r = subprocess.run(
[f"{PG_BIN}/psql", "-U", "accusys", "-d", "momentry", "-c",
f"COPY (SELECT * FROM {tbl} WHERE {col} = '{uuid}') TO STDOUT WITH CSV HEADER"],
capture_output=True, text=True, timeout=60)
if r.stdout.strip():
f.write(f"COPY {tbl} ({cols}) FROM STDIN WITH CSV HEADER;\n")
f.write(r.stdout)
if not r.stdout.endswith("\n"):
f.write("\n")
f.write("\\.\n\n")
f.write("COMMIT;\n")
size = os.path.getsize(sql_path)
print(f" {sql_path} ({size/1024/1024:.1f} MB)")
# file_info.json
r = subprocess.run(
[f"{PG_BIN}/psql", "-U", "accusys", "-d", "momentry", "-t", "-A",
"-c", f"SELECT json_build_object('file_uuid', file_uuid, 'file_name', file_name, 'duration', duration, 'fps', fps, 'width', width, 'height', height, 'total_frames', total_frames, 'status', status) FROM dev.videos WHERE file_uuid='{uuid}'"],
capture_output=True, text=True, timeout=15)
if r.stdout.strip():
info = json.loads(r.stdout.strip())
with open(os.path.join(outdir, "file_info.json"), "w") as f:
json.dump(info, f, indent=2)
print(f" file_info.json")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,128 @@
#!/opt/homebrew/bin/python3.11
"""
Identity Binding: cluster face traces → identity bindings.
Uses face embeddings from face_detections, clusters per trace, creates identities.
"""
import json, sys, time
import psycopg2
import numpy as np
from sklearn.cluster import AgglomerativeClustering
UUID = sys.argv[1] if len(sys.argv) > 1 else "23b1c872379d4ec06479e5ed39eef4c5"
DB = "dbname=momentry user=accusys"
DISTANCE_THRESHOLD = 0.55 # Cosine distance threshold for clustering
print(f"=== Identity Binding for {UUID} ===")
conn = psycopg2.connect(DB)
cur = conn.cursor()
# Step 1: Get trace embeddings from face_detections
print("Loading face trace data...")
cur.execute("""
SELECT trace_id, embedding
FROM dev.face_detections
WHERE file_uuid = %s AND trace_id IS NOT NULL AND embedding IS NOT NULL
ORDER BY trace_id, id
""", (UUID,))
rows = cur.fetchall()
print(f"Face detections with embeddings: {len(rows)}")
# Group by trace_id and compute average embedding
trace_embs = {}
for trace_id, emb in rows:
if trace_id not in trace_embs:
trace_embs[trace_id] = []
trace_embs[trace_id].append(emb)
print(f"Unique traces: {len(trace_embs)}")
# Compute mean embeddings per trace
trace_ids = []
trace_vectors = []
for tid, embs in sorted(trace_embs.items()):
mean_emb = np.mean(embs, axis=0)
mean_emb = mean_emb / (np.linalg.norm(mean_emb) + 1e-10)
trace_ids.append(tid)
trace_vectors.append(mean_emb)
X = np.array(trace_vectors)
print(f"Trace vectors shape: {X.shape}")
# Step 2: Cluster traces
print("Clustering traces...")
if len(X) > 1:
clustering = AgglomerativeClustering(
n_clusters=None,
distance_threshold=DISTANCE_THRESHOLD,
metric='cosine',
linkage='average'
)
labels = clustering.fit_predict(X)
else:
labels = [0]
n_clusters = len(set(labels))
print(f"Clusters/identities: {n_clusters}")
# Step 3: Get or create identity records
print("Creating identity records...")
# Get existing identities
cur.execute("SELECT id, uuid FROM dev.identities")
existing = {row[0]: row[1] for row in cur.fetchall()}
# Map cluster -> identity_id
cluster_to_identity = {}
for cluster_id in sorted(set(labels)):
# Create new identity
identity_uuid = None
cur.execute("""
INSERT INTO dev.identities (name, identity_type, source, status, created_at)
VALUES (%s, 'face', 'auto', 'active', NOW())
RETURNING id
""", (f"PERSON_{cluster_id}",))
identity_id = cur.fetchone()[0]
cluster_to_identity[cluster_id] = identity_id
print(f" Cluster {cluster_id}: new identity {identity_id} (PERSON_{cluster_id})")
# Step 4: Create identity bindings
print("Creating identity bindings...")
bindings = 0
for tid, label in zip(trace_ids, labels):
identity_id = cluster_to_identity[label]
# Get a representative face_id for this trace
cur.execute("""
SELECT face_id FROM dev.face_detections
WHERE file_uuid = %s AND trace_id = %s
LIMIT 1
""", (UUID, tid))
row = cur.fetchone()
if row:
face_id = row[0]
# Create binding
cur.execute("""
INSERT INTO dev.identity_bindings (identity_id, identity_type, identity_value, confidence, created_at)
VALUES (%s, 'trace', %s, 0.8, NOW())
ON CONFLICT DO NOTHING
""", (identity_id, str(tid)))
bindings += 1
# Also update face_detection with identity_id
cur.execute("""
UPDATE dev.face_detections SET identity_id = %s
WHERE file_uuid = %s AND trace_id = %s
""", (identity_id, UUID, tid))
conn.commit()
print(f"Created {bindings} identity bindings for {n_clusters} identities")
# Summary
print(f"\n=== Summary ===")
cur.execute("SELECT COUNT(*) FROM dev.identities WHERE source = 'auto'")
print(f"Total auto-generated identities: {cur.fetchone()[0]}")
cur.execute("SELECT COUNT(*) FROM dev.identity_bindings")
print(f"Total identity bindings: {cur.fetchone()[0]}")
cur.close()
conn.close()
print("=== Done ===")

View File

@@ -0,0 +1,250 @@
#!/opt/homebrew/bin/python3.11
"""
Offline Report Generator — Uses SQLite file (no PostgreSQL needed).
Generates comprehensive HTML report with charts, heatmaps, and vector stats.
Usage:
python3 render_offline_report.py <uuid>.sqlite [output.html]
python3 render_offline_report.py <uuid>.sqlite --identity <id>
"""
import sys, json, sqlite3, os, argparse
from collections import defaultdict
parser = argparse.ArgumentParser()
parser.add_argument("sqlite_path", help="Path to the .sqlite file")
parser.add_argument("output", nargs="?", default=None, help="Output HTML path")
parser.add_argument("--identity", "-i", type=int, default=None, help="Filter by identity_id")
args = parser.parse_args()
SQLITE_PATH = args.sqlite_path
OUT = args.output or SQLITE_PATH.replace(".sqlite", "_report.html")
IDENTITY = args.identity
if not os.path.exists(SQLITE_PATH):
print(f"ERROR: {SQLITE_PATH} not found")
sys.exit(1)
# Load sqlite-vec extension if available
VEC_DYLIB = None
for path in [
os.path.join(os.path.dirname(os.path.abspath(__file__)), "vec0.dylib"),
"/tmp/vec0.dylib",
]:
if os.path.exists(path):
VEC_DYLIB = path
break
conn = sqlite3.connect(SQLITE_PATH)
if VEC_DYLIB:
conn.enable_load_extension(True)
try:
conn.load_extension(VEC_DYLIB)
except:
pass
conn.enable_load_extension(False)
c = conn.cursor()
# Read video metadata
c.execute("SELECT file_uuid, file_name, duration, fps FROM videos LIMIT 1")
row = c.fetchone()
if not row:
print("No video data found")
sys.exit(1)
file_uuid, video_name, duration, fps = row[0], row[1], float(row[2] or 6785), float(row[3] or 25.0)
sample_interval = 3 # 8Hz face detection
hz = fps / sample_interval
# Build identity filter
identity_filter = ""
identity_params = []
if IDENTITY is not None:
identity_filter = " AND identity_id = ?"
identity_params = [IDENTITY]
# Query trace spans
trace_query = f"SELECT trace_id, MIN(frame_number), MAX(frame_number), MIN(timestamp_secs), MAX(timestamp_secs), COUNT(*) FROM face_detections WHERE trace_id IS NOT NULL{identity_filter} GROUP BY trace_id ORDER BY MIN(timestamp_secs)"
c.execute(trace_query, identity_params)
trace_spans = c.fetchall()
# Query density
density_query = f"SELECT CAST(FLOOR(timestamp_secs/5) AS INTEGER) as bkt, COUNT(*) as cnt FROM face_detections WHERE trace_id IS NOT NULL{identity_filter} GROUP BY bkt ORDER BY bkt"
c.execute(density_query, identity_params)
density = {r[0]: r[1] for r in c.fetchall()}
# Total detections
c.execute(f"SELECT COUNT(*) FROM face_detections WHERE 1=1{identity_filter}", identity_params)
total_detections = c.fetchone()[0]
# Trace-to-identity mapping (for tooltips)
trace_to_identity = {}
c.execute("SELECT DISTINCT trace_id, identity_id FROM face_detections WHERE trace_id IS NOT NULL AND identity_id IS NOT NULL")
for tid, iid in c.fetchall():
trace_to_identity[tid] = iid
# Get identity names
id_names = {}
if trace_to_identity:
unique_ids = set(trace_to_identity.values())
placeholders = ",".join(["?" for _ in unique_ids])
c.execute(f"SELECT id, name FROM identities WHERE id IN ({placeholders})", list(unique_ids))
id_names = {r[0]: r[1] for r in c.fetchall()}
# Identity info
identity_info = None
if IDENTITY is not None:
c.execute("SELECT id, name, identity_type, source, status FROM identities WHERE id=?", [IDENTITY])
r = c.fetchone()
if r:
identity_info = {"id": r[0], "name": r[1], "type": r[2], "source": r[3], "status": r[4]}
else:
c.execute("SELECT identity_id, COUNT(*) as fc, COUNT(DISTINCT trace_id) as tc FROM face_detections WHERE identity_id IS NOT NULL GROUP BY identity_id ORDER BY fc DESC LIMIT 10")
top_identities = c.fetchall()
# TKG stats
c.execute("SELECT COUNT(*) FROM tkg_nodes")
tkg_nodes = c.fetchone()[0]
c.execute("SELECT node_type, COUNT(*) FROM tkg_nodes GROUP BY node_type")
tkg_types = dict(c.fetchall())
c.execute("SELECT COUNT(*) FROM tkg_edges")
tkg_edges = c.fetchone()[0]
# Vector counts
vec_counts = {}
for tbl in ["chunk_embeddings", "face_embeddings", "voice_embeddings"]:
try:
c.execute(f"SELECT COUNT(*) FROM {tbl}")
vec_counts[tbl] = c.fetchone()[0]
except:
vec_counts[tbl] = 0
c.close()
conn.close()
BUCKET = 5
num_buckets = int(duration / BUCKET) + 1
max_density = max(density.values()) if density else 1
def build_html():
h = []
h.append('<!DOCTYPE html><html><head><meta charset="utf-8"><title>Offline Report — {}</title>'.format(video_name[:50]))
h.append('<style>')
h.append('body{font-family:-apple-system,BlinkMacSystemFont,sans-serif;margin:20px;background:#0d1117;color:#c9d1d9}')
h.append('h1,h2{color:#e94560}')
h.append('.stats{display:flex;gap:12px;margin:8px 0;flex-wrap:wrap}')
h.append('.stat{background:#161b22;padding:6px 14px;border-radius:6px}')
h.append('.stat .num{font-size:20px;font-weight:bold;color:#e94560}')
h.append('.stat .label{font-size:10px;color:#8b949e}')
h.append('.viz{position:relative;background:#0d1117;border:1px solid #30363d;margin:8px 0;overflow:hidden}')
h.append('.bar{display:block;position:absolute;height:3px;background:#e94560;opacity:0.7;border-radius:1px}')
h.append('.bar:hover{height:8px;opacity:1}')
h.append('table{border-collapse:collapse;width:100%;color:#c9d1d9}')
h.append('th{background:#161b22;text-align:left;padding:6px 10px}')
h.append('td{padding:4px 10px;border-bottom:1px solid #21262d}')
h.append('</style></head><body>')
sub = " (identity: {})".format(identity_info["name"]) if identity_info else ""
h.append('<h1>📊 Offline Report — {}{}</h1>'.format(video_name[:60], sub))
h.append('<div style="color:#666;font-size:11px;margin-bottom:10px">Source: {} | Generated: offline (SQLite)</div>'.format(os.path.basename(SQLITE_PATH)))
# Identity card
if identity_info:
h.append('<div style="background:#161b22;border:1px solid #30363d;border-radius:8px;padding:16px;margin:12px 0">')
h.append('<h3 style="margin:0;color:#e94560">Identity Details</h3>')
h.append('<table><tr><td style="color:#8b949e;width:80px">ID</td><td>{}</td></tr>'.format(identity_info["id"]))
h.append('<tr><td style="color:#8b949e">Name</td><td style="font-weight:bold">{}</td></tr>'.format(identity_info["name"]))
h.append('<tr><td style="color:#8b949e">Type</td><td>{}</td></tr>'.format(identity_info["type"]))
h.append('<tr><td style="color:#8b949e">Source</td><td>{}</td></tr>'.format(identity_info["source"]))
h.append('<tr><td style="color:#8b949e">Status</td><td>{}</td></tr>'.format(identity_info["status"]))
h.append('</table></div>')
# Stats row
h.append('<div class="stats">')
h.append('<div class="stat"><div class="num">{:,}</div><div class="label">traces</div></div>'.format(len(trace_spans)))
h.append('<div class="stat"><div class="num">{:,}</div><div class="label">detections</div></div>'.format(total_detections))
h.append('<div class="stat"><div class="num">{:.0f}s</div><div class="label">duration</div></div>'.format(duration))
h.append('<div class="stat"><div class="num">{}</div><div class="label">max/{}s</div></div>'.format(max_density, BUCKET))
h.append('<div class="stat"><div class="num">{:.0f}fps</div><div class="label">video fps</div></div>'.format(fps))
h.append('<div class="stat"><div class="num">{:.0f}Hz</div><div class="label">sample rate</div></div>'.format(hz))
h.append('<div class="stat"><div class="num">{:,}</div><div class="label">{}s buckets</div></div>'.format(num_buckets, BUCKET))
h.append('</div>')
# Database summary
h.append('<h2>Database Contents</h2>')
h.append('<table>')
h.append('<tr><th>Table</th><th style="text-align:right">Rows</th><th>Type</th></tr>')
for name, count in [
("videos", 1), ("chunk", len(trace_spans)),
("face_detections", total_detections), ("identities", len(id_names) if not IDENTITY else 1),
("tkg_nodes", tkg_nodes), ("tkg_edges", tkg_edges),
]:
h.append('<tr><td>{}</td><td style="text-align:right">{:,}</td><td>flat</td></tr>'.format(name, count))
for name, dim in [("chunk_embeddings", 768), ("face_embeddings", 512), ("voice_embeddings", 192)]:
count = vec_counts.get(name, 0)
h.append('<tr><td>{}</td><td style="text-align:right">{:,}</td><td>vec0 ({}D)</td></tr>'.format(name, count, dim))
h.append('</table>')
# TKG breakdown
if tkg_types:
h.append('<h2>TKG Nodes</h2>')
h.append('<div class="stats">')
for ntype, cnt in sorted(tkg_types.items()):
h.append('<div class="stat"><div class="num">{:,}</div><div class="label">{}</div></div>'.format(cnt, ntype))
h.append('</div>')
# 1. Density histogram
h.append('<h2>Face Density Over Time</h2>')
w_px = num_buckets * 2 + 20
h.append('<div class="viz" style="width:{}px;height:80px">'.format(w_px))
for b in range(num_buckets):
v = density.get(b, 0)
h_px = max(2, int(60 * v / max(1, max_density * 0.6))) if v > 0 else 0
if v == 0:
color = "#0d1117"
else:
i = min(v / (max(1, max_density * 0.5)), 1.0)
r = int(233 * i + 13 * (1 - i))
g = int(69 * i + 13 * (1 - i))
bv = int(96 * i + 23 * (1 - i))
color = "rgb({},{},{})".format(r, g, bv)
h.append('<span style="position:absolute;left:{}px;bottom:0;width:2px;height:{}px;background:{}" title="{}s: {} faces"></span>'.format(b*2+10, h_px, color, b*BUCKET, v))
h.append('</div>')
# 2. Trace timeline
h.append('<h2>Trace Timeline</h2>')
show_traces = min(len(trace_spans), 2000)
bar_h = 2
chart_height = show_traces * (bar_h + 1) + 10
h.append('<div class="viz" style="width:{}px;height:{}px">'.format(w_px, chart_height))
for i, (tid, fn0, fn1, t0, t1, cnt) in enumerate(trace_spans[:show_traces]):
left = int(t0 / duration * (w_px - 20)) + 10
width = max(3, int((t1 - t0) / duration * (w_px - 20)))
top = i * (bar_h + 1) + 5
opacity = 1.0 if cnt > 5 else 0.3
identity_note = ""
iid = trace_to_identity.get(tid)
if iid and iid in id_names:
identity_note = ", identity: {}".format(id_names[iid])
h.append('<span class="bar" style="left:{}px;top:{}px;width:{}px;height:{}px;opacity:{}" title="T{}: {:.0f}s{:.0f}s, {} faces{}"></span>'.format(
left, top, width, bar_h, opacity, tid, t0, t1, cnt, identity_note))
h.append('</div>')
# 3. Top identities
if not IDENTITY and top_identities:
h.append('<h2>Top Identities</h2>')
h.append('<table>')
h.append('<tr><th>ID</th><th>Name</th><th style="text-align:right">Faces</th><th style="text-align:right">Traces</th></tr>')
for iid, fc, tc in top_identities:
name = id_names.get(iid, "#{}".format(iid))[:50]
h.append('<tr><td style="color:#8b949e">{}</td><td>{}</td><td style="text-align:right">{:,}</td><td style="text-align:right">{}</td></tr>'.format(iid, name, fc, tc))
h.append('</table>')
h.append('</body></html>')
return '\n'.join(h)
html = build_html()
with open(OUT, 'w') as f:
f.write(html)
print("Saved: {}".format(OUT))
print("Traces: {}, Detections: {}, Duration: {:.0f}s, Sample: {:.0f}Hz".format(len(trace_spans), total_detections, duration, hz))
print("Size: {:.0f}KB".format(len(html) / 1024))

View File

@@ -0,0 +1,164 @@
#!/opt/homebrew/bin/python3.11
"""
Speaker Assignment: cluster voice vectors from Qdrant, assign speaker IDs to DB chunks.
"""
import json, sys, time
import psycopg2
import numpy as np
from urllib.request import Request, urlopen
from sklearn.cluster import AgglomerativeClustering
from sklearn.metrics.pairwise import cosine_similarity
UUID = sys.argv[1] if len(sys.argv) > 1 else "23b1c872379d4ec06479e5ed39eef4c5"
QDRANT = "http://localhost:6333"
DB = "dbname=momentry user=accusys"
COLLECTION = "momentry_dev_voice"
print(f"=== Speaker Assignment for {UUID} ===")
# Step 1: Read voice vectors from Qdrant
print("Reading voice vectors from Qdrant...")
vectors = []
chunk_ids = []
# We need to scroll through all points
offset = None
while True:
data = {"limit": 100, "with_payload": True, "with_vector": True}
if offset is not None:
data["offset"] = offset
req = Request(f"{QDRANT}/collections/{COLLECTION}/points/scroll",
data=json.dumps(data).encode(),
headers={"Content-Type": "application/json"}, method="POST")
resp = json.loads(urlopen(req).read())
result = resp["result"]
points = result.get("points", [])
if not points:
break
for pt in points:
payload = pt.get("payload", {})
cid = payload.get("chunk_id", "")
# Only get vectors for THIS UUID's chunks
# Filter by checking DB later, or rely on Qdrant payload
vectors.append(pt["vector"])
chunk_ids.append(cid)
offset = result.get("next_page_offset")
if offset is None:
break
print(f" Read {len(vectors)} vectors...")
print(f"Total vectors: {len(vectors)}")
# Step 2: Filter to only our UUID's chunks (from DB)
conn = psycopg2.connect(DB)
cur = conn.cursor()
cur.execute("SELECT chunk_id FROM dev.chunk WHERE file_uuid = %s AND chunk_type = 'sentence' ORDER BY id", (UUID,))
db_chunk_ids = set(row[0] for row in cur.fetchall())
print(f"DB chunk_ids: {len(db_chunk_ids)}")
# Filter vectors to match DB chunks
filtered_vectors = []
filtered_chunk_ids = []
for v, cid in zip(vectors, chunk_ids):
if cid in db_chunk_ids:
filtered_vectors.append(v)
filtered_chunk_ids.append(cid)
vectors = filtered_vectors
chunk_ids = filtered_chunk_ids
print(f"Matched vectors: {len(vectors)}")
# Sort by chunk_id (which is numeric string)
indices = sorted(range(len(chunk_ids)), key=lambda i: int(chunk_ids[i]) if chunk_ids[i].isdigit() else 0)
vectors = [vectors[i] for i in indices]
chunk_ids = [chunk_ids[i] for i in indices]
# Step 3: Read speaker_change from asr.json
asr_path = f"/Users/accusys/momentry/output_dev/{UUID}.asr.json"
with open(asr_path) as f:
asr_data = json.load(f)
segments = asr_data.get("segments", [])
speaker_changes = {}
for seg in segments:
speaker_changes[seg["chunk_id"]] = seg.get("speaker_change", False)
# Step 4: Cluster embeddings
print("Clustering...")
X = np.array(vectors)
# Compute cosine distance matrix
# Cosine distance = 1 - cosine_similarity
cos_sim = cosine_similarity(X)
cos_dist = 1 - cos_sim
# Use AgglomerativeClustering with cosine distance
# Determine optimal n_clusters by looking at speaker_change boundaries
# First pass: use speaker_change as hard boundaries to get initial clusters
# Then refine
# Simpler: use a distance threshold
n = len(vectors)
labels = np.full(n, -1, dtype=int)
current_speaker = 0
# Start with first chunk as speaker 0
labels[0] = current_speaker
centroids = [np.array(vectors[0])] # per-cluster centroid
for i in range(1, n):
has_change = speaker_changes.get(chunk_ids[i], False)
vec = np.array(vectors[i])
if has_change:
# Speaker change: check if this is a NEW speaker or returning to a previous one
# Compare with centroid of current speaker vs others
similarities = [float(np.dot(vec, c) / (np.linalg.norm(vec) * np.linalg.norm(c) + 1e-10)) for c in centroids]
best_sim = max(similarities) if similarities else 0
best_cluster = similarities.index(best_sim) if similarities else 0
if best_sim > 0.65 and best_cluster != current_speaker:
# Returning to a previous speaker
labels[i] = best_cluster
elif best_sim < 0.55:
# New speaker
current_speaker = len(centroids)
labels[i] = current_speaker
centroids.append(vec)
else:
# Stay with current speaker (false change detection)
labels[i] = current_speaker
centroids[current_speaker] = (centroids[current_speaker] + vec) / 2
else:
# No speaker change: same speaker as previous
labels[i] = current_speaker
centroids[current_speaker] = (centroids[current_speaker] + vec) / 2
n_speakers = len(set(labels))
print(f"Identified {n_speakers} unique speakers")
# Step 5: Update DB chunks with speaker assignment
print("Updating DB chunks...")
# Map: chunk_id -> speaker_id
speaker_map = {}
for cid, label in zip(chunk_ids, labels):
speaker_map[cid] = f"SPEAKER_{label}"
updated = 0
for cid, spk_id in speaker_map.items():
cur.execute("""
UPDATE dev.chunk SET metadata = COALESCE(metadata, '{}'::jsonb) || %s::jsonb
WHERE file_uuid = %s AND chunk_id = %s AND chunk_type = 'sentence'
""", (json.dumps({"speaker_id": spk_id}), UUID, cid))
updated += 1
conn.commit()
print(f"Updated {updated} chunks with speaker IDs")
# Step 6: Save speaker map
speaker_map_path = f"/Users/accusys/momentry/output_dev/{UUID}.speaker_map.json"
with open(speaker_map_path, "w") as f:
json.dump({"speakers": n_speakers, "assignments": speaker_map}, f, indent=2)
print(f"Speaker map saved: {speaker_map_path}")
cur.close()
conn.close()
print("=== Done ===")

View File

@@ -1,204 +0,0 @@
#!/opt/homebrew/bin/python3.11
"""
Split ASR segments at detected speaker change points.
Uses ECAPA-TDNN sub-window classification against reference centroids.
Output: new asrx_fine.json with fine-grained segments + parent_asr_idx reference.
"""
import json, sys, os, time, argparse, subprocess, tempfile, shutil
import numpy as np
from collections import Counter
from pathlib import Path
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath(__file__)), "asrx_self"))
from main_fixed import SelfASRXFixed
from speaker_encoder import extract_speaker_embedding, normalize_embeddings
import torchaudio, psycopg2
SUB_WIN = 0.5
SUB_STRIDE = 0.25
CHANGE_CONFIRM = 2
MIN_DUR = 0.7
BATCH_SIZE = 500
def load_reference(uuid, db_url):
conn = psycopg2.connect(db_url)
cur = conn.cursor()
cur.execute("SELECT chunk_index, metadata->>'new_speaker_name' FROM dev.chunks WHERE file_uuid=%s AND chunk_type='sentence' ORDER BY chunk_index", (uuid,))
name_by_idx = dict(cur.fetchall())
conn.close()
asrx_path = f"/Users/accusys/momentry/output_dev/{uuid}.asrx.json"
asrx_full = json.load(open(asrx_path))
ref = {"Cary Grant": [], "Audrey Hepburn": [], "Unknown": []}
for i, seg in enumerate(asrx_full["segments"]):
name = name_by_idx.get(i, "Unknown")
if name in ref and i < len(asrx_full.get("embeddings", [])):
ref[name].append(np.array(asrx_full["embeddings"][i]))
centroids = {}
for name, el in ref.items():
if el:
c = np.mean(el, axis=0)
centroids[name] = c / (np.linalg.norm(c) + 1e-10)
name_to_speaker = {}
for i, seg in enumerate(asrx_full["segments"]):
name = name_by_idx.get(i, "Unknown")
sid = seg["speaker_id"]
name_to_speaker.setdefault(name, sid)
return centroids, name_to_speaker
def extract_audio(video_path, sr=16000):
tmp = tempfile.mkdtemp(prefix="asr_split_")
wav = os.path.join(tmp, "audio.wav")
subprocess.run(["ffmpeg", "-y", "-v", "quiet", "-i", video_path,
"-ar", str(sr), "-ac", "1", "-sample_fmt", "s16", wav], check=True, capture_output=True, timeout=300)
wav_data, sr_actual = torchaudio.load(wav)
if wav_data.shape[0] > 1:
wav_data = wav_data.mean(dim=0, keepdim=True)
return wav_data, sr_actual, tmp
def classify(emb, centroids):
return max(centroids, key=lambda n: float(np.dot(emb, centroids[n])))
def process_batch(asr_segs, wav, sr, centroids, encoder, offset_start=0):
ws = int(SUB_WIN * sr)
sw = int(SUB_STRIDE * sr)
results = []
for si, s in enumerate(asr_segs):
st = s["start"] - offset_start
et = s["end"] - offset_start
dur = et - st
if dur < 1.0:
a = wav[:, int(st*sr):int(et*sr)]
e = extract_speaker_embedding(encoder, a.numpy(), sr)
e /= np.linalg.norm(e) + 1e-10
results.append((s["start"], s["end"], classify(e, centroids), si))
continue
ss = int(st*sr); se = int(et*sr)
sub_e, sub_t = [], []
for wpos in range(ss, se-ws+1, sw):
chunk = wav[:, wpos:wpos+ws]
sub_e.append(extract_speaker_embedding(encoder, chunk.numpy(), sr))
sub_t.append(wpos/sr + offset_start)
if len(sub_e) < 3:
a = wav[:, ss:se]
e = extract_speaker_embedding(encoder, a.numpy(), sr)
e /= np.linalg.norm(e) + 1e-10
results.append((s["start"], s["end"], classify(e, centroids), si))
continue
sub_e = normalize_embeddings(np.array(sub_e))
names = []
for i in range(len(sub_e)):
names.append(classify(sub_e[i], centroids))
# Smooth
sm = list(names)
for i in range(1, len(names)-1):
sm[i] = Counter(names[max(0,i-1):min(len(names),i+2)]).most_common(1)[0][0]
# Find splits
splits = []
prev = sm[0]
for i in range(1, len(sm)):
if sm[i] != prev:
if i+CHANGE_CONFIRM < len(sm) and all(sm[i]==sm[j] for j in range(i, i+CHANGE_CONFIRM+1)):
splits.append(sub_t[i]); prev = sm[i]
elif i+CHANGE_CONFIRM >= len(sm):
splits.append(sub_t[i]); prev = sm[i]
if not splits:
results.append((s["start"], s["end"], Counter(names).most_common(1)[0][0], si))
else:
boundaries = [s["start"]] + splits + [s["end"]]
for pi in range(len(boundaries)-1):
ps, pe = boundaries[pi], boundaries[pi+1]
if pe-ps < MIN_DUR: continue
sub_i = [i for i, t in enumerate(sub_t) if ps <= t < pe]
lbl = Counter([names[i] for i in sub_i]).most_common(1)[0][0] if sub_i else Counter(names).most_common(1)[0][0]
results.append((round(ps,2), round(pe,2), lbl, si))
return results
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--uuid", default="aeed71342a899fe4b4c57b7d41bcb692")
parser.add_argument("--output", help="Output path for fine ASRX JSON")
args = parser.parse_args()
UUID = args.uuid
BASE = "/Users/accusys/momentry/output_dev"
DB_URL = "postgresql://accusys@localhost:5432/momentry?host=/tmp"
VIDEO = "/Users/accusys/momentry/var/sftpgo/data/demo/Charade (1963) Cary Grant & Audrey Hepburn \uff5c Comedy Mystery Romance Thriller \uff5c Full Movie.mp4"
print(f"Processing {UUID}")
centroids, name_to_speaker = load_reference(UUID, DB_URL)
print(f"Centroids: {list(centroids.keys())}")
asr = json.load(open(f"{BASE}/{UUID}.asr.json"))
asr_segs = asr["segments"]
print(f"ASR segments: {len(asr_segs)}")
print("Extracting audio...")
wav, sr, tmp_dir = extract_audio(VIDEO)
print(f"Audio: {wav.shape[1]/sr:.0f}s")
inst = SelfASRXFixed()
encoder = inst.speaker_encoder
all_results = []
t0 = time.time()
for batch_start in range(0, len(asr_segs), BATCH_SIZE):
batch = asr_segs[batch_start:batch_start + BATCH_SIZE]
segs = process_batch(batch, wav, sr, centroids, encoder)
all_results.extend(segs)
pct = (batch_start + len(batch)) * 100 // len(asr_segs)
print(f" {batch_start+len(batch)}/{len(asr_segs)} ({pct}%) -> {len(all_results)} segments [{time.time()-t0:.0f}s]")
shutil.rmtree(tmp_dir, ignore_errors=True)
# Build output
spk_stats = {}
out_segs = []
# Assign sequential SPEAKER_X IDs based on name order
name_order = {name: i for i, name in enumerate(sorted(set(s[2] for s in all_results)))}
for start, end, name, asr_idx in all_results:
sid = f"SPEAKER_{name_order[name]}"
dur = end - start
spk_stats.setdefault(sid, {"count": 0, "duration": 0})
spk_stats[sid]["count"] += 1
spk_stats[sid]["duration"] += dur
out_segs.append({
"start_time": start,
"end_time": end,
"speaker_id": sid,
"speaker_name": name,
"parent_asr_idx": asr_idx,
})
output = {
"uuid": UUID,
"language": "en",
"segments": out_segs,
"speaker_stats": spk_stats,
"total_asr_segments": len(asr_segs),
"total_fine_segments": len(out_segs),
}
output_path = args.output or f"{BASE}/{UUID}.asrx_fine.json"
json.dump(output, open(output_path, "w"), indent=2)
print(f"\nSaved: {output_path}")
print(f"Segments: {len(out_segs)} (was {len(asr_segs)}, +{len(out_segs)-len(asr_segs)})")
print(f"Speakers: {len(spk_stats)}")
for sid, st in sorted(spk_stats.items()):
print(f" {sid}: {st['count']} segs, {st['duration']:.0f}s")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,284 @@
#!/opt/homebrew/bin/python3.11
"""
One-pass ASR + Speaker Change Detection + Split → asr.json
"""
import json, os, sys, time, argparse, subprocess, tempfile, shutil
import numpy as np
from pathlib import Path
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
sys.path.insert(0, os.path.join(os.path.dirname(os.path.abspath(__file__)), "asrx_self"))
from speaker_encoder import load_speaker_encoder, extract_speaker_embedding, normalize_embeddings
import torchaudio
from faster_whisper import WhisperModel
SUB_WIN = 0.5
SUB_STRIDE = 0.25
MIN_DUR = 0.3
SIM_THRESHOLD = 0.45
CHANGE_CONFIRM = 2
def extract_audio(video_path, tmp_dir, sr=16000):
wav_path = os.path.join(tmp_dir, "audio.wav")
subprocess.run(["ffmpeg", "-y", "-v", "quiet", "-i", video_path,
"-ar", str(sr), "-ac", "1", "-sample_fmt", "s16", wav_path],
check=True, capture_output=True, timeout=300)
wav_data, sr_actual = torchaudio.load(wav_path)
if wav_data.shape[0] > 1:
wav_data = wav_data.mean(dim=0, keepdim=True)
return wav_data, sr_actual
def transcribe_pass1(model, wav_path, vad_params=None):
print(" [faster-whisper] Transcribing...")
if vad_params is None:
vad_params = {"min_silence_duration_ms": 500, "speech_pad_ms": 200}
segments, info = model.transcribe(wav_path, beam_size=5,
vad_filter=True, word_timestamps=True, vad_parameters=vad_params)
pass1 = []
for i, seg in enumerate(segments):
words = []
if seg.words:
for w in seg.words:
words.append({"word": w.word.strip(), "start": round(w.start,3), "end": round(w.end,3)})
pass1.append({
"index": i,
"start": round(seg.start, 3),
"end": round(seg.end, 3),
"text": seg.text.strip(),
"words": words,
})
print(f" Pass1 segments: {len(pass1)}")
return pass1
def detect_speaker_changes(wav_data, sr, pass1_segs, encoder, progress_step=100):
print(" [Speaker Detection] Scanning...")
ws = int(SUB_WIN * sr)
sw = int(SUB_STRIDE * sr)
change_points = [] # List[List[float]] → change times per pass1 segment
t0 = time.time()
for si, seg in enumerate(pass1_segs):
st = int(seg["start"] * sr)
et = int(seg["end"] * sr)
dur = seg["end"] - seg["start"]
if dur < 1.0:
change_points.append([])
continue
sub_embs = []
sub_times = []
for wpos in range(st, et - ws + 1, sw):
chunk = wav_data[:, wpos:wpos+ws]
emb = extract_speaker_embedding(encoder, chunk.numpy(), sr)
emb = emb / (np.linalg.norm(emb) + 1e-10)
sub_embs.append(emb)
sub_times.append(wpos / sr)
if len(sub_embs) < 3:
change_points.append([])
continue
sub_embs = normalize_embeddings(np.array(sub_embs))
cps = []
# Require CHANGE_CONFIRM consecutive low-similarity windows before registering a change
low_run = 0
for i in range(1, len(sub_embs)):
sim = float(np.dot(sub_embs[i-1], sub_embs[i]))
if sim < SIM_THRESHOLD:
low_run += 1
if low_run >= CHANGE_CONFIRM:
# Change point at the START of the low-sim run
cps.append(round(sub_times[i - low_run + 1], 2))
low_run = 0
else:
low_run = 0
change_points.append(cps)
if (si + 1) % progress_step == 0:
pct = (si + 1) * 100 // len(pass1_segs)
print(f" {si+1}/{len(pass1_segs)} ({pct}%) [{time.time()-t0:.0f}s]")
total_changes = sum(len(cps) for cps in change_points)
print(f" Speaker changes detected: {total_changes} in {len(pass1_segs)} segments ({time.time()-t0:.0f}s)")
return change_points
def build_segments(pass1_segs, change_points, wav_data, sr, asr_model, tmp_dir, fps=24.0):
print(" [Split] Building final segments...")
final = []
chunk_idx = 0
for si, seg in enumerate(pass1_segs):
cps = change_points[si]
if not cps:
final.append({
"chunk_id": str(chunk_idx),
"pass1_index": si,
"start_time": seg["start"],
"end_time": seg["end"],
"start_frame": int(seg["start"] * fps),
"end_frame": int(seg["end"] * fps),
"text": seg["text"],
})
chunk_idx += 1
continue
seg["split"] = True
boundaries = [seg["start"]] + cps + [seg["end"]]
for pi in range(len(boundaries) - 1):
ps, pe = boundaries[pi], boundaries[pi+1]
if pe - ps < MIN_DUR:
continue
# Try word_timestamp mapping first (wider tolerance)
sub_words = [w["word"] for w in seg["words"] if w["start"] >= ps - 0.3 and w["end"] <= pe + 0.3]
text = " ".join(sub_words).strip() if sub_words else ""
# Fallback: call faster-whisper on the sub-audio chunk
if not text:
import soundfile as sf
chunk_path = os.path.join(tmp_dir, f"sub_{chunk_idx}.wav")
a_chunk = wav_data[:, int(ps*sr):int(pe*sr)].numpy()[0]
if len(a_chunk) > sr * 0.3: # skip if < 0.3s
sf.write(chunk_path, a_chunk, sr)
try:
sub_segs, _ = asr_model.transcribe(chunk_path, beam_size=5,
vad_filter=True, vad_parameters={"min_silence_duration_ms": 100})
text = " ".join(s.text.strip() for s in sub_segs)
except:
pass
os.remove(chunk_path)
if not text:
text = " ".join([w["word"] for w in seg["words"]
if w["start"] >= ps - 0.5 and w["end"] <= pe + 0.5]).strip()
if not text:
text = seg["text"][:60]
final.append({
"chunk_id": str(chunk_idx),
"pass1_index": si,
"start_time": round(ps, 3),
"end_time": round(pe, 3),
"start_frame": int(ps * fps),
"end_frame": int(pe * fps),
"text": text,
"speaker_change": True,
})
chunk_idx += 1
print(f" Final segments: {len(final)}")
return final
def voice_vectors_to_qdrant(wav_data, sr, final_segs, encoder, qdrant_url="http://localhost:6333"):
print(" [Voice Vectors] Extracting 192D embeddings...")
embeddings = []
t0 = time.time()
for si, seg in enumerate(final_segs):
st = int(seg["start_time"] * sr)
et = int(seg["end_time"] * sr)
a_chunk = wav_data[:, st:et]
emb = extract_speaker_embedding(encoder, a_chunk.numpy(), sr)
emb = emb / (np.linalg.norm(emb) + 1e-10)
embeddings.append({"chunk_id": seg["chunk_id"], "embedding": emb.tolist()})
if (si + 1) % 500 == 0:
print(f" {si+1}/{len(final_segs)} [{time.time()-t0:.0f}s]")
print(f" Writing to Qdrant...")
from urllib.request import Request, urlopen
batch = []
for i, e in enumerate(embeddings):
batch.append({"id": i + 1, "vector": e["embedding"],
"payload": {"chunk_id": e["chunk_id"], "chunk_type": "sentence"}})
if len(batch) >= 100:
req = Request(f"{qdrant_url}/collections/momentry_dev_voice/points?wait=true",
data=json.dumps({"points": batch}).encode(),
headers={"Content-Type": "application/json"}, method="PUT")
try: urlopen(req)
except: pass
batch = []
# Flush remaining
if batch:
req = Request(f"{qdrant_url}/collections/momentry_dev_voice/points?wait=true",
data=json.dumps({"points": batch}).encode(),
headers={"Content-Type": "application/json"}, method="PUT")
try: urlopen(req)
except: pass
print(f" Voice vectors: {len(embeddings)} pts → Qdrant [{time.time()-t0:.0f}s]")
return embeddings
def main():
parser = argparse.ArgumentParser()
parser.add_argument("--video", default="/Users/accusys/momentry/var/sftpgo/data/demo/Charade (1963) Cary Grant & Audrey Hepburn Comedy Mystery Romance Thriller Full Movie.mp4")
parser.add_argument("--output", help="Output path for asr.json", default="/Users/accusys/momentry/output_dev/aeed71342a899fe4b4c57b7d41bcb692.asr.json")
parser.add_argument("--sample", type=int, help="Process only first N pass1 segments (for testing)")
parser.add_argument("--no-qdrant", action="store_true", help="Skip Qdrant upload")
args = parser.parse_args()
t0 = time.time()
# Load models
print("=== Loading Models ===")
asr_model = WhisperModel("small", device="cpu", compute_type="int8")
print(" faster-whisper small loaded")
encoder = load_speaker_encoder()
print(" ECAPA-TDNN loaded")
print()
# Extract audio
print("=== Audio Extraction ===")
tmp_dir = tempfile.mkdtemp(prefix="transcribe_")
wav_data, sr = extract_audio(args.video, tmp_dir)
print(f" Audio: {wav_data.shape[1]/sr:.0f}s, {sr}Hz")
wav_path = os.path.join(tmp_dir, "audio.wav")
print()
# Step 1: faster-whisper pass1
print("=== Step 1: Pass1 Transcription ===")
pass1_segs = transcribe_pass1(asr_model, wav_path)
if args.sample:
pass1_segs = pass1_segs[:args.sample]
print(f" SAMPLE MODE: limiting to {args.sample} segments")
print()
# Step 2: Speaker change detection
print("=== Step 2: Speaker Change Detection ===")
change_points = detect_speaker_changes(wav_data, sr, pass1_segs, encoder)
print()
# Step 3: Build final segments
print("=== Step 3: Build Final Segments ===")
final_segs = build_segments(pass1_segs, change_points, wav_data, sr, asr_model, tmp_dir)
print()
# Step 4: Voice vectors → Qdrant
if not args.no_qdrant:
print("=== Step 4: Voice Vectors → Qdrant ===")
voice_vectors_to_qdrant(wav_data, sr, final_segs, encoder)
print()
# Step 5: Write asr.json
print("=== Step 5: Write asr.json ===")
uuid = os.path.basename(args.output).replace(".asr.json", "")
output = {
"file_uuid": uuid,
"pass1": pass1_segs,
"segments": final_segs,
}
with open(args.output, "w") as f:
json.dump(output, f, indent=2, ensure_ascii=False)
sz = os.path.getsize(args.output)
print(f" {args.output} ({sz/1024:.0f} KB)")
# Cleanup
shutil.rmtree(tmp_dir, ignore_errors=True)
elapsed = time.time() - t0
print(f"\n=== Done ({elapsed:.0f}s) ===")
print(f" Pass1 segments: {len(pass1_segs)}")
print(f" Final segments: {len(final_segs)}")
fp = args.output
print(f" Output: {fp}")
if __name__ == "__main__":
main()

View File

@@ -0,0 +1,57 @@
#!/bin/bash
# Momentry Release Package — Verify Script
# Usage: bash verify.sh
set -euo pipefail
DIR="$(cd "$(dirname "$0")" && pwd)"
UUID=$(basename "$DIR")
PG_BIN="${PG_BIN:-/Users/accusys/pgsql/18.3/bin}"
DB_NAME="${DB_NAME:-momentry}"
DB_USER="${DB_USER:-accusys}"
echo "=== Package Verification ==="
echo "UUID: $UUID"
echo ""
# Check files
FILES=("data.sql" "file_info.json" "$UUID.sqlite" "$UUID.identities.json" "$UUID.asr.json" "$UUID.face.json" "$UUID.speaker_map.json")
echo "## 1. File Integrity"
for f in "${FILES[@]}"; do
if [ -f "$DIR/$f" ]; then
SIZE=$(ls -lh "$DIR/$f" | awk '{print $5}')
echo "$f ($SIZE)"
else
echo " ⚠️ $f (not found)"
fi
done
# Check DB (if accessible)
if "$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -c "SELECT 1" >/dev/null 2>&1; then
echo ""
echo "## 2. Database"
for tbl in chunk face_detections tkg_nodes tkg_edges identities identity_bindings; do
COUNT=$("$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -t -A -c "SELECT COUNT(*) FROM dev.$tbl WHERE file_uuid='$UUID' OR uuid='$UUID'" 2>/dev/null || echo "N/A")
echo " $tbl: $COUNT"
done
else
echo ""
echo "## 2. Database (offline — check $UUID.sqlite)"
if [ -f "$DIR/$UUID.sqlite" ]; then
python3 -c "
import sqlite3
conn = sqlite3.connect('$DIR/$UUID.sqlite')
c = conn.cursor()
for tbl in ['chunk', 'face_detections', 'identities', 'tkg_nodes', 'tkg_edges']:
c.execute(f'SELECT COUNT(*) FROM {tbl}')
print(f' {tbl}: {c.fetchone()[0]}')
conn.close()
" 2>/dev/null || echo " (sqlite3 unavailable)"
fi
fi
echo ""
echo "## 3. Pipeline Status"
echo " $("$PG_BIN/psql" -U "$DB_USER" -d "$DB_NAME" -t -A -c "SELECT status FROM dev.videos WHERE file_uuid='$UUID'" 2>/dev/null || echo "unknown")"
echo ""
echo "=== Verification Complete ==="