momentry_core/docs/PROCESSING_PIPELINE.md

# Video Processing Pipeline - 處理流程

| 項目 | 內容 |
|------|------|
| 建立者 | Warren |
| 建立時間 | 2026-03-22 |
| 文件版本 | V1.1 |

---

## 版本歷史

| 版本 | 日期 | 目的 | 操作人 | 工具/模型 |
|------|------|------|--------|-----------|
| V1.0 | 2026-03-22 | 創建文件 | Warren | OpenCode |
| V1.1 | 2026-03-26 | 更新流程圖文字 (media_url→file_path) | OpenCode | deepseek-reasoner |

---

## 處理流程架構

```
┌─────────────────────────────────────────────────────────────────────────────┐
│                         Video Processing Pipeline                            │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 1: JSON 生成 (Process)                                        │  │
│  │                                                                       │  │
│  │  video.mp4 ──→ [ASR] ──→ asr.json     (語音辨識)                   │  │
│  │            ──→ [CUT] ──→ cut.json     (場景偵測)                   │  │
│  │            ──→ [ASRX] ──→ asrx.json   (說話者分離)                 │  │
│  │            ──→ [YOLO] ──→ yolo.json   (物體偵測)                   │  │
│  │            ──→ [OCR] ──→ ocr.json     (文字辨識)                   │  │
│  │            ──→ [Face] ──→ face.json   (人臉偵測)                   │  │
│  │            ──→ [Pose] ──→ pose.json   (姿態估計)                   │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 2: 入庫 (Import)                                              │  │
│  │                                                                       │  │
│  │  .json files ──→ PostgreSQL (fs_json = true)                        │  │
│  │                      ↓                                               │  │
│  │                 pre_chunks 表 (from ASR, CUT)                        │  │
│  │                 frames 表 (from YOLO, OCR, Face, Pose)               │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 3: Chunk 生成 (Chunk)                                         │  │
│  │                                                                       │  │
│  │  pre_chunks ──→ [Chunk Rule] ──→ chunks 表                         │  │
│  │                      ↓                                               │  │
│  │              清洗 → 純文字                                            │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 4: 向量化 (Vectorize)                                         │  │
│  │                                                                       │  │
│  │  chunks ──→ [Embedding Model] ──→ vectors                          │  │
│  │                            ↓                                           │  │
│  │                     Qdrant (主要向量庫)                               │  │
│  │                     PGVector (備份向量庫)                             │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                      ↓                                      │
│  ┌─────────────────────────────────────────────────────────────────────┐  │
│  │  Stage 5: 搜尋 (Search)                                             │  │
│  │                                                                       │  │
│  │  Natural Language Query ──→ [Embedding] ──→ [Qdrant Search]        │  │
│  │                                    ↓                                   │  │
│  │                           返回結果含 file_path                        │  │
│  └─────────────────────────────────────────────────────────────────────┘  │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘
```

---

## CLI 命令

### Stage 1: JSON 生成 (Process)

```bash
# 基本用法
cargo run --bin momentry -- process <uuid_or_path>

# 只處理特定模組
cargo run --bin momentry -- process <uuid> --modules asr,cut

# 強制重新處理（忽略完整性檢查）
cargo run --bin momentry -- process <uuid> --force

# 從中斷點續傳
cargo run --bin momentry -- process <uuid> --resume

# 模組使用雲端處理
cargo run --bin momentry -- process <uuid> --modules yolo,face --cloud yolo

# 完整範例
cargo run --bin momentry -- process /path/to/video.mp4 \
    --modules asr,cut,yolo,ocr \
    --cloud yolo
```

### Stage 2: 入庫 (Import)

```bash
# 目前入庫在 process 完成後自動執行
# 計劃新增獨立的 import 命令
# cargo run --bin momentry -- import <uuid>
```

### Stage 3: Chunk 生成

```bash
# 生成 chunks
cargo run --bin momentry -- chunk <uuid>
```

### Stage 4: 向量化

```bash
# 向量化 chunks（使用預設模型 nomic-embed-text-v2-moe:latest）
cargo run --bin momentry -- vectorize <uuid>

# 明確指定模型
cargo run --bin momentry -- vectorize <uuid> --model nomic-embed-text-v2-moe:latest
```

---

## 處理模式選項

### --force (強制重新處理)

- 刪除現有的 JSON 檔案
- 從頭開始處理
- 適用於：處理失敗、模型更新、需要重新處理

```bash
# 強制重新處理 YOLO
cargo run --bin momentry -- process <uuid> --modules yolo --force
```

### --resume (續傳)

- 檢查現有 JSON 的進度
- 從中斷點繼續處理
- 適用於：處理中斷、系統崩潰後恢復

```bash
# 從上次中斷點繼續
cargo run --bin momentry -- process <uuid> --resume
```

### 預設行為 (Smart Mode)

- 如果 JSON 完全：跳過
- 如果 JSON 不完整：警告 + 跳過（需要 --resume 或 --force）
- 如果 JSON 不存在：處理

```
Output:
ASR: ✓ Already complete, skipping

⚠️  Found incomplete JSON file: /path/to/yolo.json
   Progress: 73800/412343 (17.9%)
   Use --resume to continue from checkpoint
   Use --force to reprocess from scratch
YOLO: ✓ Already complete, skipping
```

---

## 可用模組

| 模組 | 功能 | 輸出 | 用途 |
|------|------|------|------|
| asr | 自動語音辨識 | asr.json | 語音轉文字 |
| cut | 場景偵測 | cut.json | 影片分段 |
| asrx | 說話者分離 | asrx.json | 多人對話分析 |
| yolo | 物體偵測 | yolo.json | 物體辨識 |
| ocr | 文字辨識 | ocr.json | 畫面文字 |
| face | 人臉偵測 | face.json | 人臉辨識 |
| pose | 姿態估計 | pose.json | 人體姿態 |

---

## 向量化模型選擇

### 統一嵌入模型
Momentry Core 統一使用 **`nomic-embed-text-v2-moe:latest`** 作為所有規則的嵌入模型：

```bash
# 統一模型（所有 Rule 1/2/3 使用）
--model nomic-embed-text-v2-moe:latest
```

### 模型特性
| 特性 | 說明 |
|------|------|
| **模型名稱** | `nomic-embed-text-v2-moe:latest` |
| **向量維度** | 768 維 |
| **多語言支持** | ✅ 完整支持（英語、中文、日語、韓語等） |
| **模型架構** | Mixture of Experts (MoE) |
| **推理速度** | 快速，適合實時應用 |

### 使用方式
```bash
# 向量化命令
cargo run --bin momentry -- vectorize <uuid> --model nomic-embed-text-v2-moe:latest
```

---

## 資料庫儲存

### PostgreSQL (主要關聯式資料庫)

- 影片資訊
- Chunks 資料
- Pre-chunks 資料
- Frames 資料
- 使用者資料

### Qdrant (主要向量資料庫)

- Chunk 向量
- 相似度搜尋

### PGVector (備份向量資料庫)

- Chunk 向量副本
- 備援機制

---

## Pipeline 狀態追蹤

### PostgreSQL 狀態欄位

```sql
-- 影片處理狀態
videos.status: 'pending' | 'processing' | 'completed' | 'failed'

-- 檔案處理狀態
videos.fs_json: true/false
videos.fs_chunks: true/false
videos.fs_vectors: true/false

-- pre_chunks 狀態
pre_chunks.imported: true/false

-- frames 狀態
frames.imported: true/false

-- chunks 狀態
chunks.cleaned: true/false
chunks.vectorized: true/false
```

### 進度查詢 API

```bash
# 查詢處理進度
curl http://localhost:3002/api/v1/progress/{uuid}

# 回應範例
{
  "uuid": "a1b10138a6bbb0cd",
  "file_name": "video.mp4",
  "overall_progress": 65,
  "cpu_percent": 45.2,
  "gpu_percent": 98.5,
  "memory_mb": 8500,
  "processors": [
    {"name": "asr", "status": "complete", "progress": 100},
    {"name": "cut", "status": "complete", "progress": 100},
    {"name": "yolo", "status": "progress", "progress": 45},
    {"name": "ocr", "status": "pending", "progress": 0}
  ]
}
```

---

## 下一步

1. **API 端點** - 支援 --modules 和 --cloud 參數
2. **獨立 Import 命令** - 分離入庫流程
3. **獨立 Chunk 命令** - 分離 chunk 生成
4. **獨立 Vectorize 命令** - 分離向量化流程
5. **模型管理** - 新增、選擇、預覽模型