docs: file_uuid generation rules for M4
This commit is contained in:
864
docs_v1.0/GUIDES/Demo_EndToEnd.md
Normal file
864
docs_v1.0/GUIDES/Demo_EndToEnd.md
Normal file
@@ -0,0 +1,864 @@
|
||||
---
|
||||
document_type: "demo_guide"
|
||||
service: "MOMENTRY_CORE"
|
||||
title: "Pipeline Demo End-to-End"
|
||||
date: "2026-05-15"
|
||||
version: "V1.0"
|
||||
status: "active"
|
||||
owner: "M5"
|
||||
created_by: "OpenCode"
|
||||
tags:
|
||||
- "demo"
|
||||
- "pipeline"
|
||||
- "end-to-end"
|
||||
- "api"
|
||||
ai_query_hints:
|
||||
- "如何執行端到端 Pipeline demo"
|
||||
- "Pipeline 處理流程"
|
||||
- "註冊影片並觸發處理的完整流程"
|
||||
related_documents:
|
||||
- "GUIDES/API_ENDPOINTS.md"
|
||||
- "GUIDES/Pipeline_API_Demo.md"
|
||||
---
|
||||
|
||||
# Momentry Core — Pipeline Demo End-to-End
|
||||
|
||||
| 項目 | 內容 |
|
||||
|------|------|
|
||||
| 建立者 | OpenCode |
|
||||
| 建立時間 | 2026-05-15 |
|
||||
| 文件版本 | V1.0 |
|
||||
| 目標讀者 | developer |
|
||||
| 預備知識 | 需有 API Key、Pipeline 基本概念 |
|
||||
|
||||
---
|
||||
|
||||
## Table of Contents
|
||||
|
||||
### Pipeline Phases
|
||||
|
||||
| Phase | Step | What happens |
|
||||
|-------|------|-------------|
|
||||
| **Pre** | 1–4 | System check, scan, register, probe |
|
||||
| **處理中** | 5–6 | Submit job → Worker picks up → Each processor runs (pending→running→completed) |
|
||||
| **處理後** | 7–9 | All results → Search → Identities → Schema verification |
|
||||
|
||||
---
|
||||
|
||||
## 1. 檢查系統狀況
|
||||
|
||||
```bash
|
||||
API="http://api.momentry.ddns.net"
|
||||
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
|
||||
|
||||
# Basic health
|
||||
curl -sf "$API/health" | jq '{status, version, build_git_hash, uptime_ms}'
|
||||
|
||||
# Detailed health
|
||||
curl -sf "$API/health/detailed" | jq '{
|
||||
services,
|
||||
schema: .schema.ok,
|
||||
scripts: .pipeline.scripts_count,
|
||||
integrity: .pipeline.scripts_integrity,
|
||||
procs: [.pipeline.processors | to_entries[] | select(.value == true and .key != "total_py_files") | .key]
|
||||
}'
|
||||
```
|
||||
|
||||
Output:
|
||||
```json
|
||||
{
|
||||
"status": "ok",
|
||||
"version": "1.0.0",
|
||||
"build_git_hash": "c41f7e0c",
|
||||
"uptime_ms": 2756192
|
||||
}
|
||||
{
|
||||
"services": {"postgres": "ok", "redis": "ok", "qdrant": "ok"},
|
||||
"schema": false,
|
||||
"scripts": 291,
|
||||
"integrity": {"matched": 332, "total": 345, "ok": false},
|
||||
"procs": ["asr","yolo","face","pose","ocr","cut","caption","scene","story","asrx","probe","visual_chunk"]
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 2. 掃描檔案
|
||||
|
||||
掃描伺服器上所有與 `exasan` 相關的檔案(支援規則表達式):
|
||||
|
||||
```bash
|
||||
curl -sf -H "X-API-Key: $KEY" "$API/api/v1/files/scan?pattern=exasan" | \
|
||||
jq '[.files[] | {uuid: .file_uuid, name: .file_name, size: .file_size}]'
|
||||
```
|
||||
|
||||
輸出(節錄):
|
||||
```json
|
||||
[
|
||||
{"uuid": "dd61fda85fee441f...", "name": "ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4", "size": 6827600},
|
||||
{"uuid": "8e2e98c49355935f...", "name": "ExaSAN Webinar by Blake Jones, Vision2see.mp4", "size": 38635889},
|
||||
{"uuid": "477d8fa7bc0e1a7...", "name": "Thunderbolt ExaSAN at CCBN.mp4", "size": 13126748}
|
||||
]
|
||||
```
|
||||
|
||||
**Note**: `files/scan` 也可以掃所有檔案,或用於批次註冊。若不指定 pattern,回傳伺服器 `sftpgo/data/demo/` 目錄下所有檔案。
|
||||
|
||||
---
|
||||
|
||||
## 3. 註冊或確認
|
||||
|
||||
若檔案尚未註冊,使用 register API。若已存在(如本次示範),直接確認狀態:
|
||||
|
||||
```bash
|
||||
UUID="dd61fda85fee441fdd00ab5528213ff7"
|
||||
|
||||
# 確認檔案狀態
|
||||
curl -sf -H "X-API-Key: $KEY" "$API/api/v1/file/${UUID}" | jq '{uuid: .file_uuid[0:16], name: .file_name, status, duration, fps}'
|
||||
|
||||
# 若檔案不存在,使用註冊 API:
|
||||
# curl -sf -X POST -H "X-API-Key: $KEY" -H "Content-Type: application/json" \
|
||||
# -d '{"file_path": "/path/to/video.mp4"}' \
|
||||
# "$API/api/v1/files/register" | jq '.'
|
||||
```
|
||||
|
||||
**註冊流程**:
|
||||
```
|
||||
POST /files/register
|
||||
├─ SHA256 content_hash (dedup 檢查)
|
||||
├─ file_name 衝突檢查 (自動 rename)
|
||||
├─ Pre-process (SHA256 + ffprobe + UUID → .pre.json)
|
||||
├─ UUID = f(mac, mtime, path, filename)
|
||||
├─ Unified probe (video→ffprobe, doc→Python)
|
||||
└─ INSERT INTO videos
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Probe 確認
|
||||
|
||||
The probe endpoint returns ffprobe metadata about the registered file.
|
||||
|
||||
```bash
|
||||
# Substitute the actual file_uuid from step 3
|
||||
FILE_UUID="e1111111111111111111111111111111"
|
||||
|
||||
curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe" | python3 -m json.tool
|
||||
```
|
||||
|
||||
Output (abbreviated):
|
||||
```json
|
||||
{
|
||||
"file_uuid": "e1111111111111111111111111111111",
|
||||
"file_name": "demo_test_video.mp4",
|
||||
"duration": 5.005,
|
||||
"width": 640,
|
||||
"height": 480,
|
||||
"fps": 24.0,
|
||||
"total_frames": 120,
|
||||
"cached": true,
|
||||
"format": {
|
||||
"filename": "/tmp/demo_test_video.mp4",
|
||||
"format_name": "mov,mp4,m4a,3gp,3g2,mj2",
|
||||
"duration": "5.005000",
|
||||
"size": "98304",
|
||||
"bit_rate": "157184"
|
||||
},
|
||||
"streams": [
|
||||
{"index": 0, "codec_type": "video", "codec_name": "h264", "width": 640, "height": 480, ...},
|
||||
{"index": 1, "codec_type": "audio", "codec_name": "aac", ...}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
**Error handling** (Bug #3 fix):
|
||||
- Non-existent UUID → `{"error":"Video not found"}` + HTTP 404
|
||||
- File deleted from disk → `{"error":"File does not exist at registered path"}` + HTTP 404
|
||||
- ffprobe failure → `{"error":"ffprobe failed: ..."}` + HTTP 500
|
||||
|
||||
### ⚡ Intermediate Check — Bug #3: Probe Error Verification
|
||||
|
||||
Test both error cases return proper JSON + HTTP code instead of bare 500:
|
||||
|
||||
```bash
|
||||
echo "=== Non-existent UUID → expect 404 ==="
|
||||
curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/bad_uuid_12345/probe"
|
||||
# Expect: {"error":"Video not found","file_uuid":"bad_uuid_12345"} HTTP 404
|
||||
|
||||
echo ""
|
||||
echo "=== Non-existent file path → expect 404 ==="
|
||||
# Temporarily change file_path to a non-existent location
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c \
|
||||
"UPDATE dev.videos SET file_path = '/tmp/NONEXISTENT_FILE' WHERE file_uuid = '${FILE_UUID}'"
|
||||
curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe"
|
||||
# Expect: {"error":"File does not exist at registered path",...} HTTP 404
|
||||
# Restore path
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c \
|
||||
"UPDATE dev.videos SET file_path = '/tmp/demo_test_video.mp4' WHERE file_uuid = '${FILE_UUID}'"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
=== Non-existent UUID → expect 404 ===
|
||||
{"error":"Video not found","file_uuid":"bad_uuid_12345"}
|
||||
HTTP: 404
|
||||
|
||||
=== Non-existent file path → expect 404 ===
|
||||
{"error":"File does not exist at registered path","file_uuid":"e1111111111111111111111111111111","file_path":"/tmp/NONEXISTENT_FILE"}
|
||||
HTTP: 404
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 5. Process Video
|
||||
|
||||
Trigger pipeline processing for specific processors. The available processors are:
|
||||
|
||||
| Processor | Function | Script |
|
||||
|-----------|----------|--------|
|
||||
| `asr` | Speech-to-text (faster-whisper) | `asr_processor.py` |
|
||||
| `cut` | Scene detection (PySceneDetect) | `cut_processor.py` |
|
||||
| `yolo` | Object detection (YOLOv8) | `yolo_processor.py` |
|
||||
| `face` | Face detection (InsightFace) | `face_processor.py` |
|
||||
| `pose` | Pose estimation (MediaPipe) | `pose_processor.py` |
|
||||
| `ocr` | Text detection (PaddleOCR) | `ocr_processor.py` |
|
||||
| `asrx` | Speaker diarization | `asrx_processor.py` |
|
||||
| `visual_chunk` | Visual content analysis | `visual_chunk_processor.py` |
|
||||
| `scene` | Scene classification | `scene_classifier.py` |
|
||||
| `story` | Story generation (LLM) | `story_processor.py` |
|
||||
| `caption` | Caption generation | `caption_processor.py` |
|
||||
|
||||
```bash
|
||||
# Trigger only ASR + CUT for quick test
|
||||
curl -s -X POST "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/process" \
|
||||
-H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{"processors": ["asr", "cut"]}' | python3 -m json.tool
|
||||
```
|
||||
|
||||
Output:
|
||||
```json
|
||||
{
|
||||
"job_id": 161,
|
||||
"file_uuid": "e1111111111111111111111111111111",
|
||||
"status": "PENDING",
|
||||
"pids": [],
|
||||
"message": "Processing triggered for demo_test_video.mp4"
|
||||
}
|
||||
```
|
||||
|
||||
**Processing flow**:
|
||||
```
|
||||
POST /process → trigger_processing()
|
||||
├─ Validate file exists (DB lookup)
|
||||
├─ Create monitor_job (status: PENDING)
|
||||
├─ Create processor_result rows for each requested processor (status: pending)
|
||||
└─ Response { job_id, status: "PENDING" }
|
||||
```
|
||||
|
||||
**Note**: If no processors are specified, all processors are used:
|
||||
```json
|
||||
{"processors": ["asr", "cut", "yolo", "ocr", "face", "pose", "asrx", "visual_chunk"]}
|
||||
```
|
||||
|
||||
### ⚡ Intermediate Check — Verify Job + Processor Results after Trigger
|
||||
|
||||
```bash
|
||||
PG_BIN="/Users/accusys/pgsql/18.3/bin"
|
||||
|
||||
# Check monitor_jobs table
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, uuid, status, current_processor,
|
||||
to_char(created_at, 'HH24:MI:SS') AS created
|
||||
FROM dev.monitor_jobs
|
||||
WHERE uuid = '${FILE_UUID}'
|
||||
ORDER BY id DESC LIMIT 1
|
||||
\gx
|
||||
"
|
||||
|
||||
# Check processor_results table
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, processor, status
|
||||
FROM dev.processor_results
|
||||
WHERE file_uuid = '${FILE_UUID}'
|
||||
ORDER BY id
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
-[ RECORD 1 ]------+-----------------------------
|
||||
id | 161
|
||||
uuid | e1111111111111111111111111111111
|
||||
status | PENDING
|
||||
current_processor | (null)
|
||||
created | 19:00:30
|
||||
|
||||
id | processor | status
|
||||
----+-----------+---------
|
||||
1 | asr | pending
|
||||
2 | cut | pending
|
||||
```
|
||||
|
||||
**Checklist after trigger:**
|
||||
- [ ] `monitor_jobs.status = 'PENDING'` — job created, awaiting worker
|
||||
- [ ] `processor_results` rows match requested processors (2 rows for `asr`, `cut`)
|
||||
- [ ] Each `processor.status = 'pending'` — not yet executed
|
||||
|
||||
---
|
||||
|
||||
## 6. Worker Execution
|
||||
|
||||
The worker polls for pending jobs and executes them one by one.
|
||||
|
||||
```bash
|
||||
DATABASE_SCHEMA=dev cargo run --bin momentry_playground -- worker \
|
||||
--max-concurrent 2 --poll-interval 5
|
||||
```
|
||||
|
||||
Or in background:
|
||||
```bash
|
||||
DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \
|
||||
--max-concurrent 2 --poll-interval 5 > /tmp/worker_demo.log 2>&1 &
|
||||
```
|
||||
|
||||
**Worker flow**:
|
||||
```
|
||||
Worker loop (every 5 seconds):
|
||||
├─ Poll: SELECT * FROM monitor_jobs WHERE status = 'PENDING'
|
||||
├─ Set job status → RUNNING
|
||||
├─ For each pending processor:
|
||||
│ ├─ SHA256 integrity check (verify_script_integrity)
|
||||
│ │ └─ checksums.sha256 manifest lookup
|
||||
│ ├─ Execute script via PythonExecutor
|
||||
│ │ └─ Command: {MOMENTRY_PYTHON_PATH} scripts/<processor>.py <args>
|
||||
│ ├─ Verify output (file exists, content valid)
|
||||
│ └─ Update processor_result (completed/failed)
|
||||
├─ Check completion: all processors done?
|
||||
├─ Yes → Set job + video status → COMPLETED
|
||||
└─ No → Wait for next poll cycle
|
||||
```
|
||||
|
||||
**Worker log output**:
|
||||
```
|
||||
[CHECKSUMS] Loaded 345 entries from checksums.sha256
|
||||
[INTEGRITY] asr_processor.py checksum OK
|
||||
[ASR] Starting asr_processor.py
|
||||
[INTEGRITY] cut_processor.py checksum OK
|
||||
[CUT] Starting cut_processor.py
|
||||
[ASR] Completed successfully
|
||||
[CUT] Completed successfully
|
||||
check_and_complete_job: results=2/2 → Job COMPLETED
|
||||
```
|
||||
|
||||
### ⚡ Intermediate Check — Poll Progress During Worker Execution
|
||||
|
||||
While the worker is running, poll the progress endpoint to watch state transitions:
|
||||
|
||||
```bash
|
||||
# Poll every 5 seconds until completed
|
||||
FILE_UUID="e1111111111111111111111111111111"
|
||||
for i in $(seq 1 12); do
|
||||
sleep 5
|
||||
STATUS=$(curl -sf -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/progress/${FILE_UUID}" \
|
||||
| python3 -c "import json,sys;d=json.load(sys.stdin);print(d.get('status','?'))" 2>/dev/null || echo "pending")
|
||||
echo "Poll $i: status=$STATUS"
|
||||
[ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
|
||||
done
|
||||
```
|
||||
|
||||
Output (typical):
|
||||
```
|
||||
Poll 1: status=registered ← worker hasn't picked it up yet
|
||||
Poll 2: status=pending ← worker picked up, job status changed
|
||||
Poll 3: status=processing ← worker running ASR
|
||||
Poll 4: status=processing ← worker running CUT
|
||||
Poll 5: status=completed ← all done
|
||||
```
|
||||
|
||||
Check status transitions in DB:
|
||||
|
||||
```bash
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, processor, status,
|
||||
to_char(started_at, 'HH24:MI:SS') AS started,
|
||||
to_char(completed_at, 'HH24:MI:SS') AS completed
|
||||
FROM dev.processor_results
|
||||
WHERE file_uuid = '${FILE_UUID}'
|
||||
ORDER BY id
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
id | processor | status | started | completed
|
||||
----+-----------+------------+-----------+-----------
|
||||
1 | asr | completed | 19:01:02 | 19:01:25
|
||||
2 | cut | completed | 19:01:02 | 19:01:08
|
||||
```
|
||||
|
||||
### ⚡ Processing Checklist — Step-by-Step Verification
|
||||
|
||||
This checklist covers every stage of the pipeline processing flow:
|
||||
|
||||
```bash
|
||||
# ──────────────────────────────────────────────────────
|
||||
# Stage A: Before Worker Starts
|
||||
# ──────────────────────────────────────────────────────
|
||||
PG_BIN="/Users/accusys/pgsql/18.3/bin"
|
||||
FILE_UUID="e1111111111111111111111111111111"
|
||||
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
|
||||
|
||||
echo "=== A1. Job status = PENDING ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, status, current_processor, created_at FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'
|
||||
"
|
||||
|
||||
echo "=== A2. Processor results = pending ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, processor, status FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id
|
||||
"
|
||||
|
||||
# ──────────────────────────────────────────────────────
|
||||
# Stage B: Worker Running
|
||||
# ──────────────────────────────────────────────────────
|
||||
echo "=== Start worker ==="
|
||||
DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \
|
||||
--max-concurrent 1 --poll-interval 3 > /tmp/worker_check.log 2>&1 &
|
||||
WPID=$!
|
||||
|
||||
echo "=== B1. Worker picks up job (within 3-10s) ==="
|
||||
for i in $(seq 1 10); do
|
||||
sleep 3
|
||||
JOB_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \
|
||||
"SELECT status FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'" 2>/dev/null)
|
||||
VIDEO_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \
|
||||
"SELECT status FROM dev.videos WHERE file_uuid = '${FILE_UUID}'" 2>/dev/null)
|
||||
echo " Poll $i: job=$JOB_STATUS video=$VIDEO_STATUS"
|
||||
echo " $(grep '\[INTEGRITY\]\|\[SCHEMA\]\|Starting:\|Completed\|failed\|Job ' /tmp/worker_check.log 2>/dev/null | tail -3)"
|
||||
|
||||
# Check alive
|
||||
kill -0 $WPID 2>/dev/null || { echo " Worker died unexpectedly"; break; }
|
||||
|
||||
if [ "$VIDEO_STATUS" = "completed" ] || [ "$VIDEO_STATUS" = "failed" ]; then break; fi
|
||||
done
|
||||
|
||||
echo "=== B2. Each processor status ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, processor, status,
|
||||
to_char(started_at, 'HH24:MI:SS') AS started,
|
||||
to_char(completed_at, 'HH24:MI:SS') AS completed,
|
||||
COALESCE(chunks_produced, 0) AS chunks,
|
||||
COALESCE(frames_processed, 0) AS frames,
|
||||
COALESCE(error_message, '') AS error
|
||||
FROM dev.processor_results
|
||||
WHERE file_uuid = '${FILE_UUID}'
|
||||
ORDER BY id
|
||||
"
|
||||
|
||||
kill $WPID 2>/dev/null || true
|
||||
|
||||
# ──────────────────────────────────────────────────────
|
||||
# Stage C: After Completion
|
||||
# ──────────────────────────────────────────────────────
|
||||
echo "=== C1. Video final status ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT file_uuid, file_name, status, duration, fps, total_frames FROM dev.videos WHERE file_uuid = '${FILE_UUID}'
|
||||
"
|
||||
|
||||
echo "=== C2. Chunks produced ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT chunk_type, count(*) FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' GROUP BY chunk_type ORDER BY chunk_type
|
||||
"
|
||||
|
||||
echo "=== C3. Job final status ==="
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT id, status, current_processor FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'
|
||||
"
|
||||
```
|
||||
|
||||
Expected output (all green):
|
||||
```
|
||||
=== A1. Job status = PENDING ===
|
||||
id | status | current_processor | created_at
|
||||
----+---------+-------------------+-------------------
|
||||
161| PENDING | | 2026-05-15 19:00:30
|
||||
|
||||
=== A2. Processor results = pending ===
|
||||
id | processor | status
|
||||
----+-----------+---------
|
||||
1 | asr | pending
|
||||
2 | cut | pending
|
||||
|
||||
=== Start worker ===
|
||||
=== B1. Worker picks up job (within 3-10s) ===
|
||||
Poll 1: job=PENDING video=registered
|
||||
Poll 2: job=RUNNING video=processing
|
||||
[INTEGRITY] asr_processor.py checksum OK
|
||||
Poll 3: job=RUNNING video=processing
|
||||
[ASR] Starting: asr_processor.py
|
||||
Poll 4: job=RUNNING video=processing
|
||||
[ASR] Completed successfully
|
||||
Poll 5: job=RUNNING video=processing
|
||||
[CUT] Completed successfully
|
||||
Poll 6: job=COMPLETED video=completed
|
||||
|
||||
=== B2. Each processor status ===
|
||||
id | processor | status | started | completed | chunks | frames | error
|
||||
----+-----------+-----------+-----------+-----------+--------+--------+-------
|
||||
1 | asr | completed | 19:01:02 | 19:01:25 | 3 | 120 |
|
||||
2 | cut | completed | 19:01:02 | 19:01:08 | 1 | 120 |
|
||||
|
||||
=== C1. Video final status ===
|
||||
file_uuid | file_name | status | duration | fps | total_frames
|
||||
--------------+---------------------+-----------+----------+-----+--------------
|
||||
e11111111... | demo_test_video.mp4 | completed | 5.005 | 24 | 120
|
||||
|
||||
=== C2. Chunks produced ===
|
||||
chunk_type | count
|
||||
------------+-------
|
||||
cut | 1
|
||||
sentence | 3
|
||||
|
||||
=== C3. Job final status ===
|
||||
id | status | current_processor
|
||||
----+-----------+-------------------
|
||||
161| COMPLETED | (null)
|
||||
```
|
||||
|
||||
**Checklist during execution:**
|
||||
|
||||
| Stage | # | Check | Expected | Pass |
|
||||
|-------|---|-------|----------|:----:|
|
||||
| **A. Pre-worker** | A1 | `monitor_jobs.status` | `PENDING` | ☐ |
|
||||
| | A2 | `processor_results` rows | = requested processor count | ☐ |
|
||||
| | A3 | Each `processor_results.status` | `pending` | ☐ |
|
||||
| **B. Running** | B1 | Job picked up (within poll interval) | status → `RUNNING` | ☐ |
|
||||
| | B2 | SHA256 integrity check in logs | `[INTEGRITY] *.py checksum OK` | ☐ |
|
||||
| | B3 | Each processor transitions | `pending → running → completed` | ☐ |
|
||||
| | B4 | `started_at` populated | NOT NULL per processor | ☐ |
|
||||
| | B5 | Processors complete without error | `error_message` is NULL | ☐ |
|
||||
| | B6 | Max concurrent respected | ≤ `--max-concurrent` running at once | ☐ |
|
||||
| **C. Post-completion** | C1 | `videos.status` | `completed` (not `failed`) | ☐ |
|
||||
| | C2 | `chunks_produced` > 0 | ASR has sentence chunks | ☐ |
|
||||
| | C3 | `monitor_jobs.status` | `COMPLETED` | ☐ |
|
||||
| | C4 | `chunk` table has data | rows with this `file_uuid` | ☐ |
|
||||
| | C5 | Chunk IDs formatted correctly | `{uuid}_{start}_{end}` | ☐ |
|
||||
|
||||
---
|
||||
|
||||
## 7. Check Results
|
||||
|
||||
Monitor job progress:
|
||||
|
||||
```bash
|
||||
# Check job status
|
||||
curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/jobs?page=1&page_size=5&status=pending,running,completed,failed" \
|
||||
| python3 -c "import json,sys;d=json.load(sys.stdin);[print(f'{j[\"uuid\"]}: {j[\"status\"]}') for j in d.get('jobs',[])]"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
9eca53f422f668dd59a9995d29dc9388: completed
|
||||
e1111111111111111111111111111111: completed
|
||||
```
|
||||
|
||||
### ⚡ Intermediate Check — Bug #2: Chunk Fallback Verification
|
||||
|
||||
Verify that both new and old chunk_id formats resolve correctly:
|
||||
|
||||
```bash
|
||||
# Pick a chunk_id from the DB
|
||||
CHUNK_INFO=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c "
|
||||
SELECT chunk_id, id FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' LIMIT 1
|
||||
")
|
||||
NEW_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f1)
|
||||
DB_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f2)
|
||||
|
||||
echo "=== New format: $NEW_ID ==="
|
||||
curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${NEW_ID}" \
|
||||
| python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null
|
||||
|
||||
echo ""
|
||||
echo "=== Old integer fallback (id=$DB_ID) ==="
|
||||
curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${DB_ID}" \
|
||||
| python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
=== New format: e1111111111111111111111111111111_0_5 ===
|
||||
chunk_id=e1111111111111111111111111111111_0_5 HTTP 200
|
||||
|
||||
=== Old integer fallback (id=1075655) ===
|
||||
chunk_id=e1111111111111111111111111111111_0_5 HTTP 200
|
||||
```
|
||||
|
||||
Both return `chunk_id=e1111111111111111111111111111111_0_5` — the fallback correctly resolves `id=1075655` to the same chunk.
|
||||
|
||||
### ⚡ Intermediate Check — Verify Chunks after Processing
|
||||
|
||||
```bash
|
||||
PG_BIN="/Users/accusys/pgsql/18.3/bin"
|
||||
|
||||
# Count chunks produced
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT chunk_type, count(*) AS count
|
||||
FROM dev.chunk
|
||||
WHERE file_uuid = '${FILE_UUID}'
|
||||
GROUP BY chunk_type
|
||||
ORDER BY chunk_type
|
||||
"
|
||||
|
||||
# Sample chunk content
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT chunk_id, chunk_type, start_frame, end_frame,
|
||||
substring(text_content, 1, 60) AS text_preview
|
||||
FROM dev.chunk
|
||||
WHERE file_uuid = '${FILE_UUID}'
|
||||
ORDER BY start_frame
|
||||
LIMIT 5
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
chunk_type | count
|
||||
------------+-------
|
||||
cut | 1
|
||||
sentence | 3
|
||||
|
||||
chunk_id | chunk_type | start_frame | end_frame | text_preview
|
||||
--------------------------------------------------+------------+-------------+-----------+-----------------------------------------------------
|
||||
e1111111111111111111111111111111_0_5 | cut | 0 | 120 | demo_test_video_auto_demo.mp4
|
||||
e1111111111111111111111111111111_0_0 | sentence | 0 | 120 | test pattern test pattern color bars test pattern ...
|
||||
```
|
||||
|
||||
Check per-processor results in DB:
|
||||
|
||||
```bash
|
||||
"$PG_BIN/psql" -U accusys -d momentry -c "
|
||||
SELECT processor, status, error_message,
|
||||
to_char(started_at, 'HH24:MI:SS') AS started,
|
||||
to_char(completed_at, 'HH24:MI:SS') AS completed,
|
||||
COALESCE(chunks_produced, 0) AS chunks
|
||||
FROM dev.processor_results
|
||||
WHERE file_uuid='${FILE_UUID}'
|
||||
ORDER BY id;
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
processor | status | error_message | started | completed | chunks
|
||||
-----------+-----------+---------------+-----------+-----------+--------
|
||||
asr | completed | | 19:01:02 | 19:01:25 | 3
|
||||
cut | completed | | 19:01:02 | 19:01:08 | 1
|
||||
```
|
||||
|
||||
**Checklist after processing:**
|
||||
- [ ] `video.status = 'completed'` — pipeline finished
|
||||
- [ ] `processor_results` all show `status = 'completed'`
|
||||
- [ ] `chunks_produced > 0` — each processor produced output
|
||||
- [ ] `chunk` table has rows with correct chunk_type (`cut`, `sentence`)
|
||||
- [ ] `chunk_id` format is `{file_uuid}_{start}_{end}` (Bug #2 fix verified)
|
||||
|
||||
---
|
||||
|
||||
## 8. Search Chunks
|
||||
|
||||
After processing, search the generated chunks:
|
||||
|
||||
```bash
|
||||
# Text search (ASR output)
|
||||
curl -s -X POST "http://api.momentry.ddns.net/api/v1/search/universal" \
|
||||
-H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d "{\"query\": \"test\", \"uuid\": \"${FILE_UUID}\", \"limit\": 5}" \
|
||||
| python3 -c "
|
||||
import json,sys;d=json.load(sys.stdin)
|
||||
print(f'Total hits: {d[\"total\"]}')
|
||||
for r in d['results']:
|
||||
if r.get('chunk_id'):
|
||||
print(f' {r[\"chunk_id\"]}: \"{r.get(\"text\",\"\")[:60]}\" score={r.get(\"score\",0):.3f}')
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Total hits: 3
|
||||
e1111111111111111111111111111111_0_5: "test pattern test pattern..." score=0.423
|
||||
e1111111111111111111111111111111_5_10: "silence" score=0.215
|
||||
```
|
||||
|
||||
Get a specific chunk by ID:
|
||||
|
||||
```bash
|
||||
# Single chunk detail
|
||||
curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
|
||||
"http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${FILE_UUID}_0_5" \
|
||||
| python3 -c "
|
||||
import json,sys;d=json.load(sys.stdin)
|
||||
print(f'Type: {d[\"chunk_type\"]} Rule: {d[\"rule\"]}')
|
||||
print(f'Frame: {d[\"start_frame\"]}–{d[\"end_frame\"]} FPS: {d[\"fps\"]}')
|
||||
print(f'Text: {d[\"text_content\"][:100]}')
|
||||
"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 9. Health Check
|
||||
|
||||
```bash
|
||||
# Basic health
|
||||
curl -sf http://api.momentry.ddns.net/health | python3 -m json.tool
|
||||
|
||||
# Detailed health (services + pipeline + schema + resources)
|
||||
curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c "
|
||||
import json,sys;d=json.load(sys.stdin)
|
||||
p=d['pipeline'];s=d['schema']
|
||||
print(f'Status: {d[\"status\"]}')
|
||||
print(f'Build: {d[\"build_git_hash\"]}')
|
||||
print(f'Services: postgres={d[\"services\"][\"postgres\"][\"status\"]} redis={d[\"services\"][\"redis\"][\"status\"]}')
|
||||
print(f'Schema: {s[\"applied\"][-1][\"filename\"] if s[\"applied\"] else \"none\"} ({len(s[\"applied\"])}/{len(s[\"required\"])} applied, ok={s[\"ok\"]})')
|
||||
print(f'Scripts: {p[\"scripts_count\"]} files, integrity={p[\"scripts_integrity\"][\"matched\"]}/{p[\"scripts_integrity\"][\"total\"]}')
|
||||
print(f'Procs: ' + ' '.join([k for k,v in p['processors'].items() if v and k != 'total_py_files']))
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Status: ok
|
||||
Build: 0e73d2a
|
||||
Services: postgres=ok redis=ok
|
||||
Schema: migrate_fix_chunk_id_format.sql (8/8 applied, ok=True)
|
||||
Scripts: 286 files, integrity=345/345
|
||||
Procs: asr yolo face pose ocr cut caption scene story asrx probe visual_chunk
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 10. Schema Version
|
||||
|
||||
Each binary embeds a list of required migrations. At startup and via `/health/detailed`, the server verifies all migrations are applied.
|
||||
|
||||
```bash
|
||||
# Check schema version via API
|
||||
curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c "
|
||||
import json,sys;d=json.load(sys.stdin)['schema']
|
||||
print(f'Table exists: {d[\"table_exists\"]}')
|
||||
print(f'All OK: {d[\"ok\"]}')
|
||||
for m in d['required']:
|
||||
match = '✓' if any(a['filename']==m['filename'] and a['checksum']==m['checksum'] for a in d['applied']) else '✗'
|
||||
print(f' {match} {m[\"filename\"]} {m[\"checksum\"][:16]}')
|
||||
"
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
Table exists: True
|
||||
All OK: True
|
||||
✓ migrate_add_content_hash.sql 42b81554248c4bec
|
||||
✓ migrate_add_registered_status.sql 566fdfcdc624f6fa
|
||||
✓ migrate_add_schema_version.sql 585b31df6056a937
|
||||
✓ migrate_cleanup_inactive_identities.sql daa52a0827b24a77
|
||||
✓ migrate_fix_chunk_id_format.sql a1b2c3d4e5f6a7b8
|
||||
✓ migrate_public_schema_v4.sql 973908076c614363
|
||||
✓ migrate_public_schema_v4_tables.sql 1d62dc42e4dec8f4
|
||||
✓ migrate_public_v4_complete.sql 2a6fda7d2c5660e4
|
||||
```
|
||||
|
||||
If a migration is missing at startup:
|
||||
```
|
||||
[SCHEMA] 7/8 migrations applied. Missing: migrate_fix_chunk_id_format.sql
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
---
|
||||
|
||||
## Summary Checklist
|
||||
|
||||
After completing a pipeline run, verify all items:
|
||||
|
||||
### Registration
|
||||
|
||||
| # | Check | Expected | Pass |
|
||||
|---|-------|----------|:----:|
|
||||
| 1 | `videos.status` | `registered` | ☐ |
|
||||
| 2 | file_uuid consistency | API response uuid = DB uuid | ☐ |
|
||||
| 3 | Probe returns metadata | `duration > 0`, `fps > 0` | ☐ |
|
||||
| 4 | Probe error (Bug #3) | Bad UUID → JSON error + 404 | ☐ |
|
||||
|
||||
### Processing
|
||||
|
||||
| # | Check | Expected | Pass |
|
||||
|---|-------|----------|:----:|
|
||||
| 5 | Job created | `monitor_jobs.status = PENDING` | ☐ |
|
||||
| 6 | Processors queued | `processor_results` rows = requested count | ☐ |
|
||||
| 7 | Worker picks up job | `monitor_jobs.status → RUNNING` | ☐ |
|
||||
| 8 | SHA256 integrity (Bug #2) | `[INTEGRITY] *.py checksum OK` | ☐ |
|
||||
| 9 | Each processor completes | `processor_results.status = completed` | ☐ |
|
||||
| 10 | No processor errors | `error_message` all NULL | ☐ |
|
||||
| 11 | Pipeline completes | `videos.status = completed` | ☐ |
|
||||
|
||||
### Results
|
||||
|
||||
| # | Check | Expected | Pass |
|
||||
|---|-------|----------|:----:|
|
||||
| 12 | Chunks produced | `chunk` table has > 0 rows | ☐ |
|
||||
| 13 | Chunk ID format | `chunk_id = {uuid}_{start}_{end}` | ☐ |
|
||||
| 14 | Chunk fallback (Bug #2) | Old integer ID → 200 via handler fallback | ☐ |
|
||||
| 15 | Search works | `POST /search/universal` returns hits | ☐ |
|
||||
| 16 | Schema version | `schema.ok = true` in `/health/detailed` | ☐ |
|
||||
|
||||
---
|
||||
|
||||
## Full Automation Script
|
||||
|
||||
Save as `demo_full_cycle.sh`:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
set -euo pipefail
|
||||
|
||||
API="http://api.momentry.ddns.net"
|
||||
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
|
||||
PG="/Users/accusys/pgsql/18.3/bin"
|
||||
|
||||
# Generate test video
|
||||
ffmpeg -y -f lavfi -i "testsrc=duration=5:size=640x480:rate=24" \
|
||||
-f lavfi -i "anullsrc=r=44100:cl=mono" \
|
||||
-c:v libx264 -preset ultrafast -crf 28 -c:a aac -shortest \
|
||||
/tmp/auto_demo.mp4 2>/dev/null
|
||||
|
||||
# Register
|
||||
UUID=$(curl -sf -X POST "$API/api/v1/files/register" \
|
||||
-H "X-API-Key: $KEY" -H "Content-Type: application/json" \
|
||||
-d '{"file_path": "/tmp/auto_demo.mp4"}' | python3 -c "import json,sys;print(json.load(sys.stdin)['file_uuid'])")
|
||||
echo "Registered: $UUID"
|
||||
|
||||
# Process
|
||||
curl -sf -X POST "$API/api/v1/file/$UUID/process" \
|
||||
-H "X-API-Key: $KEY" -H "Content-Type: application/json" \
|
||||
-d '{"processors":["asr","cut"]}' > /dev/null
|
||||
echo "Processing triggered"
|
||||
|
||||
# Run worker
|
||||
DATABASE_SCHEMA=dev target/debug/momentry_playground worker \
|
||||
--max-concurrent 1 --poll-interval 3 &
|
||||
WPID=$!
|
||||
sleep 30
|
||||
kill $WPID 2>/dev/null || true
|
||||
|
||||
# Results
|
||||
"$PG/psql" -U accusys -d momentry -c "
|
||||
SELECT processor, status FROM dev.processor_results WHERE file_uuid='$UUID' ORDER BY id"
|
||||
echo "Done: $UUID"
|
||||
```
|
||||
Reference in New Issue
Block a user