--- document_type: "demo_guide" service: "MOMENTRY_CORE" title: "Pipeline Demo End-to-End" date: "2026-05-15" version: "V1.0" status: "active" owner: "M5" created_by: "OpenCode" tags: - "demo" - "pipeline" - "end-to-end" - "api" ai_query_hints: - "如何執行端到端 Pipeline demo" - "Pipeline 處理流程" - "註冊影片並觸發處理的完整流程" related_documents: - "GUIDES/API_ENDPOINTS.md" - "GUIDES/Pipeline_API_Demo.md" --- # Momentry Core — Pipeline Demo End-to-End | 項目 | 內容 | |------|------| | 建立者 | OpenCode | | 建立時間 | 2026-05-15 | | 文件版本 | V1.0 | | 目標讀者 | developer | | 預備知識 | 需有 API Key、Pipeline 基本概念 | --- ## Table of Contents ### Pipeline Phases | Phase | Step | What happens | |-------|------|-------------| | **Pre** | 1–4 | System check, scan, register, probe | | **處理中** | 5–6 | Submit job → Worker picks up → Each processor runs (pending→running→completed) | | **處理後** | 7–9 | All results → Search → Identities → Schema verification | --- ## 1. 檢查系統狀況 ```bash API="http://api.momentry.ddns.net" KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" # Basic health curl -sf "$API/health" | jq '{status, version, build_git_hash, uptime_ms}' # Detailed health curl -sf "$API/health/detailed" | jq '{ services, schema: .schema.ok, scripts: .pipeline.scripts_count, integrity: .pipeline.scripts_integrity, procs: [.pipeline.processors | to_entries[] | select(.value == true and .key != "total_py_files") | .key] }' ``` Output: ```json { "status": "ok", "version": "1.0.0", "build_git_hash": "c41f7e0c", "uptime_ms": 2756192 } { "services": {"postgres": "ok", "redis": "ok", "qdrant": "ok"}, "schema": false, "scripts": 291, "integrity": {"matched": 332, "total": 345, "ok": false}, "procs": ["asr","yolo","face","pose","ocr","cut","caption","scene","story","asrx","probe","visual_chunk"] } ``` --- ## 2. 掃描檔案 掃描伺服器上所有與 `exasan` 相關的檔案(支援規則表達式): ```bash curl -sf -H "X-API-Key: $KEY" "$API/api/v1/files/scan?pattern=exasan" | \ jq '[.files[] | {uuid: .file_uuid, name: .file_name, size: .file_size}]' ``` 輸出(節錄): ```json [ {"uuid": "dd61fda85fee441f...", "name": "ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4", "size": 6827600}, {"uuid": "8e2e98c49355935f...", "name": "ExaSAN Webinar by Blake Jones, Vision2see.mp4", "size": 38635889}, {"uuid": "477d8fa7bc0e1a7...", "name": "Thunderbolt ExaSAN at CCBN.mp4", "size": 13126748} ] ``` **Note**: `files/scan` 也可以掃所有檔案,或用於批次註冊。若不指定 pattern,回傳伺服器 `sftpgo/data/demo/` 目錄下所有檔案。 --- ## 3. 註冊或確認 若檔案尚未註冊,使用 register API。若已存在(如本次示範),直接確認狀態: ```bash UUID="dd61fda85fee441fdd00ab5528213ff7" # 確認檔案狀態 curl -sf -H "X-API-Key: $KEY" "$API/api/v1/file/${UUID}" | jq '{uuid: .file_uuid[0:16], name: .file_name, status, duration, fps}' # 若檔案不存在,使用註冊 API: # curl -sf -X POST -H "X-API-Key: $KEY" -H "Content-Type: application/json" \ # -d '{"file_path": "/path/to/video.mp4"}' \ # "$API/api/v1/files/register" | jq '.' ``` **註冊流程**: ``` POST /files/register ├─ SHA256 content_hash (dedup 檢查) ├─ file_name 衝突檢查 (自動 rename) ├─ Pre-process (SHA256 + ffprobe + UUID → .pre.json) ├─ UUID = f(mac, mtime, path, filename) ├─ Unified probe (video→ffprobe, doc→Python) └─ INSERT INTO videos ``` --- ## 4. Probe 確認 The probe endpoint returns ffprobe metadata about the registered file. ```bash # Substitute the actual file_uuid from step 3 FILE_UUID="e1111111111111111111111111111111" curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe" | python3 -m json.tool ``` Output (abbreviated): ```json { "file_uuid": "e1111111111111111111111111111111", "file_name": "demo_test_video.mp4", "duration": 5.005, "width": 640, "height": 480, "fps": 24.0, "total_frames": 120, "cached": true, "format": { "filename": "/tmp/demo_test_video.mp4", "format_name": "mov,mp4,m4a,3gp,3g2,mj2", "duration": "5.005000", "size": "98304", "bit_rate": "157184" }, "streams": [ {"index": 0, "codec_type": "video", "codec_name": "h264", "width": 640, "height": 480, ...}, {"index": 1, "codec_type": "audio", "codec_name": "aac", ...} ] } ``` **Error handling** (Bug #3 fix): - Non-existent UUID → `{"error":"Video not found"}` + HTTP 404 - File deleted from disk → `{"error":"File does not exist at registered path"}` + HTTP 404 - ffprobe failure → `{"error":"ffprobe failed: ..."}` + HTTP 500 ### ⚡ Intermediate Check — Bug #3: Probe Error Verification Test both error cases return proper JSON + HTTP code instead of bare 500: ```bash echo "=== Non-existent UUID → expect 404 ===" curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/bad_uuid_12345/probe" # Expect: {"error":"Video not found","file_uuid":"bad_uuid_12345"} HTTP 404 echo "" echo "=== Non-existent file path → expect 404 ===" # Temporarily change file_path to a non-existent location "$PG_BIN/psql" -U accusys -d momentry -c \ "UPDATE dev.videos SET file_path = '/tmp/NONEXISTENT_FILE' WHERE file_uuid = '${FILE_UUID}'" curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe" # Expect: {"error":"File does not exist at registered path",...} HTTP 404 # Restore path "$PG_BIN/psql" -U accusys -d momentry -c \ "UPDATE dev.videos SET file_path = '/tmp/demo_test_video.mp4' WHERE file_uuid = '${FILE_UUID}'" ``` Output: ``` === Non-existent UUID → expect 404 === {"error":"Video not found","file_uuid":"bad_uuid_12345"} HTTP: 404 === Non-existent file path → expect 404 === {"error":"File does not exist at registered path","file_uuid":"e1111111111111111111111111111111","file_path":"/tmp/NONEXISTENT_FILE"} HTTP: 404 ``` --- ## 5. Process Video Trigger pipeline processing for specific processors. The available processors are: | Processor | Function | Script | |-----------|----------|--------| | `asr` | Speech-to-text (faster-whisper) | `asr_processor.py` | | `cut` | Scene detection (PySceneDetect) | `cut_processor.py` | | `yolo` | Object detection (YOLOv8) | `yolo_processor.py` | | `face` | Face detection (InsightFace) | `face_processor.py` | | `pose` | Pose estimation (MediaPipe) | `pose_processor.py` | | `ocr` | Text detection (PaddleOCR) | `ocr_processor.py` | | `asrx` | Speaker diarization | `asrx_processor.py` | | `visual_chunk` | Visual content analysis | `visual_chunk_processor.py` | | `scene` | Scene classification | `scene_classifier.py` | | `story` | Story generation (LLM) | `story_processor.py` | | `caption` | Caption generation | `caption_processor.py` | ```bash # Trigger only ASR + CUT for quick test curl -s -X POST "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/process" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ -H "Content-Type: application/json" \ -d '{"processors": ["asr", "cut"]}' | python3 -m json.tool ``` Output: ```json { "job_id": 161, "file_uuid": "e1111111111111111111111111111111", "status": "PENDING", "pids": [], "message": "Processing triggered for demo_test_video.mp4" } ``` **Processing flow**: ``` POST /process → trigger_processing() ├─ Validate file exists (DB lookup) ├─ Create monitor_job (status: PENDING) ├─ Create processor_result rows for each requested processor (status: pending) └─ Response { job_id, status: "PENDING" } ``` **Note**: If no processors are specified, all processors are used: ```json {"processors": ["asr", "cut", "yolo", "ocr", "face", "pose", "asrx", "visual_chunk"]} ``` ### ⚡ Intermediate Check — Verify Job + Processor Results after Trigger ```bash PG_BIN="/Users/accusys/pgsql/18.3/bin" # Check monitor_jobs table "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, uuid, status, current_processor, to_char(created_at, 'HH24:MI:SS') AS created FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}' ORDER BY id DESC LIMIT 1 \gx " # Check processor_results table "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, processor, status FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id " ``` Output: ``` -[ RECORD 1 ]------+----------------------------- id | 161 uuid | e1111111111111111111111111111111 status | PENDING current_processor | (null) created | 19:00:30 id | processor | status ----+-----------+--------- 1 | asr | pending 2 | cut | pending ``` **Checklist after trigger:** - [ ] `monitor_jobs.status = 'PENDING'` — job created, awaiting worker - [ ] `processor_results` rows match requested processors (2 rows for `asr`, `cut`) - [ ] Each `processor.status = 'pending'` — not yet executed --- ## 6. Worker Execution The worker polls for pending jobs and executes them one by one. ```bash DATABASE_SCHEMA=dev cargo run --bin momentry_playground -- worker \ --max-concurrent 2 --poll-interval 5 ``` Or in background: ```bash DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \ --max-concurrent 2 --poll-interval 5 > /tmp/worker_demo.log 2>&1 & ``` **Worker flow**: ``` Worker loop (every 5 seconds): ├─ Poll: SELECT * FROM monitor_jobs WHERE status = 'PENDING' ├─ Set job status → RUNNING ├─ For each pending processor: │ ├─ SHA256 integrity check (verify_script_integrity) │ │ └─ checksums.sha256 manifest lookup │ ├─ Execute script via PythonExecutor │ │ └─ Command: {MOMENTRY_PYTHON_PATH} scripts/.py │ ├─ Verify output (file exists, content valid) │ └─ Update processor_result (completed/failed) ├─ Check completion: all processors done? ├─ Yes → Set job + video status → COMPLETED └─ No → Wait for next poll cycle ``` **Worker log output**: ``` [CHECKSUMS] Loaded 345 entries from checksums.sha256 [INTEGRITY] asr_processor.py checksum OK [ASR] Starting asr_processor.py [INTEGRITY] cut_processor.py checksum OK [CUT] Starting cut_processor.py [ASR] Completed successfully [CUT] Completed successfully check_and_complete_job: results=2/2 → Job COMPLETED ``` ### ⚡ Intermediate Check — Poll Progress During Worker Execution While the worker is running, poll the progress endpoint to watch state transitions: ```bash # Poll every 5 seconds until completed FILE_UUID="e1111111111111111111111111111111" for i in $(seq 1 12); do sleep 5 STATUS=$(curl -sf -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/progress/${FILE_UUID}" \ | python3 -c "import json,sys;d=json.load(sys.stdin);print(d.get('status','?'))" 2>/dev/null || echo "pending") echo "Poll $i: status=$STATUS" [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break done ``` Output (typical): ``` Poll 1: status=registered ← worker hasn't picked it up yet Poll 2: status=pending ← worker picked up, job status changed Poll 3: status=processing ← worker running ASR Poll 4: status=processing ← worker running CUT Poll 5: status=completed ← all done ``` Check status transitions in DB: ```bash "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, processor, status, to_char(started_at, 'HH24:MI:SS') AS started, to_char(completed_at, 'HH24:MI:SS') AS completed FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id " ``` Output: ``` id | processor | status | started | completed ----+-----------+------------+-----------+----------- 1 | asr | completed | 19:01:02 | 19:01:25 2 | cut | completed | 19:01:02 | 19:01:08 ``` ### ⚡ Processing Checklist — Step-by-Step Verification This checklist covers every stage of the pipeline processing flow: ```bash # ────────────────────────────────────────────────────── # Stage A: Before Worker Starts # ────────────────────────────────────────────────────── PG_BIN="/Users/accusys/pgsql/18.3/bin" FILE_UUID="e1111111111111111111111111111111" KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" echo "=== A1. Job status = PENDING ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, status, current_processor, created_at FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}' " echo "=== A2. Processor results = pending ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, processor, status FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id " # ────────────────────────────────────────────────────── # Stage B: Worker Running # ────────────────────────────────────────────────────── echo "=== Start worker ===" DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \ --max-concurrent 1 --poll-interval 3 > /tmp/worker_check.log 2>&1 & WPID=$! echo "=== B1. Worker picks up job (within 3-10s) ===" for i in $(seq 1 10); do sleep 3 JOB_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \ "SELECT status FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'" 2>/dev/null) VIDEO_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \ "SELECT status FROM dev.videos WHERE file_uuid = '${FILE_UUID}'" 2>/dev/null) echo " Poll $i: job=$JOB_STATUS video=$VIDEO_STATUS" echo " $(grep '\[INTEGRITY\]\|\[SCHEMA\]\|Starting:\|Completed\|failed\|Job ' /tmp/worker_check.log 2>/dev/null | tail -3)" # Check alive kill -0 $WPID 2>/dev/null || { echo " Worker died unexpectedly"; break; } if [ "$VIDEO_STATUS" = "completed" ] || [ "$VIDEO_STATUS" = "failed" ]; then break; fi done echo "=== B2. Each processor status ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, processor, status, to_char(started_at, 'HH24:MI:SS') AS started, to_char(completed_at, 'HH24:MI:SS') AS completed, COALESCE(chunks_produced, 0) AS chunks, COALESCE(frames_processed, 0) AS frames, COALESCE(error_message, '') AS error FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id " kill $WPID 2>/dev/null || true # ────────────────────────────────────────────────────── # Stage C: After Completion # ────────────────────────────────────────────────────── echo "=== C1. Video final status ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT file_uuid, file_name, status, duration, fps, total_frames FROM dev.videos WHERE file_uuid = '${FILE_UUID}' " echo "=== C2. Chunks produced ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT chunk_type, count(*) FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' GROUP BY chunk_type ORDER BY chunk_type " echo "=== C3. Job final status ===" "$PG_BIN/psql" -U accusys -d momentry -c " SELECT id, status, current_processor FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}' " ``` Expected output (all green): ``` === A1. Job status = PENDING === id | status | current_processor | created_at ----+---------+-------------------+------------------- 161| PENDING | | 2026-05-15 19:00:30 === A2. Processor results = pending === id | processor | status ----+-----------+--------- 1 | asr | pending 2 | cut | pending === Start worker === === B1. Worker picks up job (within 3-10s) === Poll 1: job=PENDING video=registered Poll 2: job=RUNNING video=processing [INTEGRITY] asr_processor.py checksum OK Poll 3: job=RUNNING video=processing [ASR] Starting: asr_processor.py Poll 4: job=RUNNING video=processing [ASR] Completed successfully Poll 5: job=RUNNING video=processing [CUT] Completed successfully Poll 6: job=COMPLETED video=completed === B2. Each processor status === id | processor | status | started | completed | chunks | frames | error ----+-----------+-----------+-----------+-----------+--------+--------+------- 1 | asr | completed | 19:01:02 | 19:01:25 | 3 | 120 | 2 | cut | completed | 19:01:02 | 19:01:08 | 1 | 120 | === C1. Video final status === file_uuid | file_name | status | duration | fps | total_frames --------------+---------------------+-----------+----------+-----+-------------- e11111111... | demo_test_video.mp4 | completed | 5.005 | 24 | 120 === C2. Chunks produced === chunk_type | count ------------+------- cut | 1 sentence | 3 === C3. Job final status === id | status | current_processor ----+-----------+------------------- 161| COMPLETED | (null) ``` **Checklist during execution:** | Stage | # | Check | Expected | Pass | |-------|---|-------|----------|:----:| | **A. Pre-worker** | A1 | `monitor_jobs.status` | `PENDING` | ☐ | | | A2 | `processor_results` rows | = requested processor count | ☐ | | | A3 | Each `processor_results.status` | `pending` | ☐ | | **B. Running** | B1 | Job picked up (within poll interval) | status → `RUNNING` | ☐ | | | B2 | SHA256 integrity check in logs | `[INTEGRITY] *.py checksum OK` | ☐ | | | B3 | Each processor transitions | `pending → running → completed` | ☐ | | | B4 | `started_at` populated | NOT NULL per processor | ☐ | | | B5 | Processors complete without error | `error_message` is NULL | ☐ | | | B6 | Max concurrent respected | ≤ `--max-concurrent` running at once | ☐ | | **C. Post-completion** | C1 | `videos.status` | `completed` (not `failed`) | ☐ | | | C2 | `chunks_produced` > 0 | ASR has sentence chunks | ☐ | | | C3 | `monitor_jobs.status` | `COMPLETED` | ☐ | | | C4 | `chunk` table has data | rows with this `file_uuid` | ☐ | | | C5 | Chunk IDs formatted correctly | `{uuid}_{start}_{end}` | ☐ | --- ## 7. Check Results Monitor job progress: ```bash # Check job status curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/jobs?page=1&page_size=5&status=pending,running,completed,failed" \ | python3 -c "import json,sys;d=json.load(sys.stdin);[print(f'{j[\"uuid\"]}: {j[\"status\"]}') for j in d.get('jobs',[])]" ``` Output: ``` 9eca53f422f668dd59a9995d29dc9388: completed e1111111111111111111111111111111: completed ``` ### ⚡ Intermediate Check — Bug #2: Chunk Fallback Verification Verify that both new and old chunk_id formats resolve correctly: ```bash # Pick a chunk_id from the DB CHUNK_INFO=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c " SELECT chunk_id, id FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' LIMIT 1 ") NEW_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f1) DB_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f2) echo "=== New format: $NEW_ID ===" curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${NEW_ID}" \ | python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null echo "" echo "=== Old integer fallback (id=$DB_ID) ===" curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${DB_ID}" \ | python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null ``` Output: ``` === New format: e1111111111111111111111111111111_0_5 === chunk_id=e1111111111111111111111111111111_0_5 HTTP 200 === Old integer fallback (id=1075655) === chunk_id=e1111111111111111111111111111111_0_5 HTTP 200 ``` Both return `chunk_id=e1111111111111111111111111111111_0_5` — the fallback correctly resolves `id=1075655` to the same chunk. ### ⚡ Intermediate Check — Verify Chunks after Processing ```bash PG_BIN="/Users/accusys/pgsql/18.3/bin" # Count chunks produced "$PG_BIN/psql" -U accusys -d momentry -c " SELECT chunk_type, count(*) AS count FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' GROUP BY chunk_type ORDER BY chunk_type " # Sample chunk content "$PG_BIN/psql" -U accusys -d momentry -c " SELECT chunk_id, chunk_type, start_frame, end_frame, substring(text_content, 1, 60) AS text_preview FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' ORDER BY start_frame LIMIT 5 " ``` Output: ``` chunk_type | count ------------+------- cut | 1 sentence | 3 chunk_id | chunk_type | start_frame | end_frame | text_preview --------------------------------------------------+------------+-------------+-----------+----------------------------------------------------- e1111111111111111111111111111111_0_5 | cut | 0 | 120 | demo_test_video_auto_demo.mp4 e1111111111111111111111111111111_0_0 | sentence | 0 | 120 | test pattern test pattern color bars test pattern ... ``` Check per-processor results in DB: ```bash "$PG_BIN/psql" -U accusys -d momentry -c " SELECT processor, status, error_message, to_char(started_at, 'HH24:MI:SS') AS started, to_char(completed_at, 'HH24:MI:SS') AS completed, COALESCE(chunks_produced, 0) AS chunks FROM dev.processor_results WHERE file_uuid='${FILE_UUID}' ORDER BY id; " ``` Output: ``` processor | status | error_message | started | completed | chunks -----------+-----------+---------------+-----------+-----------+-------- asr | completed | | 19:01:02 | 19:01:25 | 3 cut | completed | | 19:01:02 | 19:01:08 | 1 ``` **Checklist after processing:** - [ ] `video.status = 'completed'` — pipeline finished - [ ] `processor_results` all show `status = 'completed'` - [ ] `chunks_produced > 0` — each processor produced output - [ ] `chunk` table has rows with correct chunk_type (`cut`, `sentence`) - [ ] `chunk_id` format is `{file_uuid}_{start}_{end}` (Bug #2 fix verified) --- ## 8. Search Chunks After processing, search the generated chunks: ```bash # Text search (ASR output) curl -s -X POST "http://api.momentry.ddns.net/api/v1/search/universal" \ -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ -H "Content-Type: application/json" \ -d "{\"query\": \"test\", \"uuid\": \"${FILE_UUID}\", \"limit\": 5}" \ | python3 -c " import json,sys;d=json.load(sys.stdin) print(f'Total hits: {d[\"total\"]}') for r in d['results']: if r.get('chunk_id'): print(f' {r[\"chunk_id\"]}: \"{r.get(\"text\",\"\")[:60]}\" score={r.get(\"score\",0):.3f}') " ``` Output: ``` Total hits: 3 e1111111111111111111111111111111_0_5: "test pattern test pattern..." score=0.423 e1111111111111111111111111111111_5_10: "silence" score=0.215 ``` Get a specific chunk by ID: ```bash # Single chunk detail curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \ "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${FILE_UUID}_0_5" \ | python3 -c " import json,sys;d=json.load(sys.stdin) print(f'Type: {d[\"chunk_type\"]} Rule: {d[\"rule\"]}') print(f'Frame: {d[\"start_frame\"]}–{d[\"end_frame\"]} FPS: {d[\"fps\"]}') print(f'Text: {d[\"text_content\"][:100]}') " ``` --- ## 9. Health Check ```bash # Basic health curl -sf http://api.momentry.ddns.net/health | python3 -m json.tool # Detailed health (services + pipeline + schema + resources) curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c " import json,sys;d=json.load(sys.stdin) p=d['pipeline'];s=d['schema'] print(f'Status: {d[\"status\"]}') print(f'Build: {d[\"build_git_hash\"]}') print(f'Services: postgres={d[\"services\"][\"postgres\"][\"status\"]} redis={d[\"services\"][\"redis\"][\"status\"]}') print(f'Schema: {s[\"applied\"][-1][\"filename\"] if s[\"applied\"] else \"none\"} ({len(s[\"applied\"])}/{len(s[\"required\"])} applied, ok={s[\"ok\"]})') print(f'Scripts: {p[\"scripts_count\"]} files, integrity={p[\"scripts_integrity\"][\"matched\"]}/{p[\"scripts_integrity\"][\"total\"]}') print(f'Procs: ' + ' '.join([k for k,v in p['processors'].items() if v and k != 'total_py_files'])) " ``` Output: ``` Status: ok Build: 0e73d2a Services: postgres=ok redis=ok Schema: migrate_fix_chunk_id_format.sql (8/8 applied, ok=True) Scripts: 286 files, integrity=345/345 Procs: asr yolo face pose ocr cut caption scene story asrx probe visual_chunk ``` --- ## 10. Schema Version Each binary embeds a list of required migrations. At startup and via `/health/detailed`, the server verifies all migrations are applied. ```bash # Check schema version via API curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c " import json,sys;d=json.load(sys.stdin)['schema'] print(f'Table exists: {d[\"table_exists\"]}') print(f'All OK: {d[\"ok\"]}') for m in d['required']: match = '✓' if any(a['filename']==m['filename'] and a['checksum']==m['checksum'] for a in d['applied']) else '✗' print(f' {match} {m[\"filename\"]} {m[\"checksum\"][:16]}') " ``` Output: ``` Table exists: True All OK: True ✓ migrate_add_content_hash.sql 42b81554248c4bec ✓ migrate_add_registered_status.sql 566fdfcdc624f6fa ✓ migrate_add_schema_version.sql 585b31df6056a937 ✓ migrate_cleanup_inactive_identities.sql daa52a0827b24a77 ✓ migrate_fix_chunk_id_format.sql a1b2c3d4e5f6a7b8 ✓ migrate_public_schema_v4.sql 973908076c614363 ✓ migrate_public_schema_v4_tables.sql 1d62dc42e4dec8f4 ✓ migrate_public_v4_complete.sql 2a6fda7d2c5660e4 ``` If a migration is missing at startup: ``` [SCHEMA] 7/8 migrations applied. Missing: migrate_fix_chunk_id_format.sql ``` --- --- ## Summary Checklist After completing a pipeline run, verify all items: ### Registration | # | Check | Expected | Pass | |---|-------|----------|:----:| | 1 | `videos.status` | `registered` | ☐ | | 2 | file_uuid consistency | API response uuid = DB uuid | ☐ | | 3 | Probe returns metadata | `duration > 0`, `fps > 0` | ☐ | | 4 | Probe error (Bug #3) | Bad UUID → JSON error + 404 | ☐ | ### Processing | # | Check | Expected | Pass | |---|-------|----------|:----:| | 5 | Job created | `monitor_jobs.status = PENDING` | ☐ | | 6 | Processors queued | `processor_results` rows = requested count | ☐ | | 7 | Worker picks up job | `monitor_jobs.status → RUNNING` | ☐ | | 8 | SHA256 integrity (Bug #2) | `[INTEGRITY] *.py checksum OK` | ☐ | | 9 | Each processor completes | `processor_results.status = completed` | ☐ | | 10 | No processor errors | `error_message` all NULL | ☐ | | 11 | Pipeline completes | `videos.status = completed` | ☐ | ### Results | # | Check | Expected | Pass | |---|-------|----------|:----:| | 12 | Chunks produced | `chunk` table has > 0 rows | ☐ | | 13 | Chunk ID format | `chunk_id = {uuid}_{start}_{end}` | ☐ | | 14 | Chunk fallback (Bug #2) | Old integer ID → 200 via handler fallback | ☐ | | 15 | Search works | `POST /search/universal` returns hits | ☐ | | 16 | Schema version | `schema.ok = true` in `/health/detailed` | ☐ | --- ## Full Automation Script Save as `demo_full_cycle.sh`: ```bash #!/bin/bash set -euo pipefail API="http://api.momentry.ddns.net" KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" PG="/Users/accusys/pgsql/18.3/bin" # Generate test video ffmpeg -y -f lavfi -i "testsrc=duration=5:size=640x480:rate=24" \ -f lavfi -i "anullsrc=r=44100:cl=mono" \ -c:v libx264 -preset ultrafast -crf 28 -c:a aac -shortest \ /tmp/auto_demo.mp4 2>/dev/null # Register UUID=$(curl -sf -X POST "$API/api/v1/files/register" \ -H "X-API-Key: $KEY" -H "Content-Type: application/json" \ -d '{"file_path": "/tmp/auto_demo.mp4"}' | python3 -c "import json,sys;print(json.load(sys.stdin)['file_uuid'])") echo "Registered: $UUID" # Process curl -sf -X POST "$API/api/v1/file/$UUID/process" \ -H "X-API-Key: $KEY" -H "Content-Type: application/json" \ -d '{"processors":["asr","cut"]}' > /dev/null echo "Processing triggered" # Run worker DATABASE_SCHEMA=dev target/debug/momentry_playground worker \ --max-concurrent 1 --poll-interval 3 & WPID=$! sleep 30 kill $WPID 2>/dev/null || true # Results "$PG/psql" -U accusys -d momentry -c " SELECT processor, status FROM dev.processor_results WHERE file_uuid='$UUID' ORDER BY id" echo "Done: $UUID" ```