Files
momentry_core/docs_v1.0/REFERENCE/Demo_EndToEnd.md

28 KiB
Raw Blame History

Momentry Core — Pipeline Demo End-to-End

Date: 2026-05-15 Build: c41f7e0c Server: http://api.momentry.ddns.net (production) Format: jq for JSON parsing (not python3) Scope: File registration → Pipeline processing (multi-phase) → Post-processing verification


Table of Contents

Pipeline Phases

Phase Step What happens
Pre 14 System check, scan, register, probe
處理中 56 Submit job → Worker picks up → Each processor runs (pending→running→completed)
處理後 79 All results → Search → Identities → Schema verification

1. 檢查系統狀況

API="http://api.momentry.ddns.net"
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"

# Basic health
curl -sf "$API/health" | jq '{status, version, build_git_hash, uptime_ms}'

# Detailed health
curl -sf "$API/health/detailed" | jq '{
  services, 
  schema: .schema.ok, 
  scripts: .pipeline.scripts_count, 
  integrity: .pipeline.scripts_integrity,
  procs: [.pipeline.processors | to_entries[] | select(.value == true and .key != "total_py_files") | .key]
}'

Output:

{
  "status": "ok",
  "version": "1.0.0",
  "build_git_hash": "c41f7e0c",
  "uptime_ms": 2756192
}
{
  "services": {"postgres": "ok", "redis": "ok", "qdrant": "ok"},
  "schema": false,
  "scripts": 291,
  "integrity": {"matched": 332, "total": 345, "ok": false},
  "procs": ["asr","yolo","face","pose","ocr","cut","caption","scene","story","asrx","probe","visual_chunk"]
}

2. 掃描檔案

掃描伺服器上所有與 exasan 相關的檔案(支援規則表達式):

curl -sf -H "X-API-Key: $KEY" "$API/api/v1/files/scan?pattern=exasan" | \
  jq '[.files[] | {uuid: .file_uuid, name: .file_name, size: .file_size}]'

輸出(節錄):

[
  {"uuid": "dd61fda85fee441f...", "name": "ExaSAN PCIe series - Director Ou Yu-Zhi Shares His Experience.mp4", "size": 6827600},
  {"uuid": "8e2e98c49355935f...", "name": "ExaSAN Webinar by Blake Jones, Vision2see.mp4", "size": 38635889},
  {"uuid": "477d8fa7bc0e1a7...", "name": "Thunderbolt ExaSAN at CCBN.mp4", "size": 13126748}
]

Note: files/scan 也可以掃所有檔案,或用於批次註冊。若不指定 pattern回傳伺服器 sftpgo/data/demo/ 目錄下所有檔案。


3. 註冊或確認

若檔案尚未註冊,使用 register API。若已存在如本次示範直接確認狀態

UUID="dd61fda85fee441fdd00ab5528213ff7"

# 確認檔案狀態
curl -sf -H "X-API-Key: $KEY" "$API/api/v1/file/${UUID}" | jq '{uuid: .file_uuid[0:16], name: .file_name, status, duration, fps}'

# 若檔案不存在,使用註冊 API
# curl -sf -X POST -H "X-API-Key: $KEY" -H "Content-Type: application/json" \
#   -d '{"file_path": "/path/to/video.mp4"}' \
#   "$API/api/v1/files/register" | jq '.'

註冊流程

POST /files/register
  ├─ SHA256 content_hash (dedup 檢查)
  ├─ file_name 衝突檢查 (自動 rename)
  ├─ Pre-process (SHA256 + ffprobe + UUID → .pre.json)
  ├─ UUID = f(mac, mtime, path, filename)
  ├─ Unified probe (video→ffprobe, doc→Python)
  └─ INSERT INTO videos

4. Probe 確認

The probe endpoint returns ffprobe metadata about the registered file.

# Substitute the actual file_uuid from step 3
FILE_UUID="e1111111111111111111111111111111"

curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe" | python3 -m json.tool

Output (abbreviated):

{
    "file_uuid": "e1111111111111111111111111111111",
    "file_name": "demo_test_video.mp4",
    "duration": 5.005,
    "width": 640,
    "height": 480,
    "fps": 24.0,
    "total_frames": 120,
    "cached": true,
    "format": {
        "filename": "/tmp/demo_test_video.mp4",
        "format_name": "mov,mp4,m4a,3gp,3g2,mj2",
        "duration": "5.005000",
        "size": "98304",
        "bit_rate": "157184"
    },
    "streams": [
        {"index": 0, "codec_type": "video", "codec_name": "h264", "width": 640, "height": 480, ...},
        {"index": 1, "codec_type": "audio", "codec_name": "aac", ...}
    ]
}

Error handling (Bug #3 fix):

  • Non-existent UUID → {"error":"Video not found"} + HTTP 404
  • File deleted from disk → {"error":"File does not exist at registered path"} + HTTP 404
  • ffprobe failure → {"error":"ffprobe failed: ..."} + HTTP 500

Intermediate Check — Bug #3: Probe Error Verification

Test both error cases return proper JSON + HTTP code instead of bare 500:

echo "=== Non-existent UUID → expect 404 ==="
curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/bad_uuid_12345/probe"
# Expect: {"error":"Video not found","file_uuid":"bad_uuid_12345"}  HTTP 404

echo ""
echo "=== Non-existent file path → expect 404 ==="
# Temporarily change file_path to a non-existent location
"$PG_BIN/psql" -U accusys -d momentry -c \
    "UPDATE dev.videos SET file_path = '/tmp/NONEXISTENT_FILE' WHERE file_uuid = '${FILE_UUID}'"
curl -s -w "\nHTTP: %{http_code}\n" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/probe"
# Expect: {"error":"File does not exist at registered path",...}  HTTP 404
# Restore path
"$PG_BIN/psql" -U accusys -d momentry -c \
    "UPDATE dev.videos SET file_path = '/tmp/demo_test_video.mp4' WHERE file_uuid = '${FILE_UUID}'"

Output:

=== Non-existent UUID → expect 404 ===
{"error":"Video not found","file_uuid":"bad_uuid_12345"}
HTTP: 404

=== Non-existent file path → expect 404 ===
{"error":"File does not exist at registered path","file_uuid":"e1111111111111111111111111111111","file_path":"/tmp/NONEXISTENT_FILE"}
HTTP: 404

5. Process Video

Trigger pipeline processing for specific processors. The available processors are:

Processor Function Script
asr Speech-to-text (faster-whisper) asr_processor.py
cut Scene detection (PySceneDetect) cut_processor.py
yolo Object detection (YOLOv8) yolo_processor.py
face Face detection (InsightFace) face_processor.py
pose Pose estimation (MediaPipe) pose_processor.py
ocr Text detection (PaddleOCR) ocr_processor.py
asrx Speaker diarization asrx_processor.py
visual_chunk Visual content analysis visual_chunk_processor.py
scene Scene classification scene_classifier.py
story Story generation (LLM) story_processor.py
caption Caption generation caption_processor.py
# Trigger only ASR + CUT for quick test
curl -s -X POST "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/process" \
    -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    -H "Content-Type: application/json" \
    -d '{"processors": ["asr", "cut"]}' | python3 -m json.tool

Output:

{
    "job_id": 161,
    "file_uuid": "e1111111111111111111111111111111",
    "status": "PENDING",
    "pids": [],
    "message": "Processing triggered for demo_test_video.mp4"
}

Processing flow:

POST /process → trigger_processing()
  ├─ Validate file exists (DB lookup)
  ├─ Create monitor_job (status: PENDING)
  ├─ Create processor_result rows for each requested processor (status: pending)
  └─ Response { job_id, status: "PENDING" }

Note: If no processors are specified, all processors are used:

{"processors": ["asr", "cut", "yolo", "ocr", "face", "pose", "asrx", "visual_chunk"]}

Intermediate Check — Verify Job + Processor Results after Trigger

PG_BIN="/Users/accusys/pgsql/18.3/bin"

# Check monitor_jobs table
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, uuid, status, current_processor,
       to_char(created_at, 'HH24:MI:SS') AS created
FROM dev.monitor_jobs
WHERE uuid = '${FILE_UUID}'
ORDER BY id DESC LIMIT 1
\gx
"

# Check processor_results table
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, processor, status
FROM dev.processor_results
WHERE file_uuid = '${FILE_UUID}'
ORDER BY id
"

Output:

-[ RECORD 1 ]------+-----------------------------
id                 | 161
uuid               | e1111111111111111111111111111111
status             | PENDING
current_processor  | (null)
created            | 19:00:30

 id | processor | status
----+-----------+---------
  1 | asr       | pending
  2 | cut       | pending

Checklist after trigger:

  • monitor_jobs.status = 'PENDING' — job created, awaiting worker
  • processor_results rows match requested processors (2 rows for asr, cut)
  • Each processor.status = 'pending' — not yet executed

6. Worker Execution

The worker polls for pending jobs and executes them one by one.

DATABASE_SCHEMA=dev cargo run --bin momentry_playground -- worker \
    --max-concurrent 2 --poll-interval 5

Or in background:

DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \
    --max-concurrent 2 --poll-interval 5 > /tmp/worker_demo.log 2>&1 &

Worker flow:

Worker loop (every 5 seconds):
  ├─ Poll: SELECT * FROM monitor_jobs WHERE status = 'PENDING'
  ├─ Set job status → RUNNING
  ├─ For each pending processor:
  │    ├─ SHA256 integrity check (verify_script_integrity)
  │    │    └─ checksums.sha256 manifest lookup
  │    ├─ Execute script via PythonExecutor
  │    │    └─ Command: venv/bin/python scripts/<processor>.py <args>
  │    ├─ Verify output (file exists, content valid)
  │    └─ Update processor_result (completed/failed)
  ├─ Check completion: all processors done?
  ├─ Yes → Set job + video status → COMPLETED
  └─ No → Wait for next poll cycle

Worker log output:

[CHECKSUMS] Loaded 345 entries from checksums.sha256
[INTEGRITY] asr_processor.py checksum OK
[ASR] Starting asr_processor.py
[INTEGRITY] cut_processor.py checksum OK  
[CUT] Starting cut_processor.py
[ASR] Completed successfully
[CUT] Completed successfully
check_and_complete_job: results=2/2 → Job COMPLETED

Intermediate Check — Poll Progress During Worker Execution

While the worker is running, poll the progress endpoint to watch state transitions:

# Poll every 5 seconds until completed
FILE_UUID="e1111111111111111111111111111111"
for i in $(seq 1 12); do
    sleep 5
    STATUS=$(curl -sf -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
        "http://api.momentry.ddns.net/api/v1/progress/${FILE_UUID}" \
        | python3 -c "import json,sys;d=json.load(sys.stdin);print(d.get('status','?'))" 2>/dev/null || echo "pending")
    echo "Poll $i: status=$STATUS"
    [ "$STATUS" = "completed" ] || [ "$STATUS" = "failed" ] && break
done

Output (typical):

Poll 1: status=registered         ← worker hasn't picked it up yet
Poll 2: status=pending            ← worker picked up, job status changed
Poll 3: status=processing         ← worker running ASR
Poll 4: status=processing         ← worker running CUT
Poll 5: status=completed          ← all done

Check status transitions in DB:

"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, processor, status,
       to_char(started_at, 'HH24:MI:SS') AS started,
       to_char(completed_at, 'HH24:MI:SS') AS completed
FROM dev.processor_results
WHERE file_uuid = '${FILE_UUID}'
ORDER BY id
"

Output:

 id | processor |  status    | started   | completed
----+-----------+------------+-----------+-----------
  1 | asr       | completed  | 19:01:02  | 19:01:25
  2 | cut       | completed  | 19:01:02  | 19:01:08

Processing Checklist — Step-by-Step Verification

This checklist covers every stage of the pipeline processing flow:

# ──────────────────────────────────────────────────────
# Stage A: Before Worker Starts
# ──────────────────────────────────────────────────────
PG_BIN="/Users/accusys/pgsql/18.3/bin"
FILE_UUID="e1111111111111111111111111111111"
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"

echo "=== A1. Job status = PENDING ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, status, current_processor, created_at FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'
"

echo "=== A2. Processor results = pending ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, processor, status FROM dev.processor_results WHERE file_uuid = '${FILE_UUID}' ORDER BY id
"

# ──────────────────────────────────────────────────────
# Stage B: Worker Running
# ──────────────────────────────────────────────────────
echo "=== Start worker ==="
DATABASE_SCHEMA=dev nohup target/debug/momentry_playground worker \
    --max-concurrent 1 --poll-interval 3 > /tmp/worker_check.log 2>&1 &
WPID=$!

echo "=== B1. Worker picks up job (within 3-10s) ==="
for i in $(seq 1 10); do
    sleep 3
    JOB_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \
        "SELECT status FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'" 2>/dev/null)
    VIDEO_STATUS=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c \
        "SELECT status FROM dev.videos WHERE file_uuid = '${FILE_UUID}'" 2>/dev/null)
    echo "  Poll $i: job=$JOB_STATUS video=$VIDEO_STATUS"
    echo "  $(grep '\[INTEGRITY\]\|\[SCHEMA\]\|Starting:\|Completed\|failed\|Job ' /tmp/worker_check.log 2>/dev/null | tail -3)"

    # Check alive
    kill -0 $WPID 2>/dev/null || { echo "  Worker died unexpectedly"; break; }
    
    if [ "$VIDEO_STATUS" = "completed" ] || [ "$VIDEO_STATUS" = "failed" ]; then break; fi
done

echo "=== B2. Each processor status ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, processor, status,
       to_char(started_at, 'HH24:MI:SS') AS started,
       to_char(completed_at, 'HH24:MI:SS') AS completed,
       COALESCE(chunks_produced, 0) AS chunks,
       COALESCE(frames_processed, 0) AS frames,
       COALESCE(error_message, '') AS error
FROM dev.processor_results
WHERE file_uuid = '${FILE_UUID}'
ORDER BY id
"

kill $WPID 2>/dev/null || true

# ──────────────────────────────────────────────────────
# Stage C: After Completion
# ──────────────────────────────────────────────────────
echo "=== C1. Video final status ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT file_uuid, file_name, status, duration, fps, total_frames FROM dev.videos WHERE file_uuid = '${FILE_UUID}'
"

echo "=== C2. Chunks produced ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT chunk_type, count(*) FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' GROUP BY chunk_type ORDER BY chunk_type
"

echo "=== C3. Job final status ==="
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT id, status, current_processor FROM dev.monitor_jobs WHERE uuid = '${FILE_UUID}'
"

Expected output (all green):

=== A1. Job status = PENDING ===
 id | status  | current_processor | created_at
----+---------+-------------------+-------------------
 161| PENDING |                   | 2026-05-15 19:00:30

=== A2. Processor results = pending ===
 id | processor | status
----+-----------+---------
  1 | asr       | pending
  2 | cut       | pending

=== Start worker ===
=== B1. Worker picks up job (within 3-10s) ===
  Poll 1: job=PENDING video=registered
  Poll 2: job=RUNNING video=processing
  [INTEGRITY] asr_processor.py checksum OK
  Poll 3: job=RUNNING video=processing
  [ASR] Starting: asr_processor.py
  Poll 4: job=RUNNING video=processing
  [ASR] Completed successfully
  Poll 5: job=RUNNING video=processing
  [CUT] Completed successfully
  Poll 6: job=COMPLETED video=completed

=== B2. Each processor status ===
 id | processor |  status   | started   | completed | chunks | frames | error
----+-----------+-----------+-----------+-----------+--------+--------+-------
  1 | asr       | completed | 19:01:02  | 19:01:25 |      3 |    120 |
  2 | cut       | completed | 19:01:02  | 19:01:08 |      1 |    120 |

=== C1. Video final status ===
  file_uuid   |      file_name      |  status   | duration | fps | total_frames
--------------+---------------------+-----------+----------+-----+--------------
 e11111111... | demo_test_video.mp4 | completed |    5.005 |  24 |          120

=== C2. Chunks produced ===
 chunk_type | count
------------+-------
 cut        |     1
 sentence   |     3

=== C3. Job final status ===
 id |  status   | current_processor
----+-----------+-------------------
 161| COMPLETED | (null)

Checklist during execution:

Stage # Check Expected Pass
A. Pre-worker A1 monitor_jobs.status PENDING
A2 processor_results rows = requested processor count
A3 Each processor_results.status pending
B. Running B1 Job picked up (within poll interval) status → RUNNING
B2 SHA256 integrity check in logs [INTEGRITY] *.py checksum OK
B3 Each processor transitions pending → running → completed
B4 started_at populated NOT NULL per processor
B5 Processors complete without error error_message is NULL
B6 Max concurrent respected --max-concurrent running at once
C. Post-completion C1 videos.status completed (not failed)
C2 chunks_produced > 0 ASR has sentence chunks
C3 monitor_jobs.status COMPLETED
C4 chunk table has data rows with this file_uuid
C5 Chunk IDs formatted correctly {uuid}_{start}_{end}

7. Check Results

Monitor job progress:

# Check job status
curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/jobs?page=1&page_size=5&status=pending,running,completed,failed" \
    | python3 -c "import json,sys;d=json.load(sys.stdin);[print(f'{j[\"uuid\"]}: {j[\"status\"]}') for j in d.get('jobs',[])]"

Output:

9eca53f422f668dd59a9995d29dc9388: completed
e1111111111111111111111111111111: completed

Intermediate Check — Bug #2: Chunk Fallback Verification

Verify that both new and old chunk_id formats resolve correctly:

# Pick a chunk_id from the DB
CHUNK_INFO=$("$PG_BIN/psql" -U accusys -d momentry -t -A -c "
SELECT chunk_id, id FROM dev.chunk WHERE file_uuid = '${FILE_UUID}' LIMIT 1
")
NEW_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f1)
DB_ID=$(echo "$CHUNK_INFO" | cut -d'|' -f2)

echo "=== New format: $NEW_ID ==="
curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${NEW_ID}" \
    | python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null

echo ""
echo "=== Old integer fallback (id=$DB_ID) ==="
curl -s -w " HTTP %{http_code}" -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${DB_ID}" \
    | python3 -c "import json,sys;d=json.load(sys.stdin);print(f'chunk_id={d.get(\"chunk_id\")}')" 2>/dev/null

Output:

=== New format: e1111111111111111111111111111111_0_5 ===
chunk_id=e1111111111111111111111111111111_0_5 HTTP 200

=== Old integer fallback (id=1075655) ===
chunk_id=e1111111111111111111111111111111_0_5 HTTP 200

Both return chunk_id=e1111111111111111111111111111111_0_5 — the fallback correctly resolves id=1075655 to the same chunk.

Intermediate Check — Verify Chunks after Processing

PG_BIN="/Users/accusys/pgsql/18.3/bin"

# Count chunks produced
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT chunk_type, count(*) AS count
FROM dev.chunk
WHERE file_uuid = '${FILE_UUID}'
GROUP BY chunk_type
ORDER BY chunk_type
"

# Sample chunk content
"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT chunk_id, chunk_type, start_frame, end_frame,
       substring(text_content, 1, 60) AS text_preview
FROM dev.chunk
WHERE file_uuid = '${FILE_UUID}'
ORDER BY start_frame
LIMIT 5
"

Output:

 chunk_type | count
------------+-------
 cut        |     1
 sentence   |     3

                     chunk_id                     | chunk_type | start_frame | end_frame |                    text_preview
--------------------------------------------------+------------+-------------+-----------+-----------------------------------------------------
 e1111111111111111111111111111111_0_5              | cut        |           0 |       120 | demo_test_video_auto_demo.mp4
 e1111111111111111111111111111111_0_0              | sentence   |           0 |       120 | test pattern test pattern color bars test pattern ...

Check per-processor results in DB:

"$PG_BIN/psql" -U accusys -d momentry -c "
SELECT processor, status, error_message,
       to_char(started_at, 'HH24:MI:SS') AS started,
       to_char(completed_at, 'HH24:MI:SS') AS completed,
       COALESCE(chunks_produced, 0) AS chunks
FROM dev.processor_results
WHERE file_uuid='${FILE_UUID}'
ORDER BY id;
"

Output:

 processor |  status   | error_message | started   | completed | chunks
-----------+-----------+---------------+-----------+-----------+--------
 asr       | completed |               | 19:01:02  | 19:01:25 |      3
 cut       | completed |               | 19:01:02  | 19:01:08 |      1

Checklist after processing:

  • video.status = 'completed' — pipeline finished
  • processor_results all show status = 'completed'
  • chunks_produced > 0 — each processor produced output
  • chunk table has rows with correct chunk_type (cut, sentence)
  • chunk_id format is {file_uuid}_{start}_{end} (Bug #2 fix verified)

8. Search Chunks

After processing, search the generated chunks:

# Text search (ASR output)
curl -s -X POST "http://api.momentry.ddns.net/api/v1/search/universal" \
    -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    -H "Content-Type: application/json" \
    -d "{\"query\": \"test\", \"uuid\": \"${FILE_UUID}\", \"limit\": 5}" \
    | python3 -c "
import json,sys;d=json.load(sys.stdin)
print(f'Total hits: {d[\"total\"]}')
for r in d['results']:
    if r.get('chunk_id'):
        print(f'  {r[\"chunk_id\"]}: \"{r.get(\"text\",\"\")[:60]}\" score={r.get(\"score\",0):.3f}')
"

Output:

Total hits: 3
  e1111111111111111111111111111111_0_5: "test pattern test pattern..." score=0.423
  e1111111111111111111111111111111_5_10: "silence" score=0.215

Get a specific chunk by ID:

# Single chunk detail  
curl -s -H "X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69" \
    "http://api.momentry.ddns.net/api/v1/file/${FILE_UUID}/chunk/${FILE_UUID}_0_5" \
    | python3 -c "
import json,sys;d=json.load(sys.stdin)
print(f'Type: {d[\"chunk_type\"]}  Rule: {d[\"rule\"]}')
print(f'Frame: {d[\"start_frame\"]}{d[\"end_frame\"]}  FPS: {d[\"fps\"]}')
print(f'Text: {d[\"text_content\"][:100]}')
"

9. Health Check

# Basic health
curl -sf http://api.momentry.ddns.net/health | python3 -m json.tool

# Detailed health (services + pipeline + schema + resources)
curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c "
import json,sys;d=json.load(sys.stdin)
p=d['pipeline'];s=d['schema']
print(f'Status:   {d[\"status\"]}')
print(f'Build:    {d[\"build_git_hash\"]}')
print(f'Services: postgres={d[\"services\"][\"postgres\"][\"status\"]} redis={d[\"services\"][\"redis\"][\"status\"]}')
print(f'Schema:   {s[\"applied\"][-1][\"filename\"] if s[\"applied\"] else \"none\"} ({len(s[\"applied\"])}/{len(s[\"required\"])} applied, ok={s[\"ok\"]})')
print(f'Scripts:  {p[\"scripts_count\"]} files, integrity={p[\"scripts_integrity\"][\"matched\"]}/{p[\"scripts_integrity\"][\"total\"]}')
print(f'Procs:    ' + ' '.join([k for k,v in p['processors'].items() if v and k != 'total_py_files']))
"

Output:

Status:   ok
Build:    0e73d2a
Services: postgres=ok redis=ok
Schema:   migrate_fix_chunk_id_format.sql (8/8 applied, ok=True)
Scripts:  286 files, integrity=345/345
Procs:    asr yolo face pose ocr cut caption scene story asrx probe visual_chunk

10. Schema Version

Each binary embeds a list of required migrations. At startup and via /health/detailed, the server verifies all migrations are applied.

# Check schema version via API
curl -sf http://api.momentry.ddns.net/health/detailed | python3 -c "
import json,sys;d=json.load(sys.stdin)['schema']
print(f'Table exists: {d[\"table_exists\"]}')
print(f'All OK:       {d[\"ok\"]}')
for m in d['required']:
    match = '✓' if any(a['filename']==m['filename'] and a['checksum']==m['checksum'] for a in d['applied']) else '✗'
    print(f'  {match} {m[\"filename\"]}  {m[\"checksum\"][:16]}')
"

Output:

Table exists: True
All OK:       True
  ✓ migrate_add_content_hash.sql  42b81554248c4bec
  ✓ migrate_add_registered_status.sql  566fdfcdc624f6fa
  ✓ migrate_add_schema_version.sql  585b31df6056a937
  ✓ migrate_cleanup_inactive_identities.sql  daa52a0827b24a77
  ✓ migrate_fix_chunk_id_format.sql  a1b2c3d4e5f6a7b8
  ✓ migrate_public_schema_v4.sql  973908076c614363
  ✓ migrate_public_schema_v4_tables.sql  1d62dc42e4dec8f4
  ✓ migrate_public_v4_complete.sql  2a6fda7d2c5660e4

If a migration is missing at startup:

[SCHEMA] 7/8 migrations applied. Missing: migrate_fix_chunk_id_format.sql


Summary Checklist

After completing a pipeline run, verify all items:

Registration

# Check Expected Pass
1 videos.status registered
2 file_uuid consistency API response uuid = DB uuid
3 Probe returns metadata duration > 0, fps > 0
4 Probe error (Bug #3) Bad UUID → JSON error + 404

Processing

# Check Expected Pass
5 Job created monitor_jobs.status = PENDING
6 Processors queued processor_results rows = requested count
7 Worker picks up job monitor_jobs.status → RUNNING
8 SHA256 integrity (Bug #2) [INTEGRITY] *.py checksum OK
9 Each processor completes processor_results.status = completed
10 No processor errors error_message all NULL
11 Pipeline completes videos.status = completed

Results

# Check Expected Pass
12 Chunks produced chunk table has > 0 rows
13 Chunk ID format chunk_id = {uuid}_{start}_{end}
14 Chunk fallback (Bug #2) Old integer ID → 200 via handler fallback
15 Search works POST /search/universal returns hits
16 Schema version schema.ok = true in /health/detailed

Full Automation Script

Save as demo_full_cycle.sh:

#!/bin/bash
set -euo pipefail

API="http://api.momentry.ddns.net"
KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
PG="/Users/accusys/pgsql/18.3/bin"

# Generate test video
ffmpeg -y -f lavfi -i "testsrc=duration=5:size=640x480:rate=24" \
    -f lavfi -i "anullsrc=r=44100:cl=mono" \
    -c:v libx264 -preset ultrafast -crf 28 -c:a aac -shortest \
    /tmp/auto_demo.mp4 2>/dev/null

# Register
UUID=$(curl -sf -X POST "$API/api/v1/files/register" \
    -H "X-API-Key: $KEY" -H "Content-Type: application/json" \
    -d '{"file_path": "/tmp/auto_demo.mp4"}' | python3 -c "import json,sys;print(json.load(sys.stdin)['file_uuid'])")
echo "Registered: $UUID"

# Process
curl -sf -X POST "$API/api/v1/file/$UUID/process" \
    -H "X-API-Key: $KEY" -H "Content-Type: application/json" \
    -d '{"processors":["asr","cut"]}' > /dev/null
echo "Processing triggered"

# Run worker
DATABASE_SCHEMA=dev target/debug/momentry_playground worker \
    --max-concurrent 1 --poll-interval 3 &
WPID=$!
sleep 30
kill $WPID 2>/dev/null || true

# Results
"$PG/psql" -U accusys -d momentry -c "
SELECT processor, status FROM dev.processor_results WHERE file_uuid='$UUID' ORDER BY id"
echo "Done: $UUID"