feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system

This commit is contained in:
Accusys
2026-06-02 07:13:23 +08:00
parent e3066c3f49
commit e1572907ae
198 changed files with 43705 additions and 8910 deletions

View File

@@ -0,0 +1,68 @@
# TMDb Pipeline Test 2026-05-17
## Purpose
Verify full TMDb enrichment pipeline: register → process → TMDb prefetch → probe → identity files → downloads.
## Environment
- **Server**: playground (port 3003)
- **Schema**: `dev`
- **TMDB_API_KEY**: `e9cde52197f6f8df4d9db99da93db1fb`
- **Build**: `momentry_playground` (debug, 0 errors)
## Pre-cleanup
Unregistered old files + deleted output files:
```bash
POST /api/v1/unregister {"file_uuid": "3abeee81..."}
POST /api/v1.unregister {"file_uuid": "23b1c872..."}
```
## Step 1: Register
| File | UUID | Result |
|------|------|--------|
| Charade main | `bd80fec92b0b6963d177a2c55bf713e2` | ✅ Registered (already_exists due to content_hash match) |
| Charade YouTube | `a6fb22eebefaef17e62af874997c5944` | ✅ Fresh registration |
Register phase completed: probe → CUT → scene classification.
## Step 2: Trigger Processing
```bash
POST /api/v1/file/:uuid/process {}
```
Jobs created:
- Main: job_id=167, status=PENDING
- YouTube: job_id=168, status=PENDING
Worker blocked by schema issue: `processor_results` missing `retry_count` column + `jsonb_set(text, text, jsonb)` signature mismatch. Fixed `retry_count` via ALTER TABLE.
## Step 3: TMDb Prefetch (requires pipeline completion first)
```bash
POST /api/v1/agents/tmdb/prefetch
```
## Step 4: TMDb Probe
```bash
POST /api/v1/file/:uuid/tmdb-probe
```
## Known Issues
1. `jsonb_set(jsonb, text, jsonb)` → should be `jsonb_set(jsonb, text[], jsonb)` — pre-existing worker bug
2. `processor_results.retry_count` column missing — fixed via ALTER TABLE
3. Worker requires running as separate process: `./target/debug/momentry_playground worker`
## Endpoint Changes in This Test
| Endpoint | Status |
|----------|--------|
| `GET /api/v1/stats/ingest` | ❌ Removed (stats moved to files/scan + identities) |
| `GET /api/v1/files/scan` | Added `total_chunks`, `searchable_chunks`, `pending_videos` |
| `GET /api/v1/identities` | Added `total_identities`, `tmdb_identities`, `auto_identities` |
| `POST /api/v1/agents/tmdb/prefetch` | ✅ Writes identity files directly |
| `POST /api/v1/file/:uuid/tmdb-probe` | ✅ Upserts from disk identity files |
| `GET /api/v1/identity/:uuid/json` | ✅ Download identity JSON |
| `GET /api/v1/file/:uuid/json/:processor` | ✅ Download processor JSON |
| `POST /api/v1/agents/identity/match-from-photo` | 🆕 New |
| `POST /api/v1/agents/identity/match-from-trace` | 🆕 New |