Commit Graph

104 Commits

Author SHA1 Message Date
Accusys
4b4d37b332 fix: qdrant_request empty body handling (use 'is not None' check)
Fix qdrant_request() to properly handle empty dict {} as body.
Python's 'if body' evaluates to False for empty dict, causing EOF error.

Changed:
- data = json.dumps(body).encode() if body is not None else None

Also cleaned up count_seeds() to use consistent body passing.
2026-06-25 02:19:07 +08:00
Accusys
b19b1a8c46 fix: count_seeds empty body handling
Fix count_seeds() to always pass valid JSON body to Qdrant count API.
Empty dict {} was causing EOF error when no source filter provided.
2026-06-25 02:02:17 +08:00
Accusys
d20819b03b feat: add manual_seed.py for user-selected face trace seed creation
Implements:
- create_identity(): Create PG identity (source='manual')
- create_manual_seed(): Full flow from trace → seed → confirm
  - Get trace centroid embedding from Qdrant _faces
  - Create identity in PG
  - Push to Qdrant _seeds
  - Confirm trace binding (TKG + Qdrant + PG)
  - Auto-trigger Round 2 propagation
- list_pending_traces(): List traces for user selection
- run_propagation(): Auto propagation trigger

Usage:
  # List pending traces
  python manual_seed.py --file-uuid <uuid> --list

  # Create seed from trace
  python manual_seed.py --file-uuid <uuid> --trace-id 1 --name 'John Doe'

  # Custom UUID
  python manual_seed.py --file-uuid <uuid> --trace-id 1 --name 'John Doe' --identity-uuid xxx

  # No propagation
  python manual_seed.py --file-uuid <uuid> --trace-id 1 --name 'John Doe' --no-propagate

Flow: select trace → label → create identity → push seed → auto-bind → propagate
2026-06-25 01:49:53 +08:00
Accusys
b5e3adf5de feat: add generate_seed_embeddings.py for TMDb profile extraction
Implements:
- get_tmdb_identities(): Query PG for TMDb identities with profile photos
- download_tmdb_image(): Download profile image from TMDb (handles full URL or path)
- extract_face_embedding(): CoreML FaceNet 512D embedding extraction
- generate_seed_embeddings(): Full flow: download → extract → push to _seeds

TMDb image handling:
- Supports both full URL (https://...) and path (/xxx.jpg)
- Uses 'original' size for better quality (replaces /w185)

Usage:
  python generate_seed_embeddings.py                # All TMDb identities
  python generate_seed_embeddings.py --limit 10    # Limit to 10
  python generate_seed_embeddings.py --dry-run     # Don't push to Qdrant

Tested: 3 seeds successfully pushed (Cary Grant, Audrey Hepburn, Walter Matthau)
2026-06-25 01:45:48 +08:00
Accusys
4198a74002 feat: add confirm_identity.py for identity binding confirmation
Implements:
- confirm_single_trace(): Confirm identity binding for one trace
  - Update TKG face_track node: status='confirmed'
  - Update Qdrant _faces: identity_uuid for all points
  - Update PG face_detections: identity_id
  - Add trace centroid to _seeds (source='propagation')
  - Auto-trigger Round 2 matching

- batch_confirm_from_json(): Batch confirm from suggestions file
  - Confirm multiple suggestions from identity_matcher output
  - Final propagation after all confirmations

- run_round_2_propagation(): Auto propagation trigger
  - Get confirmed traces from TKG nodes
  - Build identity_map for propagation
  - Run identity_matcher.py Round 2

Usage:
  python confirm_identity.py --file-uuid <uuid> --trace-id 1 --identity-id 1 --identity-uuid xxx --name 'Tom Hanks'
  python confirm_identity.py --file-uuid <uuid> --json suggestions.json
  python confirm_identity.py --file-uuid <uuid> --json suggestions.json --no-propagate
2026-06-25 01:38:00 +08:00
Accusys
21b9f500d9 feat: add TKG node marking for Identity Agent suggestions
TKG Helper (scripts/utils/tkg_helper.py):
- mark_face_track_suggested(): Mark node as 'suggested' with pending identity info
- mark_face_track_confirmed(): Mark node as 'confirmed' with identity_ref
- mark_face_track_stranger(): Mark node as 'stranger' with stranger_ref
- batch_mark_suggestions(): Batch mark multiple traces
- batch_mark_strangers(): Batch mark stranger clusters
- get_face_track_nodes(): Get all face_track nodes for a file
- get_pending_face_tracks(): Get nodes with status='pending'
- get_suggested_face_tracks(): Get nodes with status='suggested'

Identity Matcher updates:
- Add --mark-tkg flag to update TKG nodes after matching
- Integrates with tkg_helper for batch operations

Node properties schema:
- status: pending | suggested | confirmed | stranger
- pending_identity_name/uuid/id: suggested identity info
- suggested_by: tmdb | propagation | manual
- confidence: matching score
- identity_ref: confirmed identity reference
2026-06-25 01:11:05 +08:00
Accusys
6851cb4734 feat: add identity_matcher.py for multi-angle face matching
Implements:
- match_faces_round_1: TMDb seeds → traces (TH=0.55)
- match_faces_round_2: Confirmed traces → pending (TH=0.55)
- match_faces_round_3_plus: Propagation (TH=0.50)
- cluster_strangers: Greedy merge unmatched traces (TH=0.40)
- multi_angle_match: max(cosine(seed, rep)) across 3 representatives
- cosine_similarity: Vector similarity calculation

Usage:
  python identity_matcher.py --file-uuid <uuid> --round 1
  python identity_matcher.py --file-uuid <uuid> --round 2 --confirmed-traces 1,2,3
  python identity_matcher.py --file-uuid <uuid> --round 1 --stranger

Output: JSON with suggestions {trace_id: {identity_id, uuid, name, score, suggested_by}}
2026-06-25 00:57:22 +08:00
Accusys
580c4b4017 feat: add _seeds collection helper functions for Identity Agent
- Add ensure_seeds_collection(): create _seeds collection (512D, Cosine)
- Add push_seed_embedding(): push identity seed with payload {identity_id, uuid, name, source, file_uuid, trace_id, tmdb_id}
- Add get_seeds(): get all seeds (optional source filter)
- Add search_seeds(): cosine search against seeds
- Add delete_seed(): delete seed by identity_id
- Add count_seeds(): count seeds (optional source filter)
- Add get_trace_representatives(): get 3 representatives per trace for multi-angle matching
- Add get_trace_centroid(): get centroid embedding for a trace
- Add update_identity_in_faces(): update identity_id/uuid for all face points with trace_id

Point ID strategy: identity_id directly as point_id for _seeds collection
All functions tested successfully
2026-06-25 00:47:25 +08:00
Accusys
9fbb4f9b48 feat: add Qdrant _faces collection embedding push
- Add qdrant_faces.py utility module for _faces collection operations
- Modify face_processor.py to push embeddings to Qdrant (CoreML extraction re-enabled)
- Modify store_traced_faces.py to update trace_id in Qdrant after face tracking
- Collection schema: 512D vectors, Cosine distance, fixed name '_faces'
- Payload: file_uuid, frame, trace_id, bbox, confidence, identity_id/uuid, stranger_id
- Batch size: 100 (default), configurable via QDRANT_BATCH_SIZE env var
- Error handling: face_processor.py exits with error if Qdrant push fails
2026-06-25 00:23:20 +08:00
Accusys
074cdcdbed refactor: remove face embedding architecture - single Qdrant _faces collection
- Delete FaceEmbeddingDb module (face_embedding_db.rs)
- Stub match_faces_iterative, generate_seed_embeddings, tmdb_match_handler
- Remove sync_trace_embeddings, populate_face_embeddings_to_qdrant
- Remove embedding from face.json output (face_processor.py)
- Remove embedding from PG UPDATE (store_traced_faces.py)
- Remove workspace traces staging (checkin.rs, qdrant_workspace.rs)
- Fix tests: add pose_angle to Face, hand_nodes to TkgResult

Disabled functions (need reimplement with _faces):
- match_faces_iterative (identity agent)
- generate_seed_embeddings (TMDb seeds)
- tmdb_match_handler (TMDb matching)
- cluster_face_embeddings, search_similar_faces
- merge_traces_within_cuts
2026-06-24 22:27:09 +08:00
Accusys
14e886cc08 feat: progressive multi-round face matching + pending person API
- Identity agent: per-face max matching, multi-round with derived
  seeds from high-confidence faces, angle diversity filter (cosine sim < 0.90)
- Pending person API: POST /file/:file_uuid/pending-person
  + GET /file/:file_uuid/pending-persons with status=pending, source=manual
- Update API docs (07_identity.md)
2026-06-24 03:42:04 +08:00
Accusys
766a1d9a6d feat: Swift Face Pose integration + TKG 方案 B
Major Changes:
- swift_face_pose: output pose angles (yaw/pitch/roll) in face.json
- face_processor.py: call swift_face_pose (dual output: face.json + pose.json)
- Face struct: add pose_angle field
- TKG 方案 B: gaze/lip_track nodes from face.json (no face_detections dependency)
- Chunk cleanup: delete old data before rebuild (avoid duplicate key)
- Hand nodes: classify by hand_type + gesture (15 combinations)
- HAND_OBJECT edges: bbox spatial matching (174 matches)

Test Results:
- Blake Jones: 8 faces, pose_angle ✓, 66 nodes, 174 edges
- FilmRiot: 394 faces, pose_angle ✓, 35 nodes, 39 edges
- Left hands: 132, Right hands: 2

Architecture:
- All TKG nodes built from JSON files (face.json, hand.json, yolo.json)
- Swift processors: sample_interval=3 (Face/Pose/Hand sync)
- Cleanup functions: delete_tkg_nodes_by_uuid, delete_tkg_edges_by_uuid
2026-06-23 05:47:24 +08:00
Accusys
e1e2da2140 fix: processor-counts API + ASRX field name conversion
- Fix processor-counts API to correctly read JSON counts:
  - YOLO: use frames.length (was returning null)
  - CUT: prioritize scenes.length over frame_count
  - Result: YOLO 1963 frames, CUT 25 scenes (correct)

- Fix ASRX field name conversion:
  - Convert start_time/end_time → start/end for ASRX compatibility
  - Prefer frame-based positioning over time-based

- Document issues in issues_2026-06-21.md:
  - Issue 6: ASRX field name mismatch
  - Issue 7: processor-counts API null values
2026-06-22 23:33:39 +08:00
Accusys
70e849d3ae refactor: remove Rule 3, Story, and Caption processors
- Remove Rule 3 (Scene Chunking) from worker auto-trigger
- Remove rule3_ingest.rs and related imports
- Remove Story/Caption from playground module parsing
- Clean up scan.rs Rule 3 display
- Fix ASRX field name conversion (start_time -> start)

Reason: Story/5W1H/Scene accuracy too poor - will redesign later
2026-06-22 15:34:02 +08:00
Accusys
7e548f8b08 release: v1.3.0 - TKG node type renaming
Changes:
- Rust: face_trace → face_track (45 occurrences in 8 files)
- Rust: gaze_trace → gaze_track, lip_trace → lip_track
- Python: tkg_builder.py unified + pipeline_checklist.py fixed
- Swift: swift_hand.swift hand state detection (empty vs holding)

Node type changes:
  face_trace    → face_track
  person_trace  → body_track
  gaze_trace    → gaze_track
  lip_trace     → lip_track
  hand_trace    → hand_track
  speaker       → speaker_segment
  object        → detected_object
  text_trace    → text_region

Migration:
  PUBLIC schema: 12970 + 892 + 305 rows updated
2026-06-22 07:18:21 +08:00
Accusys
bce9435823 feat: add Level 2/3 dynamic feature extraction CLI
- test_level2_level3.py: on-demand extraction script
- Level 2: face, torso, leg, arm regions (medium)
- Level 3: glasses, earrings, watch (fine details)
- Demonstrates dynamic calculation from keypoints
2026-06-22 03:26:12 +08:00
Accusys
d94b96d884 feat: add shot type detection and proportion-based height estimation
- detect_shot_type(): classify full_body/medium_shot/close_up
- estimate height using shoulder_width × 3.8 (~171cm) for close-up
- add BODY_PROPORTIONS constants for validation
- head position ratio + bbox aspect ratio → shot type
- enables filtering full-body shots in video search
2026-06-22 02:47:01 +08:00
Accusys
606f31f13c feat: add appearance feature system with coordinate/scale fixes
- Add Appearance_Feature_System_V1.0.md design doc
- Add proportion_calculator.py for body proportions (height, body shape)
- Add feature_extractor.py for hierarchical feature extraction
- Add tkg_level1_builder.py for TKG person_trace nodes
- Fix mediapipe_holistic_processor.py to output Top-Left pixels
- Add MediaPipe format conversion in proportion_calculator

Coordinate system alignment:
- Swift Pose: Top-Left pixels (Y-flip done in swift_pose.swift)
- MediaPipe: Top-Left pixels (norm→pixel conversion added)
2026-06-22 02:27:03 +08:00
Accusys
2cfcfdd1af feat: Phase 2.6 edges migration to Qdrant (TKG-only architecture)
Phase 2.6.1: co_occurrence_edges migration
- build_co_occurrence_edges_from_qdrant()
- Qdrant embeddings → frame grouping → YOLO objects
- Result: 6679 edges (vs 6701 PostgreSQL)

Phase 2.6.2: face_face_edges migration
- build_face_face_edges_from_qdrant()
- Qdrant embeddings → frame grouping → face pairs
- mutual_gaze detection preserved
- Result: 6 edges (exact match)

Phase 2.6.3: speaker_face_edges migration
- build_speaker_face_edges_from_qdrant()
- Qdrant embeddings → trace_id frame ranges
- SPEAKS_AS edge creation

Architecture:
- All edges use Qdrant payload (no face_detections queries)
- PostgreSQL fallback for empty Qdrant
- Estimated 3.6x performance improvement

Testing:
- Playground (3003): ✓ All Phase 2.6 logs verified
- Edge counts: ✓ Close match with PostgreSQL
- Fallback: ✓ Working

Docs:
- docs_v1.0/DESIGN/TKG_PHASE2_6_EDGES_MIGRATION.md
- docs_v1.0/M4_workspace/2026-06-21_phase2_6_test.md
2026-06-21 04:47:49 +08:00
Accusys
17e4e15860 feat: add Vision LLM integration (CLIP + Qwen3-VL cascade)
- Add Qwen3-VL dynamic management (start/stop/status CLI)
- Add CLIP + Qwen3-VL cascade detection strategy
- Add Vision CLI commands (vision start/stop/status, detect)
- Add cascade_vision processor module
- Add clip processor module
- Add qwen_vl_manager module

Changes:
- scripts/start_qwen3vl.sh, stop_qwen3vl.sh: Qwen3-VL management scripts
- src/core/vision/: Qwen3-VL manager module
- src/core/processor/cascade_vision.rs: CLIP + Qwen3-VL cascade logic
- src/core/processor/clip.rs: CLIP classification and detection
- src/api/clip_api.rs: CLIP API endpoints
- src/cli/vision.rs: Vision CLI implementation
- src/cli/args.rs: Add Vision and Detect commands
- src/main.rs: Integrate Vision CLI
- src/core/mod.rs: Add vision module
- src/core/processor/mod.rs: Add cascade_vision module
2026-06-13 16:25:52 +08:00
Accusys
e1572907ae feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system 2026-06-02 07:13:23 +08:00
Accusys
127d646ef1 fix: worker processor_results + rule3 SQL + unregister cleanup bugs
- job_worker.rs: add upsert_processor_result when output file exists
- job_worker.rs: add load JSON and store to pre_chunks when output exists
- rule3_ingest.rs: fix SQL bind order (scene_number was occupying chunk_type slot)
- files.rs: fix unregister WHERE clause (uuid -> file_uuid) + add pre_chunks delete
- asrx_self/main_fixed.py: fix KeyError (s['start'] -> s['start_time'])
- wrapper_worker_playground.sh: add Worker launchd script
- com.momentry.playground.plist: add Playground launchd config
2026-05-26 04:35:51 +08:00
M5Max128
0806d44df4 fix: add status/duration/fps to FileDetailResponse; fix progress API with HSET+HGETALL 2026-05-25 03:40:02 +08:00
M5Max128
29eabf6d88 chore: remove swift build artifacts from tracking 2026-05-25 03:37:19 +08:00
M5Max128
78923a8973 fix: system consistency - store_vector, search, worker trigger
- store_vector: stub -> actual PG embedding storage
- search_parent_chunks_semantic: include sentence chunks
- Remove early return in check_and_complete_job
2026-05-24 23:20:02 +08:00
M5Max128
a008bb865b feat: add Gitea to startup script, update AGENTS.md token
- Add Gitea (port 3000) as step 10 in startup script
- Update AGENTS.md Gitea token record
2026-05-23 02:37:19 +08:00
M5Max128
1c30af9557 fix: correct service paths, nohup removal, MongoDB graceful fallback, add MariaDB + Caddy to startup
- Fix Qdrant binary path (services/ -> momentry_resources/bin/)
- Fix LLM binary/model paths (llama/ -> momentry_resources/llama/, models/ -> models/llm/)
- Fix PostgreSQL data path (pgsql/data -> momentry/var/postgresql)
- Remove nohup (fails in LaunchDaemon environment)
- Add MongoDB graceful fallback with 5s timeout in server.rs
- Add MariaDB + Caddy steps to startup script for WordPress
- Revert all unrelated changes
2026-05-23 01:46:23 +08:00
M5Max128
701e71463d feat: identity PATCH update, alias system, name UNIQUE removal
- Add PATCH /api/v1/identity/:identity_uuid endpoint
- Migration 030: remove name UNIQUE, add tmdb_id index
- TMDb upsert: ON CONFLICT (name) -> ON CONFLICT (tmdb_id)
- get_or_create_identity: pre-check by name
- upload_identity: ON CONFLICT (name) -> ON CONFLICT (uuid)
- Search: include aliases in identity text search
- Add scripts/llm_metadata_enhancer.py
- Add DESIGN/IdentityUpdateAndAliasSystem.md
2026-05-22 08:35:32 +08:00
Accusys
bebaa743ed feat: trace-level matching, health watcher/worker status, timezone config 2026-05-21 01:08:30 +08:00
Accusys
7680c202ef Phase 5: mark bind/unbind/match-trace as tested on 3003 2026-05-19 21:08:16 +08:00
Accusys
58c283a1fc fix: playground ASR field names (start_time/end_time) + add 3003 specific test script
- playground.rs: seg.start/end -> seg.start_time/end_time
- scripts/test_m5api_phase5_3003.sh: tests bind, unbind, match-from-trace on localhost:3003
- Note: bind fails on dev (real_name column missing), match-from-trace returns 404 for no embeddings
2026-05-19 21:07:39 +08:00
Accusys
d2d3197c0d Phase 5: 21 tests (18 pass, 3 known: identity deleted by mergeinto, multipart required, proxy 404)
Note: mergeinto is destructive and deletes source identity.
Match-from-photo requires multipart file upload.
Match-from-trace works but proxy returns 404.
2026-05-19 20:31:34 +08:00
Accusys
e3c7e347b7 fix: identity binding + JSON endpoint + Phase 5 test script
- identity_binding.rs: fix i32->i64 type mismatch, COALESCE name column
- identity_api.rs: get_identity_json fallback to DB if file missing
- test_m5api_phase5.sh: fixed variable expansion, updated request bodies
- Phase 5: 21/23 passed (2 known: multipart + proxy 404)
2026-05-19 20:30:05 +08:00
Accusys
47a480a5e2 fix: identity search - fix i.name column and simplify identity_bindings join
- search_identity_text: COALESCE(i.real_name, i.actor_name) AS identity_name
- search_identities_by_text:
  - Removed broken identity_bindings join (table has wrong schema)
  - Fixed i.id type mismatch (bigint -> i32 via ::int cast)
  - Simplified to direct face_detections join
- Added error logging for debugging
- Phase 4 now 11/11 passed
2026-05-19 16:21:15 +08:00
Accusys
77098b88ba feat: Phase 2-5 API test scripts + create_monitor_job fix
Phase 2: 10/10 passed 
Phase 3: 7/7 passed 
Phase 4: 9/11 passed (2 known bugs - i.name column)
Phase 5: 13/23 passed (10 failures - pre-existing bugs)

Fixes:
- create_monitor_job: ON CONFLICT (uuid) DO UPDATE to prevent duplicate key errors
- test scripts: Correct request bodies for all visual search endpoints
2026-05-19 16:05:46 +08:00
Accusys
ff0bf6b25b feat: Phase 2-5 API test scripts
Phase 2: Files (10 endpoints) - 10/10 passed
Phase 3: Process & Pipeline (7 endpoints) - 4/7 passed
Phase 4: Search (12 endpoints) - pending
Phase 5: Identity/Media/TMDB (24 endpoints) - pending

Known issues:
- Process trigger fails for already-processed files (500)
- Health detailed returns 200 when tested directly
2026-05-19 15:53:53 +08:00
Accusys
ea6ea02925 fix: delete_video - add file existence check + fix pre_chunks UUID cast
- unregister: check file exists before delete, return 200 with success:false if not found
- delete_video: cast pre_chunks.file_uuid parameter as UUID (::uuid)
- Added Phase 2 test script (10/10 endpoints passed)
2026-05-19 15:51:25 +08:00
Accusys
3d2bacb07f feat: Phase 1 base API test script (15 endpoints) 2026-05-19 14:15:00 +08:00
Accusys
7ab7119a99 fix: ASR processor indentation error 2026-05-19 13:23:09 +08:00
Accusys
67ca846ccd feat: ASR output frame numbers + rename start/end to start_time/end_time
- Python: asr_processor.py detects FPS from CUT/ffprobe (no fallback), outputs start_frame/end_frame
- Rust: All AsrSegment structs use start_time/end_time with #[serde(alias)] for backward compat
- store_asr_chunks: prefers ASR output frames, falls back to time-based conversion
- Added backward compatibility test for old JSON format (start/end)

Breaking change: ffprobe/CUT FPS failure now aborts instead of using default 24fps
2026-05-19 13:22:38 +08:00
Accusys
f6f623eeea docs: add 13_config to USER_MODULES + regenerate docs 2026-05-19 03:14:18 +08:00
Accusys
12864634da fix: clear password field in Python login page too 2026-05-18 12:35:08 +08:00
Accusys
78ba6f3d3d docs: fix logout f-string escaping, rebuild 2026-05-18 10:00:51 +08:00
Accusys
2103672684 docs: add logout to every doc page and index 2026-05-18 10:00:29 +08:00
Accusys
54da7c7266 docs: add logout button to login page 2026-05-18 09:54:37 +08:00
Accusys
088aefdac7 fix: pipeline timeline log, chunk lookup, face processor no fallback, Qdrant UUID script, delete safety rules 2026-05-18 00:36:14 +08:00
Accusys
c41f7e0c6e feat: schema version tracking, SHA256 integrity, setup scripts, bug fixes 2026-05-15 18:06:36 +08:00
Accusys
0e73d2a2ce test: add unified probe unit tests (8 Rust + 6 Python), fix pre-existing test compilation errors 2026-05-15 14:58:44 +08:00
Accusys
29eca5a224 feat: unified probe — dispatcher detects category, runs ffprobe/Python/meta per file type 2026-05-15 14:38:47 +08:00
Accusys
37747466e8 fix: deploy_package.sh — add content_hash column migration before import 2026-05-14 20:35:22 +08:00