Commit Graph

184 Commits

Author SHA1 Message Date
Accusys
14e886cc08 feat: progressive multi-round face matching + pending person API
- Identity agent: per-face max matching, multi-round with derived
  seeds from high-confidence faces, angle diversity filter (cosine sim < 0.90)
- Pending person API: POST /file/:file_uuid/pending-person
  + GET /file/:file_uuid/pending-persons with status=pending, source=manual
- Update API docs (07_identity.md)
2026-06-24 03:42:04 +08:00
Accusys
766a1d9a6d feat: Swift Face Pose integration + TKG 方案 B
Major Changes:
- swift_face_pose: output pose angles (yaw/pitch/roll) in face.json
- face_processor.py: call swift_face_pose (dual output: face.json + pose.json)
- Face struct: add pose_angle field
- TKG 方案 B: gaze/lip_track nodes from face.json (no face_detections dependency)
- Chunk cleanup: delete old data before rebuild (avoid duplicate key)
- Hand nodes: classify by hand_type + gesture (15 combinations)
- HAND_OBJECT edges: bbox spatial matching (174 matches)

Test Results:
- Blake Jones: 8 faces, pose_angle ✓, 66 nodes, 174 edges
- FilmRiot: 394 faces, pose_angle ✓, 35 nodes, 39 edges
- Left hands: 132, Right hands: 2

Architecture:
- All TKG nodes built from JSON files (face.json, hand.json, yolo.json)
- Swift processors: sample_interval=3 (Face/Pose/Hand sync)
- Cleanup functions: delete_tkg_nodes_by_uuid, delete_tkg_edges_by_uuid
2026-06-23 05:47:24 +08:00
Accusys
e1e2da2140 fix: processor-counts API + ASRX field name conversion
- Fix processor-counts API to correctly read JSON counts:
  - YOLO: use frames.length (was returning null)
  - CUT: prioritize scenes.length over frame_count
  - Result: YOLO 1963 frames, CUT 25 scenes (correct)

- Fix ASRX field name conversion:
  - Convert start_time/end_time → start/end for ASRX compatibility
  - Prefer frame-based positioning over time-based

- Document issues in issues_2026-06-21.md:
  - Issue 6: ASRX field name mismatch
  - Issue 7: processor-counts API null values
2026-06-22 23:33:39 +08:00
Accusys
db8bb8fa95 fix(tkg): handle null identity_id + remove skin_tone nodes
- Fix Phase 2.5 null handling in build_gaze/lip_track_nodes
  - Use query_scalar::<_, Option<i64>> + flatten() for nullable fields
  - Prevents 'unexpected null' decoding errors

- Remove skin_tone_trace_nodes from TKG build
  - Delete build_skin_tone_trace_nodes function (110 lines)
  - Remove from TkgResult struct and API response
  - Skin tone should be independent function, not in TKG

Result: TKG rebuild now completes successfully
- Nodes: 40 (face_track, gaze_track, text_region, appearance)
- Edges: 2967 (co_occurrence edges increased from 21 → 2964)
2026-06-22 16:39:47 +08:00
Accusys
70e849d3ae refactor: remove Rule 3, Story, and Caption processors
- Remove Rule 3 (Scene Chunking) from worker auto-trigger
- Remove rule3_ingest.rs and related imports
- Remove Story/Caption from playground module parsing
- Clean up scan.rs Rule 3 display
- Fix ASRX field name conversion (start_time -> start)

Reason: Story/5W1H/Scene accuracy too poor - will redesign later
2026-06-22 15:34:02 +08:00
Accusys
22f13eca4b fix(cut): change ffprobe output format to default=nk=0
- Problem: compact=p=0:nk=1 outputs pipe-delimited format without pts_time=
- Fix: default=nk=0 outputs pts_time=XXX format that parser can match
- Result: Charade scene detection from 1 scene -> 833 scenes (correct)
2026-06-22 13:25:16 +08:00
Accusys
30b252ac95 fix: pre_chunks schema + TMDb movie name extraction
- pre_chunks: add chunk_type, text_content columns; drop NOT NULL on
  coordinate_type/coordinate_index (INSERT statements reference these
  columns but CREATE TABLE was missing them)
- run_migrations: add ALTER TABLE for existing databases
- extract_movie_name: filter noise words (youtube, fps, 24fps, 1080p,
  pure digits) so 'Charade_YouTube_24fps' → 'Charade'
- run-server-3002.sh: add companion worker startup (matching 3003 script)
2026-06-22 11:55:12 +08:00
Accusys
f4de741d5b fix: add appearance back to processor list, keep mediapipe/story filtered out 2026-06-22 09:20:16 +08:00
Accusys
c93b54efeb fix: filter deprecated processors from trigger API requests 2026-06-22 09:15:02 +08:00
Accusys
4ba248513e fix: correct processor list - remove deprecated mediapipe/appearance/story, fix auto-pipeline order
- ProcessorType::all(): remove MediaPipe, Appearance, Story (mediapipe replaced by Swift)
- files.rs auto-pipeline: fix order to cut,asr,asrx,yolo,ocr,face,pose (was missing asr)
- postgres_db.rs run_migrations(): rewrite to auto-create all 38 tables idempotently
2026-06-22 08:49:41 +08:00
Accusys
7e548f8b08 release: v1.3.0 - TKG node type renaming
Changes:
- Rust: face_trace → face_track (45 occurrences in 8 files)
- Rust: gaze_trace → gaze_track, lip_trace → lip_track
- Python: tkg_builder.py unified + pipeline_checklist.py fixed
- Swift: swift_hand.swift hand state detection (empty vs holding)

Node type changes:
  face_trace    → face_track
  person_trace  → body_track
  gaze_trace    → gaze_track
  lip_trace     → lip_track
  hand_trace    → hand_track
  speaker       → speaker_segment
  object        → detected_object
  text_trace    → text_region

Migration:
  PUBLIC schema: 12970 + 892 + 305 rows updated
2026-06-22 07:18:21 +08:00
Accusys
e214106d48 feat: Phase 2.7 identity resolution for gaze/lip trace nodes
Implementation:
- gaze_trace nodes: Query face_trace identity_id, add to properties
- lip_trace nodes: Query face_trace identity_id, add to properties
- Rule2: Extend identity resolution to support gaze_trace/lip_trace node types

Architecture:
- All face-related nodes now have identity_id in TKG properties
- Rule2 unified identity resolution for face_trace/gaze_trace/lip_trace
- TKG-only approach (no face_detections dependency for identity)

Code Changes:
- src/core/processor/tkg.rs: Add identity_id query in gaze/lip builders
- src/core/chunk/rule2_ingest.rs: Extend node_type condition

Docs:
- docs_v1.0/DESIGN/TKG_PHASE2_7_IDENTITY_RESOLUTION.md

Status: Implementation complete, pending test with valid file
2026-06-21 05:12:13 +08:00
Accusys
2cfcfdd1af feat: Phase 2.6 edges migration to Qdrant (TKG-only architecture)
Phase 2.6.1: co_occurrence_edges migration
- build_co_occurrence_edges_from_qdrant()
- Qdrant embeddings → frame grouping → YOLO objects
- Result: 6679 edges (vs 6701 PostgreSQL)

Phase 2.6.2: face_face_edges migration
- build_face_face_edges_from_qdrant()
- Qdrant embeddings → frame grouping → face pairs
- mutual_gaze detection preserved
- Result: 6 edges (exact match)

Phase 2.6.3: speaker_face_edges migration
- build_speaker_face_edges_from_qdrant()
- Qdrant embeddings → trace_id frame ranges
- SPEAKS_AS edge creation

Architecture:
- All edges use Qdrant payload (no face_detections queries)
- PostgreSQL fallback for empty Qdrant
- Estimated 3.6x performance improvement

Testing:
- Playground (3003): ✓ All Phase 2.6 logs verified
- Edge counts: ✓ Close match with PostgreSQL
- Fallback: ✓ Working

Docs:
- docs_v1.0/DESIGN/TKG_PHASE2_6_EDGES_MIGRATION.md
- docs_v1.0/M4_workspace/2026-06-21_phase2_6_test.md
2026-06-21 04:47:49 +08:00
Accusys
c39805bb8e feat: Phase 2.5 gaze_trace and lip_trace Qdrant migration + Charade Q&A test
Phase 2.5.1: gaze_trace_nodes from Qdrant
- build_gaze_trace_nodes_from_qdrant()
- Read trace_id, frame, bbox from Qdrant payload
- Compute gaze stats (yaw, pitch, roll, gaze direction, blink)
- No PostgreSQL face_detections dependency

Phase 2.5.2: lip_trace_nodes from Qdrant + face.json
- build_lip_trace_nodes_from_qdrant()
- Match trace_id using Qdrant embeddings + face.json bbox
- Compute lip stats (openness, variance, speaking frames)
- Fixed face.json bbox structure (x,y,width,height not bbox object)

Test results:
- 23 gaze_trace nodes from Qdrant
- 23 lip_trace nodes from Qdrant + face.json
- 51 lip_sync edges created
- Charade Q&A: 20 identities, 75 relationship chunks

Docs:
- TKG_PHASE2_NONFACE_MIGRATION_V1.0.md (migration plan)
- 2026-06-21_charade_qa_test.md (Q&A test report)
2026-06-21 02:17:08 +08:00
Accusys
23c440104b feat: Phase 2-3 TKG-only architecture
Phase 2.1: build_face_trace_nodes_from_qdrant()
- Read trace_id, frame, bbox directly from Qdrant payload
- No dependency on face_detections table

Phase 2.3: Rule2 queries TKG nodes
- identity resolution from tkg_nodes.properties.identity_id
- TKG-only architecture (Phase 2.3)

Phase 3: Identity Agent updates TKG nodes
- match_faces_iterative() updates tkg_nodes.properties
- bind_identity_trace() syncs identity_id to TKG
- unbind_identity() removes identity_id from TKG

Test results:
- 23 face_trace nodes from Qdrant (Phase 2.1)
- 75 relationship chunks (Rule2)
- TKG rebuild: Phase0 → Phase1 → Phase2
2026-06-21 01:30:04 +08:00
Accusys
2f2ccc94f7 feat: Identity Agent query Qdrant for face embeddings
Phase 1.4: Modify match_faces_iterative to use Qdrant

Changes:
- match_faces_iterative() now queries FaceEmbeddingDb
- Fallback to PostgreSQL if Qdrant is empty
- Group embeddings by trace_id from Qdrant payload
- Sample 3-angle embeddings (front, mid, back)
- Match against TMDb seeds (threshold=0.50)
- Propagate to unmatched traces
- Update face_detections.identity_id in PostgreSQL

New functions:
- match_faces_iterative() - Qdrant-based matching
- match_faces_iterative_pg() - PostgreSQL fallback

Flow:
1. Load TMDb identities with face_embedding
2. Query Qdrant for file embeddings
3. Sample 3 embeddings per trace
4. Match against TMDb seeds
5. Propagate matches iteratively
6. Update identity_id in PostgreSQL
2026-06-21 00:31:25 +08:00
Accusys
3ad6f8740a feat: Rule2 TKG relationship chunks + Phase0-1 Qdrant integration
Phase 0: TKG builder populate face_detections from face.json
- Fix face.json parser for pose_angle format
- Call store_traced_faces.py to set trace_id
- Skip if trace_id already populated

Phase 1: Qdrant face embeddings integration
- Add FaceEmbeddingDb module (src/core/db/face_embedding_db.rs)
- Create dev_face_embeddings collection (dim=512)
- Store 1122 face embeddings with pose metadata
- API: init_collection, batch_upsert, search_similar

Rule2: TKG edges → relationship chunks
- Design: RULE2_TKG_RELATIONSHIP_V1.0.md
- Implementation: rule2_ingest.rs
- ChunkType::Relationship added
- Edge types: SPEAKS_AS, MUTUAL_GAZE, CO_OCCURS_WITH, HAS_APPEARANCE, WEARS
- Auto-trigger on TKG rebuild

API:
- POST /api/v1/file/:file_uuid/rule2 (vectorization)
- POST /api/v1/file/:file_uuid/tkg/rebuild (auto Rule2)

Test: 75 relationship chunks created + vectorized
2026-06-21 00:22:41 +08:00
Accusys
17e4e15860 feat: add Vision LLM integration (CLIP + Qwen3-VL cascade)
- Add Qwen3-VL dynamic management (start/stop/status CLI)
- Add CLIP + Qwen3-VL cascade detection strategy
- Add Vision CLI commands (vision start/stop/status, detect)
- Add cascade_vision processor module
- Add clip processor module
- Add qwen_vl_manager module

Changes:
- scripts/start_qwen3vl.sh, stop_qwen3vl.sh: Qwen3-VL management scripts
- src/core/vision/: Qwen3-VL manager module
- src/core/processor/cascade_vision.rs: CLIP + Qwen3-VL cascade logic
- src/core/processor/clip.rs: CLIP classification and detection
- src/api/clip_api.rs: CLIP API endpoints
- src/cli/vision.rs: Vision CLI implementation
- src/cli/args.rs: Add Vision and Detect commands
- src/main.rs: Integrate Vision CLI
- src/core/mod.rs: Add vision module
- src/core/processor/mod.rs: Add cascade_vision module
2026-06-13 16:25:52 +08:00
Accusys
834b0d4865 feat: score-based search, LLM re-ranking endpoint, video title search, pipeline module
Core search changes:
- Replace RRF with score-based merge (max of semantic/keyword/identity)
- Add video title ILIKE search for brand/name queries (score 0.9)
- Add /api/v1/search/llm-smart endpoint with Gemma 4 re-ranking
- Fix LLM JSON parsing (markdown fences, empty responses)

Infrastructure:
- Rebuild Qdrant collection (clear 347K contaminated points)
- Add dotenv loading to main.rs for config parity
- Implement store_pre_chunk in postgres_db.rs

Pipeline module (WordPress):
- store-asrx, rule1, vectorize, phase1, complete endpoints
- CLI commands for pipeline operations

Docs:
- SEARCH_SCORE_IMPROVEMENT.md (score-based merge proposal)
2026-06-04 07:40:41 +08:00
Accusys
e1572907ae feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system 2026-06-02 07:13:23 +08:00
Accusys
874d688987 feat: deploy hybrid search (semantic+keyword+identity) with RRF fusion
- Replace smart_search with hybrid RRF implementation
- Add speaker_detections table for identity-agent binding
- Fix identity queries: direct SQL to avoid type mismatches
- Add debug logs to job_worker for processor debugging
- Deployed to production (3002) successfully

Key changes:
- search.rs: Complete rewrite with 3 strategies + RRF
- postgres_db.rs: speaker_detections table + identity query fixes
- job_worker.rs: Debug logs for output file checks

Tested:
- Hybrid search works with semantic + keyword + identity
- Identity search: 'identity:Charade' returns correct results
- Chinese keyword search: '調光' matches Charade summaries

Bugs found:
- Case mismatch: 'ASRX' vs 'asrx' in processors field
- Missing CUT dependency for ASRX processor
2026-06-01 15:15:17 +08:00
Accusys
0d58a738a1 feat: add processor state machine and alert mechanism
- Add ProcessorJobStatus enum (8 states: Idle/Waiting/Ready/Pending/Running/Completed/Failed/Skipped)
- Add processor_alerts table (migrations/034)
- Add emit_processor_alert() to redis_client.rs
- Add ConditionResult enum + check_dependencies() to job_worker.rs
2026-05-30 10:03:49 +08:00
Accusys
127d646ef1 fix: worker processor_results + rule3 SQL + unregister cleanup bugs
- job_worker.rs: add upsert_processor_result when output file exists
- job_worker.rs: add load JSON and store to pre_chunks when output exists
- rule3_ingest.rs: fix SQL bind order (scene_number was occupying chunk_type slot)
- files.rs: fix unregister WHERE clause (uuid -> file_uuid) + add pre_chunks delete
- asrx_self/main_fixed.py: fix KeyError (s['start'] -> s['start_time'])
- wrapper_worker_playground.sh: add Worker launchd script
- com.momentry.playground.plist: add Playground launchd config
2026-05-26 04:35:51 +08:00
Accusys
87dead7f65 fix: POST /api/v1/jobs 500 — wrong column names + NULL file_name 2026-05-25 10:50:37 +08:00
Accusys
de88fd4e44 fix: restore accidentally deleted type definitions
Add back PipelineType enum, ProcessorType::pipeline() method, and
OLLAMA_URL/EMBED_URL/LLM_HEALTH_URL config constants — all of
which were deleted in commits 78923a89 and 0856b92e while the
referencing code was left intact, causing 5 compilation errors.
2026-05-25 08:50:53 +08:00
Accusys
d7f89a962b fix: frame_number is BIGINT in DB, use i64 not i32
frame_number column in face_detections table is defined as BIGINT (INT8).
Using i32 caused sqlx type mismatch at runtime. Fixed in:
- identity_agent_api.rs: query_as tuples and HashMap key
- qdrant_db.rs: upsert_face_embedding signature and row extraction
2026-05-25 04:07:30 +08:00
M5Max128
25ec1625df Merge branch 'main' of 10.10.10.201:/Users/accusys/momentry_core_0.1/ 2026-05-25 03:59:54 +08:00
M5Max128
0806d44df4 fix: add status/duration/fps to FileDetailResponse; fix progress API with HSET+HGETALL 2026-05-25 03:40:02 +08:00
Accusys
a2b71fef0d fix: i64→i32 for INT4 cols (identity_binding, identity_agent, qdrant_db) 2026-05-25 03:18:50 +08:00
Accusys
8fdd1d741b fix: stranger_id=NULL on bind/merge; doc: add traces+mergeinto endpoints 2026-05-25 03:03:27 +08:00
M5Max128
78923a8973 fix: system consistency - store_vector, search, worker trigger
- store_vector: stub -> actual PG embedding storage
- search_parent_chunks_semantic: include sentence chunks
- Remove early return in check_and_complete_job
2026-05-24 23:20:02 +08:00
M5Max128
932e43518d fix: trigger_processing — remove fake QUEUED state, create monitor_job if missing
- Remove SET processing_status = 'QUEUED' (no queue exists)
- Fix COALESCE type mismatch (jsonb vs text)
- Fix UPDATE WHERE id =  should be WHERE uuid =
- Check monitor_jobs existence, INSERT if missing via create_monitor_job
- Add UNIQUE constraint on monitor_jobs.uuid
- Fix response message: 'Processing queued' → 'Processing triggered'
2026-05-23 23:06:37 +08:00
M5Max128
5d8449b07c fix: compile processing.rs + mount processing_routes
- Fix 9 compilation errors in processing.rs:
  - memory_mb typo (mem_mb)
  - download_json return type
  - Chunk from_row (use row_to_json)
  - ProgressResponse/SystemHealthInfo/ProcessorProgressInfo Deserialize
  - Remove flush_all/flush (methods don't exist)
- Add pub mod processing to api/mod.rs
- Merge processing::processing_routes() into server router
2026-05-23 22:40:19 +08:00
M5Max128
0856b92ec6 fix: resource path cleanup + mount processing_routes WIP
- config.rs: SCRIPTS_DIR fix, EMBED/OLLAMA_URL 127.0.0.1, PYTHON_PATH restored
- executor.rs: use config::PYTHON_PATH instead of hardcoded path
- probe.rs/watcher.rs: use config::SCRIPTS_DIR instead of hardcoded path
- release.rs: momentry_core_0.1 → momentry_core
- .env.development: fix REDIS_URL host, PYTHON_PATH, SCRIPTS_DIR
- api/mod.rs + server.rs: add processing module declaration (routes not yet mountable due to pre-existing compile errors)
2026-05-23 22:26:03 +08:00
M5Max128
f8bcc0356c feat: frame/time pipeline split + output validation
- Add PipelineType enum + pipeline() to ProcessorType
- Split ProcessorPool into frame_slots (max 2) and time_slots (max 1)
- Add can_start_for() for pipeline-aware scheduling
- Add validate_output_file() — checks JSON validity before marking complete
- Add 3 unit tests for validate_output_file()
- Create DESIGN/FRAME_TIME_PIPELINE_V1.0.md (492 lines)
2026-05-23 21:14:28 +08:00
M5Max128
dddb5d4cbd refactor: centralize port config + fix 8082 conflict
- Add EMBED_URL, OLLAMA_URL, LLM_HEALTH_URL to config.rs
- Fix health.rs hardcoded ports → config references
- Fix sync_db.rs Ollama URL → config::OLLAMA_URL
- Create config/port_registry.tsv (single source of truth for ports)
- Remove Caddy 8082 proxy block (port belongs to LLM)
- Fix .env LLM_URL: localhost → 127.0.0.1 (avoid IPv6 Caddy conflict)
2026-05-23 02:54:34 +08:00
M5Max128
1c30af9557 fix: correct service paths, nohup removal, MongoDB graceful fallback, add MariaDB + Caddy to startup
- Fix Qdrant binary path (services/ -> momentry_resources/bin/)
- Fix LLM binary/model paths (llama/ -> momentry_resources/llama/, models/ -> models/llm/)
- Fix PostgreSQL data path (pgsql/data -> momentry/var/postgresql)
- Remove nohup (fails in LaunchDaemon environment)
- Add MongoDB graceful fallback with 5s timeout in server.rs
- Add MariaDB + Caddy steps to startup script for WordPress
- Revert all unrelated changes
2026-05-23 01:46:23 +08:00
Accusys
c4e30e4234 fix: list_resources returns data (config+metadata); register source code resource 2026-05-22 16:01:33 +08:00
Accusys
bd82028f34 refactor: unified LLM config - CHAT_URL/VISION_URL/SUMMARY_URL with env var overrides 2026-05-22 15:47:17 +08:00
Accusys
2d008b75bf fix: find_file/list_files include has_data flag for video data availability 2026-05-22 12:22:35 +08:00
Accusys
380dd87d8b feat: POST /api/v1/agents/search - Gemma4 function calling agent 2026-05-22 12:10:37 +08:00
Accusys
883535c4f7 feat: POST /identity/:uuid/bind/trace endpoint 2026-05-22 10:29:52 +08:00
Accusys
7805eaa3cb fix: doc-wasm hardcoded path momentry_core_0.1 -> momentry_core 2026-05-22 09:33:33 +08:00
Accusys
0794476902 feat: representative frame limited to first half of video 2026-05-22 09:24:48 +08:00
Accusys
2b950c985c feat: representative frame - auto-detect thumbnail + JSON endpoint 2026-05-22 09:22:15 +08:00
M5Max128
e1619c724a Merge branch 'main' of http://192.168.110.200:3000/admin/momentry_core 2026-05-22 08:51:08 +08:00
M5Max128
701e71463d feat: identity PATCH update, alias system, name UNIQUE removal
- Add PATCH /api/v1/identity/:identity_uuid endpoint
- Migration 030: remove name UNIQUE, add tmdb_id index
- TMDb upsert: ON CONFLICT (name) -> ON CONFLICT (tmdb_id)
- get_or_create_identity: pre-check by name
- upload_identity: ON CONFLICT (name) -> ON CONFLICT (uuid)
- Search: include aliases in identity text search
- Add scripts/llm_metadata_enhancer.py
- Add DESIGN/IdentityUpdateAndAliasSystem.md
2026-05-22 08:35:32 +08:00
Accusys
deb9516796 feat: TKG extension - pose data + mutual gaze detection 2026-05-22 07:09:54 +08:00
Accusys
2d3017d3c1 feat: GET file/:uuid/identities/:a/co-occur-with/:b endpoint 2026-05-22 05:34:25 +08:00
Accusys
d67f123949 feat: GET file/:uuid/trace/:tid/thumbnail endpoint 2026-05-22 04:58:28 +08:00