fix: ASRX duplication, TKG edges, trace ingest, and add pipeline progress publishing

- ASRX handler no longer stores duplicate 'asr' pre_chunks
- Pre_chunks storage made idempotent (delete-before-insert)
- Rule 1 + trace_ingest changed to query 'asrx' not 'asr'
- Trace chunks removed (dynamic from TKG/Qdrant)
- TKG scroll_face_points fixed: trace_id >= 1 (not == 1)
- TKG AsrxSegmentEntry: start/end -> start_time/end_time (match ASRX JSON)
- Unregister error handling: log instead of silent discard
- Add publish_pipeline_progress calls at each pipeline stage
  (processors, rule1, face_trace, identity_agent, TKG, rule2, completion)
This commit is contained in:
Accusys
2026-07-02 10:43:46 +08:00
parent d791d138f2
commit 3eabd45882
65 changed files with 9481 additions and 3856 deletions

View File

@@ -863,3 +863,42 @@ Before creating any file in `docs_v1.0/` (API_WORKSPACE, GUIDES, REFERENCE, DESI
完整交付程序M4_workspace → M5 → Release → Deploy → Public
`docs_v1.0/OPERATIONS/DELIVERY_PROCEDURE.md`
## Session Summary (2026-07-01: Search Mode Fixes)
### Goal
Fix search modes: Keyword BM25 ranking + People search migration to Qdrant + Qdrant scroll pagination
### Done
- **Keyword/BM25 search (`search_bm25`)**: Replaced hardcoded 1.0 score with PostgreSQL FTS (`ts_rank` + `plainto_tsquery`). Now ranks results by relevance instead of flat 1.0.
- **Smart search merge**: Passes real FTS score through instead of fixed 0.5, so keyword-only results are properly differentiated.
- **Qdrant scroll_points**: Added `offset` parameter for pagination support; new `scroll_all_points()` method handles multi-page scroll automatically.
- **get_identity_traces**: Fixed broken pagination loop (always fetched same first 1000 points) by switching to `scroll_all_points`.
- **People search (`search_persons_internal`)**: Replaced `face_detections` JOIN in universal search with Qdrant `_faces` scroll + Rust aggregation (count per identity per file, frame→second via FPS).
- **People search (`search_persons_by_query`)**: Same migration for the REST API person search endpoint.
- **Payload field fix**: `_faces` uses `frame` (integer) not `timestamp_secs` (float). Fixed both `search_persons_internal` and `search_persons_by_query` to read `frame` and convert via `frame / fps`.
### Key Files Changed
- `src/core/db/qdrant_db.rs`: `scroll_points` → offset pagination, new `scroll_all_points`
- `src/api/identity_binding.rs`: Use `scroll_all_points` instead of broken loop
- `src/api/universal_search.rs`: Rewrote `search_persons_internal` and `search_persons_by_query` to use Qdrant
- `src/core/db/postgres_db.rs`: `search_bm25` → PostgreSQL FTS ranking
- `src/api/search.rs`: Pass real FTS scores in merge, removed unused `KEYWORD_FIXED_SCORE`
### Done This Session
- **Qdrant scroll pagination**: `scroll_points` now accepts `offset` param + returns `next_page_offset`; new `scroll_all_points()` handles multi-page scroll automatically
- **get_identity_traces pagination fix**: No longer fetches same 1000 points in infinite loop
- **Keyword BM25**: `search_bm25` replaced hardcoded 1.0 score with PostgreSQL `ts_rank` + `plainto_tsquery`; `smart_search` passes real FTS scores instead of fixed 0.5
- **People search → Qdrant**: Both `search_persons_internal` and `search_persons_by_query` replaced `face_detections` JOIN with Qdrant `_faces` scroll + Rust aggregation (count/group/sort). Fixed `timestamp_secs` → `frame` + `frame/fps` conversion
- **list_face_candidates → Qdrant**: `identities.rs` unbound faces query now scrolls `_faces` with `is_null: identity_id` filter, sorts by confidence DESC in Rust
- **list_unassigned_traces → Qdrant**: `identities.rs` unbound traces query now scrolls `_faces` with `is_null: identity_id` + `trace_id > 0` filter, groups by (file_uuid, trace_id) in Rust, picks best face per trace
- **get_identity_chunks → identity_bindings**: Replaced `face_detections` frame-range JOIN with `identity_bindings` + `chunk.metadata->>'trace_id'`
- **postgres_db.rs 5 remaining READs → Qdrant**: `get_trace_count_by_file`, `get_trace_frame_count_distribution`, `get_identity_files`, `get_identity_faces`, `get_file_faces` all migrated to `_faces` scroll + Rust aggregation
- **agent/tools.rs fully migrated**: `exec_find_file`, `exec_list_files`, `exec_tkg_query` (8 sub-queries), `exec_identity_text`, `exec_identities_search` — all face_detections JOINs replaced with Qdrant scroll or identity_bindings
- **job_worker.rs + storage.rs**: Remaining face_detections READs migrated to Qdrant scroll
### Remaining face_detections references (all inactive/safe)
- Schema definition (CREATE TABLE/INDEX in `postgres_db.rs`)
- `store_face_detections_batch` — already skipped (Phase 1)
- `workspace_sqlite.rs` — local processing DB, separate from PG
- `bin/release.rs` — standalone release utility