fix: M4 Phase 1 bugs - dev.chunks refs, search_path, uuid column

Bug fixes from M4 report:
- 4 remaining dev.chunks → dev.chunk in SQL queries
- search_path includes public for pgvector extension
- get_chunk_by_chunk_id_and_uuid: uuid → file_uuid
- New endpoint: GET /api/v1/file/:uuid/chunk/:chunk_id
This commit is contained in:
Accusys
2026-05-11 10:21:06 +08:00
parent 39ba5ddf76
commit cac60c6093
17 changed files with 25156 additions and 8 deletions

View File

@@ -0,0 +1,72 @@
# M4 Handover Package — Complete
## Contents
| File | Size | Description |
|------|:----:|-------------|
| `HANDOVER_V2.0.md` | 9.6K | Main handover document |
| `api_test.sh` | 8.7K | API smoke test (37 endpoints) |
| `M4_RESPONSE.md` | 1.0K | M4 response (this file) |
### Source Code (choose one)
| File | Size | Description |
|------|:----:|-------------|
| `momentry_core_v1.0.1_source.tar.gz` | 204M | Git archive (latest commit) |
| `momentry_core.bundle` | 150M | Git bundle (full repo, `git clone momentry_core.bundle`) |
### DB Backup (pre-migration)
| File | Size | Description |
|------|:----:|-------------|
| `dev.chunks.sql` | 20M | `dev.chunks` table (old schema, pre-migration) |
| `dev.chunk_vectors.sql` | 56M | `dev.chunk_vectors` table (pre-migration) |
### Scripts
| File | Description |
|------|-------------|
| `generate_asr1.py` | Generate correction record from DB + asr.json |
| `apply_asr_corrections.py` | Apply corrections, preserve chunk_vectors |
| `clean_sentence_text.py` | LLM cleaning + Qdrant re-embedding |
| `pipeline_status.py` | Pipeline health check (9 stages) |
| `split_asr_segments.py` | Sub-window speaker change detection |
## Quick Start (on M4 machine)
```bash
# 1. Restore DB
psql -U accusys -d momentry < dev.chunks.sql
psql -U accusys -d momentry < dev.chunk_vectors.sql
# 2. Apply schema migration
psql -U accusys -d momentry -c "
ALTER TABLE dev.chunks RENAME TO dev.chunk;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS old_chunk_id;
ALTER TABLE dev.chunk DROP COLUMN IF EXISTS chunk_index;
"
psql -U accusys -d momentry -c "
UPDATE dev.chunk SET chunk_id = substring(chunk_id from 34)
WHERE chunk_id LIKE (file_uuid || '_%');
UPDATE dev.chunk_vectors cv SET chunk_id = substring(cv.chunk_id from 34)
FROM dev.chunk c WHERE c.file_uuid = cv.uuid AND cv.chunk_id LIKE (c.file_uuid || '_%');
"
# 3. Get source code
git clone momentry_core.bundle momentry_core_0.1
# or: tar xzf momentry_core_v1.0.1_source.tar.gz
# 4. Apply corrections
python3 generate_asr1.py
python3 apply_asr_corrections.py
# 5. Rebuild Qdrant
python3 clean_sentence_text.py
# 6. Build and run
cargo build --bin momentry_playground
DATABASE_SCHEMA=dev ./target/debug/momentry_playground server --port 3003
# 7. Run API test
bash api_test.sh
```