fix: M4 Phase 1 bugs - dev.chunks refs, search_path, uuid column

Bug fixes from M4 report:
- 4 remaining dev.chunks → dev.chunk in SQL queries
- search_path includes public for pgvector extension
- get_chunk_by_chunk_id_and_uuid: uuid → file_uuid
- New endpoint: GET /api/v1/file/:uuid/chunk/:chunk_id
This commit is contained in:
Accusys
2026-05-11 10:21:06 +08:00
parent 39ba5ddf76
commit cac60c6093
17 changed files with 25156 additions and 8 deletions

View File

@@ -0,0 +1,97 @@
# Bug Report: Schema Migration Bugs Found During M4 Phase 1 Testing
**Date**: 2026-05-11
**From**: M4 (Integration & Testing)
**To**: M5 (Development)
**Priority**: High (3 bugs, all cause 500 errors)
---
## Bug 1: Stale `dev.chunks` References After Table Rename
M5 renamed `dev.chunks``dev.chunk` but 4 SQL queries still hardcoded `dev.chunks`, causing "relation does not exist" errors.
### Affected Files
| File | Line | Old | New |
|------|------|-----|-----|
| `src/core/db/postgres_db.rs` | 4626 | `FROM dev.chunks` | `FROM dev.chunk` |
| `src/api/five_w1h_agent_api.rs` | 779 | `UPDATE dev.chunks SET embedding` | `UPDATE dev.chunk SET embedding` |
| `src/worker/processor.rs` | 1083 | `UPDATE dev.chunks SET metadata` | `UPDATE dev.chunk SET metadata` |
| `src/worker/job_worker.rs` | 1005 | `FROM dev.chunks WHERE` | `FROM dev.chunk WHERE` |
### Why M5 Tests Passed
M5 likely still had the OLD table (ALTER RENAME keeps source, M5 may not have dropped it). M4 followed the migration exactly (RENAME), so the old table is gone.
### Trigger
- `POST /api/v1/search/smart` → 500 error "relation dev.chunks does not exist"
---
## Bug 2: `search_path` Missing `public` — pgvector `vector` Type Not Found
`after_connect` sets `SET search_path TO dev` which REPLACES the default `"$user", public` — losing access to `public` schema where pgvector extension lives.
### Affected Code
```rust
// src/core/db/postgres_db.rs:745
sqlx::query(&format!("SET search_path TO {}", schema))
```
### Fix
```rust
sqlx::query(&format!("SET search_path TO {}, public", schema))
```
### Trigger
- Any query using `::vector` cast (e.g., `search_parent_chunks_semantic`) after `SET search_path TO dev`
- Error: "type vector does not exist"
---
## Bug 3: `get_chunk_by_chunk_id_and_uuid` Uses Wrong Column Name
The function queries `SELECT ... uuid, ... WHERE uuid = $2` but the `dev.chunk` table column is `file_uuid`, not `uuid`.
### Affected Code
```rust
// src/core/db/postgres_db.rs:2776
"SELECT ... uuid, ... FROM {} WHERE chunk_id = $1 AND uuid = $2"
```
### Fix
```sql
SELECT ... file_uuid as uuid, ... FROM dev.chunk WHERE chunk_id = $1 AND file_uuid = $2
```
### Trigger
- `GET /api/v1/file/:uuid/chunk/:chunk_id` → 500 "column uuid does not exist"
- `POST /api/v1/search/universal` → same error (internally calls this function)
---
## Bug 4 (Observation): PG Has No Embeddings for M5 Charade
M5's Charade data (`aeed71342a899fe4b4c57b7d41bcb692`) has 0 embeddings in PostgreSQL (all 4,188 sentence chunks have `embedding IS NULL`). Vectors are only in Qdrant.
### Impact
- `POST /api/v1/search/smart` returns empty results (requires `embedding IS NOT NULL`)
- `POST /api/v1/search/universal` works (queries Qdrant)
### M4 Workaround
Demo script step 20 changed from `search/smart` to `search/universal` to show meaningful results.
---
## M5 Response
All 4 bugs fixed and verified (37/37 API tests ✅):
| # | Bug | Fix | Status |
|:-:|-----|-----|:------:|
| 1 | `dev.chunks` refs | 4 files updated: `postgres_db.rs`, `five_w1h_agent_api.rs`, `processor.rs`, `job_worker.rs` | ✅ |
| 2 | `search_path` missing `public` | `postgres_db.rs:745``SET search_path TO dev, public` | ✅ |
| 3 | `uuid` vs `file_uuid` | `postgres_db.rs:2777``/* ... */ AND file_uuid = $2` with `file_uuid as uuid` alias | ✅ |
| 4 | PG embeddings empty | Confirmed design choice — vectors in Qdrant only. `search/universal` works. | ⚠️ By design |
New source bundle updated at `docs_v1.0/M4_HANDOVER/momentry_core_v1.0.2_source.tar.gz`.

View File

@@ -0,0 +1,70 @@
# API Report: Missing Chunk Detail Endpoint
**Date**: 2026-05-11
**From**: M4 (Integration & Testing)
**To**: M5 (Development)
**Priority**: Medium
---
## Issue
Portal's `ChunkDetailView` needs to fetch a single chunk by `file_uuid` + `chunk_id`. Currently no dedicated endpoint exists for this.
## Proposed Endpoint
```
GET /api/v1/file/:file_uuid/chunk/:chunk_id
```
- Method: `GET`
- Path params: `file_uuid` (String), `chunk_id` (String)
- Response: Single `Chunk` object (same structure as chunk items in search results)
## Existing Building Block
The DB-layer method already exists:
```rust
// src/core/db/postgres_db.rs:2770
pub async fn get_chunk_by_chunk_id_and_uuid(
&self,
chunk_id: &str,
uuid: &str,
) -> Result<Option<Chunk>>
```
Currently only used internally by the search handler (`server.rs:1494, 1534`). Only a route + handler wrapper is needed.
## Why Not the Old `/file/:uuid/chunks` Endpoint
The old endpoint returned ALL chunks for a file, and the portal filtered client-side. This is inefficient. The new single-chunk endpoint replaces it cleanly.
## Portal Impact
`portal/src/views/ChunkDetailView.vue:245` currently calls:
```
GET /api/v1/file/{uuid}/chunks → filters client-side by chunk_id
```
Will switch to:
```
GET /api/v1/file/{uuid}/chunk/{chunk_id} → direct single result
```
## Temporary Workaround (M4 Only)
M4 has temporarily re-added `GET /api/v1/file/:file_uuid/chunks` for portal compatibility. This will be removed once the new endpoint is available.
## M5 Response
**Status**: ✅ Implemented
**Endpoint**: `GET /api/v1/file/:file_uuid/chunk/:chunk_id`
**Verified**:
- `0-01` → 200 (corrected chunk)
- `1446-01` → 200 (corrected chunk)
- `story_240` → 200 (story chunk)
- `nonexistent` → 404
M4 can remove the temporary workaround (`/file/:uuid/chunks`) and switch Portal to use this endpoint.