fix: M4 Phase 1 bugs - dev.chunks refs, search_path, uuid column
Bug fixes from M4 report: - 4 remaining dev.chunks → dev.chunk in SQL queries - search_path includes public for pgvector extension - get_chunk_by_chunk_id_and_uuid: uuid → file_uuid - New endpoint: GET /api/v1/file/:uuid/chunk/:chunk_id
This commit is contained in:
97
docs_v1.0/M4_workspace/2026-05-11_Phase1_bug_report.md
Normal file
97
docs_v1.0/M4_workspace/2026-05-11_Phase1_bug_report.md
Normal file
@@ -0,0 +1,97 @@
|
||||
# Bug Report: Schema Migration Bugs Found During M4 Phase 1 Testing
|
||||
|
||||
**Date**: 2026-05-11
|
||||
**From**: M4 (Integration & Testing)
|
||||
**To**: M5 (Development)
|
||||
**Priority**: High (3 bugs, all cause 500 errors)
|
||||
|
||||
---
|
||||
|
||||
## Bug 1: Stale `dev.chunks` References After Table Rename
|
||||
|
||||
M5 renamed `dev.chunks` → `dev.chunk` but 4 SQL queries still hardcoded `dev.chunks`, causing "relation does not exist" errors.
|
||||
|
||||
### Affected Files
|
||||
|
||||
| File | Line | Old | New |
|
||||
|------|------|-----|-----|
|
||||
| `src/core/db/postgres_db.rs` | 4626 | `FROM dev.chunks` | `FROM dev.chunk` |
|
||||
| `src/api/five_w1h_agent_api.rs` | 779 | `UPDATE dev.chunks SET embedding` | `UPDATE dev.chunk SET embedding` |
|
||||
| `src/worker/processor.rs` | 1083 | `UPDATE dev.chunks SET metadata` | `UPDATE dev.chunk SET metadata` |
|
||||
| `src/worker/job_worker.rs` | 1005 | `FROM dev.chunks WHERE` | `FROM dev.chunk WHERE` |
|
||||
|
||||
### Why M5 Tests Passed
|
||||
M5 likely still had the OLD table (ALTER RENAME keeps source, M5 may not have dropped it). M4 followed the migration exactly (RENAME), so the old table is gone.
|
||||
|
||||
### Trigger
|
||||
- `POST /api/v1/search/smart` → 500 error "relation dev.chunks does not exist"
|
||||
|
||||
---
|
||||
|
||||
## Bug 2: `search_path` Missing `public` — pgvector `vector` Type Not Found
|
||||
|
||||
`after_connect` sets `SET search_path TO dev` which REPLACES the default `"$user", public` — losing access to `public` schema where pgvector extension lives.
|
||||
|
||||
### Affected Code
|
||||
```rust
|
||||
// src/core/db/postgres_db.rs:745
|
||||
sqlx::query(&format!("SET search_path TO {}", schema))
|
||||
```
|
||||
|
||||
### Fix
|
||||
```rust
|
||||
sqlx::query(&format!("SET search_path TO {}, public", schema))
|
||||
```
|
||||
|
||||
### Trigger
|
||||
- Any query using `::vector` cast (e.g., `search_parent_chunks_semantic`) after `SET search_path TO dev`
|
||||
- Error: "type vector does not exist"
|
||||
|
||||
---
|
||||
|
||||
## Bug 3: `get_chunk_by_chunk_id_and_uuid` Uses Wrong Column Name
|
||||
|
||||
The function queries `SELECT ... uuid, ... WHERE uuid = $2` but the `dev.chunk` table column is `file_uuid`, not `uuid`.
|
||||
|
||||
### Affected Code
|
||||
```rust
|
||||
// src/core/db/postgres_db.rs:2776
|
||||
"SELECT ... uuid, ... FROM {} WHERE chunk_id = $1 AND uuid = $2"
|
||||
```
|
||||
|
||||
### Fix
|
||||
```sql
|
||||
SELECT ... file_uuid as uuid, ... FROM dev.chunk WHERE chunk_id = $1 AND file_uuid = $2
|
||||
```
|
||||
|
||||
### Trigger
|
||||
- `GET /api/v1/file/:uuid/chunk/:chunk_id` → 500 "column uuid does not exist"
|
||||
- `POST /api/v1/search/universal` → same error (internally calls this function)
|
||||
|
||||
---
|
||||
|
||||
## Bug 4 (Observation): PG Has No Embeddings for M5 Charade
|
||||
|
||||
M5's Charade data (`aeed71342a899fe4b4c57b7d41bcb692`) has 0 embeddings in PostgreSQL (all 4,188 sentence chunks have `embedding IS NULL`). Vectors are only in Qdrant.
|
||||
|
||||
### Impact
|
||||
- `POST /api/v1/search/smart` returns empty results (requires `embedding IS NOT NULL`)
|
||||
- `POST /api/v1/search/universal` works (queries Qdrant)
|
||||
|
||||
### M4 Workaround
|
||||
Demo script step 20 changed from `search/smart` to `search/universal` to show meaningful results.
|
||||
|
||||
---
|
||||
|
||||
## M5 Response
|
||||
|
||||
All 4 bugs fixed and verified (37/37 API tests ✅):
|
||||
|
||||
| # | Bug | Fix | Status |
|
||||
|:-:|-----|-----|:------:|
|
||||
| 1 | `dev.chunks` refs | 4 files updated: `postgres_db.rs`, `five_w1h_agent_api.rs`, `processor.rs`, `job_worker.rs` | ✅ |
|
||||
| 2 | `search_path` missing `public` | `postgres_db.rs:745` — `SET search_path TO dev, public` | ✅ |
|
||||
| 3 | `uuid` vs `file_uuid` | `postgres_db.rs:2777` — `/* ... */ AND file_uuid = $2` with `file_uuid as uuid` alias | ✅ |
|
||||
| 4 | PG embeddings empty | Confirmed design choice — vectors in Qdrant only. `search/universal` works. | ⚠️ By design |
|
||||
|
||||
New source bundle updated at `docs_v1.0/M4_HANDOVER/momentry_core_v1.0.2_source.tar.gz`.
|
||||
70
docs_v1.0/M4_workspace/2026-05-11_chunk_detail_endpoint.md
Normal file
70
docs_v1.0/M4_workspace/2026-05-11_chunk_detail_endpoint.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# API Report: Missing Chunk Detail Endpoint
|
||||
|
||||
**Date**: 2026-05-11
|
||||
**From**: M4 (Integration & Testing)
|
||||
**To**: M5 (Development)
|
||||
**Priority**: Medium
|
||||
|
||||
---
|
||||
|
||||
## Issue
|
||||
|
||||
Portal's `ChunkDetailView` needs to fetch a single chunk by `file_uuid` + `chunk_id`. Currently no dedicated endpoint exists for this.
|
||||
|
||||
## Proposed Endpoint
|
||||
|
||||
```
|
||||
GET /api/v1/file/:file_uuid/chunk/:chunk_id
|
||||
```
|
||||
|
||||
- Method: `GET`
|
||||
- Path params: `file_uuid` (String), `chunk_id` (String)
|
||||
- Response: Single `Chunk` object (same structure as chunk items in search results)
|
||||
|
||||
## Existing Building Block
|
||||
|
||||
The DB-layer method already exists:
|
||||
|
||||
```rust
|
||||
// src/core/db/postgres_db.rs:2770
|
||||
pub async fn get_chunk_by_chunk_id_and_uuid(
|
||||
&self,
|
||||
chunk_id: &str,
|
||||
uuid: &str,
|
||||
) -> Result<Option<Chunk>>
|
||||
```
|
||||
|
||||
Currently only used internally by the search handler (`server.rs:1494, 1534`). Only a route + handler wrapper is needed.
|
||||
|
||||
## Why Not the Old `/file/:uuid/chunks` Endpoint
|
||||
|
||||
The old endpoint returned ALL chunks for a file, and the portal filtered client-side. This is inefficient. The new single-chunk endpoint replaces it cleanly.
|
||||
|
||||
## Portal Impact
|
||||
|
||||
`portal/src/views/ChunkDetailView.vue:245` currently calls:
|
||||
```
|
||||
GET /api/v1/file/{uuid}/chunks → filters client-side by chunk_id
|
||||
```
|
||||
Will switch to:
|
||||
```
|
||||
GET /api/v1/file/{uuid}/chunk/{chunk_id} → direct single result
|
||||
```
|
||||
|
||||
## Temporary Workaround (M4 Only)
|
||||
|
||||
M4 has temporarily re-added `GET /api/v1/file/:file_uuid/chunks` for portal compatibility. This will be removed once the new endpoint is available.
|
||||
|
||||
## M5 Response
|
||||
|
||||
**Status**: ✅ Implemented
|
||||
|
||||
**Endpoint**: `GET /api/v1/file/:file_uuid/chunk/:chunk_id`
|
||||
|
||||
**Verified**:
|
||||
- `0-01` → 200 (corrected chunk)
|
||||
- `1446-01` → 200 (corrected chunk)
|
||||
- `story_240` → 200 (story chunk)
|
||||
- `nonexistent` → 404
|
||||
|
||||
M4 can remove the temporary workaround (`/file/:uuid/chunks`) and switch Portal to use this endpoint.
|
||||
Reference in New Issue
Block a user