refactor: rename search uuid -> file_uuid

This commit is contained in:
Accusys
2026-05-18 01:17:48 +08:00
parent 245ef39f03
commit 4125163f7b
4 changed files with 150 additions and 15 deletions

View File

@@ -0,0 +1,135 @@
<!-- module: search -->
<!-- description: Vector search, BM25, smart search, universal search, visual search -->
<!-- depends: 01_auth -->
## Search APIs
### `POST /api/v1/search/smart`
**Auth**: Required
**Scope**: file-level
Semantic vector search using EmbeddingGemma-300m. Generates a query embedding via EmbeddingGemma (port 11436), then searches pgvector `story_parent` and `llm_parent` chunks by cosine similarity.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `file_uuid` | string | Yes | — | File UUID to search within |
| `query` | string | Yes | — | Search text |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 5 | Items per page |
#### Example
```bash
curl -s -X POST "$API/api/v1/search/smart" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JWT" \
-d '{"file_uuid": "'"$FILE_UUID"'", "query": "Audrey Hepburn"}'
```
#### Response (200)
```json
{
"query": "Audrey Hepburn",
"results": [
{
"parent_id": 12345,
"start_time": 299.0,
"end_time": 300.0,
"summary": "[299s-300s, 1s] Cast: Audrey Hepburn. Total: 1 lines, 5 words...",
"similarity": 0.72
}
],
"strategy": "semantic_vector_search"
}
```
---
### `POST /api/v1/search/universal`
**Auth**: Required
**Scope**: file-level
Multi-type BM25 full-text search across chunks, frames, and persons. Uses PostgreSQL `tsvector`.
#### Request Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `query` | string | Yes | — | Search text |
| `file_uuid` | string | No | — | Restrict to specific file |
| `types` | string[] | No | `["chunk","frame","person"]` | Search types |
| `page` | integer | No | 1 | Page number |
| `page_size` | integer | No | 20 | Items per page |
#### Example
```bash
curl -s -X POST "$API/api/v1/search/universal" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $JWT" \
-d '{"file_uuid": "'"$FILE_UUID"'", "query": "Cary Grant"}'
```
#### Response (200)
```json
{
"results": [
{
"type": "chunk",
"chunk_id": "uuid_1429",
"chunk_type": "story_child",
"start_time": 429.16,
"end_time": 430.5,
"text": "You could have the stamps.",
"score": 0.9
}
],
"total": 20,
"took_ms": 18
}
```
---
### `POST /api/v1/search/frames`
**Auth**: Required
**Scope**: file-level
Search face detection frames by identity name or trace ID.
---
### `POST /api/v1/search/identity_text`
**Auth**: Required
**Scope**: file-level
Search text chunks spoken by a specific identity.
---
### Visual Search
| Method | Endpoint | Description |
|--------|----------|-------------|
| POST | `/api/v1/search/visual` | Search visual chunks |
| POST | `/api/v1/search/visual/class` | Search by object class |
| POST | `/api/v1/search/visual/density` | Search by object density |
| POST | `/api/v1/search/visual/combination` | Search by object combination |
| POST | `/api/v1/search/visual/stats` | Visual chunk statistics |
#### Embedding Model
| Detail | Value |
|--------|-------|
| **Model** | EmbeddingGemma-300m |
| **Endpoint** | `POST /api/v1/embeddings` on port 11436 |
| **Dimension** | 768 |
| **Storage** | pgvector (`chunk.embedding` column) |