feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system
This commit is contained in:
184
deliverable_v1.1.0/modules/03_register.md
Normal file
184
deliverable_v1.1.0/modules/03_register.md
Normal file
@@ -0,0 +1,184 @@
|
||||
<!-- module: register -->
|
||||
<!-- description: File registration — register, scan -->
|
||||
<!-- depends: 01_auth -->
|
||||
|
||||
## File Registration
|
||||
|
||||
### `POST /api/v1/files/register`
|
||||
|
||||
**Auth**: Required
|
||||
**Scope**: file-level
|
||||
|
||||
Register a video file for processing. Returns the file's metadata and UUID.
|
||||
|
||||
**New in v0.1.2**: Registration now **automatically triggers the processing pipeline** — no need to call `POST /api/v1/file/:file_uuid/process` separately. The system will:
|
||||
1. Register the file and run ffprobe
|
||||
2. Auto-run offline TMDb probe (reads local identity files, no API calls)
|
||||
3. Create a monitor job for the worker
|
||||
4. Worker starts all 10 processors (Cut → ASR → ASRX → YOLO → OCR → Face → Pose → VisualChunk → Story → 5W1H)
|
||||
|
||||
If the file already exists (same content hash), returns the existing record with `already_exists: true`.
|
||||
|
||||
#### Request Parameters
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `file_path` | string | Yes | — | Path to video file on disk |
|
||||
| `pattern` | string | No | — | Regex pattern for batch register (requires `file_path` to be a directory) |
|
||||
| `user_id` | integer | No | — | User ID to associate with registration |
|
||||
| `content_hash` | string | No | — | Pre-computed SHA-256 hash (skips computation) |
|
||||
|
||||
#### Example
|
||||
|
||||
```bash
|
||||
# Register a single file
|
||||
curl -s -X POST "$API/api/v1/files/register" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-API-Key: $KEY" \
|
||||
-d '{"file_path": "/path/to/video.mp4"}'
|
||||
|
||||
# Batch register files matching a pattern in a directory
|
||||
curl -s -X POST "$API/api/v1/files/register" \
|
||||
-H "Content-Type: application/json" \
|
||||
-H "X-API-Key: $KEY" \
|
||||
-d '{"file_path": "/path/to/dir", "pattern": ".*\\.mp4$"}'
|
||||
```
|
||||
|
||||
#### Response (200)
|
||||
|
||||
```json
|
||||
{
|
||||
"success": true,
|
||||
"file_uuid": "3a6c1865...",
|
||||
"file_name": "video.mp4",
|
||||
"file_path": "/path/to/video.mp4",
|
||||
"file_type": "video",
|
||||
"duration": 120.5,
|
||||
"width": 1920,
|
||||
"height": 1080,
|
||||
"fps": 24.0,
|
||||
"total_frames": 2892,
|
||||
"already_exists": false,
|
||||
"message": "File registered successfully"
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `success` | boolean | Always true on 200 |
|
||||
| `file_uuid` | string | 32-char hex UUID of the registered file |
|
||||
| `file_name` | string | File name (auto-renamed if name conflict) |
|
||||
| `file_path` | string | Canonical path on disk |
|
||||
| `file_type` | string | `"video"`, `"audio"`, or `"unknown"` |
|
||||
| `duration` | float | Duration in seconds |
|
||||
| `width` | integer | Video width in pixels |
|
||||
| `height` | integer | Video height in pixels |
|
||||
| `fps` | float | Frames per second |
|
||||
| `total_frames` | integer | Total frame count |
|
||||
| `already_exists` | boolean | True if same content was already registered |
|
||||
| `message` | string | Human-readable status |
|
||||
|
||||
#### Error Responses
|
||||
|
||||
| HTTP | When |
|
||||
|------|------|
|
||||
| `401` | Missing or invalid API key |
|
||||
| `400` | Invalid request body |
|
||||
| `404` | File path does not exist |
|
||||
|
||||
---
|
||||
|
||||
### `GET /api/v1/files/scan`
|
||||
|
||||
**Auth**: Required
|
||||
**Scope**: file-level
|
||||
|
||||
Scan the filesystem directory and list all media files, showing which are registered, processing, or unregistered.
|
||||
|
||||
#### Query Parameters
|
||||
|
||||
| Field | Type | Required | Default | Description |
|
||||
|-------|------|----------|---------|-------------|
|
||||
| `page` | integer | No | 1 | Page number (1-based) |
|
||||
| `page_size` | integer | No | all | Items per page (alias: `limit`) |
|
||||
| `limit` | integer | No | all | Max items (alias for `page_size`) |
|
||||
| `pattern` | string | No | — | Regex filter on file name (e.g., `.*\\.mp4$`) |
|
||||
| `sort_by` | string | No | `name` | Sort field: `name`, `size`, `modified`, `status` |
|
||||
| `sort_order` | string | No | `asc` | Sort direction: `asc` or `desc` |
|
||||
|
||||
#### Example
|
||||
|
||||
```bash
|
||||
# Full scan
|
||||
curl -s "$API/api/v1/files/scan" -H "X-API-Key: $KEY" | jq '{total, registered_count, unregistered_count}'
|
||||
|
||||
# Paginated (page 1, 5 per page)
|
||||
curl -s "$API/api/v1/files/scan?page=1&page_size=5" -H "X-API-Key: $KEY" | jq '{page, total_pages, files: [.files[].file_name]}'
|
||||
|
||||
# Regex filter: only mp4 files
|
||||
curl -s "$API/api/v1/files/scan?pattern=.*\\.mp4$" -H "X-API-Key: $KEY" | jq '{filtered_total, files: [.files[].file_name]}'
|
||||
|
||||
# Sort by file size (largest first)
|
||||
curl -s "$API/api/v1/files/scan?sort_by=size&sort_order=desc&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, file_size}]'
|
||||
|
||||
# Sort by modified time (most recent first)
|
||||
curl -s "$API/api/v1/files/scan?sort_by=modified&sort_order=desc&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, modified_time}]'
|
||||
|
||||
# Sort by status
|
||||
curl -s "$API/api/v1/files/scan?sort_by=status&page_size=5" -H "X-API-Key: $KEY" | jq '[.files[] | {file_name, status}]'
|
||||
```
|
||||
|
||||
#### Response (200)
|
||||
|
||||
```json
|
||||
{
|
||||
"files": [
|
||||
{
|
||||
"file_name": "video.mp4",
|
||||
"file_size": 12345678,
|
||||
"is_registered": true,
|
||||
"file_uuid": "3a6c1865...",
|
||||
"status": "completed",
|
||||
"registration_time": "2026-05-16T12:00:00Z",
|
||||
"job_id": 42
|
||||
}
|
||||
],
|
||||
"total": 107,
|
||||
"filtered_total": 80,
|
||||
"page": 1,
|
||||
"page_size": 20,
|
||||
"total_pages": 4,
|
||||
"registered_count": 26,
|
||||
"unregistered_count": 81
|
||||
}
|
||||
```
|
||||
|
||||
| Field | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `files` | array | Array of file info objects (paginated) |
|
||||
| `files[].file_name` | string | File name |
|
||||
| `files[].relative_path` | string | Path relative to scan root |
|
||||
| `files[].file_path` | string | Absolute path on disk |
|
||||
| `files[].file_size` | integer | File size in bytes |
|
||||
| `files[].modified_time` | string | Last modified timestamp (ISO8601) |
|
||||
| `files[].is_registered` | boolean | Whether file is registered in DB |
|
||||
| `files[].file_uuid` | string | 32-char hex UUID (only if registered) |
|
||||
| `files[].status` | string | `"completed"`, `"processing"`, `"registered"`, `"unregistered"`, or `null` |
|
||||
| `files[].registration_time` | string | DB registration timestamp (only if registered) |
|
||||
| `files[].job_id` | integer | Processing job ID (only if a job exists) |
|
||||
| `total` | integer | Total files found on disk (unfiltered) |
|
||||
| `filtered_total` | integer | Files matching regex filter |
|
||||
| `page` | integer | Current page number |
|
||||
| `page_size` | integer | Items per page |
|
||||
| `total_pages` | integer | Total pages |
|
||||
| `registered_count` | integer | Files registered in DB |
|
||||
| `unregistered_count` | integer | Files not yet registered |
|
||||
|
||||
#### Notes
|
||||
|
||||
| Feature | Behavior |
|
||||
|---------|----------|
|
||||
| **Regex** | Case-insensitive (`(?i)` prefix auto-applied). Applied to `file_name`. |
|
||||
| **Sort order** | Default (`sort_by=name`): registered files first, then alphabetically. `sort_by=status`: alphabetical by status string. |
|
||||
| **Pagination** | `page_size` and `limit` are aliases. Default: show all results. |
|
||||
| **Processing order** | `pattern` regex filter → `sort_by`/`sort_order` → `page`/`page_size` slice. |
|
||||
Reference in New Issue
Block a user