docs: add thumbnail endpoint to 08_media.md

This commit is contained in:
Accusys
2026-05-22 04:58:43 +08:00
parent d67f123949
commit 6378d7be89
3 changed files with 296 additions and 0 deletions

View File

@@ -0,0 +1,252 @@
<!-- module: media -->
<!-- description: Video streaming & frame extraction -->
<!-- depends: 01_auth -->
## Video Streaming & Frame Extraction
All video streaming endpoints support the following common query parameters:
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `mode` | string | No | `normal` | `normal` or `debug` (draws detection overlays) |
| `audio` | string | No | `on` | `on` or `off` |
---
### `GET /api/v1/file/:file_uuid/video`
Stream the full video file with range support for seeking.
**Auth**: Required
**Scope**: file-level
#### Response
- **200**: Video stream (`Content-Type` based on file extension)
- **206**: Partial content (range request)
- Supports `Range` header for seeking
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/video`
Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay).
**Auth**: Required
**Scope**: file-level
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/representative-face`
Find the best single face to represent this trace. Uses a two-stage selection: SQL (area × confidence → top 10) then FFmpeg `blurdetect` (sharpness → pick the least blurry).
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
-H "X-API-Key: $KEY"
```
#### Response (200)
```json
{
"success": true,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"trace_id": 1939,
"face_count": 538,
"representative": {
"frame_number": 68193,
"timestamp_secs": 2727.72,
"bbox": { "x": 347, "y": 378, "width": 427, "height": 427 },
"confidence": 0.760,
"quality_score": 138516,
"blur_score": 9.46
}
}
```
#### Response Fields
| Field | Type | Description |
|-------|------|-------------|
| `trace_id` | integer | Face trace ID |
| `face_count` | integer | Total face detections in this trace |
| `representative.frame_number` | integer | Frame number of the selected face (primary coordinate) |
| `representative.timestamp_secs` | float | Time in seconds (derived from `frame_number / fps`) |
| `representative.bbox` | object | Bounding box `{x, y, width, height}` |
| `representative.confidence` | float | Detection confidence (0.01.0) |
| `representative.quality_score` | float | Pre-selection score (`area × confidence`) |
| `representative.blur_score` | float | FFmpeg blurdetect result (lower = sharper) |
#### Error Responses
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail`
Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as `representative-face`, then crops via FFmpeg. The result is cacheable for 24 hours.
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
-H "X-API-Key: $KEY" -o trace_1939_face.jpg
```
#### Response
- **200**: `image/jpeg` binary data (320×320 cropped face)
- **404**: File, trace not found, or no suitable face
- **500**: FFmpeg or database error
| HTTP | When |
|------|------|
| `404` | File, trace not found, or no suitable face |
| `500` | FFmpeg or database error |
---
### `GET /api/v1/file/:file_uuid/video/bbox`
Stream video with bounding box overlay for all detected objects/faces.
**Auth**: Required
**Scope**: file-level
Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg `drawtext` filter.
---
### `GET /api/v1/file/:file_uuid/thumbnail`
Extract a single frame from a video as JPEG image. Uses FFmpeg `select` filter.
**Auth**: Required
**Scope**: file-level
#### Query Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `frame` | integer | Yes | — | Zero-based frame number to extract |
| `x` | integer | No | — | Crop start X (left edge). Requires `y`, `w`, `h`. |
| `y` | integer | No | — | Crop start Y (top edge). Requires `x`, `w`, `h`. |
| `w` | integer | No | — | Crop width in pixels. Requires `x`, `y`, `h`. |
| `h` | integer | No | — | Crop height in pixels. Requires `x`, `y`, `w`. |
All four crop params (`x`, `y`, `w`, `h`) must be provided together or omitted.
#### Example
```bash
# Extract frame 1000 (full frame)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000" \
-H "Authorization: Bearer $JWT" -o frame_1000.jpg
# Extract and crop face region (x=320, y=240, w=160, h=160)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&x=320&y=240&w=160&h=160" \
-H "Authorization: Bearer $JWT" -o face_crop.jpg
```
#### Response
- **200**: `image/jpeg` binary data
- **404**: File not found
- **500**: FFmpeg error (e.g., frame number exceeds video duration)
### `GET /api/v1/file/:file_uuid/clip`
Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg `-ss` fast seek.
**Auth**: Required
**Scope**: file-level
#### Query Parameters
| Field | Type | Required | Default | Description |
|-------|------|----------|---------|-------------|
| `start_frame` | integer | No* | — | Start frame (zero-based). **Frame-accurate** — use this for precision. |
| `end_frame` | integer | No* | — | End frame (zero-based, inclusive). Requires `start_frame`. |
| `start_time` | float | No* | — | Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
| `end_time` | float | No* | — | End time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
| `fps` | float | No | video FPS | Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS. |
| `mode` | string | No | `normal` | `normal` or `debug` (draws "CLIP" overlay) |
| `audio` | string | No | `on` | `on` or `off` |
Either (`start_frame`+`end_frame`) OR (`start_time`+`end_time`) must be provided.
#### Example
```bash
# Clip by frame range (primary)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&end_frame=47" \
-H "Authorization: Bearer $JWT" -o clip.ts
# Clip by time range (fallback)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&end_time=45" \
-H "Authorization: Bearer $JWT" -o clip.ts
```
#### Response
- **200**: `video/mp2t` MPEG-TS stream
- **400**: Missing/invalid range parameters
- **404**: File not found
- **500**: FFmpeg error
#### Technical Notes
| Detail | Value |
|--------|-------|
| **Backend** | FFmpeg (`ffmpeg-full`) |
| **Seek** | `-ss` before `-i` (fast keyframe seek) |
| **Format** | MPEG-TS (`mpegts` muxer, pipe-safe) |
| **Codec** | H.264 + AAC |
| **Cache** | `Cache-Control: public, max-age=86400` (24h) |
### Video vs Clip: Quality & Format Comparison
Both endpoints support time range extraction, but serve different use cases:
| Feature | `/video` | `/clip` |
|---------|----------|---------|
| **No params** | Streams full file (Range seek) | Returns 400 (params required) |
| **HTTP Range** | ✅ Supported | ❌ Not supported |
| **Encoding** | `-c copy` (zero encoding) | `-c:v libx264 -c:a aac` (re-encode) |
| **Quality** | Original (bit-exact, zero loss) | Compressed (default CRF ≈ 23) |
| **Format** | `video/mp4` | `video/mp2t` (MPEG-TS) |
| **Speed** | Fast (no computation) | Slower (encoding required) |
| **Frame control** | Time-based (`dur = (ef-sf)/fps`) | Precise (`-vframes`) |
| **Debug mode** | ❌ | ✅ `mode=debug` overlay |
| **Cache** | ❌ | ✅ `max-age=86400` |
#### Usage Recommendation
| Scenario | Use |
|----------|-----|
| Full video streaming / player seek | `/video` |
| Quick preview clip (zero quality loss) | `/video?start_frame=...&end_frame=...` |
| Debug frame verification / text overlay | `/clip?mode=debug` |
| Precise frame count control | `/clip` |
| CDN cacheable clip | `/clip` |
---
| Detail | Value |
|--------|-------|
| **Backend** | FFmpeg (`ffmpeg-full`) |
| **Filter** | `select=eq(n\,FRAME)` to select frame, optional `crop=W:H:X:Y` |
| **Output** | Single JPEG via pipe (`image2pipe`, `mjpeg` codec) |
| **Cache** | `Cache-Control: public, max-age=86400` (24h) |
| **Frame number** | Zero-based (`frame=0` = first frame of video) |
---
*Updated: 2026-05-19 12:49:24*

View File

@@ -85,6 +85,28 @@ curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
#### Error Responses
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail`
Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as `representative-face`, then crops via FFmpeg. The result is cacheable for 24 hours.
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
-H "X-API-Key: $KEY" -o trace_1939_face.jpg
```
#### Response
- **200**: `image/jpeg` binary data (320×320 cropped face)
- **404**: File, trace not found, or no suitable face
- **500**: FFmpeg or database error
| HTTP | When |
|------|------|
| `404` | File, trace not found, or no suitable face |

View File

@@ -92,6 +92,28 @@ curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
---
### `GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail`
Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as `representative-face`, then crops via FFmpeg. The result is cacheable for 24 hours.
**Auth**: Required
**Scope**: file-level
#### Example
```bash
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
-H "X-API-Key: $KEY" -o trace_1939_face.jpg
```
#### Response
- **200**: `image/jpeg` binary data (320×320 cropped face)
- **404**: File, trace not found, or no suitable face
- **500**: FFmpeg or database error
---
### `GET /api/v1/file/:file_uuid/video/bbox`
Stream video with bounding box overlay for all detected objects/faces.