10 KiB
Video Streaming & Frame Extraction
All video streaming endpoints support the following common query parameters:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
mode |
string | No | normal |
normal or debug (draws detection overlays) |
audio |
string | No | on |
on or off |
GET /api/v1/file/:file_uuid/video
Stream the full video file with range support for seeking.
Auth: Required Scope: file-level
Response
- 200: Video stream (
Content-Typebased on file extension) - 206: Partial content (range request)
- Supports
Rangeheader for seeking
GET /api/v1/file/:file_uuid/trace/:trace_id/video
Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay).
Auth: Required Scope: file-level
GET /api/v1/file/:file_uuid/trace/:trace_id/representative-face
Find the best single face to represent this trace. Uses a two-stage selection: SQL (area × confidence → top 10) then FFmpeg blurdetect (sharpness → pick the least blurry).
Auth: Required Scope: file-level
Example
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
-H "X-API-Key: $KEY"
Response (200)
{
"success": true,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"trace_id": 1939,
"face_count": 538,
"representative": {
"frame_number": 68193,
"timestamp_secs": 2727.72,
"bbox": { "x": 347, "y": 378, "width": 427, "height": 427 },
"confidence": 0.760,
"quality_score": 138516,
"blur_score": 9.46
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
trace_id |
integer | Face trace ID |
face_count |
integer | Total face detections in this trace |
representative.frame_number |
integer | Frame number of the selected face (primary coordinate) |
representative.timestamp_secs |
float | Time in seconds (derived from frame_number / fps) |
representative.bbox |
object | Bounding box {x, y, width, height} |
representative.confidence |
float | Detection confidence (0.0–1.0) |
representative.quality_score |
float | Pre-selection score (area × confidence) |
representative.blur_score |
float | FFmpeg blurdetect result (lower = sharper) |
Error Responses
GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail
Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as representative-face, then crops via FFmpeg. The result is cacheable for 24 hours.
Auth: Required Scope: file-level
Example
curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
-H "X-API-Key: $KEY" -o trace_1939_face.jpg
Response
- 200:
image/jpegbinary data (320×320 cropped face) - 404: File, trace not found, or no suitable face
- 500: FFmpeg or database error
GET /api/v1/file/:file_uuid/identities/:identity_uuid_a/co-occur-with/:identity_uuid_b
Find the first frame where two identities appear together, with representative face thumbnails for both.
Auth: Required Scope: file-level
Example
# Audrey Hepburn & Cary Grant 第一次同框
curl -s "$API/api/v1/file/$FILE_UUID/identities/$AUDREY_UUID/co-occur-with/$CARY_UUID" \
-H "X-API-Key: $KEY" | jq '{identity_a: .identity_a.name, identity_b: .identity_b.name, first_frame: .first_cooccurrence.frame_number}'
Response (200)
{
"success": true,
"file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
"identity_a": {
"identity_uuid": "c3545906-c82d-4b66-aa1d-150bc02decce",
"name": "Audrey Hepburn",
"trace_id": 920
},
"identity_b": {
"identity_uuid": "2b0ddefe-e2a9-4533-9308-b375594604d5",
"name": "Cary Grant",
"trace_id": 919
},
"first_cooccurrence": {
"frame_number": 38165,
"timestamp_secs": 1526.60,
"total_cooccurrence_frames": 3136,
"representative_face_a": {
"frame_number": 38199,
"bbox": { "x": 122, "y": 339, "width": 176, "height": 176 },
"confidence": 0.832,
"thumbnail_url": "/api/v1/file/aeed71342.../trace/920/thumbnail"
},
"representative_face_b": {
"frame_number": 38291,
"bbox": { "x": 511, "y": 315, "width": 192, "height": 192 },
"confidence": 0.791,
"thumbnail_url": "/api/v1/file/aeed71342.../trace/919/thumbnail"
}
}
}
Response Fields
| Field | Type | Description |
|---|---|---|
identity_a.name |
string | First identity name |
identity_b.name |
string | Second identity name |
first_cooccurrence.frame_number |
int | First frame where both appear |
first_cooccurrence.timestamp_secs |
float | Time in seconds |
first_cooccurrence.total_cooccurrence_frames |
int | Total frames with both present |
first_cooccurrence.representative_face_a/b |
object | Best face thumbnail data for each identity |
Error Responses
| HTTP | When |
|---|---|
404 |
File or identity not found |
404 |
The two identities never co-occur in this file |
500 |
Database or FFmpeg error |
GET /api/v1/file/:file_uuid/video/bbox
Stream video with bounding box overlay for all detected objects/faces.
Auth: Required Scope: file-level
Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg drawtext filter.
GET /api/v1/file/:file_uuid/thumbnail
Extract a single frame from a video as JPEG image. Uses FFmpeg select filter.
Auth: Required Scope: file-level
Query Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
frame |
integer | Yes | — | Zero-based frame number to extract |
x |
integer | No | — | Crop start X (left edge). Requires y, w, h. |
y |
integer | No | — | Crop start Y (top edge). Requires x, w, h. |
w |
integer | No | — | Crop width in pixels. Requires x, y, h. |
h |
integer | No | — | Crop height in pixels. Requires x, y, w. |
All four crop params (x, y, w, h) must be provided together or omitted.
Example
# Extract frame 1000 (full frame)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000" \
-H "Authorization: Bearer $JWT" -o frame_1000.jpg
# Extract and crop face region (x=320, y=240, w=160, h=160)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&x=320&y=240&w=160&h=160" \
-H "Authorization: Bearer $JWT" -o face_crop.jpg
Response
- 200:
image/jpegbinary data - 404: File not found
- 500: FFmpeg error (e.g., frame number exceeds video duration)
GET /api/v1/file/:file_uuid/clip
Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg -ss fast seek.
Auth: Required Scope: file-level
Query Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
start_frame |
integer | No* | — | Start frame (zero-based). Frame-accurate — use this for precision. |
end_frame |
integer | No* | — | End frame (zero-based, inclusive). Requires start_frame. |
start_time |
float | No* | — | Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
end_time |
float | No* | — | End time in seconds. Approximate (FPS-dependent). Fallback if frames not given. |
fps |
float | No | video FPS | Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS. |
mode |
string | No | normal |
normal or debug (draws "CLIP" overlay) |
audio |
string | No | on |
on or off |
Either (start_frame+end_frame) OR (start_time+end_time) must be provided.
Example
# Clip by frame range (primary)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&end_frame=47" \
-H "Authorization: Bearer $JWT" -o clip.ts
# Clip by time range (fallback)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&end_time=45" \
-H "Authorization: Bearer $JWT" -o clip.ts
Response
- 200:
video/mp2tMPEG-TS stream - 400: Missing/invalid range parameters
- 404: File not found
- 500: FFmpeg error
Technical Notes
| Detail | Value |
|---|---|
| Backend | FFmpeg (ffmpeg-full) |
| Seek | -ss before -i (fast keyframe seek) |
| Format | MPEG-TS (mpegts muxer, pipe-safe) |
| Codec | H.264 + AAC |
| Cache | Cache-Control: public, max-age=86400 (24h) |
Video vs Clip: Quality & Format Comparison
Both endpoints support time range extraction, but serve different use cases:
| Feature | /video |
/clip |
|---|---|---|
| No params | Streams full file (Range seek) | Returns 400 (params required) |
| HTTP Range | ✅ Supported | ❌ Not supported |
| Encoding | -c copy (zero encoding) |
-c:v libx264 -c:a aac (re-encode) |
| Quality | Original (bit-exact, zero loss) | Compressed (default CRF ≈ 23) |
| Format | video/mp4 |
video/mp2t (MPEG-TS) |
| Speed | Fast (no computation) | Slower (encoding required) |
| Frame control | Time-based (dur = (ef-sf)/fps) |
Precise (-vframes) |
| Debug mode | ❌ | ✅ mode=debug overlay |
| Cache | ❌ | ✅ max-age=86400 |
Usage Recommendation
| Scenario | Use |
|---|---|
| Full video streaming / player seek | /video |
| Quick preview clip (zero quality loss) | /video?start_frame=...&end_frame=... |
| Debug frame verification / text overlay | /clip?mode=debug |
| Precise frame count control | /clip |
| CDN cacheable clip | /clip |
| Detail | Value |
|---|---|
| Backend | FFmpeg (ffmpeg-full) |
| Filter | select=eq(n\,FRAME) to select frame, optional crop=W:H:X:Y |
| Output | Single JPEG via pipe (image2pipe, mjpeg codec) |
| Cache | Cache-Control: public, max-age=86400 (24h) |
| Frame number | Zero-based (frame=0 = first frame of video) |
Updated: 2026-05-19 12:49:24