## Video Streaming & Frame Extraction All video streaming endpoints support the following common query parameters: | Field | Type | Required | Default | Description | |-------|------|----------|---------|-------------| | `mode` | string | No | `normal` | `normal` or `debug` (draws detection overlays) | | `audio` | string | No | `on` | `on` or `off` | --- ### `GET /api/v1/file/:file_uuid/video` Stream the full video file with range support for seeking. **Auth**: Required **Scope**: file-level #### Response - **200**: Video stream (`Content-Type` based on file extension) - **206**: Partial content (range request) - Supports `Range` header for seeking --- ### `GET /api/v1/file/:file_uuid/trace/:trace_id/video` Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay). **Auth**: Required **Scope**: file-level --- ### `GET /api/v1/file/:file_uuid/trace/:trace_id/representative-face` Find the best single face to represent this trace. Uses a two-stage selection: SQL (area × confidence → top 10) then FFmpeg `blurdetect` (sharpness → pick the least blurry). **Auth**: Required **Scope**: file-level #### Example ```bash curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \ -H "X-API-Key: $KEY" ``` #### Response (200) ```json { "success": true, "file_uuid": "aeed71342a899fe4b4c57b7d41bcb692", "trace_id": 1939, "face_count": 538, "representative": { "frame_number": 68193, "timestamp_secs": 2727.72, "bbox": { "x": 347, "y": 378, "width": 427, "height": 427 }, "confidence": 0.760, "quality_score": 138516, "blur_score": 9.46 } } ``` #### Response Fields | Field | Type | Description | |-------|------|-------------| | `trace_id` | integer | Face trace ID | | `face_count` | integer | Total face detections in this trace | | `representative.frame_number` | integer | Frame number of the selected face (primary coordinate) | | `representative.timestamp_secs` | float | Time in seconds (derived from `frame_number / fps`) | | `representative.bbox` | object | Bounding box `{x, y, width, height}` | | `representative.confidence` | float | Detection confidence (0.0–1.0) | | `representative.quality_score` | float | Pre-selection score (`area × confidence`) | | `representative.blur_score` | float | FFmpeg blurdetect result (lower = sharper) | #### Error Responses --- ### `GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail` Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as `representative-face`, then crops via FFmpeg. The result is cacheable for 24 hours. **Auth**: Required **Scope**: file-level #### Example ```bash curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \ -H "X-API-Key: $KEY" -o trace_1939_face.jpg ``` #### Response - **200**: `image/jpeg` binary data (320×320 cropped face) - **404**: File, trace not found, or no suitable face - **500**: FFmpeg or database error --- ### `GET /api/v1/file/:file_uuid/identities/:identity_uuid_a/co-occur-with/:identity_uuid_b` Find the first frame where two identities appear together, with representative face thumbnails for both. **Auth**: Required **Scope**: file-level #### Example ```bash # Audrey Hepburn & Cary Grant 第一次同框 curl -s "$API/api/v1/file/$FILE_UUID/identities/$AUDREY_UUID/co-occur-with/$CARY_UUID" \ -H "X-API-Key: $KEY" | jq '{identity_a: .identity_a.name, identity_b: .identity_b.name, first_frame: .first_cooccurrence.frame_number}' ``` #### Response (200) ```json { "success": true, "file_uuid": "aeed71342a899fe4b4c57b7d41bcb692", "identity_a": { "identity_uuid": "c3545906-c82d-4b66-aa1d-150bc02decce", "name": "Audrey Hepburn", "trace_id": 920 }, "identity_b": { "identity_uuid": "2b0ddefe-e2a9-4533-9308-b375594604d5", "name": "Cary Grant", "trace_id": 919 }, "first_cooccurrence": { "frame_number": 38165, "timestamp_secs": 1526.60, "total_cooccurrence_frames": 3136, "representative_face_a": { "frame_number": 38199, "bbox": { "x": 122, "y": 339, "width": 176, "height": 176 }, "confidence": 0.832, "thumbnail_url": "/api/v1/file/aeed71342.../trace/920/thumbnail" }, "representative_face_b": { "frame_number": 38291, "bbox": { "x": 511, "y": 315, "width": 192, "height": 192 }, "confidence": 0.791, "thumbnail_url": "/api/v1/file/aeed71342.../trace/919/thumbnail" } } } ``` #### Response Fields | Field | Type | Description | |-------|------|-------------| | `identity_a.name` | string | First identity name | | `identity_b.name` | string | Second identity name | | `first_cooccurrence.frame_number` | int | First frame where both appear | | `first_cooccurrence.timestamp_secs` | float | Time in seconds | | `first_cooccurrence.total_cooccurrence_frames` | int | Total frames with both present | | `first_cooccurrence.representative_face_a/b` | object | Best face thumbnail data for each identity | #### Error Responses | HTTP | When | |------|------| | `404` | File or identity not found | | `404` | The two identities never co-occur in this file | | `500` | Database or FFmpeg error | ### `GET /api/v1/file/:file_uuid/video/bbox` Stream video with bounding box overlay for all detected objects/faces. **Auth**: Required **Scope**: file-level Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg `drawtext` filter. --- ### `GET /api/v1/file/:file_uuid/thumbnail` Extract a single frame from a video as JPEG image. Uses FFmpeg `select` filter. **Auth**: Required **Scope**: file-level #### Query Parameters | Field | Type | Required | Default | Description | |-------|------|----------|---------|-------------| | `frame` | integer | Yes | — | Zero-based frame number to extract | | `x` | integer | No | — | Crop start X (left edge). Requires `y`, `w`, `h`. | | `y` | integer | No | — | Crop start Y (top edge). Requires `x`, `w`, `h`. | | `w` | integer | No | — | Crop width in pixels. Requires `x`, `y`, `h`. | | `h` | integer | No | — | Crop height in pixels. Requires `x`, `y`, `w`. | All four crop params (`x`, `y`, `w`, `h`) must be provided together or omitted. #### Example ```bash # Extract frame 1000 (full frame) curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000" \ -H "Authorization: Bearer $JWT" -o frame_1000.jpg # Extract and crop face region (x=320, y=240, w=160, h=160) curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&x=320&y=240&w=160&h=160" \ -H "Authorization: Bearer $JWT" -o face_crop.jpg ``` #### Response - **200**: `image/jpeg` binary data - **404**: File not found - **500**: FFmpeg error (e.g., frame number exceeds video duration) ### `GET /api/v1/file/:file_uuid/clip` Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg `-ss` fast seek. **Auth**: Required **Scope**: file-level #### Query Parameters | Field | Type | Required | Default | Description | |-------|------|----------|---------|-------------| | `start_frame` | integer | No* | — | Start frame (zero-based). **Frame-accurate** — use this for precision. | | `end_frame` | integer | No* | — | End frame (zero-based, inclusive). Requires `start_frame`. | | `start_time` | float | No* | — | Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given. | | `end_time` | float | No* | — | End time in seconds. Approximate (FPS-dependent). Fallback if frames not given. | | `fps` | float | No | video FPS | Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS. | | `mode` | string | No | `normal` | `normal` or `debug` (draws "CLIP" overlay) | | `audio` | string | No | `on` | `on` or `off` | Either (`start_frame`+`end_frame`) OR (`start_time`+`end_time`) must be provided. #### Example ```bash # Clip by frame range (primary) curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&end_frame=47" \ -H "Authorization: Bearer $JWT" -o clip.ts # Clip by time range (fallback) curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&end_time=45" \ -H "Authorization: Bearer $JWT" -o clip.ts ``` #### Response - **200**: `video/mp2t` MPEG-TS stream - **400**: Missing/invalid range parameters - **404**: File not found - **500**: FFmpeg error #### Technical Notes | Detail | Value | |--------|-------| | **Backend** | FFmpeg (`ffmpeg-full`) | | **Seek** | `-ss` before `-i` (fast keyframe seek) | | **Format** | MPEG-TS (`mpegts` muxer, pipe-safe) | | **Codec** | H.264 + AAC | | **Cache** | `Cache-Control: public, max-age=86400` (24h) | ### Video vs Clip: Quality & Format Comparison Both endpoints support time range extraction, but serve different use cases: | Feature | `/video` | `/clip` | |---------|----------|---------| | **No params** | Streams full file (Range seek) | Returns 400 (params required) | | **HTTP Range** | ✅ Supported | ❌ Not supported | | **Encoding** | `-c copy` (zero encoding) | `-c:v libx264 -c:a aac` (re-encode) | | **Quality** | Original (bit-exact, zero loss) | Compressed (default CRF ≈ 23) | | **Format** | `video/mp4` | `video/mp2t` (MPEG-TS) | | **Speed** | Fast (no computation) | Slower (encoding required) | | **Frame control** | Time-based (`dur = (ef-sf)/fps`) | Precise (`-vframes`) | | **Debug mode** | ❌ | ✅ `mode=debug` overlay | | **Cache** | ❌ | ✅ `max-age=86400` | #### Usage Recommendation | Scenario | Use | |----------|-----| | Full video streaming / player seek | `/video` | | Quick preview clip (zero quality loss) | `/video?start_frame=...&end_frame=...` | | Debug frame verification / text overlay | `/clip?mode=debug` | | Precise frame count control | `/clip` | | CDN cacheable clip | `/clip` | --- | Detail | Value | |--------|-------| | **Backend** | FFmpeg (`ffmpeg-full`) | | **Filter** | `select=eq(n\,FRAME)` to select frame, optional `crop=W:H:X:Y` | | **Output** | Single JPEG via pipe (`image2pipe`, `mjpeg` codec) | | **Cache** | `Cache-Control: public, max-age=86400` (24h) | | **Frame number** | Zero-based (`frame=0` = first frame of video) | --- *Updated: 2026-05-19 12:49:24*