Files
momentry_core/deliverable_v1.1.0/modules/08_media.md
2026-05-22 04:58:43 +08:00

8.1 KiB
Raw Blame History

Video Streaming & Frame Extraction

All video streaming endpoints support the following common query parameters:

Field Type Required Default Description
mode string No normal normal or debug (draws detection overlays)
audio string No on on or off

GET /api/v1/file/:file_uuid/video

Stream the full video file with range support for seeking.

Auth: Required Scope: file-level

Response

  • 200: Video stream (Content-Type based on file extension)
  • 206: Partial content (range request)
  • Supports Range header for seeking

GET /api/v1/file/:file_uuid/trace/:trace_id/video

Stream video with highlights for a specific face trace (follows a single person across frames with bounding box overlay).

Auth: Required Scope: file-level


GET /api/v1/file/:file_uuid/trace/:trace_id/representative-face

Find the best single face to represent this trace. Uses a two-stage selection: SQL (area × confidence → top 10) then FFmpeg blurdetect (sharpness → pick the least blurry).

Auth: Required Scope: file-level

Example

curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/representative-face" \
  -H "X-API-Key: $KEY"

Response (200)

{
  "success": true,
  "file_uuid": "aeed71342a899fe4b4c57b7d41bcb692",
  "trace_id": 1939,
  "face_count": 538,
  "representative": {
    "frame_number": 68193,
    "timestamp_secs": 2727.72,
    "bbox": { "x": 347, "y": 378, "width": 427, "height": 427 },
    "confidence": 0.760,
    "quality_score": 138516,
    "blur_score": 9.46
  }
}

Response Fields

Field Type Description
trace_id integer Face trace ID
face_count integer Total face detections in this trace
representative.frame_number integer Frame number of the selected face (primary coordinate)
representative.timestamp_secs float Time in seconds (derived from frame_number / fps)
representative.bbox object Bounding box {x, y, width, height}
representative.confidence float Detection confidence (0.01.0)
representative.quality_score float Pre-selection score (area × confidence)
representative.blur_score float FFmpeg blurdetect result (lower = sharper)

Error Responses


GET /api/v1/file/:file_uuid/trace/:trace_id/thumbnail

Extract the best face image for a trace as JPEG (320×320). Internally selects the face using the same two-stage algorithm as representative-face, then crops via FFmpeg. The result is cacheable for 24 hours.

Auth: Required Scope: file-level

Example

curl -s "$API/api/v1/file/$FILE_UUID/trace/1939/thumbnail" \
  -H "X-API-Key: $KEY" -o trace_1939_face.jpg

Response

  • 200: image/jpeg binary data (320×320 cropped face)
  • 404: File, trace not found, or no suitable face
  • 500: FFmpeg or database error
HTTP When
404 File, trace not found, or no suitable face
500 FFmpeg or database error

GET /api/v1/file/:file_uuid/video/bbox

Stream video with bounding box overlay for all detected objects/faces.

Auth: Required Scope: file-level

Uses a built-in 5×7 bitmap font renderer to draw labels directly on video frames via FFmpeg drawtext filter.


GET /api/v1/file/:file_uuid/thumbnail

Extract a single frame from a video as JPEG image. Uses FFmpeg select filter.

Auth: Required Scope: file-level

Query Parameters

Field Type Required Default Description
frame integer Yes Zero-based frame number to extract
x integer No Crop start X (left edge). Requires y, w, h.
y integer No Crop start Y (top edge). Requires x, w, h.
w integer No Crop width in pixels. Requires x, y, h.
h integer No Crop height in pixels. Requires x, y, w.

All four crop params (x, y, w, h) must be provided together or omitted.

Example

# Extract frame 1000 (full frame)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000" \
  -H "Authorization: Bearer $JWT" -o frame_1000.jpg

# Extract and crop face region (x=320, y=240, w=160, h=160)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&x=320&y=240&w=160&h=160" \
  -H "Authorization: Bearer $JWT" -o face_crop.jpg

Response

  • 200: image/jpeg binary data
  • 404: File not found
  • 500: FFmpeg error (e.g., frame number exceeds video duration)

GET /api/v1/file/:file_uuid/clip

Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg -ss fast seek.

Auth: Required Scope: file-level

Query Parameters

Field Type Required Default Description
start_frame integer No* Start frame (zero-based). Frame-accurate — use this for precision.
end_frame integer No* End frame (zero-based, inclusive). Requires start_frame.
start_time float No* Start time in seconds. Approximate (FPS-dependent). Fallback if frames not given.
end_time float No* End time in seconds. Approximate (FPS-dependent). Fallback if frames not given.
fps float No video FPS Override frames-per-second for frame↔time calculation. Defaults to video's detected FPS.
mode string No normal normal or debug (draws "CLIP" overlay)
audio string No on on or off

Either (start_frame+end_frame) OR (start_time+end_time) must be provided.

Example

# Clip by frame range (primary)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_frame=0&end_frame=47" \
  -H "Authorization: Bearer $JWT" -o clip.ts

# Clip by time range (fallback)
curl -s "$API/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/clip?start_time=30&end_time=45" \
  -H "Authorization: Bearer $JWT" -o clip.ts

Response

  • 200: video/mp2t MPEG-TS stream
  • 400: Missing/invalid range parameters
  • 404: File not found
  • 500: FFmpeg error

Technical Notes

Detail Value
Backend FFmpeg (ffmpeg-full)
Seek -ss before -i (fast keyframe seek)
Format MPEG-TS (mpegts muxer, pipe-safe)
Codec H.264 + AAC
Cache Cache-Control: public, max-age=86400 (24h)

Video vs Clip: Quality & Format Comparison

Both endpoints support time range extraction, but serve different use cases:

Feature /video /clip
No params Streams full file (Range seek) Returns 400 (params required)
HTTP Range Supported Not supported
Encoding -c copy (zero encoding) -c:v libx264 -c:a aac (re-encode)
Quality Original (bit-exact, zero loss) Compressed (default CRF ≈ 23)
Format video/mp4 video/mp2t (MPEG-TS)
Speed Fast (no computation) Slower (encoding required)
Frame control Time-based (dur = (ef-sf)/fps) Precise (-vframes)
Debug mode mode=debug overlay
Cache max-age=86400

Usage Recommendation

Scenario Use
Full video streaming / player seek /video
Quick preview clip (zero quality loss) /video?start_frame=...&end_frame=...
Debug frame verification / text overlay /clip?mode=debug
Precise frame count control /clip
CDN cacheable clip /clip

Detail Value
Backend FFmpeg (ffmpeg-full)
Filter select=eq(n\,FRAME) to select frame, optional crop=W:H:X:Y
Output Single JPEG via pipe (image2pipe, mjpeg codec)
Cache Cache-Control: public, max-age=86400 (24h)
Frame number Zero-based (frame=0 = first frame of video)

Updated: 2026-05-19 12:49:24