feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system
This commit is contained in:
@@ -294,6 +294,7 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</s
|
||||
<hr />
|
||||
<h3><code>GET /api/v1/file/:file_uuid/thumbnail</code></h3>
|
||||
<p>Extract a single frame from a video as JPEG image. Uses FFmpeg <code>select</code> filter.</p>
|
||||
<p>When <code>frame</code> is omitted, the system automatically selects the best representative frame using the TKG bridge (see algorithm below).</p>
|
||||
<p><strong>Auth</strong>: Required
|
||||
<strong>Scope</strong>: file-level</p>
|
||||
<h4>Query Parameters</h4>
|
||||
@@ -311,9 +312,9 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</s
|
||||
<tr>
|
||||
<td><code>frame</code></td>
|
||||
<td>integer</td>
|
||||
<td>Yes</td>
|
||||
<td>—</td>
|
||||
<td>Zero-based frame number to extract</td>
|
||||
<td>No</td>
|
||||
<td>auto-detect</td>
|
||||
<td>Zero-based frame number to extract. Omit for auto-detect.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>x</code></td>
|
||||
@@ -346,8 +347,23 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</s
|
||||
</tbody>
|
||||
</table>
|
||||
<p>All four crop params (<code>x</code>, <code>y</code>, <code>w</code>, <code>h</code>) must be provided together or omitted.</p>
|
||||
<h4>Example</h4>
|
||||
<div class="codehilite"><pre><span></span><code><span class="c1"># Extract frame 1000 (full frame)</span>
|
||||
<h4>Auto-detect Algorithm</h4>
|
||||
<p>When <code>frame</code> is not provided, the endpoint finds the best frame using this fallback chain:</p>
|
||||
<ol>
|
||||
<li><strong>Main characters</strong>: find the two identities with the most face detections (TMDb source)</li>
|
||||
<li><strong>Mutual gaze</strong>: if their face traces have a TKG <code>CO_OCCURS_WITH</code> edge with <code>mutual_gaze=true</code>, take <code>first_frame</code></li>
|
||||
<li><strong>Co-occurrence</strong>: fallback to the first frame where both identities appear together</li>
|
||||
<li><strong>Single identity</strong>: if only one main identity exists, take its highest-quality face frame</li>
|
||||
<li><strong>Any identity</strong>: fallback to the best-quality face frame across all identities</li>
|
||||
<li><strong>Error</strong>: if no face exists, returns <code>404</code></li>
|
||||
</ol>
|
||||
<p>The selected frame is constrained to the <strong>first half of the video</strong> (<code>total_frames / 2</code>).</p>
|
||||
<h4>Examples</h4>
|
||||
<div class="codehilite"><pre><span></span><code><span class="c1"># Auto-detect best representative frame</span>
|
||||
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/thumbnail"</span><span class="w"> </span><span class="se">\</span>
|
||||
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"X-API-Key: </span><span class="nv">$KEY</span><span class="s2">"</span><span class="w"> </span>-o<span class="w"> </span>representative.jpg
|
||||
|
||||
<span class="c1"># Extract frame 1000 (full frame)</span>
|
||||
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000"</span><span class="w"> </span><span class="se">\</span>
|
||||
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">"</span><span class="w"> </span>-o<span class="w"> </span>frame_1000.jpg
|
||||
|
||||
@@ -359,10 +375,185 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</s
|
||||
<h4>Response</h4>
|
||||
<ul>
|
||||
<li><strong>200</strong>: <code>image/jpeg</code> binary data</li>
|
||||
<li><strong>404</strong>: File not found</li>
|
||||
<li><strong>404</strong>: File not found / No faces in file (auto-detect)</li>
|
||||
<li><strong>500</strong>: FFmpeg error (e.g., frame number exceeds video duration)</li>
|
||||
</ul>
|
||||
<h3><code>GET /api/v1/file/:file_uuid/clip</code></h3>
|
||||
<h4>Technical Details</h4>
|
||||
<table class="table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Detail</th>
|
||||
<th>Value</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><strong>Backend</strong></td>
|
||||
<td>FFmpeg (<code>ffmpeg-full</code>)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><strong>Filter</strong></td>
|
||||
<td><code>select=eq(n\,FRAME)</code> to select frame, optional <code>crop=W:H:X:Y</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><strong>Output</strong></td>
|
||||
<td>Single JPEG via pipe (<code>image2pipe</code>, <code>mjpeg</code> codec)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><strong>Cache</strong></td>
|
||||
<td><code>Cache-Control: public, max-age=86400</code> (24h)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><strong>Frame number</strong></td>
|
||||
<td>Zero-based (<code>frame=0</code> = first frame of video)</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<hr />
|
||||
<h3><code>GET /api/v1/file/:file_uuid/representative-frame</code></h3>
|
||||
<p>Return JSON metadata about the best representative frame for the video. Uses the same auto-detect algorithm as <code>GET /thumbnail</code> (without crop support).</p>
|
||||
<p><strong>Auth</strong>: Required
|
||||
<strong>Scope</strong>: file-level</p>
|
||||
<h4>Example</h4>
|
||||
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">"</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/representative-frame"</span><span class="w"> </span><span class="se">\</span>
|
||||
<span class="w"> </span>-H<span class="w"> </span><span class="s2">"X-API-Key: </span><span class="nv">$KEY</span><span class="s2">"</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">'.'</span>
|
||||
</code></pre></div>
|
||||
|
||||
<h4>Response (200)</h4>
|
||||
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
|
||||
<span class="w"> </span><span class="nt">"success"</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"file_uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"aeed71342a899fe4b4c57b7d41bcb692"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"frame_number"</span><span class="p">:</span><span class="w"> </span><span class="mi">38165</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"timestamp_secs"</span><span class="p">:</span><span class="w"> </span><span class="mf">1526.6</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"face_quality"</span><span class="p">:</span><span class="w"> </span><span class="mf">37292.97</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"main_identities"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
|
||||
<span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="nt">"identity_uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"c3545906-c82d-4b66-aa1d-150bc02decce"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Audrey Hepburn"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"face_count"</span><span class="p">:</span><span class="w"> </span><span class="mi">16456</span>
|
||||
<span class="w"> </span><span class="p">},</span>
|
||||
<span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="nt">"identity_uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2b0ddefe-e2a9-4533-9308-b375594604d5"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Cary Grant"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"face_count"</span><span class="p">:</span><span class="w"> </span><span class="mi">10643</span>
|
||||
<span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="p">],</span>
|
||||
<span class="w"> </span><span class="nt">"traces"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
|
||||
<span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="nt">"trace_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">919</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"identity_uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2b0ddefe-e2a9-4533-9308-b375594604d5"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Cary Grant"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"x"</span><span class="p">:</span><span class="w"> </span><span class="mi">764</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"y"</span><span class="p">:</span><span class="w"> </span><span class="mi">237</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"width"</span><span class="p">:</span><span class="w"> </span><span class="mi">199</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"height"</span><span class="p">:</span><span class="w"> </span><span class="mi">199</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"confidence"</span><span class="p">:</span><span class="w"> </span><span class="mf">0.8426</span>
|
||||
<span class="w"> </span><span class="p">},</span>
|
||||
<span class="w"> </span><span class="p">{</span>
|
||||
<span class="w"> </span><span class="nt">"trace_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">920</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"identity_uuid"</span><span class="p">:</span><span class="w"> </span><span class="s2">"c3545906-c82d-4b66-aa1d-150bc02decce"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">"Audrey Hepburn"</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"x"</span><span class="p">:</span><span class="w"> </span><span class="mi">1143</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"y"</span><span class="p">:</span><span class="w"> </span><span class="mi">312</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"width"</span><span class="p">:</span><span class="w"> </span><span class="mi">215</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"height"</span><span class="p">:</span><span class="w"> </span><span class="mi">215</span><span class="p">,</span>
|
||||
<span class="w"> </span><span class="nt">"confidence"</span><span class="p">:</span><span class="w"> </span><span class="mf">0.8068</span>
|
||||
<span class="w"> </span><span class="p">}</span>
|
||||
<span class="w"> </span><span class="p">]</span>
|
||||
<span class="p">}</span>
|
||||
</code></pre></div>
|
||||
|
||||
<h4>Response Fields</h4>
|
||||
<table class="table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Field</th>
|
||||
<th>Type</th>
|
||||
<th>Description</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>frame_number</code></td>
|
||||
<td>integer</td>
|
||||
<td>Selected representative frame number (primary coordinate)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>timestamp_secs</code></td>
|
||||
<td>float</td>
|
||||
<td>Time in seconds (derived from <code>frame_number / fps</code>)</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>face_quality</code></td>
|
||||
<td>float</td>
|
||||
<td>Quality score <code>area × confidence</code> of the best face at this frame</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>main_identities</code></td>
|
||||
<td>array</td>
|
||||
<td>Top 2 most frequent TMDb identities in the file</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>main_identities[].name</code></td>
|
||||
<td>string</td>
|
||||
<td>Identity display name</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>main_identities[].face_count</code></td>
|
||||
<td>integer</td>
|
||||
<td>Total face detections count</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces</code></td>
|
||||
<td>array</td>
|
||||
<td>All face traces present at the selected frame</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces[].trace_id</code></td>
|
||||
<td>integer</td>
|
||||
<td>Face trace ID</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces[].identity_uuid</code></td>
|
||||
<td>string or null</td>
|
||||
<td>Matched identity UUID</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces[].name</code></td>
|
||||
<td>string or null</td>
|
||||
<td>Identity name</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces[].x, y, width, height</code></td>
|
||||
<td>integer</td>
|
||||
<td>Bounding box coordinates</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>traces[].confidence</code></td>
|
||||
<td>float</td>
|
||||
<td>Detection confidence (0.0–1.0)</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<h4>Error Responses</h4>
|
||||
<table class="table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>HTTP</th>
|
||||
<th>When</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td><code>404</code></td>
|
||||
<td>File not found / No faces in file</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td><code>500</code></td>
|
||||
<td>Database error</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<p>Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg <code>-ss</code> fast seek.</p>
|
||||
<p><strong>Auth</strong>: Required
|
||||
<strong>Scope</strong>: file-level</p>
|
||||
|
||||
Reference in New Issue
Block a user