feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system

This commit is contained in:
Accusys
2026-06-02 07:13:23 +08:00
parent e3066c3f49
commit e1572907ae
198 changed files with 43705 additions and 8910 deletions

View File

@@ -38,7 +38,7 @@ a { color: #0066cc; }
<h2>Search APIs</h2>
<h3><code>POST /api/v1/search/smart</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Semantic vector search using EmbeddingGemma-300m. Generates a query embedding via EmbeddingGemma (port 11436), then searches pgvector <code>story_parent</code> and <code>llm_parent</code> chunks by cosine similarity.</p>
<h4>Request Parameters</h4>
<table class="table">
@@ -53,13 +53,6 @@ a { color: #0066cc; }
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>File UUID to search within</td>
</tr>
<tr>
<td><code>query</code></td>
<td>string</td>
<td>Yes</td>
@@ -67,6 +60,13 @@ a { color: #0066cc; }
<td>Search text</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>File UUID to search within. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
@@ -89,7 +89,14 @@ a { color: #0066cc; }
</tr>
</tbody>
</table>
<h4>Example</h4>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/smart&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;query&quot;: &quot;Audrey Hepburn&quot;}&#39;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/smart&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
@@ -101,6 +108,7 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;query&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;parent_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scene_order&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">104438</span><span class="p">,</span>
@@ -118,10 +126,26 @@ a { color: #0066cc; }
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where result was found</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>POST /api/v1/search/universal</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Multi-type BM25 full-text search across chunks, frames, and persons. Uses PostgreSQL <code>tsvector</code>.</p>
<h4>Request Parameters</h4>
<table class="table">
@@ -147,7 +171,7 @@ a { color: #0066cc; }
<td>string</td>
<td>No</td>
<td></td>
<td>Restrict to specific file</td>
<td>Restrict to specific file. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>types</code></td>
@@ -179,7 +203,14 @@ a { color: #0066cc; }
</tr>
</tbody>
</table>
<h4>Example</h4>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/universal&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;query&quot;: &quot;Cary Grant&quot;}&#39;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/universal&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
@@ -191,6 +222,7 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;chunk&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec92b0b6963d177a2c55bf713e2_2&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;story_child&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5103</span><span class="p">,</span>
@@ -199,6 +231,25 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">213.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;[213s-214s] Cary Grant: \&quot;Olá!\&quot;&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.9</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;frame&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;frame_number&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5105</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;timestamp&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">212.72</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.7</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;objects&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;ocr_texts&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;faces&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;person&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">12</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;appearance_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">542</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.95</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span>
@@ -206,16 +257,140 @@ a { color: #0066cc; }
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].type</code></td>
<td>string</td>
<td>Result type: <code>chunk</code>, <code>frame</code>, or <code>person</code></td>
</tr>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where result was found (all types)</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>POST /api/v1/search/frames</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Search face detection frames by identity name or trace ID.</p>
<hr />
<h3><code>POST /api/v1/search/identity_text</code></h3>
<h3><code>GET /api/v1/search/identity_text</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Search text chunks spoken by a specific identity.</p>
<strong>Scope</strong>: global / file-level</p>
<p>Search text chunks → find associated identities. Returns chunks where face detections overlap with text content.</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>q</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>Search text (ILIKE match)</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Restrict to specific file. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
<td>50</td>
<td>Max results</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>No</td>
<td>1</td>
<td>Page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>No</td>
<td>50</td>
<td>Items per page</td>
</tr>
</tbody>
</table>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/identity_text?q=love&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/identity_text?file_uuid=</span><span class="nv">$FILE_UUID</span><span class="s2">&amp;q=love&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;llm_parent_..._256_270&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">256.256</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">270.228</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text_content&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;...lack of affection...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">9</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_source&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tmdb&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;trace_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">94</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where chunk was found</td>
</tr>
<tr>
<td><code>results[].identity_id</code></td>
<td>integer</td>
<td>Identity ID if face was detected</td>
</tr>
<tr>
<td><code>results[].trace_id</code></td>
<td>integer</td>
<td>Face trace ID</td>
</tr>
</tbody>
</table>
<hr />
<h3>Visual Search</h3>
<table class="table">
@@ -282,7 +457,7 @@ a { color: #0066cc; }
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-19 12:49:24</em></p>
<p><em>Updated: 2026-05-27 — Added global search support for smart, universal, identity_text APIs</em></p>
</div>
</body>
</html>

File diff suppressed because it is too large Load Diff

View File

@@ -294,6 +294,7 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</s
<hr />
<h3><code>GET /api/v1/file/:file_uuid/thumbnail</code></h3>
<p>Extract a single frame from a video as JPEG image. Uses FFmpeg <code>select</code> filter.</p>
<p>When <code>frame</code> is omitted, the system automatically selects the best representative frame using the TKG bridge (see algorithm below).</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<h4>Query Parameters</h4>
@@ -311,9 +312,9 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</s
<tr>
<td><code>frame</code></td>
<td>integer</td>
<td>Yes</td>
<td></td>
<td>Zero-based frame number to extract</td>
<td>No</td>
<td>auto-detect</td>
<td>Zero-based frame number to extract. Omit for auto-detect.</td>
</tr>
<tr>
<td><code>x</code></td>
@@ -346,8 +347,23 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</s
</tbody>
</table>
<p>All four crop params (<code>x</code>, <code>y</code>, <code>w</code>, <code>h</code>) must be provided together or omitted.</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Extract frame 1000 (full frame)</span>
<h4>Auto-detect Algorithm</h4>
<p>When <code>frame</code> is not provided, the endpoint finds the best frame using this fallback chain:</p>
<ol>
<li><strong>Main characters</strong>: find the two identities with the most face detections (TMDb source)</li>
<li><strong>Mutual gaze</strong>: if their face traces have a TKG <code>CO_OCCURS_WITH</code> edge with <code>mutual_gaze=true</code>, take <code>first_frame</code></li>
<li><strong>Co-occurrence</strong>: fallback to the first frame where both identities appear together</li>
<li><strong>Single identity</strong>: if only one main identity exists, take its highest-quality face frame</li>
<li><strong>Any identity</strong>: fallback to the best-quality face frame across all identities</li>
<li><strong>Error</strong>: if no face exists, returns <code>404</code></li>
</ol>
<p>The selected frame is constrained to the <strong>first half of the video</strong> (<code>total_frames / 2</code>).</p>
<h4>Examples</h4>
<div class="codehilite"><pre><span></span><code><span class="c1"># Auto-detect best representative frame</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/thumbnail&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>representative.jpg
<span class="c1"># Extract frame 1000 (full frame)</span>
curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/bd80fec92b0b6963d177a2c55bf713e2/thumbnail?frame=1000&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span>-o<span class="w"> </span>frame_1000.jpg
@@ -359,10 +375,185 @@ curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</s
<h4>Response</h4>
<ul>
<li><strong>200</strong>: <code>image/jpeg</code> binary data</li>
<li><strong>404</strong>: File not found</li>
<li><strong>404</strong>: File not found / No faces in file (auto-detect)</li>
<li><strong>500</strong>: FFmpeg error (e.g., frame number exceeds video duration)</li>
</ul>
<h3><code>GET /api/v1/file/:file_uuid/clip</code></h3>
<h4>Technical Details</h4>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>Backend</strong></td>
<td>FFmpeg (<code>ffmpeg-full</code>)</td>
</tr>
<tr>
<td><strong>Filter</strong></td>
<td><code>select=eq(n\,FRAME)</code> to select frame, optional <code>crop=W:H:X:Y</code></td>
</tr>
<tr>
<td><strong>Output</strong></td>
<td>Single JPEG via pipe (<code>image2pipe</code>, <code>mjpeg</code> codec)</td>
</tr>
<tr>
<td><strong>Cache</strong></td>
<td><code>Cache-Control: public, max-age=86400</code> (24h)</td>
</tr>
<tr>
<td><strong>Frame number</strong></td>
<td>Zero-based (<code>frame=0</code> = first frame of video)</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>GET /api/v1/file/:file_uuid/representative-frame</code></h3>
<p>Return JSON metadata about the best representative frame for the video. Uses the same auto-detect algorithm as <code>GET /thumbnail</code> (without crop support).</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<h4>Example</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/file/</span><span class="nv">$FILE_UUID</span><span class="s2">/representative-frame&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span><span class="w"> </span><span class="p">|</span><span class="w"> </span>jq<span class="w"> </span><span class="s1">&#39;.&#39;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;aeed71342a899fe4b4c57b7d41bcb692&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;frame_number&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">38165</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;timestamp_secs&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">1526.6</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;face_quality&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">37292.97</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;main_identities&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;c3545906-c82d-4b66-aa1d-150bc02decce&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;face_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">16456</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2b0ddefe-e2a9-4533-9308-b375594604d5&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;face_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">10643</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;traces&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;trace_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">919</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;2b0ddefe-e2a9-4533-9308-b375594604d5&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;x&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">764</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;y&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">237</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;width&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">199</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;height&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">199</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;confidence&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.8426</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;trace_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">920</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;c3545906-c82d-4b66-aa1d-150bc02decce&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;x&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1143</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;y&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">312</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;width&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">215</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;height&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">215</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;confidence&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.8068</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<h4>Response Fields</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>frame_number</code></td>
<td>integer</td>
<td>Selected representative frame number (primary coordinate)</td>
</tr>
<tr>
<td><code>timestamp_secs</code></td>
<td>float</td>
<td>Time in seconds (derived from <code>frame_number / fps</code>)</td>
</tr>
<tr>
<td><code>face_quality</code></td>
<td>float</td>
<td>Quality score <code>area × confidence</code> of the best face at this frame</td>
</tr>
<tr>
<td><code>main_identities</code></td>
<td>array</td>
<td>Top 2 most frequent TMDb identities in the file</td>
</tr>
<tr>
<td><code>main_identities[].name</code></td>
<td>string</td>
<td>Identity display name</td>
</tr>
<tr>
<td><code>main_identities[].face_count</code></td>
<td>integer</td>
<td>Total face detections count</td>
</tr>
<tr>
<td><code>traces</code></td>
<td>array</td>
<td>All face traces present at the selected frame</td>
</tr>
<tr>
<td><code>traces[].trace_id</code></td>
<td>integer</td>
<td>Face trace ID</td>
</tr>
<tr>
<td><code>traces[].identity_uuid</code></td>
<td>string or null</td>
<td>Matched identity UUID</td>
</tr>
<tr>
<td><code>traces[].name</code></td>
<td>string or null</td>
<td>Identity name</td>
</tr>
<tr>
<td><code>traces[].x, y, width, height</code></td>
<td>integer</td>
<td>Bounding box coordinates</td>
</tr>
<tr>
<td><code>traces[].confidence</code></td>
<td>float</td>
<td>Detection confidence (0.01.0)</td>
</tr>
</tbody>
</table>
<h4>Error Responses</h4>
<table class="table">
<thead>
<tr>
<th>HTTP</th>
<th>When</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>404</code></td>
<td>File not found / No faces in file</td>
</tr>
<tr>
<td><code>500</code></td>
<td>Database error</td>
</tr>
</tbody>
</table>
<p>Extract a video clip (time range) as MPEG-TS stream. Uses FFmpeg <code>-ss</code> fast seek.</p>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>

View File

@@ -209,7 +209,191 @@ a { color: #0066cc; }
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-19 12:49:24</em></p>
<h2>POST /api/v1/agents/search</h2>
<p>Conversational search assistant. Uses Gemma4 function calling to automatically decide which tools to call based on the user's natural language query. Supports multi-turn conversation.</p>
<h3>Request</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;query&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn 和 Cary Grant 第一次同框在哪個 frame&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;conversation_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>query</code></td>
<td>string</td>
<td></td>
<td>自然語言查詢</td>
</tr>
<tr>
<td><code>conversation_id</code></td>
<td>string</td>
<td></td>
<td>延續對話時傳入;新對話不傳</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td></td>
<td>Portal 有選中檔案時可指定</td>
</tr>
</tbody>
</table>
<h3>Response</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;conversation_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;conv_abc123&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;answer&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;在 Charade (1963) 中Audrey Hepburn 與 Cary Grant 第一次同框在第 38619 幀(約 1544.76 秒)。&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;need_input&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;sources&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;tool&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tkg_query&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;result&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;{\&quot;first_cooccurrence\&quot;:{\&quot;frame\&quot;:38619,\&quot;timestamp_secs\&quot;:1544.76}}&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>conversation_id</code></td>
<td>string</td>
<td>後續對話需要傳入此 ID</td>
</tr>
<tr>
<td><code>answer</code></td>
<td>string</td>
<td>Agent 的自然語言回答(或反問)</td>
</tr>
<tr>
<td><code>need_input</code></td>
<td>boolean</td>
<td><code>true</code> 表示 agent 需要更多資訊才能回答</td>
</tr>
<tr>
<td><code>suggestions</code></td>
<td>string[]</td>
<td>建議用戶提供的線索(當 <code>need_input=true</code></td>
</tr>
<tr>
<td><code>sources</code></td>
<td>array</td>
<td>引用的工具執行結果</td>
</tr>
</tbody>
</table>
<h3>Conversation Flow</h3>
<div class="codehilite"><pre><span></span><code>Round 1: POST /agents/search { query: &quot;我想看男女主角同框&quot; }
→ need_input: true, suggestions: [&quot;片名&quot;, &quot;演員&quot;, &quot;年代&quot;]
→ answer: &quot;請問是哪部電影?請提供更多線索&quot;
Round 2: POST /agents/search { query: &quot;奧黛麗赫本&quot;, conversation_id: &quot;...&quot; }
→ need_input: false
→ answer: &quot;找到 Charade (1963)Audrey Hepburn 和 Cary Grant...&quot;
</code></pre></div>
<h3>Available Tools</h3>
<p>Agent 內部使用 Gemma4 function calling 自動調用以下工具:</p>
<table class="table">
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>find_file</code></td>
<td>透過片名/演員/年份關鍵字搜尋影片,回傳 file_uuid + has_data 狀態</td>
</tr>
<tr>
<td><code>list_files</code></td>
<td>列出近期註冊的影片</td>
</tr>
<tr>
<td><code>tkg_query</code></td>
<td>查詢人物互動資料7 種子類型top_identities、first_cooccurrence、identity_details、mutual_gaze、interaction_network、identity_traces、file_info</td>
</tr>
<tr>
<td><code>smart_search</code></td>
<td>文字內容 ILIKE 搜尋 chunk可指定 file_uuid 限制範圍)</td>
</tr>
<tr>
<td><code>get_identity_detail</code></td>
<td>查詢單一身份的詳細資料角色、TMDb 資訊)</td>
</tr>
<tr>
<td><code>get_file_info</code></td>
<td>查詢影片基本資訊(片長、解析度)</td>
</tr>
<tr>
<td><code>get_representative_frame</code></td>
<td>查詢影片最具代表性的 frame 資訊</td>
</tr>
</tbody>
</table>
<h3>Design Principles</h3>
<ul>
<li><strong>用戶不需要知道 file_uuid</strong> — Agent 會自動用 <code>find_file</code> 搜尋或反問</li>
<li><strong>不推薦無資料的影片</strong><code>has_data=false</code> 的影片不會被推薦給用戶</li>
<li><strong>多輪對話</strong> — 透過 <code>conversation_id</code> 延續上下文agent 會記得之前的交流</li>
<li><strong>並行工具呼叫</strong> — Gemma4 可以一次呼叫多個工具再綜合回答</li>
</ul>
<h3>Model</h3>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>LLM</strong></td>
<td>Gemma4 26B (Q5_K_M)</td>
</tr>
<tr>
<td><strong>Engine</strong></td>
<td>llama.cpp at <code>localhost:8082</code></td>
</tr>
<tr>
<td><strong>Endpoint</strong></td>
<td><code>/v1/chat/completions</code> (OpenAI-compatible)</td>
</tr>
<tr>
<td><strong>Temperature</strong></td>
<td>0.1</td>
</tr>
<tr>
<td><strong>Max rounds</strong></td>
<td>5 (tool call iterations)</td>
</tr>
<tr>
<td><strong>Conversation TTL</strong></td>
<td>30 minutes</td>
</tr>
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-22</em></p>
</div>
</body>
</html>