feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system

This commit is contained in:
Accusys
2026-06-02 07:13:23 +08:00
parent e3066c3f49
commit e1572907ae
198 changed files with 43705 additions and 8910 deletions

View File

@@ -38,7 +38,7 @@ a { color: #0066cc; }
<h2>Search APIs</h2>
<h3><code>POST /api/v1/search/smart</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Semantic vector search using EmbeddingGemma-300m. Generates a query embedding via EmbeddingGemma (port 11436), then searches pgvector <code>story_parent</code> and <code>llm_parent</code> chunks by cosine similarity.</p>
<h4>Request Parameters</h4>
<table class="table">
@@ -53,13 +53,6 @@ a { color: #0066cc; }
</thead>
<tbody>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>File UUID to search within</td>
</tr>
<tr>
<td><code>query</code></td>
<td>string</td>
<td>Yes</td>
@@ -67,6 +60,13 @@ a { color: #0066cc; }
<td>Search text</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>File UUID to search within. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
@@ -89,7 +89,14 @@ a { color: #0066cc; }
</tr>
</tbody>
</table>
<h4>Example</h4>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/smart&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;query&quot;: &quot;Audrey Hepburn&quot;}&#39;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/smart&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
@@ -101,6 +108,7 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;query&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;parent_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;scene_order&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">1087822</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">104438</span><span class="p">,</span>
@@ -118,10 +126,26 @@ a { color: #0066cc; }
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where result was found</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>POST /api/v1/search/universal</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Multi-type BM25 full-text search across chunks, frames, and persons. Uses PostgreSQL <code>tsvector</code>.</p>
<h4>Request Parameters</h4>
<table class="table">
@@ -147,7 +171,7 @@ a { color: #0066cc; }
<td>string</td>
<td>No</td>
<td></td>
<td>Restrict to specific file</td>
<td>Restrict to specific file. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>types</code></td>
@@ -179,7 +203,14 @@ a { color: #0066cc; }
</tr>
</tbody>
</table>
<h4>Example</h4>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/universal&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-d<span class="w"> </span><span class="s1">&#39;{&quot;query&quot;: &quot;Cary Grant&quot;}&#39;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span>-X<span class="w"> </span>POST<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/universal&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Content-Type: application/json&quot;</span><span class="w"> </span><span class="se">\</span>
<span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;Authorization: Bearer </span><span class="nv">$JWT</span><span class="s2">&quot;</span><span class="w"> </span><span class="se">\</span>
@@ -191,6 +222,7 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;chunk&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;bd80fec92b0b6963d177a2c55bf713e2_2&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;story_child&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_frame&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5103</span><span class="p">,</span>
@@ -199,6 +231,25 @@ a { color: #0066cc; }
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">213.64</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;[213s-214s] Cary Grant: \&quot;Olá!\&quot;&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.9</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;frame&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;frame_number&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5105</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;timestamp&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">212.72</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.7</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;objects&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;ocr_texts&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;faces&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="w"> </span><span class="p">},</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;type&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;person&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">12</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a9a901056d6b46ff92da0c3c1a57dff4&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Cary Grant&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;appearance_count&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">542</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;score&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">0.95</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">],</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">20</span><span class="p">,</span>
@@ -206,16 +257,140 @@ a { color: #0066cc; }
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].type</code></td>
<td>string</td>
<td>Result type: <code>chunk</code>, <code>frame</code>, or <code>person</code></td>
</tr>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where result was found (all types)</td>
</tr>
</tbody>
</table>
<hr />
<h3><code>POST /api/v1/search/frames</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<strong>Scope</strong>: global / file-level</p>
<p>Search face detection frames by identity name or trace ID.</p>
<hr />
<h3><code>POST /api/v1/search/identity_text</code></h3>
<h3><code>GET /api/v1/search/identity_text</code></h3>
<p><strong>Auth</strong>: Required
<strong>Scope</strong>: file-level</p>
<p>Search text chunks spoken by a specific identity.</p>
<strong>Scope</strong>: global / file-level</p>
<p>Search text chunks → find associated identities. Returns chunks where face detections overlap with text content.</p>
<h4>Query Parameters</h4>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Default</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>q</code></td>
<td>string</td>
<td>Yes</td>
<td></td>
<td>Search text (ILIKE match)</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td>No</td>
<td></td>
<td>Restrict to specific file. If omitted, searches all files (global search)</td>
</tr>
<tr>
<td><code>limit</code></td>
<td>integer</td>
<td>No</td>
<td>50</td>
<td>Max results</td>
</tr>
<tr>
<td><code>page</code></td>
<td>integer</td>
<td>No</td>
<td>1</td>
<td>Page number</td>
</tr>
<tr>
<td><code>page_size</code></td>
<td>integer</td>
<td>No</td>
<td>50</td>
<td>Items per page</td>
</tr>
</tbody>
</table>
<h4>Example (Global Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/identity_text?q=love&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Example (File-specific Search)</h4>
<div class="codehilite"><pre><span></span><code>curl<span class="w"> </span>-s<span class="w"> </span><span class="s2">&quot;</span><span class="nv">$API</span><span class="s2">/api/v1/search/identity_text?file_uuid=</span><span class="nv">$FILE_UUID</span><span class="s2">&amp;q=love&quot;</span><span class="w"> </span>-H<span class="w"> </span><span class="s2">&quot;X-API-Key: </span><span class="nv">$KEY</span><span class="s2">&quot;</span>
</code></pre></div>
<h4>Response (200)</h4>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;total&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">5</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;results&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;a6fb22eebefaef17e62af874997c5944&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;chunk_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;llm_parent_..._256_270&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;start_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">256.256</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;end_time&quot;</span><span class="p">:</span><span class="w"> </span><span class="mf">270.228</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;text_content&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;...lack of affection...&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">9</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_name&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;identity_source&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tmdb&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;trace_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="mi">94</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>results[].file_uuid</code></td>
<td>string</td>
<td>File UUID where chunk was found</td>
</tr>
<tr>
<td><code>results[].identity_id</code></td>
<td>integer</td>
<td>Identity ID if face was detected</td>
</tr>
<tr>
<td><code>results[].trace_id</code></td>
<td>integer</td>
<td>Face trace ID</td>
</tr>
</tbody>
</table>
<hr />
<h3>Visual Search</h3>
<table class="table">
@@ -282,7 +457,7 @@ a { color: #0066cc; }
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-19 12:49:24</em></p>
<p><em>Updated: 2026-05-27 — Added global search support for smart, universal, identity_text APIs</em></p>
</div>
</body>
</html>