feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system

This commit is contained in:
Accusys
2026-06-02 07:13:23 +08:00
parent e3066c3f49
commit e1572907ae
198 changed files with 43705 additions and 8910 deletions

View File

@@ -209,7 +209,191 @@ a { color: #0066cc; }
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-19 12:49:24</em></p>
<h2>POST /api/v1/agents/search</h2>
<p>Conversational search assistant. Uses Gemma4 function calling to automatically decide which tools to call based on the user's natural language query. Supports multi-turn conversation.</p>
<h3>Request</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;query&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;Audrey Hepburn 和 Cary Grant 第一次同框在哪個 frame&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;conversation_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;file_uuid&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">null</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Required</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>query</code></td>
<td>string</td>
<td></td>
<td>自然語言查詢</td>
</tr>
<tr>
<td><code>conversation_id</code></td>
<td>string</td>
<td></td>
<td>延續對話時傳入;新對話不傳</td>
</tr>
<tr>
<td><code>file_uuid</code></td>
<td>string</td>
<td></td>
<td>Portal 有選中檔案時可指定</td>
</tr>
</tbody>
</table>
<h3>Response</h3>
<div class="codehilite"><pre><span></span><code><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;success&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">true</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;conversation_id&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;conv_abc123&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;answer&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;在 Charade (1963) 中Audrey Hepburn 與 Cary Grant 第一次同框在第 38619 幀(約 1544.76 秒)。&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;need_input&quot;</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;sources&quot;</span><span class="p">:</span><span class="w"> </span><span class="p">[</span>
<span class="w"> </span><span class="p">{</span>
<span class="w"> </span><span class="nt">&quot;tool&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;tkg_query&quot;</span><span class="p">,</span>
<span class="w"> </span><span class="nt">&quot;result&quot;</span><span class="p">:</span><span class="w"> </span><span class="s2">&quot;{\&quot;first_cooccurrence\&quot;:{\&quot;frame\&quot;:38619,\&quot;timestamp_secs\&quot;:1544.76}}&quot;</span>
<span class="w"> </span><span class="p">}</span>
<span class="w"> </span><span class="p">]</span>
<span class="p">}</span>
</code></pre></div>
<table class="table">
<thead>
<tr>
<th>Field</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>conversation_id</code></td>
<td>string</td>
<td>後續對話需要傳入此 ID</td>
</tr>
<tr>
<td><code>answer</code></td>
<td>string</td>
<td>Agent 的自然語言回答(或反問)</td>
</tr>
<tr>
<td><code>need_input</code></td>
<td>boolean</td>
<td><code>true</code> 表示 agent 需要更多資訊才能回答</td>
</tr>
<tr>
<td><code>suggestions</code></td>
<td>string[]</td>
<td>建議用戶提供的線索(當 <code>need_input=true</code></td>
</tr>
<tr>
<td><code>sources</code></td>
<td>array</td>
<td>引用的工具執行結果</td>
</tr>
</tbody>
</table>
<h3>Conversation Flow</h3>
<div class="codehilite"><pre><span></span><code>Round 1: POST /agents/search { query: &quot;我想看男女主角同框&quot; }
→ need_input: true, suggestions: [&quot;片名&quot;, &quot;演員&quot;, &quot;年代&quot;]
→ answer: &quot;請問是哪部電影?請提供更多線索&quot;
Round 2: POST /agents/search { query: &quot;奧黛麗赫本&quot;, conversation_id: &quot;...&quot; }
→ need_input: false
→ answer: &quot;找到 Charade (1963)Audrey Hepburn 和 Cary Grant...&quot;
</code></pre></div>
<h3>Available Tools</h3>
<p>Agent 內部使用 Gemma4 function calling 自動調用以下工具:</p>
<table class="table">
<thead>
<tr>
<th>Tool</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>find_file</code></td>
<td>透過片名/演員/年份關鍵字搜尋影片,回傳 file_uuid + has_data 狀態</td>
</tr>
<tr>
<td><code>list_files</code></td>
<td>列出近期註冊的影片</td>
</tr>
<tr>
<td><code>tkg_query</code></td>
<td>查詢人物互動資料7 種子類型top_identities、first_cooccurrence、identity_details、mutual_gaze、interaction_network、identity_traces、file_info</td>
</tr>
<tr>
<td><code>smart_search</code></td>
<td>文字內容 ILIKE 搜尋 chunk可指定 file_uuid 限制範圍)</td>
</tr>
<tr>
<td><code>get_identity_detail</code></td>
<td>查詢單一身份的詳細資料角色、TMDb 資訊)</td>
</tr>
<tr>
<td><code>get_file_info</code></td>
<td>查詢影片基本資訊(片長、解析度)</td>
</tr>
<tr>
<td><code>get_representative_frame</code></td>
<td>查詢影片最具代表性的 frame 資訊</td>
</tr>
</tbody>
</table>
<h3>Design Principles</h3>
<ul>
<li><strong>用戶不需要知道 file_uuid</strong> — Agent 會自動用 <code>find_file</code> 搜尋或反問</li>
<li><strong>不推薦無資料的影片</strong><code>has_data=false</code> 的影片不會被推薦給用戶</li>
<li><strong>多輪對話</strong> — 透過 <code>conversation_id</code> 延續上下文agent 會記得之前的交流</li>
<li><strong>並行工具呼叫</strong> — Gemma4 可以一次呼叫多個工具再綜合回答</li>
</ul>
<h3>Model</h3>
<table class="table">
<thead>
<tr>
<th>Detail</th>
<th>Value</th>
</tr>
</thead>
<tbody>
<tr>
<td><strong>LLM</strong></td>
<td>Gemma4 26B (Q5_K_M)</td>
</tr>
<tr>
<td><strong>Engine</strong></td>
<td>llama.cpp at <code>localhost:8082</code></td>
</tr>
<tr>
<td><strong>Endpoint</strong></td>
<td><code>/v1/chat/completions</code> (OpenAI-compatible)</td>
</tr>
<tr>
<td><strong>Temperature</strong></td>
<td>0.1</td>
</tr>
<tr>
<td><strong>Max rounds</strong></td>
<td>5 (tool call iterations)</td>
</tr>
<tr>
<td><strong>Conversation TTL</strong></td>
<td>30 minutes</td>
</tr>
</tbody>
</table>
<hr />
<p><em>Updated: 2026-05-22</em></p>
</div>
</body>
</html>