← Back to index Logout

Agent Endpoints

Agent endpoints provide AI-powered capabilities including translation, identity analysis, and 5W1H extraction.

POST /api/v1/agents/translate

Translate text between languages using Gemma4 (llama.cpp, port 8082).

Request

{
  "text": "Hello, welcome to Momentry Core.",
  "target_language": "Traditional Chinese",
  "source_language": "English"
}
Field Type Required Description
text string Text to translate
target_language string Target language name (e.g. "Traditional Chinese", "Japanese")
source_language string Source language (default: "auto")

Response

{
  "success": true,
  "translated_text": "您好,歡迎使用 Momentry Core。",
  "source_language_detected": "English",
  "model_used": "google_gemma-4-26B-A4B-it-Q5_K_M.gguf"
}

Supported Language Pairs (tested)

Source Target Quality
English Traditional Chinese
English Japanese
Chinese English
English French
Chinese Japanese

Model

Errors

Status Condition
500 LLM unreachable or response parse failure
401 Missing/invalid auth

POST /api/v1/agents/5w1h/analyze

Extract 5W1H (Who, What, When, Where, Why, How) from a scene. Uses Gemma4 LLM on port 8082.

Request

{
  "file_uuid": "3abeee81d94597629ed8cb943f182e94",
  "scene_id": 42
}

Response

{
  "success": true,
  "5w1h": {
    "who": ["Cary Grant"],
    "what": ["discussing plans"],
    "when": ["1963"],
    "where": ["Paris"],
    "why": ["vacation"],
    "how": ["in person"]
  }
}

POST /api/v1/agents/5w1h/batch

Batch analyze all scenes in a file for 5W1H extraction. Uses the pipeline's parent_chunk_5w1h.py --mode llm.

Request

{
  "file_uuid": "3abeee81d94597629ed8cb943f182e94"
}

GET /api/v1/agents/5w1h/status

Get status of the 5W1H agent pipeline for a file.


Embedding Model

Detail Value
Model EmbeddingGemma-300m
Endpoint POST /v1/embeddings on port 11436
Dimension 768
Used by parent_chunk_5w1h.py --embed, story, 5W1H, search

POST /api/v1/agents/search

Conversational search assistant. Uses Gemma4 function calling to automatically decide which tools to call based on the user's natural language query. Supports multi-turn conversation.

Request

{
  "query": "Audrey Hepburn 和 Cary Grant 第一次同框在哪個 frame?",
  "conversation_id": null,
  "file_uuid": null
}
Field Type Required Description
query string 自然語言查詢
conversation_id string 延續對話時傳入;新對話不傳
file_uuid string Portal 有選中檔案時可指定

Response

{
  "success": true,
  "conversation_id": "conv_abc123",
  "answer": "在 Charade (1963) 中,Audrey Hepburn 與 Cary Grant 第一次同框在第 38619 幀(約 1544.76 秒)。",
  "need_input": false,
  "sources": [
    {
      "tool": "tkg_query",
      "result": "{\"first_cooccurrence\":{\"frame\":38619,\"timestamp_secs\":1544.76}}"
    }
  ]
}
Field Type Description
conversation_id string 後續對話需要傳入此 ID
answer string Agent 的自然語言回答(或反問)
need_input boolean true 表示 agent 需要更多資訊才能回答
suggestions string[] 建議用戶提供的線索(當 need_input=true
sources array 引用的工具執行結果

Conversation Flow

Round 1: POST /agents/search { query: "我想看男女主角同框" }
         → need_input: true, suggestions: ["片名", "演員", "年代"]
         → answer: "請問是哪部電影?請提供更多線索"

Round 2: POST /agents/search { query: "奧黛麗赫本", conversation_id: "..." }
         → need_input: false
         → answer: "找到 Charade (1963),Audrey Hepburn 和 Cary Grant..."

Available Tools

Agent 內部使用 Gemma4 function calling 自動調用以下工具:

Tool Description
find_file 透過片名/演員/年份關鍵字搜尋影片,回傳 file_uuid + has_data 狀態
list_files 列出近期註冊的影片
tkg_query 查詢人物互動資料(7 種子類型:top_identities、first_cooccurrence、identity_details、mutual_gaze、interaction_network、identity_traces、file_info)
smart_search 文字內容 ILIKE 搜尋 chunk(可指定 file_uuid 限制範圍)
get_identity_detail 查詢單一身份的詳細資料(角色、TMDb 資訊)
get_file_info 查詢影片基本資訊(片長、解析度)
get_representative_frame 查詢影片最具代表性的 frame 資訊

Design Principles

Model

Detail Value
LLM Gemma4 26B (Q5_K_M)
Engine llama.cpp at localhost:8082
Endpoint /v1/chat/completions (OpenAI-compatible)
Temperature 0.1
Max rounds 5 (tool call iterations)
Conversation TTL 30 minutes

Updated: 2026-05-22