21 KiB
AGENTS.md - Momentry Core
Rust-based digital asset management system with video analysis and RAG capabilities.
⚠️ CRITICAL: 開發隔離原則
絕對禁止事項
- 絕對不可修改
/Users/accusys/wordpress/目錄下的任何檔案 - 絕對不可修改 n8n 工作流或設定
- 絕對不可修改 WordPress 或 n8n 的資料庫 table
- 除非是 release 作業,絕對不可動 port 3002 (production)
開發範圍界定
| 範圍 | 狀態 | 說明 |
|---|---|---|
momentry_core_0.1/ |
✅ 可開發 | Momentry Core 主要開發目錄 |
momentry_core_0.1/portal/ |
✅ 可開發 | Tauri Portal 前端 |
momentry_core_0.1/src/ |
✅ 可開發 | Rust 後端程式碼 |
/Users/accusys/wordpress/ |
❌ 禁止修改 | WordPress/Marcom 團隊負責 |
| n8n 工作流 | ❌ 禁止修改 | 自動化流程,與 dev 無關 |
| WordPress/n8n 資料庫 table | ❌ 禁止修改 | Marcom 團隊管理,與 dev 無關 |
開發環境
| 服務 | Port | 用途 | 命令 |
|---|---|---|---|
| Playground | 3003 | 唯一開發環境 | cargo run --bin momentry_playground -- server |
| Production | 3002 | ❌ 禁止修改 | cargo run -- server (僅 release 時) |
| Portal (Tauri) | 1420 | 前端開發 | npm run tauri dev |
⛔ 嚴格測試隔離規則 (Strict Test Isolation)
- 所有測試 (Test) 必須在 Dev (3003) 進行。
- 絕對禁止 (ABSOLUTELY FORBIDDEN) 在任何測試指令、Demo 流程或 API 檢查中使用
localhost:3002。 - 即使是「測試 Unregister」或「檢查版本」,若未明確標示為 "Production Deployment",一律視為違規。
- 預設行為: 所有 curl, CLI, 或程式碼測試指令,預設 URL 必須為
http://localhost:3003。
違反後果
- 修改 WordPress/n8n 可能影響 marcom 團隊工作與生產環境
- 修改 WordPress/n8n 資料庫 table 可能破壞自動化流程與資料完整性
- 修改 port 3002 可能中斷正在使用的服務 (這是非常嚴重的錯誤)
- 所有 dev 測試必須在 playground (3003) 進行
AI Coding Principles (Karpathy-Inspired)
Behavioral guidelines to reduce common LLM coding mistakes. Source: andrej-karpathy-skills (94K stars)
Tradeoff: These guidelines bias toward caution over speed. For trivial tasks, use judgment.
1. Think Before Coding
Don't assume. Don't hide confusion. Surface tradeoffs.
- State your assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them - don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
2. Simplicity First
Minimum code that solves the problem. Nothing speculative.
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- If you write 200 lines and it could be 50, rewrite it.
Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
3. Surgical Changes
Touch only what you must. Clean up only your own mess.
When editing existing code:
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken.
- Match existing style, even if you'd do it differently.
- If you notice unrelated dead code, mention it - don't delete it.
When your changes create orphans:
- Remove imports/variables/functions that YOUR changes made unused.
- Don't remove pre-existing dead code unless asked.
The test: Every changed line should trace directly to the user's request.
4. Goal-Driven Execution
Define success criteria. Loop until verified.
Transform tasks into verifiable goals:
- "Add validation" -> "Write tests for invalid inputs, then make them pass"
- "Fix the bug" -> "Write a test that reproduces it, then make it pass"
- "Refactor X" -> "Ensure tests pass before and after"
For multi-step tasks, state a brief plan:
1. [Step] -> verify: [check]
2. [Step] -> verify: [check]
3. [Step] -> verify: [check]
Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification.
These guidelines are working if: fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes.
Terminology (V4.0)
| Term | Scope | Description | Example |
|---|---|---|---|
| file_uuid | Video file | Video file identifier (renamed from video_uuid) |
384b0ff44aaaa1f1 |
| identity_uuid | Global identity | Global person identity (cross-file) | a9a90105-6d6b-46ff-92da-0c3c1a57dff4 |
| face_id | Single detection | Single face detection (frame-level) | face_100 |
| trace_id | Face tracking | Face tracking ID (Face Tracker output) | 2 |
| chunk_id | Sentence chunk | Sentence chunk (from pre_chunks via rules) | chunk_1 |
| speaker_id | Speaker segment | Speaker ID (from ASRX) | SPEAKER_0 |
| person_id | ❌ Deprecated | Video-local person ID (removed in V4.0) | - |
Architecture (V4.0)
Face → Identity (Two-layer, direct binding)
↓
person_identities table: REMOVED
file_identities table: ADDED (N:N relationship)
Key Changes (V3.x → V4.0)
| Change | V3.x | V4.0 |
|---|---|---|
| video_uuid | Used everywhere | file_uuid |
| person_identities | Required (303 records) | Removed |
| person_id APIs | 28 endpoints | Removed (except register/bind) |
| Face binding | Person → Identity | Face → Identity (direct) |
| Chunk binding | Manual | Auto (time alignment) |
Build & Run Commands
# Build project (use debug builds for development/testing)
cargo build
cargo build --bin momentry
cargo build --bin momentry_playground
# Build all binaries
cargo build --bins
# Run CLI
cargo run -- --help
cargo run -- register /path/to/video.mp4
cargo run -- server --host 0.0.0.0 --port 3002
# Run playground (development binary)
cargo run --bin momentry_playground -- server
cargo run --bin momentry_playground -- --help
⚠️ CRITICAL: cargo build --release PROHIBITION
- NEVER run
cargo build --releaseunless the user explicitly says "release the binary" or "正式 release" cargo build --releaseis SLOW and only needed when producing a production binary for deployment- For all development, testing, debugging, and linting: use
cargo buildorcargo check - If uncertain, ALWAYS ask the user first
Binaries
| Binary | Purpose | Port | Redis Prefix | Environment |
|---|---|---|---|---|
momentry |
Production | 3002 | momentry: |
.env |
momentry_playground |
Development | 3003 | momentry_dev: |
.env.development |
momentry_player |
Video player | - | - | - |
Testing
# Run all tests
cargo test
# Run single test by name
cargo test test_name
# Run with output
cargo test -- --nocapture
# Doc tests
cargo test --doc
Linting & Formatting
# Format code (edition=2021, max_width=100, tab_spaces=4)
cargo fmt
cargo fmt -- --check
# Lint
cargo clippy
cargo clippy --all-features
# Check for errors
cargo check
cargo check --all-features
Code Style
General
- Use Rust 2021 edition
- Use tracing for logging (not println!)
- Keep lines under 100 characters
Imports (order: std → external → local)
use std::path::Path;
use anyhow::{Context, Result};
use async_trait::async_trait;
use serde::{Deserialize, Serialize};
use crate::core::chunk::Chunk;
Error Handling
- Use
anyhow::Result<T>for application code - Use
thiserrorfor library code - Use
.context()for error context - Use
anyhow::bail!()for early returns
fn example() -> Result<SomeType> {
let output = Command::new("ffprobe")
.args([...])
.output()
.context("Failed to run ffprobe")?;
if !output.status.success() {
anyhow::bail!("Command failed");
}
Ok(result)
}
Naming
- Types/Enums: PascalCase (
VideoRecord,ChunkType) - Functions/Variables: snake_case (
get_video_by_uuid) - Traits: PascalCase with -er suffix (
Database,ChunkStore) - Files: snake_case (
postgres_db.rs)
Types
- Use
serde::{Deserialize, Serialize}for serializable types - Use
#[serde(rename_all = "snake_case")]for enum variants - Use explicit numeric types (i64, u32, f64)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VideoRecord {
pub id: i64,
pub uuid: String,
pub duration: f64,
pub width: u32,
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize, PartialEq)]
#[serde(rename_all = "snake_case")]
pub enum ChunkType {
TimeBased,
Sentence,
Cut,
}
Async Programming
- Use
tokioruntime with full features - Use
#[async_trait]for async trait methods
#[async_trait]
pub trait Database: Send + Sync {
async fn init() -> Result<Self>
where Self: Sized;
}
Code Structure
src/
├── main.rs # CLI entry point
├── lib.rs # Library exports
├── core/
│ ├── api_key/ # API key management (anomaly, blacklist, encryption, etc.)
│ ├── chunk/ # Chunking logic
│ ├── config.rs # Centralized configuration (env vars)
│ ├── db/ # Database (PostgreSQL, MongoDB, Redis, Qdrant)
│ ├── embedding/ # Vector embeddings
│ ├── overlay/ # Video overlay
│ ├── probe/ # ffprobe integration
│ ├── processor/ # ASR, OCR, YOLO, Face, Pose, CUT, ASRX
│ │ └── executor.rs # Unified Python script executor
│ ├── storage/ # File management
│ └── thumbnail/ # Thumbnail extraction
├── api/ # HTTP API (axum)
├── player/ # Video player
├── ui/ # TUI components
└── watcher/ # File system watcher
Key Dependencies
- Error handling:
anyhow,thiserror - Async:
tokio(full features),async-trait - CLI:
clap(derive) - Serialization:
serde,serde_json,chrono - Database:
sqlx,mongodb,redis(1.0),qdrant-client - HTTP:
axum,tower - Logging:
tracing,tracing-subscriber - Config:
once_cell(lazy static config)
Environment Variables
Server
MOMENTRY_SERVER_PORT- API server port (default:3002for production,3003for playground)MOMENTRY_REDIS_PREFIX- Redis key prefix (default:momentry:for production,momentry_dev:for playground)MOMENTRY_API_KEY- API key for Player online mode testing
Testing API Key
export MOMENTRY_API_KEY="muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69"
# Test Player online mode
cargo run --features player --bin momentry_player -- -o
Database
DATABASE_URL- PostgreSQL (default:postgres://accusys@localhost:5432/momentry)
Redis
REDIS_URL- Redis URL (default:redis://:accusys@localhost:6379)REDIS_PASSWORD- Redis password (default:accusys)
Paths
MOMENTRY_OUTPUT_DIR- Output directory (default:/Users/accusys/momentry/output)MOMENTRY_BACKUP_DIR- Backup directoryMOMENTRY_PYTHON_PATH- Python path (default:/opt/homebrew/bin/python3.11)MOMENTRY_SCRIPTS_DIR- Scripts directory
Processor Timeouts
MOMENTRY_ASR_TIMEOUT- ASR timeout in seconds (default: 3600)MOMENTRY_CUT_TIMEOUT- CUT timeout in seconds (default: 3600)MOMENTRY_DEFAULT_TIMEOUT- Default timeout (default: 7200)
TMDb Integration (Face Clustering)
TMDB_API_KEY- TMDb API key for movie metadata lookup (required forMOMENTRY_TMDB_PROBE_ENABLED=true)MOMENTRY_TMDB_PROBE_ENABLED- Enable TMDb probe during registration (default:false)- Register phase: searches TMDb by filename, creates identities with tmdb_id/tmdb_profile
- Post-process phase: matches detected faces against TMDb identities via cosine similarity
Synonym Expansion
MOMENTRY_SYNONYM_FILES- Comma-separated paths to synonym JSON files (e.g.,data/english_synonyms.json,data/llm_synonyms.json)MOMENTRY_SYNONYM_FILE- Single synonym JSON file path (deprecated, use above)
Logging
RUST_LOGorMOMENTRY_LOG_LEVEL- Log level (default:info)
Notes
- Unit tests exist (86 library tests)
- Video processing uses external tools (ffprobe, Python scripts)
- Multi-database architecture (PostgreSQL, MongoDB, Redis, Qdrant)
- Monitor directory is a separate system (not Rust)
- PythonExecutor provides unified script execution with timeout support
- Redis 1.0.x for improved performance
- FaceNet CoreML model (
models/facenet512.mlpackage) replaces InsightFace for embedding extraction (MIT license, ANE-accelerated)
LLM Synonym Generation
Generate synonym database using llama.cpp (Gemma4):
# Generate full database (162 entries, ~5 minutes)
python3 scripts/generate_synonyms_llamacpp.py
# Quick test
python3 scripts/generate_synonyms_llamacpp.py --test
# Resume from existing file
python3 scripts/generate_synonyms_llamacpp.py --resume
# Output: data/llm_synonyms.json (27 Chinese + 135 English words)
Task Management
使用 todowrite 追蹤任務
# 創建任務清單
/todo 建立配置模組 [in_progress]
/todo 添加單元測試 [pending]
# 更新狀態
/todo 完成標記 [completed]
任務批次建議
- 一次處理 1-2 個功能
- 每個功能完成後驗證 (clippy + test)
- 驗證通過後再繼續下一個
Code Review Checklist
完成任務後檢查:
cargo clippy --lib通過cargo test --lib通過cargo fmt -- --check通過- 文檔已更新 (如需要)
- 新功能有單元測試
Commit Guidelines
# feat: 新功能
git commit -m "feat: add monitor_jobs table"
# fix: 錯誤修復
git commit -m "fix: resolve SQL injection in store_vector"
# refactor: 重構
git commit -m "refactor: use parameterized queries"
# docs: 文檔更新
git commit -m "docs: update AGENTS.md with new modules"
Pre-commit Hook
專案已配置 .git/hooks/pre-commit,提交前自動檢查:
# 檢查內容
1. cargo fmt --check # Rust 格式化檢查
2. cargo clippy --lib # Rust Lint 檢查
3. cargo test --lib # Rust 單元測試
4. ruff check # Python Lint 檢查
5. ruff format --check # Python 格式化檢查
6. markdownlint # Markdown 格式檢查
7. shellcheck # Shell 腳本檢查
# 跳過檢查(不建議)
git commit --no-verify
# 跳過特定檢查
git commit --skip-checks
注意: Hook 僅檢查已暫存的 Rust/Python/Markdown 文件。
Python 環境設置
# 安裝 ruff
pip install ruff==0.11.2
# 格式化 Python 文件
ruff format scripts/
# Lint Python 文件
ruff check scripts/
Markdown 環境設置
# 安裝 markdownlint-cli (使用系統 Node.js)
npm install -g markdownlint-cli
# 檢查 Markdown 文件
markdownlint docs/
# 配置檔案
.markdownlint.json
Shell 環境設置
# 安裝 shellcheck
brew install shellcheck
# 檢查 Shell 腳本
shellcheck scripts/*.sh monitor/**/*.sh
注意: Hook 只檢查 error 等級的 shellcheck 問題,style 警告會顯示但不阻擋提交。
Release Workflow
Release 前準備
每次 release production binary 前,必須:
-
建立 Release Tag
git tag -a v0.X.X -m "Release vX.X.X - YYYY-MM-DD" git push origin v0.X.X -
備份獨立 Source Code
# 建立 release 獨立目錄 RELEASE_DIR="/Users/accusys/momentry_core_releases/v0.X.X" mkdir -p "$RELEASE_DIR" # 複製完整原始碼(排除不必要的檔案) rsync -av --exclude='.git' --exclude='target' --exclude='node_modules' \ /Users/accusys/momentry_core_0.1/ "$RELEASE_DIR/" # 記錄 release 資訊 echo "Release: v0.X.X" > "$RELEASE_DIR/RELEASE_INFO.txt" echo "Date: $(date)" >> "$RELEASE_DIR/RELEASE_INFO.txt" echo "Git Commit: $(git rev-parse HEAD)" >> "$RELEASE_DIR/RELEASE_INFO.txt" echo "Binary: $(ls -la target/release/momentry)" >> "$RELEASE_DIR/RELEASE_INFO.txt" -
備份 Binary
cp target/release/momentry "$RELEASE_DIR/momentry_v0.X.X" cp target/release/momentry_playground "$RELEASE_DIR/momentry_playground_v0.X.X" 2>/dev/null -
記錄資料庫 Schema
pg_dump -U accusys -d momentry --schema-only > "$RELEASE_DIR/schema_v0.X.X.sql"
重要性
- 避免 release binary 與 current source code 不一致
- 方便追蹤特定 release 的程式碼狀態
- 必要時可快速復原或比對差異
- 確保資料庫 schema 與程式碼版本對應
Reference Documents
| 文件 | 用途 |
|---|---|
docs/OPENCODE_GUIDE.md |
OpenCode 使用規範 |
docs/ARCHITECTURE_EVALUATION.md |
架構優化待評估項目 (含 GraphRAG) |
docs/PENDING_ISSUES.md |
待解決問題追蹤 |
docs/MOMENTRY_CORE_MONITORING.md |
監控系統規範 |
docs/MOMENTRY_CORE_REDIS_KEYS.md |
Redis Key 設計規範 |
docs/PYTHON.md |
Python 腳本規範 |
docs/FILE_CHANGE_MANAGEMENT.md |
文件修改管理規範 |
docs/YOLO_RESUME_INTEGRATION.md |
YOLO Resume 功能整合記錄 |
docs/DOCUMENT_EMBEDDING_STRATEGY.md |
Parent-Child 嵌入策略 |
docs/PROCESSING_PIPELINE.md |
處理流程文檔 |
docs/N8N_DEMO_WORKFLOW.md |
n8n 工作流文檔 |
docs/FRESH_MAC_INSTALLATION.md |
全新 Mac 安裝指南 |
docs/SERVICES.md |
服務總覽與管理 |
docs/SFTPGO_DEMO_USER.md |
SFTPGo 用戶指南 |
Document Change Workflow
修改文件前請參考 docs/FILE_CHANGE_MANAGEMENT.md,確保:
- 修改前:完整閱讀文件、執行預檢清單
- 修改中:提供變更計畫、取得確認
- 修改後:展示 diff、更新版本歷史
- 驗證:執行 lint/test、提交前審查
AI 工具修改規範
AI 工具修改文件時:
- 必須先完整閱讀文件(不可只讀取部分章節)
- 修改前先提出變更計畫供確認
- 修改後展示 diff 內容
- 更新版本歷史表
PHP Development
WordPress 作為 Momentry Portal,負責 n8n 自動化與 sftpgo 檔案服務的頁面整合。
編輯器設定
| 編輯器 | LSP 方案 | 安裝方式 |
|---|---|---|
| VS Code | Intelephense | Extension Marketplace (推薦) |
| Cursor | Intelephense | Extension Marketplace (推薦) |
| CLI | phpactor | ~/bin/phpactor |
Intelephense (VS Code/Cursor)
- 安裝 Extension: 搜尋 "Intelephense"
- 設定:
{
"intelephense.stubs": ["wordpress"]
}
phpactor (CLI)
# 安裝方式
brew install composer
curl -sSL https://github.com/phpactor/phpactor/releases/latest/download/phpactor.phar -o ~/bin/phpactor
chmod +x ~/bin/phpactor
# 安裝 WordPress Stubs
cd /Users/accusys/wordpress/web
composer require --dev php-stubs/wordpress-stubs
# 建立 WordPress 索引
cd /Users/accusys/wordpress/web
~/bin/phpactor index:build --reset
# 常用指令
~/bin/phpactor class:search "WP_User" # 搜尋類別
~/bin/phpactor index:query WP_User # 查看類別資訊
~/bin/phpactor navigate /path/to/file.php # 導航到定義
WordPress 程式碼位置
| 類型 | 路徑 |
|---|---|
| 主題 | /Users/accusys/wordpress/web/wp-content/themes/ |
| 插件 | /Users/accusys/wordpress/web/wp-content/plugins/ |
與 marcom 團隊協作
| 角色 | 負責 |
|---|---|
| marcom 團隊 | Figma 設計 / Elementor 建構 |
| OpenCode | 程式碼實作 / 重構 |
開發時程
Phase 1: marcom 建構 (現在) → Elementor 頁面建構
Phase 2: 交付審視 (TBD) → 功能確認 / 重構評估
Phase 3: OpenCode 重構 → 純程式碼實作,交付無 Elementor 依賴版本
M4 通知規範
固定通知方式
通知 M4 的唯一管道:M4_workspace/ 下建立回覆文件 + git commit。不需口頭、即時訊息、郵件。
命名規則
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_response.md (回覆 M4 問題)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>.md (主動通報)
docs_v1.0/M4_workspace/YYYY-MM-DD_<topic>_test_report.md (測試報告)
觸發時機
| 情境 | 動作 |
|---|---|
M4 提交問題報告到 M4_workspace/ |
修復後,回覆 *_response.md |
| 完成 M4 要求的任務 | 回覆 *_response.md |
| 重大變更(模型替換、架構變更) | 主動通知 *.md |
| 新測試包產出 | *_test_report.md |
交付檢查
- 文件寫入
docs_v1.0/M4_workspace/ git add包含該文件git commit含相關變更- M4 透過 git log 查看
詳細規範見 docs_v1.0/M4_workspace/M4_NOTIFICATION_PROTOCOL.md。
Delivery Procedure
完整交付程序(M4_workspace → M5 → Release → Deploy → Public)見:
docs_v1.0/REFERENCE/DELIVERY_PROCEDURE.md