feat: ASRX hybrid pipeline, identity history, worker fixes, checkpoint system

This commit is contained in:
Accusys
2026-06-02 07:13:23 +08:00
parent e3066c3f49
commit e1572907ae
198 changed files with 43705 additions and 8910 deletions

View File

@@ -1,105 +1,178 @@
# Momentry Core 配置管理
# Momentry Core Config Management
## 目錄結構
## Directory Structure
```
momentry_core_0.1/
├── .env.example # 配置模板(已納入版本控制)
├── .env # 本地配置(已從版本控制排除)
├── .env.local # 本地覆蓋配置(已從版本控制排除)
├── .env.example # Template (version controlled)
├── .env # Local config (gitignored)
├── .env.development # Playground dev overrides (gitignored)
├── .env.local # Local overrides (gitignored)
├── config/
── README.md # 本文件
└── src/core/config.rs # 配置代碼
── README.md # This file
│ └── port_registry.tsv # Central port registry
└── src/core/config.rs # Config code with lazy_static env reading
```
## 配置加載順序
## Load Order
1. `.env` - 默認本地配置
2. `.env.local` - 本地覆蓋(最高優先級)
For `momentry_playground` (development):
1. `.env` — shared defaults
2. `.env.development` — dev-specific overrides (loaded by playground binary)
## 環境變數列表
For `momentry` (production):
1. `.env` — production config
### 數據庫配置
In Rust: `config.rs` reads env vars with lazy_static, falling back to hardcoded defaults.
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `DATABASE_URL` | PostgreSQL 連接字串 | `postgres://accusys@localhost:5432/momentry` |
## Environment Variables
### Redis 配置
### Server
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `REDIS_URL` | Redis 連接字串 | `redis://:accusys@localhost:6379` |
| `REDIS_PASSWORD` | Redis 密碼 | `accusys` |
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_SERVER_PORT` | Server port (3002=prod, 3003=dev) | `3002` |
| `MOMENTRY_REDIS_PREFIX` | Redis key prefix | `momentry:` (prod), `momentry_dev:` (dev) |
### 存儲路徑
### Database
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `MOMENTRY_OUTPUT_DIR` | 輸出目錄 | `/Users/accusys/momentry/output` |
| `MOMENTRY_BACKUP_DIR` | 備份目錄 | `/Users/accusys/momentry/backup/momentry` |
| `MOMENTRY_SCRIPTS_DIR` | 腳本目錄 | `/Users/accusys/momentry_core_0.1/scripts` |
| `MOMENTRY_PYTHON_PATH` | Python 路徑 | `/opt/homebrew/bin/python3.11` |
| Variable | Description | Default |
|----------|-------------|---------|
| `DATABASE_URL` | PostgreSQL connection string | `postgres://accusys@localhost:5432/momentry` |
| `DATABASE_SCHEMA` | Schema for dev isolation | `dev` |
| `MONGODB_URL` | MongoDB connection string | `mongodb://localhost:27017` |
| `MONGODB_DATABASE` | MongoDB database name | `momentry` (prod), `momentry_dev` (dev) |
| `MONGODB_CACHE_ENABLED` | MongoDB cache toggle | `true` |
| `MONGODB_CACHE_TTL_VIDEOS` | Cache TTL for videos | `300` |
| `MONGODB_CACHE_TTL_SEARCH` | Cache TTL for search | `300` |
| `MONGODB_CACHE_TTL_HYBRID_SEARCH` | Cache TTL for hybrid search | `600` |
| `MONGODB_CACHE_TTL_VIDEO_META` | Cache TTL for video metadata | `3600` |
### 處理器超時(秒)
### Redis
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `MOMENTRY_ASR_TIMEOUT` | ASR 處理超時 | `3600` |
| `MOMENTRY_CUT_TIMEOUT` | CUT 處理超時 | `3600` |
| `MOMENTRY_DEFAULT_TIMEOUT` | 默認超時 | `7200` |
| Variable | Description | Default |
|----------|-------------|---------|
| `REDIS_URL` | Redis connection string | `redis://:accusys@localhost:6379` |
| `REDIS_PASSWORD` | Redis password | `accusys` |
| `REDIS_CACHE_TTL_HEALTH` | Health check cache TTL | `30` |
| `REDIS_CACHE_TTL_VIDEO_META` | Video metadata cache TTL | `3600` |
### 日誌
### Qdrant
| 變數 | 說明 | 默認值 |
|------|------|--------|
| `RUST_LOG` | 日誌級別 | `info` |
| `MOMENTRY_LOG_LEVEL` | 日誌級別(備選) | `info` |
| Variable | Description | Default |
|----------|-------------|---------|
| `QDRANT_URL` | Qdrant server URL | `http://localhost:6333` |
| `QDRANT_API_KEY` | Qdrant API key | `Test3200Test3200Test3200` |
| `QDRANT_COLLECTION` | Collection name | `momentry_rule1` (prod), `momentry_dev_rule1_v2` (dev) |
## 使用方式
### LLM
### 1. 首次設置
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_LLM_CHAT_URL` | Chat/function-calling endpoint | `http://127.0.0.1:8082/v1/chat/completions` |
| `MOMENTRY_LLM_CHAT_MODEL` | Chat model name | `google_gemma-4-26B-A4B-it-Q5_K_M.gguf` |
| `MOMENTRY_LLM_VISION_URL` | Vision LLM endpoint (E4B) | falls back to CHAT_URL |
| `MOMENTRY_LLM_VISION_MODEL` | Vision model name (E4B) | falls back to CHAT_MODEL |
| `MOMENTRY_LLM_SUMMARY_URL` | Summary LLM endpoint (5W1H) | falls back to CHAT_URL |
| `MOMENTRY_LLM_SUMMARY_MODEL` | Summary model name | falls back to CHAT_MODEL |
| `MOMENTRY_LLM_SUMMARY_ENABLED` | Toggle 5W1H summary generation | `true` |
| `MOMENTRY_LLM_SUMMARY_TIMEOUT` | 5W1H timeout in seconds | `120` |
| `MOMENTRY_LLM_CHAT_TIMEOUT` | Chat LLM timeout in seconds | `120` |
| `MOMENTRY_LLM_VISION_TIMEOUT` | Vision LLM timeout in seconds | `120` |
### Embedding
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_EMBED_URL` | Embedding server URL | `http://localhost:11436` |
### TMDb Integration
| Variable | Description | Default |
|----------|-------------|---------|
| `TMDB_API_KEY` | TMDb API key (required for probe) | (none) |
| `MOMENTRY_TMDB_PROBE_ENABLED` | Enable TMDb probe during register | `false` |
### Paths
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_OUTPUT_DIR` | Output directory for processing | `/Users/accusys/momentry/output` |
| `MOMENTRY_BACKUP_DIR` | Backup directory | `/Users/accusys/momentry/backup/momentry` |
| `MOMENTRY_SCRIPTS_DIR` | Python scripts directory | `/Users/accusys/momentry_core_0.1/scripts` |
| `MOMENTRY_PYTHON_PATH` | Python interpreter path | `/opt/homebrew/bin/python3.11` |
| `MOMENTRY_MEDIA_BASE_URL` | Base URL for media serving | (none) |
### Processor Timeouts
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_ASR_TIMEOUT` | ASR timeout in seconds | `3600` |
| `MOMENTRY_CUT_TIMEOUT` | CUT timeout in seconds | `3600` |
| `MOMENTRY_DEFAULT_TIMEOUT` | Default timeout in seconds | `7200` |
### Logging
| Variable | Description | Default |
|----------|-------------|---------|
| `RUST_LOG` | Rust log level (tracing) | `info` |
| `MOMENTRY_LOG_LEVEL` | Fallback log level | `info` |
### Worker
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_WORKER_ENABLED` | Enable background worker | `true` |
| `MOMENTRY_MAX_CONCURRENT` | Max concurrent jobs | `6` |
| `MOMENTRY_POLL_INTERVAL` | Poll interval in seconds | `10` |
| `MOMENTRY_WORKER_BATCH_SIZE` | Batch size | `5` |
### Synonym Expansion
| Variable | Description | Default |
|----------|-------------|---------|
| `MOMENTRY_SYNONYM_FILES` | Comma-separated paths to synonym JSON files | (none) |
| `MOMENTRY_SYNONYM_FILE` | Single synonym file (deprecated) | (none) |
### Encryption
| Variable | Description | Default |
|----------|-------------|---------|
| `AUDIT_ENCRYPTION_KEY` | 32-byte hex encryption key (64 hex chars) | (none) |
## Port Registry
See `config/port_registry.tsv` for the authoritative list of all ports and their owners.
| Port | Service | Owner | Config Key |
|------|---------|-------|------------|
| 5432 | PostgreSQL | postgres | `DATABASE_URL` |
| 6379 | Redis | redis-server | `REDIS_URL` |
| 6333 | Qdrant | qdrant | `QDRANT_URL` |
| 8082 | LLM Chat (A4B) | llama-server | `MOMENTRY_LLM_CHAT_URL` |
| 8083 | LLM Vision (E4B) | llama-server | `MOMENTRY_LLM_VISION_URL` |
| 11434 | Ollama | ollama | `MOMENTRY_OLLAMA_URL` |
| 11436 | Embedding | embeddinggemma_server.py | `MOMENTRY_EMBED_URL` |
| 27017 | MongoDB | mongod | `MONGODB_URL` |
| 3002 | Production API | momentry | `MOMENTRY_SERVER_PORT` |
| 3003 | Playground API | momentry_playground | `MOMENTRY_SERVER_PORT` |
## Quick Start
```bash
# 複製模板
# 1. Copy template
cp .env.example .env
# 編輯配置
nano .env
# 2. Edit .env for production or use .env.development for playground
# 3. Start all services
./scripts/start_momentry.sh
```
### 2. 本地覆蓋
## Version Control
創建 `.env.local` 設置僅本地適用的配置:
```bash
# .env.local 示例
DATABASE_URL=postgres://local:password@localhost:5432/momentry_dev
MOMENTRY_LOG_LEVEL=debug
```
### 3. 運行應用
```bash
# 加載配置並運行
source .env && cargo run
# 或使用 direnv
direnv allow
```
## 版本控制策略
| 文件 | 版本控制 | 說明 |
|------|---------|------|
| `.env.example` | ✅ 追蹤 | 模板,包含所有選項 |
| `.env` | ❌ 忽略 | 本地敏感配置 |
| `.env.local` | ❌ 忽略 | 本地覆蓋配置 |
## 部署檢查清單
- [ ] 複製 `.env.example``.env`
- [ ] 設置數據庫連接
- [ ] 設置 Redis 密碼
- [ ] 配置目錄路徑
- [ ] 確認日誌級別
| File | Tracked | Purpose |
|------|---------|---------|
| `.env.example` | ✅ Yes | Template with all options documented |
| `.env` | ❌ No | Local sensitive config |
| `.env.development` | ❌ No | Dev-specific overrides |
| `.env.local` | ❌ No | Local overrides (highest priority) |

View File

@@ -16,7 +16,9 @@
6379 redis redis-server REDIS_URL redis://...:6379 start_momentry.sh
6333 qdrant qdrant QDRANT_URL http://...:6333 start_momentry.sh
8081 wordpress Caddy - - Caddyfile
8082 llm llama-server MOMENTRY_LLM_CHAT_URL http://...:8082 start_momentry.sh
8082 llm-chat llama-server MOMENTRY_LLM_CHAT_URL http://...:8082 start_momentry.sh
8083 llm-vision llama-server MOMENTRY_LLM_VISION_URL http://...:8083 start_momentry.sh
9000 php-fpm php-fpm - 9000 brew services
11434 ollama ollama MOMENTRY_OLLAMA_URL http://...:11434 start_momentry.sh
11436 embedding embeddinggemma MOMENTRY_EMBED_URL http://...:11436 start_momentry.sh
27017 mongodb mongod MONGODB_URL mongodb://...:27017 start_momentry.sh
1 # Port Registry - Momentry Core
16 6379
17 6333
18 8081
19 8082
20 8083
21 9000
22 11434
23 11436
24 27017