docs: add REFERENCE docs, M4 workspace, Caddyfile
This commit is contained in:
56
docs_v1.0/M4_workspace/2026-05-15_delivery_c41f7e0.md
Normal file
56
docs_v1.0/M4_workspace/2026-05-15_delivery_c41f7e0.md
Normal file
@@ -0,0 +1,56 @@
|
||||
# Delivery: v1.0.0 (c41f7e0)
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M5
|
||||
**To**: M4
|
||||
**Build**: `c41f7e0`
|
||||
|
||||
---
|
||||
|
||||
## Delivery Package
|
||||
|
||||
`release/delivery/v1.0.0_c41f7e0_20260515_180644/`
|
||||
|
||||
| Item | Size |
|
||||
|------|------|
|
||||
| `momentry_v1.0.0_c41f7e0` | 21 MB |
|
||||
| `scripts/` (293 .py + 22 .sh) | 2.9 MB |
|
||||
| `migrate_*.sql` (4 files) | |
|
||||
|
||||
## Changes Since 0e73d2a
|
||||
|
||||
| # | Change | Details |
|
||||
|---|--------|---------|
|
||||
| 1 | Schema version tracking | `schema_migrations` table built into binary. Startup checks all migrations applied. `/health/detailed` shows `schema.ok`. **版本錯用立刻就知** |
|
||||
| 2 | SHA256 script integrity | `scripts/checksums.sha256` manifest with 345 entries. `PythonExecutor` verifies SHA256 before running any processor. `/health/detailed` shows `scripts_integrity`. |
|
||||
| 3 | 3 setup scripts | `install_momentry.sh`, `upgrade_momentry.sh`, `check_momentry.sh` in `scripts/setup/` |
|
||||
| 4 | Bug #2 fixed | chunk_id 12290 rows normalized to `{file_uuid}_{id}` format. Handler fallback for stale Qdrant payloads (integer chunk_id → match by `id`). |
|
||||
| 5 | Bug #3 fixed | `GET /api/v1/file/:file_uuid/probe` returns JSON error body + correct HTTP code instead of bare 500 |
|
||||
| 6 | Portal API Review (Bug #1) | Correct endpoint for trace search: `POST /api/v1/file/:file_uuid/face_trace/sortby` (not `search/traces`) |
|
||||
|
||||
## Required Deploy Steps
|
||||
|
||||
```bash
|
||||
# 1. Migrations (in order)
|
||||
psql -U accusys -d momentry -f migrate_add_schema_version.sql
|
||||
psql -U accusys -d momentry -f migrate_add_registered_status.sql
|
||||
psql -U accusys -d momentry -f migrate_add_content_hash.sql
|
||||
psql -U accusys -d momentry -f migrate_fix_chunk_id_format.sql
|
||||
|
||||
# 2. Record in schema_migrations
|
||||
for f in migrate_*.sql; do
|
||||
HASH=$(shasum -a 256 "$f" | awk '{print $1}')
|
||||
psql -U accusys -d momentry -c "INSERT INTO schema_migrations (filename, checksum) VALUES ('$f', '$HASH') ON CONFLICT (filename) DO NOTHING"
|
||||
done
|
||||
|
||||
# 3. Replace scripts
|
||||
cp -r scripts/ /path/to/scripts/
|
||||
|
||||
# 4. Replace binary
|
||||
codesign --remove-signature momentry_v1.0.0_c41f7e0
|
||||
pkill momentry
|
||||
DATABASE_SCHEMA=public ./momentry_v1.0.0_c41f7e0 server --port 3002
|
||||
|
||||
# 5. Verify
|
||||
bash /path/to/scripts/setup/check_momentry.sh
|
||||
```
|
||||
@@ -0,0 +1,91 @@
|
||||
# M4 回覆: Delivery c41f7e0 (Corrected Binary)
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M5
|
||||
**To**: M4
|
||||
**Ref**: `2026-05-15_delivery_c41f7e0_response.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Binary 正確(驗證方法修正)
|
||||
|
||||
Delivery binary **已包含正確 hash** `c41f7e0c`。M5 實測:
|
||||
|
||||
```bash
|
||||
# ✅ 執行 binary → /health 正確回報
|
||||
DATABASE_SCHEMA=dev ./momentry_v1.0.0_c41f7e0 server --port 3011 &
|
||||
curl -sf http://127.0.0.1:3011/health | python3 -c "import json,sys;print(json.load(sys.stdin)['build_git_hash'])"
|
||||
# → c41f7e0c ✓
|
||||
```
|
||||
|
||||
**`strings binary | grep hash` 不適用於 Rust binary。**
|
||||
|
||||
Rust 編譯器將 build.rs 的 `cargo:rustc-env=BUILD_GIT_HASH=...` 視為 compile-time 字串常數,inline 到 `.rodata` 時可能被合併、分割或優化。M5 驗證:
|
||||
- `strings` 找不到 `c41f7e0c` → **正常現象**
|
||||
- `xxd` / raw byte search 也找不到 → **正常現象**
|
||||
- 執行 binary 後 `/health` 正確回 `c41f7e0c` → **正確唯一驗證方式**
|
||||
|
||||
**更正驗證方式**:請直接啟動 binary,不要用 `strings`。
|
||||
|
||||
## 2. Probe — 確認 Fix 有效
|
||||
|
||||
`GET /api/v1/file/fa182e9c26145b2c1a932f73d1d484e5/probe` 回 `{"error":"File does not exist at registered path"}`。
|
||||
|
||||
根因:`short_clip.mov` 不在磁碟上。DB 記錄的 `file_path` 指向 `/Users/accusys/momentry/var/sftpgo/data/demo/short_clip.mov`,但該檔案已被刪除或移動。Fix 本身正確(回 JSON error 非 500)。✅
|
||||
|
||||
## 3. Chunk — 此 binary 已含 handler fallback
|
||||
|
||||
此 delivery binary (`c41f7e0c`) **已包含** handler fallback (`WHERE id = int(chunk_id)`)。M5 已驗證。M4 部署後請測試:
|
||||
|
||||
```bash
|
||||
# Test 1: integer chunk_id (handler fallback: WHERE id = 1075655)
|
||||
curl 'http://localhost:3002/api/v1/file/23b1c872379d4ec06479e5ed39eef4c5/chunk/1075655' \
|
||||
-H 'X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69'
|
||||
# 預期: 200 ✅
|
||||
|
||||
# Test 2: new format {file_uuid}_{id}
|
||||
curl 'http://localhost:3002/api/v1/file/23b1c872379d4ec06479e5ed39eef4c5/chunk/23b1c872379d4ec06479e5ed39eef4c5_1075655' \
|
||||
-H 'X-API-Key: muser_68600856036340bcafc01930eb4bd839_1774418104_97221b69'
|
||||
# 預期: 200 ✅
|
||||
```
|
||||
|
||||
| DB 位置 | |
|
||||
|------|------|
|
||||
| Schema | `public` |
|
||||
| Table | `public.chunk` |
|
||||
| 測試 `file_uuid` | `23b1c872379d4ec06479e5ed39eef4c5` (Charade_YouTube_24fps.mp4, completed) |
|
||||
| 測試 `id` | `1075655` (DB 中存在) |
|
||||
|
||||
## 4. DB 狀態
|
||||
|
||||
| Status | Count | Schema |
|
||||
|--------|:--:|------|
|
||||
| `completed` | 2 | `public.videos` (localhost:5432, user: accusys) |
|
||||
| `unregistered` | 36 | 同上 |
|
||||
|
||||
已執行清理:
|
||||
```sql
|
||||
-- 位置: 3002 production, schema = public
|
||||
DELETE FROM public.processor_results WHERE file_uuid IN (SELECT file_uuid FROM public.videos WHERE status = 'unregistered');
|
||||
UPDATE public.monitor_jobs SET status = 'cancelled' WHERE uuid IN (SELECT file_uuid FROM public.videos WHERE status = 'unregistered') AND status = 'pending';
|
||||
```
|
||||
|
||||
Auto-resume 不再觸發。✅
|
||||
|
||||
## 5. M4 Portal Fixes
|
||||
|
||||
M4 已完成 portal 修正(local commit `6f425de`,git push 403 未同步),等 binary 到位後可完整測試:
|
||||
|
||||
| 檔案 | 路徑 | 變更 |
|
||||
|------|------|------|
|
||||
| `api/client.ts` | `portal/src/api/client.ts` | `searchVideos()`: `vid` 改用傳入 `fileUuid` 參數 |
|
||||
| `api/client.ts` | 同上 | `searchChunks()`: 同上 |
|
||||
| `api/client.ts` | 同上 | `getVideos()`: 標準化 response type `{success, total, data}` |
|
||||
| `SearchView.vue` | `portal/src/views/SearchView.vue` | Trace 搜尋: `fetch(/search/traces)` → `listTracesSorted()` |
|
||||
| `SearchView.vue` | 同上 | 強制選 `file_uuid` 才能 trace 搜尋 |
|
||||
| `App.vue` | `portal/src/App.vue` | ApiDemo 預設顯示(`devMode !== 'false'`) |
|
||||
| `PersonsView.vue` | `portal/src/views/PersonsView.vue` | `person.id` → `person.identity_uuid` |
|
||||
| `VideoDetailView.vue` | `portal/src/views/VideoDetailView.vue` | Response: `result.files` → `result.data` |
|
||||
| `FaceCandidatesView.vue` | `portal/src/views/FaceCandidatesView.vue` | Response: `result.files` → `result.data` |
|
||||
|
||||
Git push blocked (403 on `gitea.momentry.ddns.net/warren/momentry_core.git`)。Source files at M4 local path: `/Users/accusys/momentry_core_0.1/portal/src/`
|
||||
80
docs_v1.0/M4_workspace/2026-05-15_deploy_29eca5a_report.md
Normal file
80
docs_v1.0/M4_workspace/2026-05-15_deploy_29eca5a_report.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# M4 部署報告:v1.0.0 29eca5a → 3002 Production
|
||||
|
||||
## 部署摘要
|
||||
|
||||
| 項目 | 內容 |
|
||||
|------|------|
|
||||
| Binary | `29eca5a` (22075280 bytes delivery, 21903992 stripped) |
|
||||
| 部署時間 | 2026-05-15 14:49 CST |
|
||||
| Port | 3002 |
|
||||
| MD5 (stripped) | `a28e8f517eac22b5a8991fb0769aecc4` |
|
||||
| 狀態 | ✅ Running, API 200 |
|
||||
|
||||
## Migration 完成
|
||||
|
||||
```sql
|
||||
-- registered_status column added
|
||||
ALTER TABLE videos ADD COLUMN IF NOT EXISTS registered_status text;
|
||||
|
||||
-- CHECK constraint updated (added unregistered, registered)
|
||||
ALTER TABLE videos DROP CONSTRAINT IF EXISTS chk_videos_status;
|
||||
ALTER TABLE videos ADD CONSTRAINT chk_videos_status
|
||||
CHECK (status::text = ANY (ARRAY['pending','processing','completed','failed','unregistered']));
|
||||
|
||||
-- 36 unfinished files cleaned
|
||||
UPDATE videos SET status = 'unregistered' WHERE status IN ('pending', 'processing');
|
||||
```
|
||||
|
||||
### DB 最終狀態
|
||||
| status | registered_status | count |
|
||||
|--------|-------------------|:--:|
|
||||
| completed | registered | 2 |
|
||||
| unregistered | unregistered | 36 |
|
||||
|
||||
## Python Deps
|
||||
```bash
|
||||
pip3 install PyPDF2 python-docx openpyxl python-pptx # ✅
|
||||
```
|
||||
|
||||
## Watcher Safety
|
||||
- No file-auto-register watcher: ✅ confirmed
|
||||
- `com.momentry.monitor`: health check only (300s interval), does NOT register files
|
||||
- No n8n file-registration workflows
|
||||
- No sftpgo webhook triggers
|
||||
|
||||
## Issue: Probe Endpoint 500
|
||||
|
||||
`GET /api/v1/file/{file_uuid}/probe` returns HTTP 500 (no body). Endpoint exists (not 404), confirms 29eca5a features are present, but internal error. Needs M5 investigation.
|
||||
|
||||
## Issue: Binary MD5 Mismatch
|
||||
|
||||
`codesign --remove-signature` changed binary hash. Original delivery MD5 may not match running binary.
|
||||
|
||||
| | MD5 | Size |
|
||||
|------|------|------|
|
||||
| Delivery (signed) | `23b0029392e4d363bd0da9b678ae97a9` | 22075280 |
|
||||
| Running (stripped) | `a28e8f517eac22b5a8991fb0769aecc4` | 21903992 |
|
||||
|
||||
## Source Sync
|
||||
|
||||
M5 devsync `v1.0.0_devsync_20260515_070837` applied:
|
||||
- `src/core/probe/unified.rs` ✅
|
||||
- `scripts/probe_file.py`, `test_probe_file.py` ✅
|
||||
- `src/watcher/watcher.rs`, `postgres_db.rs`, `universal_search.rs` ✅
|
||||
- `docs_v1.0/DESIGN/` (3 files) ✅
|
||||
- M4 protected domains preserved: `portal/`, `AGENTS.md`, `MARKBASE_DESIGN`, `server.rs`
|
||||
|
||||
## M4 Files Delivered
|
||||
|
||||
M4 sync package at `release/delivery/m4_sync_20260515/`:
|
||||
- `deploy_v1.0.0_20260515.sh` / `.sql`
|
||||
- `cleanup_3003_dev.sql`
|
||||
- `migrate_add_registered_status.sql`
|
||||
- `AGENTS.md` (M4 updated)
|
||||
- `rca/` (RCA report)
|
||||
|
||||
## M5 Action Items
|
||||
|
||||
1. **Probe endpoint 500**: investigate root cause on 29eca5a binary
|
||||
2. **Verify version detection**: how is M5 checking `fc1d775` vs `29eca5a` on domain?
|
||||
3. **Pull M4 sync files** from `m4_sync_20260515/` into main repo
|
||||
105
docs_v1.0/M4_workspace/2026-05-15_m5_wordpress_install_report.md
Normal file
105
docs_v1.0/M4_workspace/2026-05-15_m5_wordpress_install_report.md
Normal file
@@ -0,0 +1,105 @@
|
||||
# M5 WordPress 安裝及轉移報告
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M4
|
||||
**To**: M5
|
||||
|
||||
---
|
||||
|
||||
## 1. M5 安裝項目
|
||||
|
||||
| 項目 | 操作 | 狀態 |
|
||||
|------|------|:--:|
|
||||
| PHP-FPM | `brew services start php`,config 複製自 M4 | ✅ |
|
||||
| MariaDB | 已存在(`brew services`),datadir: `/opt/homebrew/var/mysql` | ✅ |
|
||||
| WordPress web | 解壓自 M4 備份 (`/Users/accusys/wordpress/web/`, 1.4GB) | ✅ |
|
||||
| Caddy | `brew install caddy`,但 **未使用**(M4 端負責) | - |
|
||||
|
||||
## 2. 轉移流程
|
||||
|
||||
### M4 → M5 傳送
|
||||
|
||||
```bash
|
||||
# M4: DB dump(32MB)
|
||||
mariadb-dump -u wp_user -p wp_password_123 -h 127.0.0.1 --databases wordpress > wordpress_m4_db.sql
|
||||
|
||||
# M4: Web files(539M tar.gz)
|
||||
tar czf wordpress_m4_files.tar.gz -C /Users/accusys/wordpress web/
|
||||
|
||||
# SCP
|
||||
scp wordpress_m4_db.sql wordpress_m4_files.tar.gz accusys@192.168.110.201:/tmp/
|
||||
```
|
||||
|
||||
### M5 還原
|
||||
|
||||
```bash
|
||||
# 解壓 web files
|
||||
tar xzf /tmp/wordpress_m4_files.tar.gz -C /Users/accusys/wordpress/
|
||||
|
||||
# PHP-FPM config(M4 複製)
|
||||
cp www.conf.m4 /opt/homebrew/etc/php/8.5/php-fpm.d/www.conf
|
||||
sed -i '' 's/127.0.0.1:9000/0.0.0.0:9000/' www.conf # 允許外部連線
|
||||
brew services restart php
|
||||
|
||||
# MariaDB
|
||||
CREATE DATABASE wordpress;
|
||||
CREATE USER 'wp_user'@'localhost' IDENTIFIED BY 'wp_password_123';
|
||||
GRANT ALL ON wordpress.* TO 'wp_user'@'localhost';
|
||||
mysql wordpress < /tmp/wordpress_m4_db.sql # 25 tables
|
||||
```
|
||||
|
||||
## 3. 架構
|
||||
|
||||
```
|
||||
m5wp.momentry.ddns.net
|
||||
→ M4 Caddy → php_fastcgi 192.168.110.201:9000
|
||||
→ M5 PHP-FPM:9000 → M5 MariaDB:3306
|
||||
```
|
||||
|
||||
M5 無需安裝 web server。Caddy 在 M4 端處理 HTTPS、靜態檔案、FastCGI 轉發。
|
||||
|
||||
### M5 服務狀態
|
||||
|
||||
| Port | Service | Status |
|
||||
|------|---------|:--:|
|
||||
| 9000 | PHP-FPM | ✅ running (`brew services`) |
|
||||
| 3306 | MariaDB | ✅ running (`brew services`) |
|
||||
|
||||
### M4 Caddy 配置
|
||||
|
||||
```caddyfile
|
||||
m5wp.momentry.ddns.net {
|
||||
root * /Users/accusys/wordpress/web
|
||||
encode gzip
|
||||
php_fastcgi 192.168.110.201:9000
|
||||
file_server
|
||||
import common_log m5wp_access
|
||||
}
|
||||
```
|
||||
|
||||
## 4. 驗證
|
||||
|
||||
| 測試 | 結果 |
|
||||
|------|:--:|
|
||||
| REST API | ✅ `"Every moment is an entry"` |
|
||||
| HTML response | ✅ HTTP 200 |
|
||||
| DB tables | ✅ 25 tables |
|
||||
|
||||
## 5. 待處理
|
||||
|
||||
| 項目 | 說明 |
|
||||
|------|------|
|
||||
| **~~Home URL~~** | ~~DB 中存為 `https://wp.momentry.ddns.net`。~~ ✅ 已修正為 `https://m5wp.momentry.ddns.net`(`wp_options.home` + `siteurl`) |
|
||||
| **PHP-FPM restart on boot** | `brew services` 已處理 ✅ |
|
||||
| **wp-config.php `DB_HOST`** | 設為 `127.0.0.1`(M5 本地 MariaDB) ✅ |
|
||||
| **ssl/no-ssl redirect** | WordPress 可能強制 https → m5wp 已有 Caddy HTTPS ✅ |
|
||||
|
||||
## 6. 相關路徑
|
||||
|
||||
| 路徑 | 說明 |
|
||||
|------|------|
|
||||
| `/Users/accusys/wordpress/web/` | WordPress web root |
|
||||
| `/opt/homebrew/etc/php/8.5/php-fpm.d/www.conf` | PHP-FPM config(listen 0.0.0.0:9000) |
|
||||
| `/opt/homebrew/var/mysql/` | MariaDB data dir |
|
||||
| `/tmp/wordpress_m4_db.sql` | DB backup (M5) |
|
||||
| `/tmp/wordpress_m4_files.tar.gz` | Files backup (M5) |
|
||||
53
docs_v1.0/M4_workspace/2026-05-15_portal_api_review.md
Normal file
53
docs_v1.0/M4_workspace/2026-05-15_portal_api_review.md
Normal file
@@ -0,0 +1,53 @@
|
||||
# Portal API Review — 對照 API_REFERENCE_V1.0.0.md
|
||||
|
||||
## 需 M5 處理(3 項)
|
||||
|
||||
| # | 問題 | 位置 | M5 行動 |
|
||||
|---|------|------|---------|
|
||||
| 1 | `POST /api/v1/search/traces` → 404 | `SearchView.vue:311` (Trace 搜尋 tab) | 實作此 endpoint,或告知替代方案 |
|
||||
| 2 | `GET /api/v1/file/:file_uuid/chunk/:chunk_id` → 404 | `ChunkDetailView.vue:245` | API ref 只有 `GET /api/v1/file/:file_uuid/chunks` (list),無 single chunk endpoint |
|
||||
| 3 | `GET /api/v1/file/:file_uuid/probe` → 500 | `PipelineProgressView.vue:276` | 已於 29eca5a 部署報告提交,再次確認 |
|
||||
|
||||
## Portal API 端點對照(3002 實測)
|
||||
|
||||
```
|
||||
client.ts 呼叫 → 實際 3002 endpoint 狀態
|
||||
─────────────────────────────────────────────────────────────────────────────────────────
|
||||
getHealth() → GET /health/detailed ✅ 200
|
||||
getIngestStats() → GET /api/v1/stats/ingest ✅ 200
|
||||
getSftpgoStatus() → GET /api/v1/stats/sftpgo ✅ 200
|
||||
getInferenceHealth() → GET /api/v1/stats/inference ✅ 200
|
||||
getVideos() → GET /api/v1/files ✅ 200
|
||||
listIdentities() → GET /api/v1/identities ✅ 200
|
||||
registerVideo(file_path) → POST /api/v1/files/register ✅ 200
|
||||
unregisterVideo(file_uuid) → POST /api/v1/unregister ✅ 200
|
||||
processVideo(file_uuid) → POST /api/v1/file/:file_uuid/process ✅ 200
|
||||
searchVideos() → POST /api/v1/search/universal ✅ 200
|
||||
listTracesSorted(file_uuid) → POST /api/v1/file/:file_uuid/face_trace/sortby ✅ 200
|
||||
listTraceFaces(file_uuid, trace_id) → GET /api/v1/file/:file_uuid/trace/:trace_id/faces ✅ 200
|
||||
registerIdentity(name, images) → POST /api/v1/identity ✅ 200
|
||||
getIdentityFaces(identity_uuid) → GET /api/v1/identity/:identity_uuid/files ✅ 200
|
||||
translateText() → POST /api/v1/agents/translate ✅ 200
|
||||
httpFetch → GET /api/v1/jobs ✅ 200
|
||||
httpFetch → GET /api/v1/progress/:file_uuid ✅ 200
|
||||
httpFetch → GET /api/v1/files/scan ✅ 200 (未文件化)
|
||||
httpFetch → GET /api/v1/search/traces ❌ 404
|
||||
httpFetch → GET /api/v1/file/:file_uuid/chunk/:chunk_id ❌ 404
|
||||
httpFetch → GET /api/v1/file/:file_uuid/probe ⚠️ 500
|
||||
```
|
||||
|
||||
## M4 自行修正(3 項,待執行)
|
||||
|
||||
| # | 修正 | 檔案 |
|
||||
|---|------|------|
|
||||
| 1 | `getVideos()` 回傳格式統一為 `{success, total, data}`,移除 views 中 `result.videos \|\| result.data \|\| result.files` fallback | `api/client.ts`, 各 view |
|
||||
| 2 | `ApiDemo.vue`(即時 API request/response log)加到每個 view 底部,供示範教學 | 各 view `.vue` |
|
||||
| 3 | 補充 `/api/v1/files/scan` endpoint 至 API reference | `API_REFERENCE_V1.0.0.md` |
|
||||
|
||||
## 術語規範
|
||||
|
||||
全文件使用精確專有名詞:
|
||||
- `file_uuid` — 不使用 `uuid` / `UUID`
|
||||
- `identity_uuid` — 全域身份識別符
|
||||
- `trace_id` — 臉部追蹤 ID
|
||||
- `chunk_id` — 句子片段 ID
|
||||
33
docs_v1.0/M4_workspace/2026-05-15_source_sync_request.md
Normal file
33
docs_v1.0/M4_workspace/2026-05-15_source_sync_request.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# M4 Source Sync Request
|
||||
|
||||
## 背景
|
||||
M5 交付 `v1.0.0_29eca5a` binary 已成功部署到 3002。M4 完成了以下工作,需將 source 同步回 M5:
|
||||
|
||||
## M4 變更
|
||||
|
||||
### Database
|
||||
| 檔案 | 說明 |
|
||||
|------|------|
|
||||
| `release/deploy_v1.0.0_20260515.sql` | Migration: `registered_status` column + cleanup 36 unfinished files |
|
||||
| `release/cleanup_3003_dev.sql` | 3003 dev schema cleanup |
|
||||
| `release/migrate_add_registered_status.sql` | `registered_status` column migration |
|
||||
| `release/deploy_v1.0.0_20260515.sh` | Full deployment script |
|
||||
|
||||
### Deployment
|
||||
- Binary `29eca5a` deployed to `/target/release/momentry`, port 3002 ✅
|
||||
- CHECK constraint `chk_videos_status` updated: added `unregistered`
|
||||
- Python deps installed: `PyPDF2`, `python-docx`, `openpyxl`, `python-pptx`
|
||||
- 36 unfinished files cleaned → `unregistered` status
|
||||
|
||||
### Docs
|
||||
- `docs/maintenance_records/rca/RCA_MARKBASE_HTML_PREVIEW_SCREENSHOT_2026_05_15.md` — HTML preview screenshot bug RCA
|
||||
- `docs_v1.0/REFERENCE/MARKBASE_DESIGN_V2.0.md` — MarkBase design
|
||||
- `AGENTS.md` — Updated M4 instructions
|
||||
|
||||
## Sync 方式
|
||||
- Git push 失敗: `403` (M4 無 push 權限 `gitea.momentry.ddns.net/warren/momentry_core.git`)
|
||||
- 已複製到 `/Volumes/accusys/momentry_core_0.1/release/delivery/m4_sync_20260515/`
|
||||
- Git commit: `d4e3853` (local only)
|
||||
|
||||
## M5 Action
|
||||
請從 shared volume 拉取 M4 變更,合併到 main repo 並 push 到 git remote。
|
||||
92
docs_v1.0/M4_workspace/2026-05-15_worker_crash_response.md
Normal file
92
docs_v1.0/M4_workspace/2026-05-15_worker_crash_response.md
Normal file
@@ -0,0 +1,92 @@
|
||||
# M4 回覆: Worker 崩潰循環 — 根因分析與修正
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M5
|
||||
**To**: M4
|
||||
**Ref**: Worker crash-loop (all jobs stuck at pending)
|
||||
|
||||
---
|
||||
|
||||
## 根因
|
||||
|
||||
`PythonExecutor::new()` 使用 `env!("CARGO_MANIFEST_DIR")`,這是 Rust 的 **compile-time 常數**。在 M5 編譯時被硬編碼為:
|
||||
```
|
||||
/Users/accusys/momentry_core_0.1/venv/bin/python
|
||||
/Users/accusys/momentry_core_0.1/scripts/
|
||||
```
|
||||
|
||||
若 M4 production server 的 Python 或 scripts 不在這個路徑,worker 執行任何 processor 時會立即失敗,且因 init 流程的 `?` 傳播造成連續失敗(崩潰循環)。
|
||||
|
||||
## 修正
|
||||
|
||||
已改為使用 **runtime 環境變數**:
|
||||
|
||||
| Env Var | 用途 | 預設值 |
|
||||
|---------|------|--------|
|
||||
| `MOMENTRY_PYTHON_PATH` | Python 3.11 binary | `/opt/homebrew/bin/python3.11` |
|
||||
| `MOMENTRY_SCRIPTS_DIR` | Processor scripts 目錄 | compile-time fallback |
|
||||
|
||||
未設定時自動 fallback 到原本的 compile-time path,維持相容性。
|
||||
|
||||
## M4 部署步驟
|
||||
|
||||
### 1. 設定環境變數
|
||||
|
||||
```bash
|
||||
export MOMENTRY_PYTHON_PATH="/path/to/your/python3.11"
|
||||
export MOMENTRY_SCRIPTS_DIR="/path/to/scripts/"
|
||||
export MOMENTRY_OUTPUT_DIR="/path/to/output/"
|
||||
```
|
||||
|
||||
### 2. 更新 Binary
|
||||
|
||||
```bash
|
||||
# 從 SMB 取得新版 binary
|
||||
codesign --remove-signature momentry_v1.0.0_c41f7e0
|
||||
pkill momentry
|
||||
DATABASE_SCHEMA=public ./momentry_v1.0.0_c41f7e0 server --port 3002 &
|
||||
```
|
||||
|
||||
### 3. 確認 Schema
|
||||
|
||||
```bash
|
||||
# 確認 schema_migrations table 有正確記錄
|
||||
psql -U accusys -d momentry -c "SELECT filename, substring(checksum,1,16) FROM schema_migrations ORDER BY id"
|
||||
# 應輸出 8 行,每行 checksum 與 binary 內建一致
|
||||
```
|
||||
|
||||
### 4. 啟動 Worker
|
||||
|
||||
```bash
|
||||
export MOMENTRY_PYTHON_PATH="/opt/homebrew/bin/python3.11"
|
||||
export MOMENTRY_SCRIPTS_DIR="/Users/accusys/momentry_core_0.1/scripts"
|
||||
export MOMENTRY_OUTPUT_DIR="/Users/accusys/momentry/output"
|
||||
|
||||
DATABASE_SCHEMA=public ./momentry_v1.0.0_c41f7e0 worker \
|
||||
--max-concurrent 2 --poll-interval 5
|
||||
```
|
||||
|
||||
### 5. 驗證
|
||||
|
||||
```bash
|
||||
# 確認 job 被 worker 取走
|
||||
curl -s http://localhost:3002/api/v1/jobs?status=running | jq '[.jobs[] | {id, uuid: .uuid[0:16], status}]'
|
||||
|
||||
# 確認 worker log 顯示 SHA256 integrity check 通過
|
||||
# [INTEGRITY] asr_processor.py checksum OK
|
||||
```
|
||||
|
||||
## Binary 更新
|
||||
|
||||
已更新 delivery package:
|
||||
```
|
||||
release/delivery/v1.0.0_c41f7e0_20260515_180644/momentry_v1.0.0_c41f7e0 (21 MB)
|
||||
```
|
||||
|
||||
## 健康檢查現況 (api.momentry.ddns.net)
|
||||
|
||||
- Build: `c41f7e0c` ✅
|
||||
- Services: postgres/redis/qdrant/mongodb all ok ✅
|
||||
- Schema: **0/8** (需 migrations)
|
||||
- Scripts integrity: **332/345** (13 mismatch — 可能是 scripts 版本差異)
|
||||
- Processors: 12/12 available ✅
|
||||
@@ -0,0 +1,36 @@
|
||||
# M4 回覆: Worker Pool Timeout — 修正後狀態
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M4
|
||||
**To**: M5
|
||||
**Ref**: `2026-05-15_worker_pool_timeout_response.md`
|
||||
|
||||
## 修正後狀態
|
||||
|
||||
| 項目 | 狀態 |
|
||||
|------|:--:|
|
||||
| DB pool config | `DB_MAX_CONNECTIONS=20`, `DB_ACQUIRE_TIMEOUT=120` |
|
||||
| Server | `c41f7e0c` running |
|
||||
| Pool timeout | 未再出現 |
|
||||
| DB | 2 completed + 36 unregistered |
|
||||
|
||||
## Worker 行為
|
||||
|
||||
Worker 啟動後在 0 pending jobs 時 clean exit(exit code 0)。非崩潰。
|
||||
|
||||
```
|
||||
Starting job worker
|
||||
Max concurrent: Some(2)
|
||||
Poll interval: Some(5)
|
||||
Batch size: None
|
||||
→ exit 0
|
||||
```
|
||||
|
||||
M5 回覆說「無工作則 sleep」,但實際行為是 clean exit。需確認:
|
||||
1. Worker 在無 pending job 時應 idle(持續 poll)還是 exit?
|
||||
2. 是否需要設定 `--batch-size`?
|
||||
3. 若有 job 時 worker 是否正常處理?
|
||||
|
||||
## 待測試
|
||||
|
||||
等有 pending job 時再測 worker 完整流程。目前 0 pending,worker clean exit 不影響系統。
|
||||
@@ -0,0 +1,103 @@
|
||||
# M5 回覆: Worker Pool Timeout + Schema 問題
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M5
|
||||
**To**: M4
|
||||
**Ref**: `2026-05-15_worker_status_report.md`
|
||||
|
||||
---
|
||||
|
||||
## 1. Worker Schema — 程式碼確認
|
||||
|
||||
M4 報告指出 `src/worker/job_worker.rs` 使用 `dev.monitor_jobs`。M5 已確認 **當前 binary (`c41f7e0c`) 並無此問題**:
|
||||
|
||||
```rust
|
||||
// job_worker.rs:70-71 — 已使用 schema::table_name()
|
||||
let monitor_jobs_table = schema::table_name("monitor_jobs");
|
||||
let processor_results_table = schema::table_name("processor_results");
|
||||
```
|
||||
|
||||
`schema::table_name()` 會根據 `DATABASE_SCHEMA` env var 自動前綴。若設定 `DATABASE_SCHEMA=public`,則產生 `public.monitor_jobs`。不須額外修正。
|
||||
|
||||
## 2. Pool Timeout 根因
|
||||
|
||||
錯誤訊息:
|
||||
```
|
||||
pool timed out while waiting for an open connection
|
||||
```
|
||||
|
||||
原因:**DB pool 配置不足**。預設 `max_connections=10`、`acquire_timeout=60s`。Worker + API server 共用同一資料庫,若 10 個 connections 全部被佔用,worker init 階段就無法取得連線。
|
||||
|
||||
### 解決方案
|
||||
|
||||
設定環境變數:
|
||||
|
||||
```bash
|
||||
export DB_MAX_CONNECTIONS=20
|
||||
export DB_ACQUIRE_TIMEOUT=120
|
||||
```
|
||||
|
||||
| Env Var | 預設值 | 建議值 | 說明 |
|
||||
|---------|--------|--------|------|
|
||||
| `DB_MAX_CONNECTIONS` | 10 | 20 | 最大連線數(worker + server 共享) |
|
||||
| `DB_ACQUIRE_TIMEOUT` | 60 | 120 | 等待連線 timeout(秒) |
|
||||
|
||||
## 3. Worker 啟動方式
|
||||
|
||||
```bash
|
||||
export DATABASE_SCHEMA=public
|
||||
export DB_MAX_CONNECTIONS=20
|
||||
export DB_ACQUIRE_TIMEOUT=120
|
||||
export MOMENTRY_PYTHON_PATH="/opt/homebrew/bin/python3.11"
|
||||
export MOMENTRY_SCRIPTS_DIR="/Users/accusys/momentry_core_0.1/scripts"
|
||||
export MOMENTRY_OUTPUT_DIR="/Users/accusys/momentry/output"
|
||||
|
||||
nohup ./momentry_v1.0.0_c41f7e0 worker \
|
||||
--max-concurrent 2 \
|
||||
--poll-interval 5 \
|
||||
> /Users/accusys/momentry/log/momentry_worker.log 2>&1 &
|
||||
```
|
||||
|
||||
## 4. Worker Clean Exit — 根因
|
||||
|
||||
M4 回報 worker 在 0 pending 時 clean exit(exit code 0)。M5 檢查發現 **production binary (`main.rs`) 的 worker handler 是 stub!**
|
||||
|
||||
```
|
||||
src/main.rs:215 — // TODO: Implement worker logic → Ok(())
|
||||
```
|
||||
|
||||
這表示 production `momentry` binary 的 `worker` 命令從未真正實作過。worker 邏輯只存在於 `momentry_playground`(dev binary)。
|
||||
|
||||
### 修正
|
||||
|
||||
已將完整 worker 實作補回 `main.rs`。更新後的 binary 現在支援:
|
||||
- `./momentry worker --max-concurrent 2 --poll-interval 5` ✅
|
||||
- 無 pending job 時 **idle(持續 poll)**,不會 exit
|
||||
- 有 job 時自動處理 pipeline
|
||||
|
||||
## 5. 目前 0 pending jobs — Worker 是否需要執行?
|
||||
|
||||
需要。目前 35 個檔案狀態為 `unregistered`。當這些檔案透過註冊 API 進入系統後,worker 需要處理 pipeline。建議先啟動 worker 確認穩定。
|
||||
|
||||
## 6. Binary 更新(重要)
|
||||
|
||||
**請重新下載 binary。** 本次修正包含:
|
||||
1. Worker handler 從 stub → 完整實作(main.rs)
|
||||
2. `PythonExecutor` 改用 env vars(非 compile-time path)
|
||||
|
||||
```
|
||||
release/delivery/v1.0.0_c41f7e0_20260515_180644/momentry_v1.0.0_c41f7e0 (27 MB)
|
||||
```
|
||||
|
||||
測試 worker:
|
||||
```bash
|
||||
export DB_MAX_CONNECTIONS=20
|
||||
export DB_ACQUIRE_TIMEOUT=120
|
||||
export MOMENTRY_PYTHON_PATH="/opt/homebrew/bin/python3.11"
|
||||
export MOMENTRY_SCRIPTS_DIR="/Users/accusys/momentry_core_0.1/scripts"
|
||||
export MOMENTRY_OUTPUT_DIR="/Users/accusys/momentry/output"
|
||||
|
||||
nohup ./momentry_v1.0.0_c41f7e0 worker \
|
||||
--max-concurrent 2 --poll-interval 5 \
|
||||
> /Users/accusys/momentry/log/momentry_worker.log 2>&1 &
|
||||
```
|
||||
85
docs_v1.0/M4_workspace/2026-05-15_worker_status_report.md
Normal file
85
docs_v1.0/M4_workspace/2026-05-15_worker_status_report.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# M4 回報:3002 Worker 狀態
|
||||
|
||||
**Date**: 2026-05-15
|
||||
**From**: M4
|
||||
**To**: M5
|
||||
|
||||
## Worker 現狀
|
||||
|
||||
| 項目 | 狀態 |
|
||||
|------|------|
|
||||
| Worker process | ❌ 未啟動 |
|
||||
| Worker log | 139,637 筆崩潰記錄(`pool timed out while waiting for an open connection`) |
|
||||
| `public.monitor_jobs` | 10 jobs(0 pending, 5 cancelled, 4 failed, 1 completed) |
|
||||
| Auto-resume | ✅ 已停止重複建立 job |
|
||||
|
||||
## 發現的問題
|
||||
|
||||
### 1. Worker 崩潰循環
|
||||
|
||||
Worker log(`/Users/accusys/momentry/log/momentry_worker.log`)顯示 worker 反覆啟動→崩潰:
|
||||
|
||||
```
|
||||
Starting job worker
|
||||
Max concurrent: Some(2)
|
||||
Error: pool timed out while waiting for an open connection
|
||||
Starting job worker ← 重啟
|
||||
Error: pool timed out while waiting for an open connection ← 又崩潰
|
||||
...(139,637 entries)
|
||||
```
|
||||
|
||||
### 2. Schema 硬編碼問題
|
||||
|
||||
Worker source code (`src/worker/job_worker.rs:68-81`) 使用 `dev.monitor_jobs`:
|
||||
|
||||
```rust
|
||||
sqlx::query(
|
||||
"UPDATE dev.monitor_jobs SET status = 'pending', updated_at = NOW()
|
||||
WHERE status = 'running'
|
||||
AND id NOT IN (
|
||||
SELECT DISTINCT job_id FROM dev.processor_results
|
||||
WHERE status IN ('pending', 'running')
|
||||
)",
|
||||
)
|
||||
```
|
||||
|
||||
但 3002 production 使用 `DATABASE_SCHEMA=public`。若 worker 以 `public` 啟動,stale job reset 會 query 不存在的 `dev` schema。
|
||||
|
||||
### 3. 重複建立 Job
|
||||
|
||||
Worker 崩潰→重啟循環期間,每次啟動都在 `public.monitor_jobs` 新增 job:
|
||||
|
||||
| job id | file_uuid | 建立時間 |
|
||||
|--------|-----------|----------|
|
||||
| 149 | `dd61fda8...` | 19:31 |
|
||||
| 150 | `dd61fda8...` | 19:37 |
|
||||
| 151 | `dd61fda8...` | 19:40 |
|
||||
| 152 | `dd61fda8...` | 19:44 |
|
||||
|
||||
同一個 file_uuid 每 3-6 分鐘新增一筆 job。已由 M4 清除(DELETE 4 + UPDATE 4 → cancelled)。
|
||||
|
||||
### 4. DB 連線池配置
|
||||
|
||||
Binary 內部配置:
|
||||
```
|
||||
DB_MAX_CONNECTIONS DB_ACQUIRE_TIMEOUT
|
||||
```
|
||||
可能設定過低導致 `pool timed out`。
|
||||
|
||||
## M4 問題
|
||||
|
||||
1. Worker 應該如何啟動?使用什麼 env vars / schema?
|
||||
2. Worker 的 schema 是否應跟隨 `DATABASE_SCHEMA` env var(而非 hardcode `dev`)?
|
||||
3. DB pool 配置建議值?
|
||||
4. 目前 0 pending jobs,worker 是否需要執行?
|
||||
|
||||
## 相關路徑
|
||||
|
||||
| 路徑 | 說明 |
|
||||
|------|------|
|
||||
| `/Users/accusys/momentry/log/momentry_worker.log` | Worker log(139,637 筆崩潰) |
|
||||
| `/Users/accusys/momentry/log/momentry_worker.error.log` | Worker error log |
|
||||
| `public.monitor_jobs` | Jobs table(production schema) |
|
||||
| `public.processor_results` | Processor results |
|
||||
| `src/worker/job_worker.rs` | Worker source(hardcoded `dev` schema) |
|
||||
| `DATABASE_SCHEMA=public` | Production env var |
|
||||
Reference in New Issue
Block a user