- scripts/release_pack.py: packages output_json + schema + chunks + vectors
- Phase 1: triggered after ASR+ASRX+Rule 1+vectorization (sentence chunk delivery)
- Phase 2: triggered after full pipeline + 5W1H Agent (full delivery)
- Both phases include all available {uuid}.*.json files
- Non-overlapping directories: release/phase1/ and release/phase2/
2.0 KiB
2.0 KiB
Release 分階段交付
階段劃分
Phase 1:Sentence Chunk Embedding 交付
觸發時機: ASR + ASRX 完成 + Rule 1 Ingestion 完成
交付內容:
{uuid}.asr.json{uuid}.asrx.json- chunks(chunk_type = 'sentence')
- chunk_vectors(sentence embedding)
- DB schema + chunks table data
用途: 終端使用者可進行語意搜尋
Phase 2:5W1H Summary Chunk Embedding 交付
觸發時機: 全部 processor 完成 + Rule 3 Ingestion + 5W1H Agent
交付內容:
- Phase 1 全部內容
{uuid}.cut.json{uuid}.yolo.json{uuid}.face.json{uuid}.pose.json{uuid}.ocr.json- chunks(chunk_type = 'cut', 'visual', 'trace', 'story')
- chunk_vectors(summary embedding)
- identities / identity_bindings / face_detections
用途: 完整搜尋 + 摘要 + 人物識別
Worker Pipeline 整合
ASR 完成 → ASRX 完成
↓
Rule 1 Ingestion (sentence chunks)
↓
Phase 1 Release Packaging ← 自動
↓
其餘 Processors 繼續
↓
Rule 3 Ingestion (cut chunks + 5W1H summary)
↓
Phase 2 Release Packaging ← 自動
產出目錄結構
release/
├── phase1/
│ ├── v1.0.0_20260509_120000/
│ │ ├── output_json/ ← asr.json, asrx.json
│ │ ├── schema.sql ← chunks table DDL
│ │ ├── chunks.csv ← sentence chunks data
│ │ ├── vectors.csv ← sentence embeddings
│ │ └── RELEASE_INFO.txt
│ └── latest → v1.0.0_20260509_120000
│
└── phase2/
├── v1.0.0_20260509_140000/
│ ├── output_json/ ← all processor outputs
│ ├── schema.sql ← full schema
│ ├── chunks.csv ← all chunks
│ ├── vectors.csv ← all embeddings
│ ├── identities.csv ← person identities
│ └── RELEASE_INFO.txt
└── latest → v1.0.0_20260509_140000