docs: file_uuid generation rules for M4

This commit is contained in:
Accusys
2026-05-17 02:26:09 +08:00
parent 3a6c186575
commit eec2eea880
79 changed files with 23293 additions and 0 deletions

View File

@@ -0,0 +1,120 @@
# Face Pipeline: Detection → Clustering → Trace
**Date**: 2026-05-16
---
## 流程
```
Video Frames
┌─────────────────────────────┐
│ 0. Cut Detection │ PySceneDetect
│ scene boundaries │ → chunk (chunk_type='cut')
└─────────────────────────────┘
┌─────────────────────────────┐
│ 1. Face Detection │ 每幀偵測人臉
│ confidence ≥ 0.5 │ → face_detections (cut_id 對應所屬 cut)
└─────────────────────────────┘
┌─────────────────────────────┐
│ 2. Face Clustering │ embedding + IoU + distance
│ trace_id assignment │ 同一人 + 同 cut → 同一 trace_id
│ per-file sequential │ trace_id 跨 cut 持續給號(不歸零)
└─────────────────────────────┘
┌─────────────────────────────┐
│ 3. Face Trace │ 跨影格連續追蹤
│ per-file sequential │ trace_id = 0, 1, 2, ...
│ scoped by cut │ 每個 trace 完全落在一個 cut 內
└─────────────────────────────┘
┌─────────────────────────────┐
│ 4. Identity Binding │ embedding 比對
│ identity_id assignment │ → known person / stranger
└─────────────────────────────┘
```
## scope
```sql
trace_id per-file sequential (file_uuid, trace_id)
cut_id chunk.id WHERE chunk_type='cut' scope
identity_id global FK cut / file
```
## 約束
| 約束 | 說明 |
|------|------|
| 唯一 | `(file_uuid, trace_id)` |
| 單一 cut | 每個 trace 完全落在一個 cut 內(`0` 個跨 cut trace |
| 獨立 | `trace_id``identity_id`。前者是物體軌跡,後者是身份分別 |
## 各階段資料量
```
Stage | 量 | Key
------------------------|-------------|----------------------
Raw faces | 262,021 | face_detections rows
After clustering | 6,892 | distinct trace_id
With identity | 147,602 | identity_id NOT NULL (2,035 identities)
Stranger (unbound) | 114,419 | identity_id IS NULL
```
## Trace 大小分布
| Faces per trace | Trace count | 說明 |
|:---------------:|:-----------:|------|
| 1 | 610 | 一閃而過 |
| 2-5 | 969 | 短暫出現 |
| 6-20 | 1,541 | 片段 |
| 21-100 | 2,218 | 一般 |
| 101+ | 1,554 | 主要角色 |
## Clustering 方式
Face Tracker (`scripts/face_tracker.py`) 使用三種方法決定同一人:
1. **IoU (Intersection over Union)** — 前後影格框重疊率
2. **Cosine distance** — face embedding 相似度
3. **Euclidean distance** — bbox 中心距離
三者加權決策iou > 0.5 || (cosine < 0.3 && distance < 100px)
## Trace 結構
```json
{
"trace_id": 2, // per-file sequential
"faces": [ // face_detections GROUP BY trace_id
{"face_id": "4587_0", "frame": 4587, "confidence": 0.92},
{"face_id": "4588_0", "frame": 4588, "confidence": 0.91},
...
],
"start_frame": 4587,
"end_frame": 4722,
"face_count": 46,
"identity_id": 101 // NULL = stranger
}
```
## API 查詢
```bash
# Trace 列表(含 face_count、區間
POST /api/v1/file/:uuid/face_trace/sortby
# Trace 內 faces逐幀 + 可選 interpolation
GET /api/v1/file/:uuid/trace/:trace_id/faces
# Trace 綁定身份
POST /api/v1/identity/:uuid/bind
```