docs: comply with V1.0 docs standard — add frontmatter, info table, English content

This commit is contained in:
Accusys
2026-05-15 12:09:34 +08:00
parent e4e3e25170
commit e4330a9704

View File

@@ -1,4 +1,6 @@
---
document_type: "reference_doc"
service: "MOMENTRY_CORE"
title: "File Lifecycle — Pre-Processing & Registration"
version: "V1.0"
date: "2026-05-15"
@@ -6,19 +8,26 @@ author: "M5"
status: "draft"
---
# File Lifecycle — Pre-Processing & Registration (All Managed Files)
# File Lifecycle — Pre-Processing & Registration
| Item | Value |
|------|-------|
| Scope | All managed file types (video, image, document, spreadsheet, presentation) |
| Status | Draft |
| Applies to | Watcher pre-processor + Register API |
| Key concept | Two-phase flow: birth certificate (`.pre.json`) → civil registration (DB INSERT) |
> **Applicable to all managed file types**: video, image, document (pdf, docx, pages, key, numbers), spreadsheet, presentation, and any other file registered in the system. The pre-processor registers any file type found by the watcher. ffprobe is used when applicable; files that ffprobe cannot parse receive minimal filesystem metadata as a fallback.
## Metaphor
```
SHA256 = DNA or fingerprint (唯一不變的生物特徵)
file created time = 出生時刻
birthday (UUID anchor) = 出生時間戳
.pre.json = 出生證明書
POST /api/v1/files/register = 戶政登記
status = registered = 完成戶籍登記
SHA256 = DNA or fingerprint (immutable biometric identity)
file created time = birth moment
birthday (UUID anchor) = birth timestamp
.pre.json = birth certificate
POST /api/v1/files/register = civil registration
status = registered = citizenship completed
```
## Two-Phase Flow
@@ -43,7 +52,7 @@ File watcher (`src/watcher/watcher.rs`) polls monitored directories every 60 sec
→ birthday = file creation time (RFC 3339)
2. SHA256(full file, streaming 64KB chunks)
→ content_hash = 512-bit hex string (檔案 DNA/指紋)
→ content_hash = 512-bit hex string (file DNA / fingerprint)
3. ffprobe (or minimal fs metadata fallback for non-video)
→ probe_json
@@ -88,12 +97,12 @@ Stored alongside other processor outputs:
The `birthday` is `file created time` — obtained from `fs::metadata().created()`. This is the **true birth time** of the file, not the registration time.
```
birthday = 2026-05-15T02:15:00Z ← 檔案出生時間,永不改變
birthday = 2026-05-15T02:15:00Z ← file birth time, never changes
file_uuid = SHA256(mac | birthday | path | filename)
同一檔案:相同 path + filename → 相同 UUID,無論註冊幾次
不同檔案:不同 content_hash → 不同 UUID即使同名
Same file: same path + filename → same UUID, regardless of registration count
Different files: different content_hash → different UUID (even if same name)
```
## Phase 2: Registration (Citizenship)
@@ -115,11 +124,11 @@ curl -X POST http://localhost:3002/api/v1/files/register \
└─ Not exists OR hash mismatch → compute fresh (existing logic)
2. Dedup check: SELECT file_uuid FROM videos WHERE content_hash = $1
├─ Found → already_exists: true (identical DNA = same person)
├─ Found → already_exists: true (identical DNA = same file)
└─ Not found → continue
3. Name conflict check + auto-rename if needed
└─ charade.mp4 → charade (1).mp4 (same name, different DNA)
└─ charade.mp4 → charade (1).mp4 (same name, different content)
4. INSERT INTO videos (
file_uuid, file_path, file_name, file_type,
@@ -148,8 +157,8 @@ File detected by watcher
[Pre-Processor]
├─ SHA256 (DNA/fingerprint)
├─ ffprobe (vital signs)
├─ SHA256 (DNA / fingerprint)
├─ ffprobe (metadata extraction)
└─ UUID (birth certificate ID)