docs: add processor refactoring assessment from M5Max128 workspace research

This commit is contained in:
M5Max128
2026-05-27 03:59:13 +08:00
parent 955282e587
commit c85794292a

View File

@@ -0,0 +1,352 @@
---
title: Processor Refactoring Assessment (M5Max128 Research)
version: 1.0
date: 2026-05-27
author: M5Max128
status: reference
---
# Processor Refactoring Assessment
> **Scope**: M5Max128 research documentation for M5Max48 implementation reference
> **Workspace**: ~/workspace/ (22 modules)
## Executive Summary
22 processor modules evaluated for Rust/Swift/Python refactoring feasibility.
### Priority Matrix
| Phase | Language | Modules | Effort | Benefit |
|-------|----------|---------|--------|---------|
| 1 | Swift | OCR, Pose, Face | Low | Remove Python wrappers |
| 2 | Rust | TKG, Resume, Redis | Low | Remove infrastructure deps |
| 3 | Rust | Cut | Medium | Pure CPU logic |
| 4 | Swift | YOLO | Medium | ANE acceleration |
| 5 | Python | Others (keep) | - | ML/LLM dependencies |
---
## Phase 1: Swift Modules (Immediate Gain)
### workspace_ocr
| Metric | Value |
|--------|-------|
| Swift Suitability | 10/10 |
| Current State | Thin Python wrapper around swift_ocr |
| Refactoring | Delete Python wrapper, Rust calls swift_ocr directly |
| LOC Change | Python: -122, Rust: ~50 |
| Risk | Low |
| Effort | 1 day |
**Current Architecture**:
```
Rust (ocr.rs) → PythonExecutor → ocr_processor.py → subprocess → swift_ocr
```
**Target Architecture**:
```
Rust (ocr.rs) → subprocess → swift_ocr
```
### workspace_pose
| Metric | Value |
|--------|-------|
| Swift Suitability | 10/10 |
| Current State | Thin Python wrapper around swift_pose |
| Refactoring | Delete Python wrapper, Rust calls swift_pose directly |
| LOC Change | Python: -150, Rust: ~50 |
| Risk | Low |
| Effort | 1 day |
**Current Architecture**:
```
Rust (pose.rs) → PythonExecutor → pose_processor.py → subprocess → swift_pose
```
**Target Architecture**:
```
Rust (pose.rs) → subprocess → swift_pose
```
### workspace_face
| Metric | Value |
|--------|-------|
| Swift Suitability | 9/10 |
| Current State | Swift detect + Python embedding (FaceNet CoreML) |
| Refactoring | Merge detection + embedding into single Swift binary |
| LOC Change | Python: -337, Swift: +100 (embedding) |
| Risk | Medium |
| Effort | 2-3 days |
**Current Architecture**:
```
Stage 1: Python → swift_face (Vision detect) → bbox + landmarks
Stage 2: Python → OpenCV crop → CoreML FaceNet → 512D embedding
```
**Target Architecture**:
```
Swift: Vision detect → crop → VNCoreMLModel (FaceNet) → embedding → face.json
```
### workspace_face_recognition
| Metric | Value |
|--------|-------|
| Status | **Superseded** |
| Recommendation | Do not refactor. Archive/remove. |
| Note | Replaced by face_processor.py (Apple Vision + CoreML) |
---
## Phase 2: Rust Modules (Infrastructure)
### workspace_tkg
| Metric | Value |
|--------|-------|
| Rust Suitability | **10/10** |
| Current State | Python psycopg2 + SQL queries |
| Dependencies | PostgreSQL, JSON I/O (no ML) |
| Refactoring | Pure Rust with sqlx/tokio-postgres |
| LOC Change | Python: -469, Rust: ~350 |
| Risk | Low |
| Effort | 1-2 days |
**Graph Structure**:
```
NODES:
(face_trace) - one per trace_id
(object) - one per YOLO class
(speaker) - one per speaker_id
EDGES:
(face) -[:CO_OCCURS_WITH]-> (object) same frame
(face) -[:SPEAKS_AS]-> (speaker) temporal overlap
(face) -[:CO_OCCURS_WITH]-> (face) same frame
```
### workspace_resume_framework
| Metric | Value |
|--------|-------|
| Rust Suitability | **10/10** |
| Current State | Python file I/O + signal handling |
| Dependencies | File I/O, timers (no ML) |
| Refactoring | Pure Rust struct with auto-save |
| LOC Change | Python: -484, Rust: ~150 |
| Risk | Low |
| Effort | 1 day |
**Rust Design**:
```rust
struct ResumeFramework {
path: PathBuf,
save_interval: Duration,
last_save: Instant,
position: Option<u64>,
}
impl ResumeFramework {
fn load_checkpoint(&mut self) -> Result<Option<u64>>
fn save_checkpoint(&self, position: u64) -> Result<()>
fn auto_save_tick(&mut self, position: u64) -> Result<bool>
fn finalize(&mut self, total: u64) -> Result<()>
}
```
### workspace_redis_publisher
| Metric | Value |
|--------|-------|
| Rust Suitability | **10/10** |
| Current State | Python redis-py pub/sub |
| Dependencies | Redis TCP (no ML) |
| Refactoring | Pure Rust with redis-rs |
| LOC Change | Python: -195, Rust: ~100 |
| Risk | Low |
| Effort | 1 day |
**Rust Design**:
```rust
use redis::AsyncCommands;
struct ProgressPublisher {
client: redis::Client,
channel: String,
}
impl ProgressPublisher {
async fn info(&self, processor: &str, msg: &str) -> Result<()>
async fn progress(&self, processor: &str, current: u32, total: u32, msg: &str) -> Result<()>
async fn complete(&self, processor: &str, msg: &str) -> Result<()>
async fn error(&self, processor: &str, msg: &str) -> Result<()>
}
```
---
## Phase 3: Rust CPU Logic
### workspace_cut
| Metric | Value |
|--------|-------|
| Rust Suitability | 8/10 |
| Current State | Python PySceneDetect |
| Dependencies | Pure CPU (histogram diff) |
| Refactoring | Port ContentDetector algorithm to Rust |
| LOC Change | Python: -106, Rust: ~300 |
| Risk | Medium |
| Effort | 2-3 days |
| Challenge | HSV histogram + adaptive threshold |
**Algorithm to Port**:
- Frame-to-frame HSV/Luma histogram difference
- Rolling average threshold
- min_scene_len enforcement
---
## Phase 4: Swift ANE Acceleration
### workspace_yolo
| Metric | Value |
|--------|-------|
| Swift Suitability | 8/10 |
| Current State | Python ultralytics (YOLOv8) |
| Dependencies | CoreML model conversion needed |
| Refactoring | Create swift_yolo with VNCoreMLModel |
| LOC Change | Python: -496, Swift: ~300 |
| Risk | Medium |
| Effort | 2-3 days |
| Challenge | CoreML model conversion, async handling |
**Swift Approach**:
1. Convert YOLOv8 → CoreML: `yolo export model=yolov8s.pt format=coreml`
2. Create swift_yolo.swift with VNCoreMLModel
3. AVAssetReader for frame extraction
4. ANE-accelerated inference
---
## Phase 5: Python Keep (ML/LLM Dependencies)
### Modules to Keep in Python
| Module | Reason |
|--------|--------|
| asr | whisper/faster-whisper (no Rust/Swift equivalent) |
| asrx | speaker diarization (pyannote) |
| audio_taxonomy | librosa/tensorflow |
| lip | MediaPipe lip tracking |
| caption | LLM generation |
| scene | ML scene classification |
| story | LLM generation |
| story_pipeline | LLM pipeline |
| tmdb_agent | API agent |
| identity_agent | LLM agent |
| voice_embedding | ML embedding |
| mediapipe_holistic | MediaPipe (no Rust/Swift binding) |
| visual_chunk | Visual processing |
---
## Implementation Roadmap
### Week 1: Swift Wrapper Removal
1. OCR: Modify `ocr.rs` to call swift_ocr directly
2. Pose: Modify `pose.rs` to call swift_pose directly
3. Test both with sample videos
### Week 2: Rust Infrastructure
4. redis_publisher: Create `src/core/redis_publisher.rs`
5. resume_framework: Create `src/core/resume.rs`
6. TKG: Create `src/core/processor/tkg.rs`
### Week 3: Swift Enhancement
7. Face: Extend swift_face.swift with CoreML embedding
8. Test face embedding pipeline
### Week 4: Rust Algorithm Port
9. Cut: Port ContentDetector to Rust
10. Test scene detection
### Week 5: Swift ANE
11. YOLO: Convert yolov8s → CoreML
12. Create swift_yolo.swift
13. Test object detection
---
## Total Effort Estimate
| Phase | LOC (Rust/Swift) | Effort |
|-------|------------------|--------|
| 1 | ~100 | 1-2 days |
| 2 | ~600 | 3-4 days |
| 3 | ~100 | 2-3 days |
| 4 | ~300 | 2-3 days |
| 5 | ~300 | 2-3 days |
| **Total** | ~1400 | **10-15 days** |
---
## Dependency Removal Summary
| Dependency | Removed By |
|------------|------------|
| Python runtime | All Swift/Rust refactors |
| redis-py | redis_publisher (Rust) |
| psycopg2 | TKG (Rust) |
| PySceneDetect | Cut (Rust) |
| ultralytics (YOLO) | swift_yolo |
| OpenCV (face crop) | Face Swift embedding |
| InsightFace | Already superseded |
---
## Appendix: Module Summary Table
| Module | Language | Suitability | Status | Action |
|--------|----------|-------------|--------|--------|
| ocr | Swift | 10/10 | Active | Delete wrapper |
| pose | Swift | 10/10 | Active | Delete wrapper |
| face | Swift | 9/10 | Active | Extend Swift |
| face_recognition | - | - | Superseded | Archive |
| yolo | Swift | 8/10 | Active | Create Swift |
| cut | Rust | 8/10 | Active | Port algorithm |
| tkg | Rust | 10/10 | Active | Pure Rust |
| resume_framework | Rust | 10/10 | Active | Pure Rust |
| redis_publisher | Rust | 10/10 | Active | Pure Rust |
| asr | Python | 2/10 | Keep | ML dependency |
| asrx | Python | 2/10 | Keep | ML dependency |
| audio_taxonomy | Python | 2/10 | Keep | ML dependency |
| lip | Python | 2/10 | Keep | ML dependency |
| caption | Python | 2/10 | Keep | LLM |
| scene | Python | 2/10 | Keep | ML |
| story | Python | 2/10 | Keep | LLM |
| story_pipeline | Python | 2/10 | Keep | LLM |
| tmdb_agent | Python | 4/10 | Keep | API |
| identity_agent | Python | 4/10 | Keep | LLM |
| voice_embedding | Python | 2/10 | Keep | ML |
| mediapipe_holistic | Python | 2/10 | Keep | ML |
| visual_chunk | Python | 3/10 | Keep | Visual |
---
## Version History
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2026-05-27 | M5Max128 | Initial assessment from workspace research |