docs: add processor refactoring assessment from M5Max128 workspace research
This commit is contained in:
352
docs_v1.0/DESIGN/Processor_Refactoring_Assessment.md
Normal file
352
docs_v1.0/DESIGN/Processor_Refactoring_Assessment.md
Normal file
@@ -0,0 +1,352 @@
|
||||
---
|
||||
title: Processor Refactoring Assessment (M5Max128 Research)
|
||||
version: 1.0
|
||||
date: 2026-05-27
|
||||
author: M5Max128
|
||||
status: reference
|
||||
---
|
||||
|
||||
# Processor Refactoring Assessment
|
||||
|
||||
> **Scope**: M5Max128 research documentation for M5Max48 implementation reference
|
||||
> **Workspace**: ~/workspace/ (22 modules)
|
||||
|
||||
## Executive Summary
|
||||
|
||||
22 processor modules evaluated for Rust/Swift/Python refactoring feasibility.
|
||||
|
||||
### Priority Matrix
|
||||
|
||||
| Phase | Language | Modules | Effort | Benefit |
|
||||
|-------|----------|---------|--------|---------|
|
||||
| 1 | Swift | OCR, Pose, Face | Low | Remove Python wrappers |
|
||||
| 2 | Rust | TKG, Resume, Redis | Low | Remove infrastructure deps |
|
||||
| 3 | Rust | Cut | Medium | Pure CPU logic |
|
||||
| 4 | Swift | YOLO | Medium | ANE acceleration |
|
||||
| 5 | Python | Others (keep) | - | ML/LLM dependencies |
|
||||
|
||||
---
|
||||
|
||||
## Phase 1: Swift Modules (Immediate Gain)
|
||||
|
||||
### workspace_ocr
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Swift Suitability | 10/10 |
|
||||
| Current State | Thin Python wrapper around swift_ocr |
|
||||
| Refactoring | Delete Python wrapper, Rust calls swift_ocr directly |
|
||||
| LOC Change | Python: -122, Rust: ~50 |
|
||||
| Risk | Low |
|
||||
| Effort | 1 day |
|
||||
|
||||
**Current Architecture**:
|
||||
```
|
||||
Rust (ocr.rs) → PythonExecutor → ocr_processor.py → subprocess → swift_ocr
|
||||
```
|
||||
|
||||
**Target Architecture**:
|
||||
```
|
||||
Rust (ocr.rs) → subprocess → swift_ocr
|
||||
```
|
||||
|
||||
### workspace_pose
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Swift Suitability | 10/10 |
|
||||
| Current State | Thin Python wrapper around swift_pose |
|
||||
| Refactoring | Delete Python wrapper, Rust calls swift_pose directly |
|
||||
| LOC Change | Python: -150, Rust: ~50 |
|
||||
| Risk | Low |
|
||||
| Effort | 1 day |
|
||||
|
||||
**Current Architecture**:
|
||||
```
|
||||
Rust (pose.rs) → PythonExecutor → pose_processor.py → subprocess → swift_pose
|
||||
```
|
||||
|
||||
**Target Architecture**:
|
||||
```
|
||||
Rust (pose.rs) → subprocess → swift_pose
|
||||
```
|
||||
|
||||
### workspace_face
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Swift Suitability | 9/10 |
|
||||
| Current State | Swift detect + Python embedding (FaceNet CoreML) |
|
||||
| Refactoring | Merge detection + embedding into single Swift binary |
|
||||
| LOC Change | Python: -337, Swift: +100 (embedding) |
|
||||
| Risk | Medium |
|
||||
| Effort | 2-3 days |
|
||||
|
||||
**Current Architecture**:
|
||||
```
|
||||
Stage 1: Python → swift_face (Vision detect) → bbox + landmarks
|
||||
Stage 2: Python → OpenCV crop → CoreML FaceNet → 512D embedding
|
||||
```
|
||||
|
||||
**Target Architecture**:
|
||||
```
|
||||
Swift: Vision detect → crop → VNCoreMLModel (FaceNet) → embedding → face.json
|
||||
```
|
||||
|
||||
### workspace_face_recognition
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Status | **Superseded** |
|
||||
| Recommendation | Do not refactor. Archive/remove. |
|
||||
| Note | Replaced by face_processor.py (Apple Vision + CoreML) |
|
||||
|
||||
---
|
||||
|
||||
## Phase 2: Rust Modules (Infrastructure)
|
||||
|
||||
### workspace_tkg
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Rust Suitability | **10/10** |
|
||||
| Current State | Python psycopg2 + SQL queries |
|
||||
| Dependencies | PostgreSQL, JSON I/O (no ML) |
|
||||
| Refactoring | Pure Rust with sqlx/tokio-postgres |
|
||||
| LOC Change | Python: -469, Rust: ~350 |
|
||||
| Risk | Low |
|
||||
| Effort | 1-2 days |
|
||||
|
||||
**Graph Structure**:
|
||||
```
|
||||
NODES:
|
||||
(face_trace) - one per trace_id
|
||||
(object) - one per YOLO class
|
||||
(speaker) - one per speaker_id
|
||||
|
||||
EDGES:
|
||||
(face) -[:CO_OCCURS_WITH]-> (object) same frame
|
||||
(face) -[:SPEAKS_AS]-> (speaker) temporal overlap
|
||||
(face) -[:CO_OCCURS_WITH]-> (face) same frame
|
||||
```
|
||||
|
||||
### workspace_resume_framework
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Rust Suitability | **10/10** |
|
||||
| Current State | Python file I/O + signal handling |
|
||||
| Dependencies | File I/O, timers (no ML) |
|
||||
| Refactoring | Pure Rust struct with auto-save |
|
||||
| LOC Change | Python: -484, Rust: ~150 |
|
||||
| Risk | Low |
|
||||
| Effort | 1 day |
|
||||
|
||||
**Rust Design**:
|
||||
```rust
|
||||
struct ResumeFramework {
|
||||
path: PathBuf,
|
||||
save_interval: Duration,
|
||||
last_save: Instant,
|
||||
position: Option<u64>,
|
||||
}
|
||||
|
||||
impl ResumeFramework {
|
||||
fn load_checkpoint(&mut self) -> Result<Option<u64>>
|
||||
fn save_checkpoint(&self, position: u64) -> Result<()>
|
||||
fn auto_save_tick(&mut self, position: u64) -> Result<bool>
|
||||
fn finalize(&mut self, total: u64) -> Result<()>
|
||||
}
|
||||
```
|
||||
|
||||
### workspace_redis_publisher
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Rust Suitability | **10/10** |
|
||||
| Current State | Python redis-py pub/sub |
|
||||
| Dependencies | Redis TCP (no ML) |
|
||||
| Refactoring | Pure Rust with redis-rs |
|
||||
| LOC Change | Python: -195, Rust: ~100 |
|
||||
| Risk | Low |
|
||||
| Effort | 1 day |
|
||||
|
||||
**Rust Design**:
|
||||
```rust
|
||||
use redis::AsyncCommands;
|
||||
|
||||
struct ProgressPublisher {
|
||||
client: redis::Client,
|
||||
channel: String,
|
||||
}
|
||||
|
||||
impl ProgressPublisher {
|
||||
async fn info(&self, processor: &str, msg: &str) -> Result<()>
|
||||
async fn progress(&self, processor: &str, current: u32, total: u32, msg: &str) -> Result<()>
|
||||
async fn complete(&self, processor: &str, msg: &str) -> Result<()>
|
||||
async fn error(&self, processor: &str, msg: &str) -> Result<()>
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Phase 3: Rust CPU Logic
|
||||
|
||||
### workspace_cut
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Rust Suitability | 8/10 |
|
||||
| Current State | Python PySceneDetect |
|
||||
| Dependencies | Pure CPU (histogram diff) |
|
||||
| Refactoring | Port ContentDetector algorithm to Rust |
|
||||
| LOC Change | Python: -106, Rust: ~300 |
|
||||
| Risk | Medium |
|
||||
| Effort | 2-3 days |
|
||||
| Challenge | HSV histogram + adaptive threshold |
|
||||
|
||||
**Algorithm to Port**:
|
||||
- Frame-to-frame HSV/Luma histogram difference
|
||||
- Rolling average threshold
|
||||
- min_scene_len enforcement
|
||||
|
||||
---
|
||||
|
||||
## Phase 4: Swift ANE Acceleration
|
||||
|
||||
### workspace_yolo
|
||||
|
||||
| Metric | Value |
|
||||
|--------|-------|
|
||||
| Swift Suitability | 8/10 |
|
||||
| Current State | Python ultralytics (YOLOv8) |
|
||||
| Dependencies | CoreML model conversion needed |
|
||||
| Refactoring | Create swift_yolo with VNCoreMLModel |
|
||||
| LOC Change | Python: -496, Swift: ~300 |
|
||||
| Risk | Medium |
|
||||
| Effort | 2-3 days |
|
||||
| Challenge | CoreML model conversion, async handling |
|
||||
|
||||
**Swift Approach**:
|
||||
1. Convert YOLOv8 → CoreML: `yolo export model=yolov8s.pt format=coreml`
|
||||
2. Create swift_yolo.swift with VNCoreMLModel
|
||||
3. AVAssetReader for frame extraction
|
||||
4. ANE-accelerated inference
|
||||
|
||||
---
|
||||
|
||||
## Phase 5: Python Keep (ML/LLM Dependencies)
|
||||
|
||||
### Modules to Keep in Python
|
||||
|
||||
| Module | Reason |
|
||||
|--------|--------|
|
||||
| asr | whisper/faster-whisper (no Rust/Swift equivalent) |
|
||||
| asrx | speaker diarization (pyannote) |
|
||||
| audio_taxonomy | librosa/tensorflow |
|
||||
| lip | MediaPipe lip tracking |
|
||||
| caption | LLM generation |
|
||||
| scene | ML scene classification |
|
||||
| story | LLM generation |
|
||||
| story_pipeline | LLM pipeline |
|
||||
| tmdb_agent | API agent |
|
||||
| identity_agent | LLM agent |
|
||||
| voice_embedding | ML embedding |
|
||||
| mediapipe_holistic | MediaPipe (no Rust/Swift binding) |
|
||||
| visual_chunk | Visual processing |
|
||||
|
||||
---
|
||||
|
||||
## Implementation Roadmap
|
||||
|
||||
### Week 1: Swift Wrapper Removal
|
||||
|
||||
1. OCR: Modify `ocr.rs` to call swift_ocr directly
|
||||
2. Pose: Modify `pose.rs` to call swift_pose directly
|
||||
3. Test both with sample videos
|
||||
|
||||
### Week 2: Rust Infrastructure
|
||||
|
||||
4. redis_publisher: Create `src/core/redis_publisher.rs`
|
||||
5. resume_framework: Create `src/core/resume.rs`
|
||||
6. TKG: Create `src/core/processor/tkg.rs`
|
||||
|
||||
### Week 3: Swift Enhancement
|
||||
|
||||
7. Face: Extend swift_face.swift with CoreML embedding
|
||||
8. Test face embedding pipeline
|
||||
|
||||
### Week 4: Rust Algorithm Port
|
||||
|
||||
9. Cut: Port ContentDetector to Rust
|
||||
10. Test scene detection
|
||||
|
||||
### Week 5: Swift ANE
|
||||
|
||||
11. YOLO: Convert yolov8s → CoreML
|
||||
12. Create swift_yolo.swift
|
||||
13. Test object detection
|
||||
|
||||
---
|
||||
|
||||
## Total Effort Estimate
|
||||
|
||||
| Phase | LOC (Rust/Swift) | Effort |
|
||||
|-------|------------------|--------|
|
||||
| 1 | ~100 | 1-2 days |
|
||||
| 2 | ~600 | 3-4 days |
|
||||
| 3 | ~100 | 2-3 days |
|
||||
| 4 | ~300 | 2-3 days |
|
||||
| 5 | ~300 | 2-3 days |
|
||||
| **Total** | ~1400 | **10-15 days** |
|
||||
|
||||
---
|
||||
|
||||
## Dependency Removal Summary
|
||||
|
||||
| Dependency | Removed By |
|
||||
|------------|------------|
|
||||
| Python runtime | All Swift/Rust refactors |
|
||||
| redis-py | redis_publisher (Rust) |
|
||||
| psycopg2 | TKG (Rust) |
|
||||
| PySceneDetect | Cut (Rust) |
|
||||
| ultralytics (YOLO) | swift_yolo |
|
||||
| OpenCV (face crop) | Face Swift embedding |
|
||||
| InsightFace | Already superseded |
|
||||
|
||||
---
|
||||
|
||||
## Appendix: Module Summary Table
|
||||
|
||||
| Module | Language | Suitability | Status | Action |
|
||||
|--------|----------|-------------|--------|--------|
|
||||
| ocr | Swift | 10/10 | Active | Delete wrapper |
|
||||
| pose | Swift | 10/10 | Active | Delete wrapper |
|
||||
| face | Swift | 9/10 | Active | Extend Swift |
|
||||
| face_recognition | - | - | Superseded | Archive |
|
||||
| yolo | Swift | 8/10 | Active | Create Swift |
|
||||
| cut | Rust | 8/10 | Active | Port algorithm |
|
||||
| tkg | Rust | 10/10 | Active | Pure Rust |
|
||||
| resume_framework | Rust | 10/10 | Active | Pure Rust |
|
||||
| redis_publisher | Rust | 10/10 | Active | Pure Rust |
|
||||
| asr | Python | 2/10 | Keep | ML dependency |
|
||||
| asrx | Python | 2/10 | Keep | ML dependency |
|
||||
| audio_taxonomy | Python | 2/10 | Keep | ML dependency |
|
||||
| lip | Python | 2/10 | Keep | ML dependency |
|
||||
| caption | Python | 2/10 | Keep | LLM |
|
||||
| scene | Python | 2/10 | Keep | ML |
|
||||
| story | Python | 2/10 | Keep | LLM |
|
||||
| story_pipeline | Python | 2/10 | Keep | LLM |
|
||||
| tmdb_agent | Python | 4/10 | Keep | API |
|
||||
| identity_agent | Python | 4/10 | Keep | LLM |
|
||||
| voice_embedding | Python | 2/10 | Keep | ML |
|
||||
| mediapipe_holistic | Python | 2/10 | Keep | ML |
|
||||
| visual_chunk | Python | 3/10 | Keep | Visual |
|
||||
|
||||
---
|
||||
|
||||
## Version History
|
||||
|
||||
| Version | Date | Author | Changes |
|
||||
|---------|------|--------|---------|
|
||||
| 1.0 | 2026-05-27 | M5Max128 | Initial assessment from workspace research |
|
||||
Reference in New Issue
Block a user