diff --git a/docs_v1.0/DESIGN/Processor_Refactoring_Assessment.md b/docs_v1.0/DESIGN/Processor_Refactoring_Assessment.md new file mode 100644 index 0000000..a6c6972 --- /dev/null +++ b/docs_v1.0/DESIGN/Processor_Refactoring_Assessment.md @@ -0,0 +1,352 @@ +--- +title: Processor Refactoring Assessment (M5Max128 Research) +version: 1.0 +date: 2026-05-27 +author: M5Max128 +status: reference +--- + +# Processor Refactoring Assessment + +> **Scope**: M5Max128 research documentation for M5Max48 implementation reference +> **Workspace**: ~/workspace/ (22 modules) + +## Executive Summary + +22 processor modules evaluated for Rust/Swift/Python refactoring feasibility. + +### Priority Matrix + +| Phase | Language | Modules | Effort | Benefit | +|-------|----------|---------|--------|---------| +| 1 | Swift | OCR, Pose, Face | Low | Remove Python wrappers | +| 2 | Rust | TKG, Resume, Redis | Low | Remove infrastructure deps | +| 3 | Rust | Cut | Medium | Pure CPU logic | +| 4 | Swift | YOLO | Medium | ANE acceleration | +| 5 | Python | Others (keep) | - | ML/LLM dependencies | + +--- + +## Phase 1: Swift Modules (Immediate Gain) + +### workspace_ocr + +| Metric | Value | +|--------|-------| +| Swift Suitability | 10/10 | +| Current State | Thin Python wrapper around swift_ocr | +| Refactoring | Delete Python wrapper, Rust calls swift_ocr directly | +| LOC Change | Python: -122, Rust: ~50 | +| Risk | Low | +| Effort | 1 day | + +**Current Architecture**: +``` +Rust (ocr.rs) → PythonExecutor → ocr_processor.py → subprocess → swift_ocr +``` + +**Target Architecture**: +``` +Rust (ocr.rs) → subprocess → swift_ocr +``` + +### workspace_pose + +| Metric | Value | +|--------|-------| +| Swift Suitability | 10/10 | +| Current State | Thin Python wrapper around swift_pose | +| Refactoring | Delete Python wrapper, Rust calls swift_pose directly | +| LOC Change | Python: -150, Rust: ~50 | +| Risk | Low | +| Effort | 1 day | + +**Current Architecture**: +``` +Rust (pose.rs) → PythonExecutor → pose_processor.py → subprocess → swift_pose +``` + +**Target Architecture**: +``` +Rust (pose.rs) → subprocess → swift_pose +``` + +### workspace_face + +| Metric | Value | +|--------|-------| +| Swift Suitability | 9/10 | +| Current State | Swift detect + Python embedding (FaceNet CoreML) | +| Refactoring | Merge detection + embedding into single Swift binary | +| LOC Change | Python: -337, Swift: +100 (embedding) | +| Risk | Medium | +| Effort | 2-3 days | + +**Current Architecture**: +``` +Stage 1: Python → swift_face (Vision detect) → bbox + landmarks +Stage 2: Python → OpenCV crop → CoreML FaceNet → 512D embedding +``` + +**Target Architecture**: +``` +Swift: Vision detect → crop → VNCoreMLModel (FaceNet) → embedding → face.json +``` + +### workspace_face_recognition + +| Metric | Value | +|--------|-------| +| Status | **Superseded** | +| Recommendation | Do not refactor. Archive/remove. | +| Note | Replaced by face_processor.py (Apple Vision + CoreML) | + +--- + +## Phase 2: Rust Modules (Infrastructure) + +### workspace_tkg + +| Metric | Value | +|--------|-------| +| Rust Suitability | **10/10** | +| Current State | Python psycopg2 + SQL queries | +| Dependencies | PostgreSQL, JSON I/O (no ML) | +| Refactoring | Pure Rust with sqlx/tokio-postgres | +| LOC Change | Python: -469, Rust: ~350 | +| Risk | Low | +| Effort | 1-2 days | + +**Graph Structure**: +``` +NODES: + (face_trace) - one per trace_id + (object) - one per YOLO class + (speaker) - one per speaker_id + +EDGES: + (face) -[:CO_OCCURS_WITH]-> (object) same frame + (face) -[:SPEAKS_AS]-> (speaker) temporal overlap + (face) -[:CO_OCCURS_WITH]-> (face) same frame +``` + +### workspace_resume_framework + +| Metric | Value | +|--------|-------| +| Rust Suitability | **10/10** | +| Current State | Python file I/O + signal handling | +| Dependencies | File I/O, timers (no ML) | +| Refactoring | Pure Rust struct with auto-save | +| LOC Change | Python: -484, Rust: ~150 | +| Risk | Low | +| Effort | 1 day | + +**Rust Design**: +```rust +struct ResumeFramework { + path: PathBuf, + save_interval: Duration, + last_save: Instant, + position: Option, +} + +impl ResumeFramework { + fn load_checkpoint(&mut self) -> Result> + fn save_checkpoint(&self, position: u64) -> Result<()> + fn auto_save_tick(&mut self, position: u64) -> Result + fn finalize(&mut self, total: u64) -> Result<()> +} +``` + +### workspace_redis_publisher + +| Metric | Value | +|--------|-------| +| Rust Suitability | **10/10** | +| Current State | Python redis-py pub/sub | +| Dependencies | Redis TCP (no ML) | +| Refactoring | Pure Rust with redis-rs | +| LOC Change | Python: -195, Rust: ~100 | +| Risk | Low | +| Effort | 1 day | + +**Rust Design**: +```rust +use redis::AsyncCommands; + +struct ProgressPublisher { + client: redis::Client, + channel: String, +} + +impl ProgressPublisher { + async fn info(&self, processor: &str, msg: &str) -> Result<()> + async fn progress(&self, processor: &str, current: u32, total: u32, msg: &str) -> Result<()> + async fn complete(&self, processor: &str, msg: &str) -> Result<()> + async fn error(&self, processor: &str, msg: &str) -> Result<()> +} +``` + +--- + +## Phase 3: Rust CPU Logic + +### workspace_cut + +| Metric | Value | +|--------|-------| +| Rust Suitability | 8/10 | +| Current State | Python PySceneDetect | +| Dependencies | Pure CPU (histogram diff) | +| Refactoring | Port ContentDetector algorithm to Rust | +| LOC Change | Python: -106, Rust: ~300 | +| Risk | Medium | +| Effort | 2-3 days | +| Challenge | HSV histogram + adaptive threshold | + +**Algorithm to Port**: +- Frame-to-frame HSV/Luma histogram difference +- Rolling average threshold +- min_scene_len enforcement + +--- + +## Phase 4: Swift ANE Acceleration + +### workspace_yolo + +| Metric | Value | +|--------|-------| +| Swift Suitability | 8/10 | +| Current State | Python ultralytics (YOLOv8) | +| Dependencies | CoreML model conversion needed | +| Refactoring | Create swift_yolo with VNCoreMLModel | +| LOC Change | Python: -496, Swift: ~300 | +| Risk | Medium | +| Effort | 2-3 days | +| Challenge | CoreML model conversion, async handling | + +**Swift Approach**: +1. Convert YOLOv8 → CoreML: `yolo export model=yolov8s.pt format=coreml` +2. Create swift_yolo.swift with VNCoreMLModel +3. AVAssetReader for frame extraction +4. ANE-accelerated inference + +--- + +## Phase 5: Python Keep (ML/LLM Dependencies) + +### Modules to Keep in Python + +| Module | Reason | +|--------|--------| +| asr | whisper/faster-whisper (no Rust/Swift equivalent) | +| asrx | speaker diarization (pyannote) | +| audio_taxonomy | librosa/tensorflow | +| lip | MediaPipe lip tracking | +| caption | LLM generation | +| scene | ML scene classification | +| story | LLM generation | +| story_pipeline | LLM pipeline | +| tmdb_agent | API agent | +| identity_agent | LLM agent | +| voice_embedding | ML embedding | +| mediapipe_holistic | MediaPipe (no Rust/Swift binding) | +| visual_chunk | Visual processing | + +--- + +## Implementation Roadmap + +### Week 1: Swift Wrapper Removal + +1. OCR: Modify `ocr.rs` to call swift_ocr directly +2. Pose: Modify `pose.rs` to call swift_pose directly +3. Test both with sample videos + +### Week 2: Rust Infrastructure + +4. redis_publisher: Create `src/core/redis_publisher.rs` +5. resume_framework: Create `src/core/resume.rs` +6. TKG: Create `src/core/processor/tkg.rs` + +### Week 3: Swift Enhancement + +7. Face: Extend swift_face.swift with CoreML embedding +8. Test face embedding pipeline + +### Week 4: Rust Algorithm Port + +9. Cut: Port ContentDetector to Rust +10. Test scene detection + +### Week 5: Swift ANE + +11. YOLO: Convert yolov8s → CoreML +12. Create swift_yolo.swift +13. Test object detection + +--- + +## Total Effort Estimate + +| Phase | LOC (Rust/Swift) | Effort | +|-------|------------------|--------| +| 1 | ~100 | 1-2 days | +| 2 | ~600 | 3-4 days | +| 3 | ~100 | 2-3 days | +| 4 | ~300 | 2-3 days | +| 5 | ~300 | 2-3 days | +| **Total** | ~1400 | **10-15 days** | + +--- + +## Dependency Removal Summary + +| Dependency | Removed By | +|------------|------------| +| Python runtime | All Swift/Rust refactors | +| redis-py | redis_publisher (Rust) | +| psycopg2 | TKG (Rust) | +| PySceneDetect | Cut (Rust) | +| ultralytics (YOLO) | swift_yolo | +| OpenCV (face crop) | Face Swift embedding | +| InsightFace | Already superseded | + +--- + +## Appendix: Module Summary Table + +| Module | Language | Suitability | Status | Action | +|--------|----------|-------------|--------|--------| +| ocr | Swift | 10/10 | Active | Delete wrapper | +| pose | Swift | 10/10 | Active | Delete wrapper | +| face | Swift | 9/10 | Active | Extend Swift | +| face_recognition | - | - | Superseded | Archive | +| yolo | Swift | 8/10 | Active | Create Swift | +| cut | Rust | 8/10 | Active | Port algorithm | +| tkg | Rust | 10/10 | Active | Pure Rust | +| resume_framework | Rust | 10/10 | Active | Pure Rust | +| redis_publisher | Rust | 10/10 | Active | Pure Rust | +| asr | Python | 2/10 | Keep | ML dependency | +| asrx | Python | 2/10 | Keep | ML dependency | +| audio_taxonomy | Python | 2/10 | Keep | ML dependency | +| lip | Python | 2/10 | Keep | ML dependency | +| caption | Python | 2/10 | Keep | LLM | +| scene | Python | 2/10 | Keep | ML | +| story | Python | 2/10 | Keep | LLM | +| story_pipeline | Python | 2/10 | Keep | LLM | +| tmdb_agent | Python | 4/10 | Keep | API | +| identity_agent | Python | 4/10 | Keep | LLM | +| voice_embedding | Python | 2/10 | Keep | ML | +| mediapipe_holistic | Python | 2/10 | Keep | ML | +| visual_chunk | Python | 3/10 | Keep | Visual | + +--- + +## Version History + +| Version | Date | Author | Changes | +|---------|------|--------|---------| +| 1.0 | 2026-05-27 | M5Max128 | Initial assessment from workspace research | \ No newline at end of file