Files
markbase/docs/FUSE_DESIGN.md
2026-05-18 17:02:30 +08:00

15 KiB
Raw Blame History

MarkBase FUSE System Design

Overview

Objective: Implement virtual file system mount for MarkBase users using FUSE technology, enabling direct file access through macOS Finder and video editing software.

Target Performance: 600MB/s sustained write per user, supporting 10 concurrent users.

Technology Choice: FUSE-T (Kext-less FUSE for macOS)


FUSE-T vs macFUSE Comparison

Core Architecture

Feature FUSE-T macFUSE
Kernel Design Kext-less (userspace server) Kernel Extension + FSKit (macOS 26+)
Backend Protocol NFSv4 / SMB3 / FSKit Direct kernel FUSE API
Installation Simple (brew install) Requires System Settings → Privacy & Security
Stability Stable (userspace server) Potential kernel crash/lock-up
License Free personal use, commercial license required Open source (BSD-style)
macOS Support All versions macOS 12+
App Store Embeddable Requires special handling
API Compatibility libfuse2/libfuse3 libfuse2/libfuse3 + macFUSE.framework
Install Count 21,892 (365 days) 129,388 (365 days)

Technical Flow

FUSE-T Operation:

User App → libfuse → FUSE-T Server (userspace) → NFS/SMB/FSKit → macOS mount

macFUSE Operation (Legacy):

User App → libfuse → macFUSE kext (kernel) → VFS → macOS mount

macFUSE Operation (macOS 26+):

User App → libfuse → FSKit (userspace) → macOS mount

Performance Considerations

Factor Impact
FUSE-T NFS backend Extra TCP/IP overhead (~5-10% latency)
macFUSE kext Direct kernel path (fastest)
macFUSE FSKit Userspace path (similar to FUSE-T)
Network packet handling FUSE-T requires NFS RPC conversion
Large file writes Both limited by userspace buffer

Recommendation

FUSE-T is recommended for MarkBase:

  1. Stability Priority: Avoid kernel panic risk
  2. Deployment Friendly: No Security Settings configuration needed
  3. macOS 26 Support: FSKit backend option (same as macFUSE)
  4. Commercial Distribution: Controllable licensing cost

macFUSE suitable for:

  • Maximum raw performance (kernel bypass)
  • Existing stable kernel extension environment
  • Open source projects (no commercial licensing)

Backend Selection

Backend Types

Backend Protocol macOS Support Performance Stability
NFSv4 NFS v4 over TCP All versions ~5-10% overhead Very stable
SMB3 SMB 3.0 over TCP All versions ~8-12% overhead Stable
FSKit Apple FSKit API macOS 26+ Direct path Native

Backend Architecture

pub enum BackendType {
    Nfs4,   // NFSv4 backend (all macOS support)
    Fskit,  // FSKit backend (macOS 26+)
}

impl BackendType {
    pub fn mount_options(&self) -> Vec<MountOption> {
        match self {
            BackendType::Nfs4 => vec![
                MountOption::Backend("nfs"),
                MountOption::AutoUnmount,
            ],
            BackendType::Fskit => vec![
                MountOption::Backend("fskit"),
                MountOption::AutoUnmount,
            ],
        }
    }
}

Performance Comparison

Expected Throughput (4K ProRes 4444 Write):

Backend Expected Speed Overhead Recommendation
NFSv4 550-600 MB/s 5-10% Stable baseline
FSKit 600-700 MB/s Minimal Performance target

Module Architecture

Directory Structure

src/fuse/
├── mod.rs              # FUSE core module (entry point)
├── filesystem.rs       # MarkBaseFs implementation
├── handlers.rs         # FUSE operation handlers
├── backend.rs          # Backend selection (NFSv4/FSKit)
├── cache.rs            # LRU cache for metadata
└── mount_manager.rs    # Multi-user concurrent mount

Module Dependencies

# Cargo.toml additions
[dependencies]
fuse = "0.3"            # FUSE-T Rust bindings (or libfuse3)
time = "0.3"            # Timestamp handling
lru = "0.12"            # LRU cache implementation
uuid = "1.11"           # UUID handling
tokio = { version = "1", features = ["full"] }

Core Components

1. MarkBaseFs (filesystem.rs)

Purpose: Main filesystem implementation backed by SQLite

pub struct MarkBaseFs {
    user_id: String,
    db: Connection,
    backend: BackendType,
    
    // Caches
    attr_cache: LruCache<u64, FileAttr>,
    path_cache: LruCache<u64, PathBuf>,
    dir_cache: LruCache<u64, Vec<DirEntry>>,
    
    // Write buffer
    write_buffer: HashMap<u64, Vec<u8>>,
    buffer_size: usize,  // Default: 64KB
}

impl Filesystem for MarkBaseFs {
    // Operations implementation...
}

Key Operations:

Operation Handler Database Query Cache Strategy
getattr() Get file/directory attributes SELECT * FROM file_nodes WHERE node_id = ? LRU cache (10,000 entries)
readdir() List directory contents SELECT * FROM file_nodes WHERE parent_id = ? LRU cache (1,000 entries)
read() Read file content SELECT location FROM file_locations WHERE file_uuid = ? Direct I/O (no cache)
write() Write file content INSERT INTO file_locations Buffer 64KB chunks
lookup() Find file by name SELECT node_id FROM file_nodes WHERE parent_id = ? AND label = ? LRU cache (10,000 entries)
create() Create new file INSERT INTO file_nodes Invalidate parent cache
unlink() Delete file DELETE FROM file_nodes WHERE node_id = ? Invalidate parent cache

2. Backend Selection (backend.rs)

Purpose: Choose optimal backend based on macOS version

pub fn select_backend() -> BackendType {
    let os_version = System::os_version();
    
    if os_version >= "26.0" {
        // macOS 26+ supports FSKit (native, fastest)
        BackendType::Fskit
    } else {
        // Older macOS uses NFSv4 (stable)
        BackendType::Nfs4
    }
}

3. Cache Management (cache.rs)

Purpose: Reduce SQLite query overhead

pub struct FuseCache {
    attr_cache: LruCache<u64, CachedAttr>,
    path_cache: LruCache<u64, PathBuf>,
    
    ttl: Duration,  // Time-to-live: 60 seconds
}

#[derive(Clone)]
struct CachedAttr {
    attr: FileAttr,
    cached_at: Instant,
}

impl FuseCache {
    pub fn get_attr(&mut self, ino: u64) -> Option<FileAttr> {
        if let Some(cached) = self.attr_cache.get(&ino) {
            if cached.cached_at.elapsed() < self.ttl {
                return Some(cached.attr.clone());
            }
        }
        None
    }
    
    pub fn put_attr(&mut self, ino: u64, attr: FileAttr) {
        self.attr_cache.put(ino, CachedAttr {
            attr,
            cached_at: Instant::now(),
        });
    }
}

4. Mount Manager (mount_manager.rs)

Purpose: Handle multi-user concurrent mounts

use tokio::task::JoinSet;

pub struct MountManager {
    mounts: HashMap<String, MountedFs>,
    backend: BackendType,
}

pub struct MountedFs {
    user_id: String,
    mount_path: PathBuf,
    process: JoinHandle<Result<()>>,
}

impl MountManager {
    pub async fn mount_user(&mut self, user_id: String, base_dir: PathBuf) -> Result<()> {
        let mount_path = base_dir.join(format!("MarkBase_{}", user_id));
        let db_path = FileTree::user_db_path(&user_id);
        let conn = FileTree::open_user_db(&db_path)?;
        
        let backend = self.backend.clone();
        let fs = MarkBaseFs::new(user_id.clone(), conn, backend);
        
        let handle = tokio::spawn(async move {
            fuse::mount(fs, &mount_path, &backend.mount_options())?;
            Ok(())
        });
        
        self.mounts.insert(user_id, MountedFs {
            user_id,
            mount_path,
            process: handle,
        });
        
        Ok(())
    }
    
    pub async fn mount_all(&mut self, users: Vec<String>, base_dir: PathBuf) -> Result<()> {
        let mut tasks = JoinSet::new();
        
        for user_id in users {
            tasks.spawn(async move {
                MountManager::new().mount_user(user_id, base_dir).await
            });
        }
        
        while let Some(result) = tasks.join_next().await {
            result??;
        }
        
        Ok(())
    }
    
    pub async fn unmount_all(&mut self) -> Result<()> {
        for mount in self.mounts.values() {
            fuse::unmount(&mount.mount_path)?;
        }
        self.mounts.clear();
        Ok(())
    }
}

Performance Optimization

Strategy 1: Write Buffering

Problem: FUSE write() syscall overhead

Solution: Buffer writes in 64KB chunks

impl MarkBaseFs {
    fn write(&mut self, ino: u64, offset: u64, data: &[u8], reply: ReplyWrite) {
        // Accumulate in buffer
        let buffer = self.write_buffer.entry(ino).or_insert(Vec::new());
        buffer.extend_from_slice(data);
        
        // Flush when buffer >= 64KB
        if buffer.len() >= 64 * 1024 {
            self.flush_buffer(ino);
        }
        
        reply.written(data.len() as u32);
    }
    
    fn flush_buffer(&mut self, ino: u64) {
        if let Some(buffer) = self.write_buffer.get(&ino) {
            let path = self.get_file_path(ino);
            std::fs::write(&path, buffer)?;
            self.write_buffer.remove(&ino);
        }
    }
}

Strategy 2: Metadata Caching

Problem: SQLite query latency (~2-5ms per query)

Solution: LRU cache with 60s TTL

Cache Configuration:

Cache Type Size TTL Hit Rate Target
attr_cache 10,000 entries 60s 95%
path_cache 10,000 entries 60s 90%
dir_cache 1,000 entries 60s 85%

Strategy 3: FSKit Backend

Problem: NFSv4 TCP/IP overhead (~5-10%)

Solution: Use FSKit backend on macOS 26+

Performance Impact:

Metric NFSv4 FSKit Improvement
Write latency 15ms 8ms 47% reduction
Read latency 10ms 5ms 50% reduction
Throughput 550 MB/s 650 MB/s 18% increase

Multi-User Concurrent Mount

Architecture

MountManager
├── mount_user(user_id) → spawn tokio task
├── mount_all([user1, user2, ...]) → parallel mount
├── unmount_user(user_id) → graceful shutdown
└── unmount_all() → cleanup all mounts

Mount Paths

/Volumes/
├── MarkBase_warren/     → data/users/warren.sqlite
├── MarkBase_momentry/   → data/users/momentry.sqlite
├── MarkBase_demo/       → data/users/demo.sqlite
├── MarkBase_user1/      → data/users/user1.sqlite
...
└── MarkBase_user10/     → data/users/user10.sqlite

Concurrent Strategy

// Parallel mount using tokio::JoinSet
pub async fn mount_all(&mut self, users: Vec<String>) -> Result<()> {
    let mut tasks = JoinSet::new();
    
    for user_id in users {
        tasks.spawn(async {
            MountManager::new().mount_user(user_id).await
        });
    }
    
    // Collect results
    let mut results = Vec::new();
    while let Some(result) = tasks.join_next().await {
        results.push(result?);
    }
    
    Ok(())
}

Performance Target

Metric Target Measurement
Mount latency (single user) <100ms Time from mount() to ready
Mount latency (10 users) <2s Parallel mount completion
Mount stability 24h uptime No crash/lock-up
Concurrent writes 10 × 600MB/s AJA System Test

Database Integration

File Node Mapping

UUID to inode mapping:

fn uuid_to_ino(uuid: &str) -> u64 {
    // Use first 8 bytes of UUID as inode number
    let bytes = uuid.as_bytes();
    u64::from_be_bytes(&bytes[0..8])
}

fn ino_to_uuid(ino: u64) -> String {
    // Convert inode back to UUID (with padding)
    let bytes = ino.to_be_bytes();
    format!("{:016x}{}", bytes, "00000000-0000-0000-0000-000000000000")
}

Path Resolution

Virtual path construction:

fn build_virtual_path(&self, ino: u64) -> PathBuf {
    let mut path = PathBuf::new();
    let mut current_ino = ino;
    
    // Walk up parent chain
    while current_ino != 0 {
        let node = self.get_node(current_ino)?;
        path.push(node.label);
        current_ino = node.parent_id;
    }
    
    path
}

File Location Query

fn get_file_path(&self, ino: u64) -> PathBuf {
    let uuid = self.ino_to_uuid(ino);
    
    self.db.query_row(
        "SELECT location FROM file_locations WHERE file_uuid = ?",
        [uuid],
        |row| row.get::<_, String>(0)
    ).map(PathBuf::from)
}

CLI Commands

Mount Commands

# Single user mount
cargo run -- fuse --mount --user warren --dir /Volumes/MarkBase_warren

# Multi-user concurrent mount
cargo run -- fuse --mount --all --dir /Volumes/

# Specify backend
cargo run -- fuse --mount --user warren --backend fskit

# Unmount
cargo run -- fuse --unmount --dir /Volumes/MarkBase_warren
cargo run -- fuse --unmount --all

Test Commands

# Performance test
cargo run -- fuse --test --user warren --size 6GB

# AJA System Test simulation
cargo run -- fuse --test --aja --config 4K_ProRes

# Stability test
cargo run -- fuse --test --stability --duration 24h

Testing Strategy

Phase 1: POC Verification

Tests:

  1. Hello FUSE mount/unmount
  2. Basic read/write operations
  3. Backend selection (NFSv4 vs FSKit)

Phase 2: SQLite-backed FUSE

Tests:

  1. warren user mount (12,659 nodes)
  2. Directory traversal
  3. File read/write
  4. Metadata caching

Phase 3: Multi-user Concurrent

Tests:

  1. 10 user parallel mount
  2. Concurrent writes (AJA System Test)
  3. 24h stability test
  4. Unmount/shutdown

Phase 4: Performance Validation

Tests:

  1. AJA System Test 4K ProRes 4444 (600MB/s target)
  2. dd baseline comparison
  3. FSKit backend performance
  4. Cache effectiveness (hit rate measurement)

Risk Assessment

Risk Probability Impact Mitigation
FUSE-T installation failure Low High Document brew installation steps
NFSv4 performance bottleneck Medium Medium FSKit backend fallback
SQLite query latency Medium High LRU caching + connection pooling
Kernel panic (macFUSE only) Low Critical Use FUSE-T (kext-less)
Multi-user mount deadlock Low High Async mount + timeout handling
Write buffer overflow Low Medium Chunked flush + memory limit

Performance Targets Summary

Metric Target Measurement Method
Write throughput >=600MB/s AJA System Test 4K ProRes 4444
Read throughput >=800MB/s AJA System Test 4K ProRes 422 HQ
Mount latency (single) <100ms Timing measurement
Mount latency (10 users) <2s Parallel mount timing
Concurrent writes 10 × 600MB/s AJA concurrent test
Uptime stability 24h no crash Stability test
Cache hit rate >=90% Cache statistics

Last Updated: 2026-05-17 Version: 1.0 Status: Design Complete, Ready for POC Implementation