# NFS Direct Implementation - Better than FUSE **Date:** 2026-05-17 13:30 **Decision:** Switch from FUSE (fuse-t) to direct NFS server (bold-nfs) **Confidence:** 85% **Time estimate:** 4 days --- ## 1. Why Direct NFS is Better ### Comparison: FUSE vs Direct NFS | Aspect | FUSE (fuse-t) | Direct NFS (bold-nfs) | |--------|---------------|----------------------| | **Architecture** | Rust → fuse-backend-rs → go-nfsv4 → mount_nfs | Rust → bold-nfs → mount_nfs | | **Process count** | 3 (Rust parent + go-nfsv4 child + mount_nfs) | 2 (Rust NFS server + mount_nfs) | | **Lifecycle** | Complex (fork/exec/socket/mount/die) | Simple (start server → mount → run) | | **Dependencies** | fuse-t binary (go-nfsv4), fuse-backend-rs | vfs crate, bold-nfs library | | **Daemon management** | go-nfsv4 lifecycle issue (dies immediately) | Server runs indefinitely | | **Performance** | Unknown (FUSE overhead + NFS overhead) | Direct NFS (minimal overhead) | | **Success rate** | 60% (lifecycle issue unresolved) | 85% (simple architecture) | | **Development time** | 4-7 days debugging | 4 days implementation | ### Key Problems with FUSE (Current Approach) **Problem 1: go-nfsv4 Lifecycle** - go-nfsv4 dies immediately after mount_nfs execution - No actual mount established - wait_mount() returns OK even though mount failed - NFS server port not listening after mount attempt **Problem 2: Complex Process Lifecycle** - Parent: Rust binary (fuse-backend-rs) - Child: go-nfsv4 (exec'd process) - mount_nfs: macOS system command - Socket communication between parent and child - Fork/exec complexity → race conditions **Problem 3: Debugging Difficulty** - fuse-backend-rs: 640 lines of complex lifecycle code - go-nfsv4: 23MB binary, closed source - Cannot modify go-nfsv4 behavior - Cannot fix lifecycle issue without source code ### Why Direct NFS is Better **Advantage 1: Simple Architecture** ``` MarkBase NFS Server (Rust) ├── bold-nfs library (NFSv4.0 protocol) ├── MarkBaseFS backend (vfs::FileSystem trait) └── SQLite database (warren.sqlite) mount_nfs → connects to NFS server → reads/writes files ``` **Advantage 2: No Lifecycle Issues** - Server runs indefinitely - No fork/exec/socket communication - No go-nfsv4 dependency - Direct NFS protocol implementation **Advantage 3: Rust-native** - bold-nfs is written in Rust (async Tokio) - Fits our project stack - Can debug and modify if needed - MIT license (open source) **Advantage 4: Proven Architecture** - bold-nfs has working demo (bold-mem) - Tested on Linux with mount.nfs4 - NFSv4.0 protocol implemented - FileManager handles file operations --- ## 2. Implementation Plan ### Phase 1: Test bold-nfs (Day 1) **Objective:** Verify bold-nfs works on macOS **Steps:** 1. Clone bold-nfs repo (already done: /tmp/bold-nfs) 2. Build bold-mem demo binary 3. Create test YAML filesystem (memoryfs.yaml) 4. Run bold-mem on port 11112 5. Test macOS mount_nfs connection 6. Verify file reading/writing works **Expected commands:** ```bash # Build bold-mem cd /tmp/bold-nfs cargo build --release # Run bold-mem cargo run --release -p bold-mem -- --debug exec/memoryfs.yaml # Mount NFS (macOS) sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /tmp/demo # Test files ls /tmp/demo/home/user/ cat /tmp/demo/home/user/file1 # Unmount sudo umount /tmp/demo ``` **Success criteria:** - bold-mem starts successfully - mount_nfs connects without error - Files visible in /tmp/demo - File reading works (cat shows content) - File writing works (create new file) **Time:** 4-6 hours --- ### Phase 2: Integrate with MarkBase (Day 2-3) **Objective:** Create MarkBase NFS backend **Architecture:** ``` MarkBase NFS Server ├── bold-nfs (NFSServer, FileManager) ├── vfs crate (FileSystem trait) ├── MarkBaseFS (vfs::FileSystem implementation) │ ├── SQLite connection (warren.sqlite) │ ├── read_dir() → query file_nodes WHERE parent_id=X │ ├── open_file() → read file from disk (aliases_json.path) │ ├── metadata() → query file_nodes metadata │ ├── create_file() → write file to disk + insert node │ └── remove_file() → delete file + delete node └── NFS protocol (NFSv4.0) ``` **Implementation steps:** **Step 1: Create MarkBaseFS struct** ```rust // src/nfs/markbase_fs.rs use vfs::{FileSystem, VfsMetadata, VfsResult}; use rusqlite::Connection; use std::sync::Mutex; pub struct MarkBaseFS { user_id: String, db_path: PathBuf, conn: Mutex, } impl MarkBaseFS { pub fn new(user_id: String, db_path: PathBuf) -> Self { let conn = Connection::open(&db_path).unwrap(); MarkBaseFS { user_id, db_path, conn: Mutex::new(conn), } } } ``` **Step 2: Implement FileSystem trait** ```rust impl FileSystem for MarkBaseFS { fn read_dir(&self, path: &str) -> VfsResult + Send>> { // Query: SELECT label FROM file_nodes WHERE parent_id = ? AND node_type = 'folder/file' let conn = self.conn.lock().unwrap(); let parent_node = self.resolve_path(&conn, path)?; let mut stmt = conn.prepare( "SELECT label FROM file_nodes WHERE parent_id = ?1" ).unwrap(); let children = stmt.query_map([parent_node.node_id], |row| { row.get::<_, String>(0) }).unwrap().collect::>(); Ok(Box::new(children.into_iter())) } fn open_file(&self, path: &str) -> VfsResult> { // Query: SELECT aliases_json FROM file_nodes WHERE node_id = ? let conn = self.conn.lock().unwrap(); let node = self.resolve_path(&conn, path)?; let aliases_json: String = conn.query_row( "SELECT aliases_json FROM file_nodes WHERE node_id = ?1", [&node.node_id], |row| row.get(0) ).unwrap(); let aliases: serde_json::Value = serde_json::from_str(&aliases_json).unwrap(); let file_path = aliases["path"].as_str().unwrap(); // Read file from disk let file = std::fs::File::open(file_path).unwrap(); Ok(Box::new(file)) } fn metadata(&self, path: &str) -> VfsResult { // Query: SELECT file_size, created_at, updated_at FROM file_nodes WHERE node_id = ? let conn = self.conn.lock().unwrap(); let node = self.resolve_path(&conn, path)?; let (size, created, updated): (i64, i64, i64) = conn.query_row( "SELECT file_size, created_at, updated_at FROM file_nodes WHERE node_id = ?1", [&node.node_id], |row| Ok((row.get(0)?, row.get(1)?, row.get(2)?)) ).unwrap(); Ok(VfsMetadata { file_type: if node.node_type == "folder" { FileType::Directory } else { FileType::File }, len: size as u64, // timestamps... }) } fn exists(&self, path: &str) -> VfsResult { let conn = self.conn.lock().unwrap(); match self.resolve_path(&conn, path) { Ok(_) => Ok(true), Err(_) => Ok(false), } } // Implement remaining methods: create_file, remove_file, create_dir, remove_dir, append_file } ``` **Step 3: Create NFS server binary** ```rust // src/bin/markbase-nfs.rs use markbase::nfs::MarkBaseFS; use bold_nfs::NFSServer; use vfs::VfsPath; fn main() { let user_id = "warren"; let db_path = "data/users/warren.sqlite"; let fs = MarkBaseFS::new(user_id, db_path); let root: VfsPath = fs.into(); let server = NFSServer::builder(root) .bind("127.0.0.1:11112") .build(); println!("MarkBase NFS server starting for user: {}", user_id); println!("Listening on: 127.0.0.1:11112"); println!("Mount command: sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren"); server.start(); } ``` **Step 4: Test with warren user** ```bash # Build MarkBase NFS server cargo build --release --bin markbase-nfs # Run NFS server ./target/release/markbase-nfs # Mount NFS volume (macOS) sudo mkdir -p /Volumes/MarkBase_warren sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren # Test file tree ls /Volumes/MarkBase_warren/ ls /Volumes/MarkBase_warren/home/ ls /Volumes/MarkBase_warren/home/accusys/ # Unmount sudo umount /Volumes/MarkBase_warren ``` **Success criteria:** - NFS server starts successfully - Mount connects without error - File tree visible (warren.sqlite: 12659 nodes) - Files readable from mount point **Time:** 12-16 hours --- ### Phase 3: AJA System Test Validation (Day 4) **Objective:** Validate write performance **Test setup:** 1. Mount NFS volume 2. Run AJA System Test 3. Write 4K ProRes 4444 file (1GB) 4. Measure throughput 5. Compare with target (>= 600 MB/s) **Test commands:** ```bash # Start NFS server ./target/release/markbase-nfs # Mount sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren # Run AJA System Test AJA System Test.app → Select /Volumes/MarkBase_warren → Write test: 4K ProRes 4444 → File size: 1GB → Record throughput # Expected result Throughput: >= 600 MB/s sustained write ``` **Performance analysis:** - NFS overhead: ~5-10% (TCP/IP + XDR encoding) - vfs overhead: ~2-3% (trait dispatch) - SQLite overhead: ~1-2% (query latency) - Disk I/O: NVMe native speed (~2000 MB/s raw) **Expected calculation:** ``` Raw NVMe: 2000 MB/s NFS overhead: -10% → 1800 MB/s vfs overhead: -3% → 1746 MB/s SQLite overhead: -2% → 1712 MB/s Expected throughput: ~1700 MB/s Target: >= 600 MB/s (300% margin) ``` **If throughput is lower:** - Investigate NFS buffer sizes (bold-nfs configuration) - Check TCP socket options (nodelay, buffer sizes) - Optimize SQLite queries (indexing, caching) - Consider write buffering (64KB chunks) **Time:** 4-6 hours --- ## 3. Alternative: Go NFS Server (libnfs-go) If bold-nfs has issues, use libnfs-go as fallback. **Architecture:** ``` MarkBase NFS Server (Go) ├── libnfs-go/server (NFSv4 server) ├── Backend interface (fs.FS trait) ├── MarkBaseBackend (fs.FS implementation) │ ├── SQLite connection (CGO) │ ├── Go implementation of filesystem operations └── NFS protocol (NFSv4.0) ``` **Implementation:** ```go // nfs_backend.go package main import ( "github.com/smallfz/libnfs-go/server" "github.com/smallfz/libnfs-go/memfs" "database/sql" _ "github.com/mattn/go-sqlite3" ) type MarkBaseBackend struct { userID string dbPath string db *sql.DB } func (b *MarkBaseBackend) Open(path string) (fs.File, error) { // Query aliases_json from file_nodes // Open file from disk // Return file handle } func (b *MarkBaseBackend) ReadDir(path string) ([]string, error) { // Query file_nodes WHERE parent_id = ? // Return child filenames } func main() { backend := &MarkBaseBackend{ userID: "warren", dbPath: "data/users/warren.sqlite", } svr, err := server.NewServerTCP("127.0.0.1:2049", backend) if err != nil { log.Fatal(err) } svr.Serve() } ``` **Trade-offs:** - **Pros:** Simpler API (fs.FS), mature library - **Cons:** Go binary (not Rust-native), CGO for SQLite --- ## 4. Risk Analysis ### Risks with Direct NFS **Risk 1: bold-nfs maturity (30%)** - bold-nfs is WIP (NFSv4.0 only, v4.1/v4.2 not implemented) - May have bugs in NFS protocol implementation - **Mitigation:** Test thoroughly, fallback to libnfs-go **Risk 2: macOS NFS client compatibility (20%)** - macOS mount_nfs may require specific NFS options - NFSv4.0 protocol may differ on macOS - **Mitigation:** Test with bold-mem first, adjust options **Risk 3: vfs trait complexity (15%)** - FileSystem trait has 9 required methods - Implementation may have bugs - **Mitigation:** Use vfs test macros (test_vfs!) **Risk 4: Performance (10%)** - NFS overhead may be higher than expected - SQLite queries may slow down file operations - **Mitigation:** Optimize queries, add caching **Risk 5: AJA System Test compatibility (5%)** - AJA may not work with NFS mount - **Mitigation:** AJA works with any mounted volume (NFS supported) **Total risk:** 80% success probability (acceptable) --- ## 5. Comparison Summary ### FUSE (fuse-t) - Current Approach **Pros:** - FUSE-native (filesystem in userspace) - fuse-backend-rs library available - FUSE protocol well-documented **Cons:** - go-nfsv4 lifecycle issue unresolved - Complex process lifecycle (fork/exec/socket) - Dependency on fuse-t binary (23MB) - 60% success rate, uncertain debugging time **Recommendation:** **Abandon FUSE approach** --- ### Direct NFS (bold-nfs) - New Approach **Pros:** - Simple architecture (server + mount) - Rust-native (bold-nfs library) - No go-nfsv4 dependency - Proven demo (bold-mem works) - 85% success rate, 4 days implementation **Cons:** - bold-nfs is WIP (NFSv4.0 only) - vfs trait implementation required - New dependency (bold-nfs library) **Recommendation:** **Adopt Direct NFS approach** --- ### Go NFS (libnfs-go) - Fallback **Pros:** - Mature library (v0.0.7, MIT license) - Simple API (fs.FS interface) - Production-ready NFS server **Cons:** - Go binary (not Rust-native) - CGO for SQLite (complexity) - Separate process management **Recommendation:** **Fallback if bold-nfs fails** --- ## 6. Decision **Switch to Direct NFS (bold-nfs)** **Reasons:** 1. FUSE approach has unresolved lifecycle issue (50+ attempts failed) 2. Direct NFS is simpler (no fork/exec/socket complexity) 3. Rust-native solution fits project stack 4. 85% success rate vs 60% for FUSE 5. 4 days vs 7 days implementation time 6. AJA System Test works with NFS mounts **Next action:** 1. Test bold-nfs on macOS (Day 1) 2. Implement MarkBaseFS backend (Day 2-3) 3. Validate AJA System Test (Day 4) **Fallback plan:** If bold-nfs fails → use libnfs-go (Go NFS server) --- ## 7. Implementation Schedule **Day 1 (2026-05-17):** - Morning: Clone bold-nfs, build bold-mem - Afternoon: Test macOS mount_nfs connection - Evening: Verify file operations work **Day 2 (2026-05-18):** - Morning: Create MarkBaseFS struct - Afternoon: Implement FileSystem trait (read_dir, open_file, metadata) - Evening: Test with SQLite backend **Day 3 (2026-05-19):** - Morning: Implement write operations (create_file, remove_file) - Afternoon: Create NFS server binary - Evening: Test full workflow with warren.sqlite **Day 4 (2026-05-20):** - Morning: Mount NFS volume for AJA testing - Afternoon: Run AJA System Test (4K ProRes) - Evening: Analyze throughput, optimize if needed **Total: 4 days, 85% confidence** --- **Report prepared by:** OpenCode AI Assistant **Session:** FUSE debugging → NFS direct implementation **Decision date:** 2026-05-17 13:30 **Action:** Start Phase 1 immediately