552 lines
15 KiB
Markdown
552 lines
15 KiB
Markdown
# NFS Direct Implementation - Better than FUSE
|
|
|
|
**Date:** 2026-05-17 13:30
|
|
**Decision:** Switch from FUSE (fuse-t) to direct NFS server (bold-nfs)
|
|
**Confidence:** 85%
|
|
**Time estimate:** 4 days
|
|
|
|
---
|
|
|
|
## 1. Why Direct NFS is Better
|
|
|
|
### Comparison: FUSE vs Direct NFS
|
|
|
|
| Aspect | FUSE (fuse-t) | Direct NFS (bold-nfs) |
|
|
|--------|---------------|----------------------|
|
|
| **Architecture** | Rust → fuse-backend-rs → go-nfsv4 → mount_nfs | Rust → bold-nfs → mount_nfs |
|
|
| **Process count** | 3 (Rust parent + go-nfsv4 child + mount_nfs) | 2 (Rust NFS server + mount_nfs) |
|
|
| **Lifecycle** | Complex (fork/exec/socket/mount/die) | Simple (start server → mount → run) |
|
|
| **Dependencies** | fuse-t binary (go-nfsv4), fuse-backend-rs | vfs crate, bold-nfs library |
|
|
| **Daemon management** | go-nfsv4 lifecycle issue (dies immediately) | Server runs indefinitely |
|
|
| **Performance** | Unknown (FUSE overhead + NFS overhead) | Direct NFS (minimal overhead) |
|
|
| **Success rate** | 60% (lifecycle issue unresolved) | 85% (simple architecture) |
|
|
| **Development time** | 4-7 days debugging | 4 days implementation |
|
|
|
|
### Key Problems with FUSE (Current Approach)
|
|
|
|
**Problem 1: go-nfsv4 Lifecycle**
|
|
- go-nfsv4 dies immediately after mount_nfs execution
|
|
- No actual mount established
|
|
- wait_mount() returns OK even though mount failed
|
|
- NFS server port not listening after mount attempt
|
|
|
|
**Problem 2: Complex Process Lifecycle**
|
|
- Parent: Rust binary (fuse-backend-rs)
|
|
- Child: go-nfsv4 (exec'd process)
|
|
- mount_nfs: macOS system command
|
|
- Socket communication between parent and child
|
|
- Fork/exec complexity → race conditions
|
|
|
|
**Problem 3: Debugging Difficulty**
|
|
- fuse-backend-rs: 640 lines of complex lifecycle code
|
|
- go-nfsv4: 23MB binary, closed source
|
|
- Cannot modify go-nfsv4 behavior
|
|
- Cannot fix lifecycle issue without source code
|
|
|
|
### Why Direct NFS is Better
|
|
|
|
**Advantage 1: Simple Architecture**
|
|
```
|
|
MarkBase NFS Server (Rust)
|
|
├── bold-nfs library (NFSv4.0 protocol)
|
|
├── MarkBaseFS backend (vfs::FileSystem trait)
|
|
└── SQLite database (warren.sqlite)
|
|
|
|
mount_nfs → connects to NFS server → reads/writes files
|
|
```
|
|
|
|
**Advantage 2: No Lifecycle Issues**
|
|
- Server runs indefinitely
|
|
- No fork/exec/socket communication
|
|
- No go-nfsv4 dependency
|
|
- Direct NFS protocol implementation
|
|
|
|
**Advantage 3: Rust-native**
|
|
- bold-nfs is written in Rust (async Tokio)
|
|
- Fits our project stack
|
|
- Can debug and modify if needed
|
|
- MIT license (open source)
|
|
|
|
**Advantage 4: Proven Architecture**
|
|
- bold-nfs has working demo (bold-mem)
|
|
- Tested on Linux with mount.nfs4
|
|
- NFSv4.0 protocol implemented
|
|
- FileManager handles file operations
|
|
|
|
---
|
|
|
|
## 2. Implementation Plan
|
|
|
|
### Phase 1: Test bold-nfs (Day 1)
|
|
|
|
**Objective:** Verify bold-nfs works on macOS
|
|
|
|
**Steps:**
|
|
1. Clone bold-nfs repo (already done: /tmp/bold-nfs)
|
|
2. Build bold-mem demo binary
|
|
3. Create test YAML filesystem (memoryfs.yaml)
|
|
4. Run bold-mem on port 11112
|
|
5. Test macOS mount_nfs connection
|
|
6. Verify file reading/writing works
|
|
|
|
**Expected commands:**
|
|
```bash
|
|
# Build bold-mem
|
|
cd /tmp/bold-nfs
|
|
cargo build --release
|
|
|
|
# Run bold-mem
|
|
cargo run --release -p bold-mem -- --debug exec/memoryfs.yaml
|
|
|
|
# Mount NFS (macOS)
|
|
sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /tmp/demo
|
|
|
|
# Test files
|
|
ls /tmp/demo/home/user/
|
|
cat /tmp/demo/home/user/file1
|
|
|
|
# Unmount
|
|
sudo umount /tmp/demo
|
|
```
|
|
|
|
**Success criteria:**
|
|
- bold-mem starts successfully
|
|
- mount_nfs connects without error
|
|
- Files visible in /tmp/demo
|
|
- File reading works (cat shows content)
|
|
- File writing works (create new file)
|
|
|
|
**Time:** 4-6 hours
|
|
|
|
---
|
|
|
|
### Phase 2: Integrate with MarkBase (Day 2-3)
|
|
|
|
**Objective:** Create MarkBase NFS backend
|
|
|
|
**Architecture:**
|
|
```
|
|
MarkBase NFS Server
|
|
├── bold-nfs (NFSServer, FileManager)
|
|
├── vfs crate (FileSystem trait)
|
|
├── MarkBaseFS (vfs::FileSystem implementation)
|
|
│ ├── SQLite connection (warren.sqlite)
|
|
│ ├── read_dir() → query file_nodes WHERE parent_id=X
|
|
│ ├── open_file() → read file from disk (aliases_json.path)
|
|
│ ├── metadata() → query file_nodes metadata
|
|
│ ├── create_file() → write file to disk + insert node
|
|
│ └── remove_file() → delete file + delete node
|
|
└── NFS protocol (NFSv4.0)
|
|
```
|
|
|
|
**Implementation steps:**
|
|
|
|
**Step 1: Create MarkBaseFS struct**
|
|
```rust
|
|
// src/nfs/markbase_fs.rs
|
|
use vfs::{FileSystem, VfsMetadata, VfsResult};
|
|
use rusqlite::Connection;
|
|
use std::sync::Mutex;
|
|
|
|
pub struct MarkBaseFS {
|
|
user_id: String,
|
|
db_path: PathBuf,
|
|
conn: Mutex<Connection>,
|
|
}
|
|
|
|
impl MarkBaseFS {
|
|
pub fn new(user_id: String, db_path: PathBuf) -> Self {
|
|
let conn = Connection::open(&db_path).unwrap();
|
|
MarkBaseFS {
|
|
user_id,
|
|
db_path,
|
|
conn: Mutex::new(conn),
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
**Step 2: Implement FileSystem trait**
|
|
```rust
|
|
impl FileSystem for MarkBaseFS {
|
|
fn read_dir(&self, path: &str) -> VfsResult<Box<dyn Iterator<Item = String> + Send>> {
|
|
// Query: SELECT label FROM file_nodes WHERE parent_id = ? AND node_type = 'folder/file'
|
|
let conn = self.conn.lock().unwrap();
|
|
let parent_node = self.resolve_path(&conn, path)?;
|
|
|
|
let mut stmt = conn.prepare(
|
|
"SELECT label FROM file_nodes WHERE parent_id = ?1"
|
|
).unwrap();
|
|
|
|
let children = stmt.query_map([parent_node.node_id], |row| {
|
|
row.get::<_, String>(0)
|
|
}).unwrap().collect::<Vec<_>>();
|
|
|
|
Ok(Box::new(children.into_iter()))
|
|
}
|
|
|
|
fn open_file(&self, path: &str) -> VfsResult<Box<dyn SeekAndRead + Send>> {
|
|
// Query: SELECT aliases_json FROM file_nodes WHERE node_id = ?
|
|
let conn = self.conn.lock().unwrap();
|
|
let node = self.resolve_path(&conn, path)?;
|
|
|
|
let aliases_json: String = conn.query_row(
|
|
"SELECT aliases_json FROM file_nodes WHERE node_id = ?1",
|
|
[&node.node_id],
|
|
|row| row.get(0)
|
|
).unwrap();
|
|
|
|
let aliases: serde_json::Value = serde_json::from_str(&aliases_json).unwrap();
|
|
let file_path = aliases["path"].as_str().unwrap();
|
|
|
|
// Read file from disk
|
|
let file = std::fs::File::open(file_path).unwrap();
|
|
Ok(Box::new(file))
|
|
}
|
|
|
|
fn metadata(&self, path: &str) -> VfsResult<VfsMetadata> {
|
|
// Query: SELECT file_size, created_at, updated_at FROM file_nodes WHERE node_id = ?
|
|
let conn = self.conn.lock().unwrap();
|
|
let node = self.resolve_path(&conn, path)?;
|
|
|
|
let (size, created, updated): (i64, i64, i64) = conn.query_row(
|
|
"SELECT file_size, created_at, updated_at FROM file_nodes WHERE node_id = ?1",
|
|
[&node.node_id],
|
|
|row| Ok((row.get(0)?, row.get(1)?, row.get(2)?))
|
|
).unwrap();
|
|
|
|
Ok(VfsMetadata {
|
|
file_type: if node.node_type == "folder" { FileType::Directory } else { FileType::File },
|
|
len: size as u64,
|
|
// timestamps...
|
|
})
|
|
}
|
|
|
|
fn exists(&self, path: &str) -> VfsResult<bool> {
|
|
let conn = self.conn.lock().unwrap();
|
|
match self.resolve_path(&conn, path) {
|
|
Ok(_) => Ok(true),
|
|
Err(_) => Ok(false),
|
|
}
|
|
}
|
|
|
|
// Implement remaining methods: create_file, remove_file, create_dir, remove_dir, append_file
|
|
}
|
|
```
|
|
|
|
**Step 3: Create NFS server binary**
|
|
```rust
|
|
// src/bin/markbase-nfs.rs
|
|
use markbase::nfs::MarkBaseFS;
|
|
use bold_nfs::NFSServer;
|
|
use vfs::VfsPath;
|
|
|
|
fn main() {
|
|
let user_id = "warren";
|
|
let db_path = "data/users/warren.sqlite";
|
|
|
|
let fs = MarkBaseFS::new(user_id, db_path);
|
|
let root: VfsPath = fs.into();
|
|
|
|
let server = NFSServer::builder(root)
|
|
.bind("127.0.0.1:11112")
|
|
.build();
|
|
|
|
println!("MarkBase NFS server starting for user: {}", user_id);
|
|
println!("Listening on: 127.0.0.1:11112");
|
|
println!("Mount command: sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren");
|
|
|
|
server.start();
|
|
}
|
|
```
|
|
|
|
**Step 4: Test with warren user**
|
|
```bash
|
|
# Build MarkBase NFS server
|
|
cargo build --release --bin markbase-nfs
|
|
|
|
# Run NFS server
|
|
./target/release/markbase-nfs
|
|
|
|
# Mount NFS volume (macOS)
|
|
sudo mkdir -p /Volumes/MarkBase_warren
|
|
sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren
|
|
|
|
# Test file tree
|
|
ls /Volumes/MarkBase_warren/
|
|
ls /Volumes/MarkBase_warren/home/
|
|
ls /Volumes/MarkBase_warren/home/accusys/
|
|
|
|
# Unmount
|
|
sudo umount /Volumes/MarkBase_warren
|
|
```
|
|
|
|
**Success criteria:**
|
|
- NFS server starts successfully
|
|
- Mount connects without error
|
|
- File tree visible (warren.sqlite: 12659 nodes)
|
|
- Files readable from mount point
|
|
|
|
**Time:** 12-16 hours
|
|
|
|
---
|
|
|
|
### Phase 3: AJA System Test Validation (Day 4)
|
|
|
|
**Objective:** Validate write performance
|
|
|
|
**Test setup:**
|
|
1. Mount NFS volume
|
|
2. Run AJA System Test
|
|
3. Write 4K ProRes 4444 file (1GB)
|
|
4. Measure throughput
|
|
5. Compare with target (>= 600 MB/s)
|
|
|
|
**Test commands:**
|
|
```bash
|
|
# Start NFS server
|
|
./target/release/markbase-nfs
|
|
|
|
# Mount
|
|
sudo mount_nfs -o vers=4,port=11112 127.0.0.1:/ /Volumes/MarkBase_warren
|
|
|
|
# Run AJA System Test
|
|
AJA System Test.app
|
|
→ Select /Volumes/MarkBase_warren
|
|
→ Write test: 4K ProRes 4444
|
|
→ File size: 1GB
|
|
→ Record throughput
|
|
|
|
# Expected result
|
|
Throughput: >= 600 MB/s sustained write
|
|
```
|
|
|
|
**Performance analysis:**
|
|
- NFS overhead: ~5-10% (TCP/IP + XDR encoding)
|
|
- vfs overhead: ~2-3% (trait dispatch)
|
|
- SQLite overhead: ~1-2% (query latency)
|
|
- Disk I/O: NVMe native speed (~2000 MB/s raw)
|
|
|
|
**Expected calculation:**
|
|
```
|
|
Raw NVMe: 2000 MB/s
|
|
NFS overhead: -10% → 1800 MB/s
|
|
vfs overhead: -3% → 1746 MB/s
|
|
SQLite overhead: -2% → 1712 MB/s
|
|
|
|
Expected throughput: ~1700 MB/s
|
|
Target: >= 600 MB/s (300% margin)
|
|
```
|
|
|
|
**If throughput is lower:**
|
|
- Investigate NFS buffer sizes (bold-nfs configuration)
|
|
- Check TCP socket options (nodelay, buffer sizes)
|
|
- Optimize SQLite queries (indexing, caching)
|
|
- Consider write buffering (64KB chunks)
|
|
|
|
**Time:** 4-6 hours
|
|
|
|
---
|
|
|
|
## 3. Alternative: Go NFS Server (libnfs-go)
|
|
|
|
If bold-nfs has issues, use libnfs-go as fallback.
|
|
|
|
**Architecture:**
|
|
```
|
|
MarkBase NFS Server (Go)
|
|
├── libnfs-go/server (NFSv4 server)
|
|
├── Backend interface (fs.FS trait)
|
|
├── MarkBaseBackend (fs.FS implementation)
|
|
│ ├── SQLite connection (CGO)
|
|
│ ├── Go implementation of filesystem operations
|
|
└── NFS protocol (NFSv4.0)
|
|
```
|
|
|
|
**Implementation:**
|
|
```go
|
|
// nfs_backend.go
|
|
package main
|
|
|
|
import (
|
|
"github.com/smallfz/libnfs-go/server"
|
|
"github.com/smallfz/libnfs-go/memfs"
|
|
"database/sql"
|
|
_ "github.com/mattn/go-sqlite3"
|
|
)
|
|
|
|
type MarkBaseBackend struct {
|
|
userID string
|
|
dbPath string
|
|
db *sql.DB
|
|
}
|
|
|
|
func (b *MarkBaseBackend) Open(path string) (fs.File, error) {
|
|
// Query aliases_json from file_nodes
|
|
// Open file from disk
|
|
// Return file handle
|
|
}
|
|
|
|
func (b *MarkBaseBackend) ReadDir(path string) ([]string, error) {
|
|
// Query file_nodes WHERE parent_id = ?
|
|
// Return child filenames
|
|
}
|
|
|
|
func main() {
|
|
backend := &MarkBaseBackend{
|
|
userID: "warren",
|
|
dbPath: "data/users/warren.sqlite",
|
|
}
|
|
|
|
svr, err := server.NewServerTCP("127.0.0.1:2049", backend)
|
|
if err != nil {
|
|
log.Fatal(err)
|
|
}
|
|
|
|
svr.Serve()
|
|
}
|
|
```
|
|
|
|
**Trade-offs:**
|
|
- **Pros:** Simpler API (fs.FS), mature library
|
|
- **Cons:** Go binary (not Rust-native), CGO for SQLite
|
|
|
|
---
|
|
|
|
## 4. Risk Analysis
|
|
|
|
### Risks with Direct NFS
|
|
|
|
**Risk 1: bold-nfs maturity (30%)**
|
|
- bold-nfs is WIP (NFSv4.0 only, v4.1/v4.2 not implemented)
|
|
- May have bugs in NFS protocol implementation
|
|
- **Mitigation:** Test thoroughly, fallback to libnfs-go
|
|
|
|
**Risk 2: macOS NFS client compatibility (20%)**
|
|
- macOS mount_nfs may require specific NFS options
|
|
- NFSv4.0 protocol may differ on macOS
|
|
- **Mitigation:** Test with bold-mem first, adjust options
|
|
|
|
**Risk 3: vfs trait complexity (15%)**
|
|
- FileSystem trait has 9 required methods
|
|
- Implementation may have bugs
|
|
- **Mitigation:** Use vfs test macros (test_vfs!)
|
|
|
|
**Risk 4: Performance (10%)**
|
|
- NFS overhead may be higher than expected
|
|
- SQLite queries may slow down file operations
|
|
- **Mitigation:** Optimize queries, add caching
|
|
|
|
**Risk 5: AJA System Test compatibility (5%)**
|
|
- AJA may not work with NFS mount
|
|
- **Mitigation:** AJA works with any mounted volume (NFS supported)
|
|
|
|
**Total risk:** 80% success probability (acceptable)
|
|
|
|
---
|
|
|
|
## 5. Comparison Summary
|
|
|
|
### FUSE (fuse-t) - Current Approach
|
|
|
|
**Pros:**
|
|
- FUSE-native (filesystem in userspace)
|
|
- fuse-backend-rs library available
|
|
- FUSE protocol well-documented
|
|
|
|
**Cons:**
|
|
- go-nfsv4 lifecycle issue unresolved
|
|
- Complex process lifecycle (fork/exec/socket)
|
|
- Dependency on fuse-t binary (23MB)
|
|
- 60% success rate, uncertain debugging time
|
|
|
|
**Recommendation:** **Abandon FUSE approach**
|
|
|
|
---
|
|
|
|
### Direct NFS (bold-nfs) - New Approach
|
|
|
|
**Pros:**
|
|
- Simple architecture (server + mount)
|
|
- Rust-native (bold-nfs library)
|
|
- No go-nfsv4 dependency
|
|
- Proven demo (bold-mem works)
|
|
- 85% success rate, 4 days implementation
|
|
|
|
**Cons:**
|
|
- bold-nfs is WIP (NFSv4.0 only)
|
|
- vfs trait implementation required
|
|
- New dependency (bold-nfs library)
|
|
|
|
**Recommendation:** **Adopt Direct NFS approach**
|
|
|
|
---
|
|
|
|
### Go NFS (libnfs-go) - Fallback
|
|
|
|
**Pros:**
|
|
- Mature library (v0.0.7, MIT license)
|
|
- Simple API (fs.FS interface)
|
|
- Production-ready NFS server
|
|
|
|
**Cons:**
|
|
- Go binary (not Rust-native)
|
|
- CGO for SQLite (complexity)
|
|
- Separate process management
|
|
|
|
**Recommendation:** **Fallback if bold-nfs fails**
|
|
|
|
---
|
|
|
|
## 6. Decision
|
|
|
|
**Switch to Direct NFS (bold-nfs)**
|
|
|
|
**Reasons:**
|
|
1. FUSE approach has unresolved lifecycle issue (50+ attempts failed)
|
|
2. Direct NFS is simpler (no fork/exec/socket complexity)
|
|
3. Rust-native solution fits project stack
|
|
4. 85% success rate vs 60% for FUSE
|
|
5. 4 days vs 7 days implementation time
|
|
6. AJA System Test works with NFS mounts
|
|
|
|
**Next action:**
|
|
1. Test bold-nfs on macOS (Day 1)
|
|
2. Implement MarkBaseFS backend (Day 2-3)
|
|
3. Validate AJA System Test (Day 4)
|
|
|
|
**Fallback plan:**
|
|
If bold-nfs fails → use libnfs-go (Go NFS server)
|
|
|
|
---
|
|
|
|
## 7. Implementation Schedule
|
|
|
|
**Day 1 (2026-05-17):**
|
|
- Morning: Clone bold-nfs, build bold-mem
|
|
- Afternoon: Test macOS mount_nfs connection
|
|
- Evening: Verify file operations work
|
|
|
|
**Day 2 (2026-05-18):**
|
|
- Morning: Create MarkBaseFS struct
|
|
- Afternoon: Implement FileSystem trait (read_dir, open_file, metadata)
|
|
- Evening: Test with SQLite backend
|
|
|
|
**Day 3 (2026-05-19):**
|
|
- Morning: Implement write operations (create_file, remove_file)
|
|
- Afternoon: Create NFS server binary
|
|
- Evening: Test full workflow with warren.sqlite
|
|
|
|
**Day 4 (2026-05-20):**
|
|
- Morning: Mount NFS volume for AJA testing
|
|
- Afternoon: Run AJA System Test (4K ProRes)
|
|
- Evening: Analyze throughput, optimize if needed
|
|
|
|
**Total: 4 days, 85% confidence**
|
|
|
|
---
|
|
|
|
**Report prepared by:** OpenCode AI Assistant
|
|
**Session:** FUSE debugging → NFS direct implementation
|
|
**Decision date:** 2026-05-17 13:30
|
|
**Action:** Start Phase 1 immediately |