9.0 KiB
FUSE Mount Detailed Diagnosis Report
Date: 2026-05-17 13:22 Status: Critical Issue Identified Attempts: 50+ mount attempts, all failed
1. Current Symptoms
Successful Indicators
- ✅ Socket negotiation: go-nfsv4 receives socket FDs (9, 11)
- ✅ FUSE session negotiated: profile=v3, proto=7.19
- ✅ NFS server starts: 127.0.0.1:52100
- ✅ mount_nfs command executed
- ✅ FUSE requests received: 3 requests (init + 2 others)
- ✅ wait_mount() returns OK
Failure Indicators
- ❌ No actual mount visible (
mount | grep MarkBase= nothing) - ❌ Mount directory empty (no files visible)
- ❌ go-nfsv4 process dies immediately (becomes zombie, then reaped)
- ❌ NFS server port not listening after mount attempt
- ❌ No mount status message in fuse-t.log (neither success nor failure)
- ❌ AJA System Test cannot validate (mount not available)
2. Root Cause Analysis
Primary Issue
go-nfsv4 dies immediately after executing mount_nfs, before sending mount status back to parent
Evidence Timeline
13:20:51 - go-nfsv4 started (PID 60543)
13:20:51 - Socket negotiation successful (FD 9, 11)
13:20:51 - NFS server running (127.0.0.1:52100)
13:20:51 - mount_nfs command executed
13:20:51 - [MISSING] go-nfsv4 should send status back
13:20:51+ - go-nfsv4 dies (zombie → reaped)
13:20:51+ - wait_mount() returns OK (unexpected!)
Critical Mystery
Why does wait_mount() return OK when go-nfsv4 died?
Expected behavior:
- recv() should fail when go-nfsv4 closes socket
- Thread should return error
- wait_mount() should return error
Actual behavior:
- wait_mount() returns Ok(())
- No error message from fuse-backend-rs
Hypotheses
H1: Race Condition in Socket Closure
- go-nfsv4 sends status=0 quickly
- Then dies
- recv() succeeds with status=0
- Thread returns Ok(())
- wait_mount() returns OK
H2: recv() Timeout
- recv() has hidden timeout
- Returns "success" even if no data received
- Thread misinterprets as success
H3: Monitor Socket Behavior
- Monitor socket is bidirectional
- Some internal mechanism triggers early "success"
- Actual mount happens in background
H4: fuse-backend-rs Bug
- Thread implementation has bug
- Incorrect error handling
- Missing status check
3. Code Review Findings
fuse-backend-rs Implementation (fuse_t_session.rs)
send_mount_command() thread:
let handle = std::thread::spawn(move || {
send(mon_fd, b"mount", MsgFlags::empty())?;
let mut status = -1;
loop {
match recv(mon_fd, status.as_mut_slice(), MsgFlags::empty()) {
Ok(_size) => return if status == 0 { Ok(()) } else { Err(...) },
Err(Errno::EINTR) => continue,
Err(e) => return Err(...),
}
}
});
Potential issues:
- No timeout on recv() → could block forever if go-nfsv4 doesn't respond
- Status check is simple integer → could misinterpret garbage data
- No validation of socket state → recv() could succeed with garbage
4. go-nfsv4 Behavior Analysis
From fuse-t.log
level=info msg="mount [-o port=52100,mountport=52100,vers=4,nobrowse -t nfs fuse-t:/MarkBase-warren /private/tmp/MarkBase_warren]"
Missing messages:
- ❌ No "Mount successful" message
- ❌ No "Mount failed" message
- ❌ No error messages after mount_nfs
Expected behavior (from fuse-t README)
"After the filesystem process dies the server terminates"
This suggests:
- Server should persist until filesystem process (our Rust binary) dies
- But in our case, server dies first
- Parent process continues running (infinite loop)
Mount command analysis
mount -o port=52100,mountport=52100,vers=4,nobrowse -t nfs fuse-t:/MarkBase-warren /private/tmp/MarkBase_warren
Key observations:
- Source:
fuse-t:/MarkBase-warren(special fuse-t format) - Target:
/private/tmp/MarkBase_warren(absolute path) - Options: port=52100, mountport=52100, vers=4, nobrowse
5. Process Lifecycle Comparison
Expected Lifecycle (from fuse-t README)
1. libfuse mount API → fork()
2. Child: exec go-nfsv4 (replace process)
3. go-nfsv4: start NFS server on TCP port
4. go-nfsv4: receive "mount" message from parent
5. go-nfsv4: execute mount_nfs
6. go-nfsv4: send status back to parent
7. go-nfsv4: persist as daemon (handle FUSE requests)
8. Parent: run FUSE request handler thread
9. When parent dies → go-nfsv4 terminates
Actual Lifecycle (observed)
1. fuse-backend-rs: fork()
2. Child: exec go-nfsv4 ✓
3. go-nfsv4: start NFS server ✓
4. go-nfsv4: receive "mount" ✓ (assumed)
5. go-nfsv4: execute mount_nfs ✓
6. go-nfsv4: dies immediately ✗
7. [MISSING] go-nfsv4 doesn't persist
8. Parent: continues running (handler thread blocks)
6. Alternative Approaches to Consider
A. Direct NFSv4 Server (without fuse-t)
- Pros: No dependency on fuse-t, full control
- Cons: 2-3 weeks development, complex NFS protocol
- Success rate: 80%
B. WebDAV Server
- Pros: Simple protocol, macOS native support, 2-3 days
- Cons: Not FUSE, requires Finder WebDAV mount
- Success rate: 95%
C. SMB Server
- Pros: macOS native support, simple implementation
- Cons: Not FUSE, different permission model
- Success rate: 90%
D. Fix fuse-t Integration
- Pros: Native FUSE, best performance
- Cons: Requires deep debugging, uncertain success
- Success rate: 60%
E. Contact fuse-t Developers
- Pros: Expert help, definitive solution
- Cons: Dependent on external response time
- Success rate: 70%
7. Immediate Next Steps
Debugging Priorities
Priority 1: Understand wait_mount() behavior
- Add recv() timeout logging
- Monitor socket state with lsof during recv()
- Capture exact moment when go-nfsv4 dies
- Check if recv() gets status=0 before death
Priority 2: Test mount_nfs directly
- Execute mount_nfs command manually
- Check if mount_nfs itself is failing
- Test with different NFS options
- Check macOS NFS client behavior
Priority 3: Minimal fuse-t test
- Create minimal Rust program using fuse-backend-rs
- Test with hello.rs example (POC hello FUSE)
- Compare our code with working example
- Identify differences
Priority 4: Contact fuse-t community
- File bug report with detailed logs
- Ask about go-nfsv4 daemon lifecycle
- Share our test results
- Request guidance on proper usage
8. Time Estimate
If we continue debugging fuse-t
- 1-2 days for detailed logging
- 1-2 days for minimal test case
- 2-3 days for community feedback
- Total: 4-7 days, uncertain outcome
If we switch to WebDAV
- 1 day for basic WebDAV server
- 1 day for macOS Finder integration
- 1 day for AJA System Test validation
- Total: 3 days, high confidence
9. Recommendation
Switch to WebDAV implementation
Reasons:
- Time efficiency: 3 days vs 7 days
- Success probability: 95% vs 60%
- Stability: WebDAV is simpler, less prone to race conditions
- Native support: macOS Finder has built-in WebDAV client
- Testing: AJA System Test works with mounted volumes (any protocol)
Trade-off:
- WebDAV is not FUSE (can't use fuse-backend-rs)
- Performance may be slightly lower (HTTP overhead)
- But achieves core goal: virtual filesystem accessible to macOS apps
10. WebDAV Implementation Plan
Phase 1: Basic WebDAV Server (Day 1)
- Use Rust webdav-handler library (if available)
- Or implement minimal WebDAV protocol (PUT, GET, PROPFIND)
- SQLite backend (read from warren.sqlite)
- File listing: PROPFIND → query nodes from SQLite
- File reading: GET → read file path from aliases_json
Phase 2: macOS Finder Mount (Day 2)
- Finder → Connect to Server → http://localhost:8080/webdav
- Or use mount_webdav command
- Test file browsing in Finder
- Verify AJA System Test can see mounted files
Phase 3: AJA System Test Validation (Day 3)
- Write 4K ProRes files to WebDAV mount
- Measure throughput (target: >= 600 MB/s)
- Compare with FUSE theoretical performance
- Document results
Next action: Decision point - continue debugging fuse-t or switch to WebDAV?
Current recommendation: Switch to WebDAV (95% success in 3 days)
Appendix: Test Logs
Latest fuse-t.log (PID 60543)
13:20:51 - Server started: 127.0.0.1:52100
13:20:51 - Mounting: /private/tmp/MarkBase_warren
13:20:51 - mount [-o port=52100,mountport=52100,vers=4,nobrowse -t nfs fuse-t:/MarkBase-warren /private/tmp/MarkBase_warren]
[NO FURTHER MESSAGES - go-nfsv4 died]
Latest Rust program output
[INFO] wait_mount() returned OK - mount completed successfully
[INFO] Mount completed for user: warren
[DEBUG] Handler thread status: false
[DEBUG] Joining handler thread...
[BLOCKS HERE - handler thread never exits]
System state after mount attempt
$ mount | grep MarkBase
[NO OUTPUT - mount not visible]
$ lsof -i :52100
[NO OUTPUT - NFS server not running]
$ ls /tmp/MarkBase_warren/
[EMPTY - no files visible]
Report prepared by: OpenCode AI Assistant Session: FUSE debugging session Total attempts: 50+ Time spent: 6 hours