583 lines
15 KiB
Markdown
583 lines
15 KiB
Markdown
# MarkBase FUSE System Design
|
||
|
||
## Overview
|
||
|
||
**Objective**: Implement virtual file system mount for MarkBase users using FUSE technology, enabling direct file access through macOS Finder and video editing software.
|
||
|
||
**Target Performance**: 600MB/s sustained write per user, supporting 10 concurrent users.
|
||
|
||
**Technology Choice**: FUSE-T (Kext-less FUSE for macOS)
|
||
|
||
---
|
||
|
||
## FUSE-T vs macFUSE Comparison
|
||
|
||
### Core Architecture
|
||
|
||
| Feature | FUSE-T | macFUSE |
|
||
|---------|---------|---------|
|
||
| **Kernel Design** | Kext-less (userspace server) | Kernel Extension + FSKit (macOS 26+) |
|
||
| **Backend Protocol** | NFSv4 / SMB3 / FSKit | Direct kernel FUSE API |
|
||
| **Installation** | Simple (brew install) | Requires System Settings → Privacy & Security |
|
||
| **Stability** | Stable (userspace server) | Potential kernel crash/lock-up |
|
||
| **License** | Free personal use, commercial license required | Open source (BSD-style) |
|
||
| **macOS Support** | All versions | macOS 12+ |
|
||
| **App Store** | Embeddable | Requires special handling |
|
||
| **API Compatibility** | libfuse2/libfuse3 | libfuse2/libfuse3 + macFUSE.framework |
|
||
| **Install Count** | 21,892 (365 days) | 129,388 (365 days) |
|
||
|
||
### Technical Flow
|
||
|
||
**FUSE-T Operation:**
|
||
```
|
||
User App → libfuse → FUSE-T Server (userspace) → NFS/SMB/FSKit → macOS mount
|
||
```
|
||
|
||
**macFUSE Operation (Legacy):**
|
||
```
|
||
User App → libfuse → macFUSE kext (kernel) → VFS → macOS mount
|
||
```
|
||
|
||
**macFUSE Operation (macOS 26+):**
|
||
```
|
||
User App → libfuse → FSKit (userspace) → macOS mount
|
||
```
|
||
|
||
### Performance Considerations
|
||
|
||
| Factor | Impact |
|
||
|--------|--------|
|
||
| FUSE-T NFS backend | Extra TCP/IP overhead (~5-10% latency) |
|
||
| macFUSE kext | Direct kernel path (fastest) |
|
||
| macFUSE FSKit | Userspace path (similar to FUSE-T) |
|
||
| Network packet handling | FUSE-T requires NFS RPC conversion |
|
||
| Large file writes | Both limited by userspace buffer |
|
||
|
||
### Recommendation
|
||
|
||
**FUSE-T is recommended for MarkBase:**
|
||
|
||
1. **Stability Priority**: Avoid kernel panic risk
|
||
2. **Deployment Friendly**: No Security Settings configuration needed
|
||
3. **macOS 26 Support**: FSKit backend option (same as macFUSE)
|
||
4. **Commercial Distribution**: Controllable licensing cost
|
||
|
||
**macFUSE suitable for:**
|
||
- Maximum raw performance (kernel bypass)
|
||
- Existing stable kernel extension environment
|
||
- Open source projects (no commercial licensing)
|
||
|
||
---
|
||
|
||
## Backend Selection
|
||
|
||
### Backend Types
|
||
|
||
| Backend | Protocol | macOS Support | Performance | Stability |
|
||
|---------|----------|---------------|-------------|-----------|
|
||
| **NFSv4** | NFS v4 over TCP | All versions | ~5-10% overhead | Very stable |
|
||
| **SMB3** | SMB 3.0 over TCP | All versions | ~8-12% overhead | Stable |
|
||
| **FSKit** | Apple FSKit API | macOS 26+ | Direct path | Native |
|
||
|
||
### Backend Architecture
|
||
|
||
```rust
|
||
pub enum BackendType {
|
||
Nfs4, // NFSv4 backend (all macOS support)
|
||
Fskit, // FSKit backend (macOS 26+)
|
||
}
|
||
|
||
impl BackendType {
|
||
pub fn mount_options(&self) -> Vec<MountOption> {
|
||
match self {
|
||
BackendType::Nfs4 => vec![
|
||
MountOption::Backend("nfs"),
|
||
MountOption::AutoUnmount,
|
||
],
|
||
BackendType::Fskit => vec![
|
||
MountOption::Backend("fskit"),
|
||
MountOption::AutoUnmount,
|
||
],
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Performance Comparison
|
||
|
||
**Expected Throughput (4K ProRes 4444 Write):**
|
||
|
||
| Backend | Expected Speed | Overhead | Recommendation |
|
||
|---------|----------------|----------|----------------|
|
||
| NFSv4 | 550-600 MB/s | 5-10% | Stable baseline |
|
||
| FSKit | 600-700 MB/s | Minimal | Performance target |
|
||
|
||
---
|
||
|
||
## Module Architecture
|
||
|
||
### Directory Structure
|
||
|
||
```
|
||
src/fuse/
|
||
├── mod.rs # FUSE core module (entry point)
|
||
├── filesystem.rs # MarkBaseFs implementation
|
||
├── handlers.rs # FUSE operation handlers
|
||
├── backend.rs # Backend selection (NFSv4/FSKit)
|
||
├── cache.rs # LRU cache for metadata
|
||
└── mount_manager.rs # Multi-user concurrent mount
|
||
```
|
||
|
||
### Module Dependencies
|
||
|
||
```toml
|
||
# Cargo.toml additions
|
||
[dependencies]
|
||
fuse = "0.3" # FUSE-T Rust bindings (or libfuse3)
|
||
time = "0.3" # Timestamp handling
|
||
lru = "0.12" # LRU cache implementation
|
||
uuid = "1.11" # UUID handling
|
||
tokio = { version = "1", features = ["full"] }
|
||
```
|
||
|
||
---
|
||
|
||
## Core Components
|
||
|
||
### 1. MarkBaseFs (filesystem.rs)
|
||
|
||
**Purpose**: Main filesystem implementation backed by SQLite
|
||
|
||
```rust
|
||
pub struct MarkBaseFs {
|
||
user_id: String,
|
||
db: Connection,
|
||
backend: BackendType,
|
||
|
||
// Caches
|
||
attr_cache: LruCache<u64, FileAttr>,
|
||
path_cache: LruCache<u64, PathBuf>,
|
||
dir_cache: LruCache<u64, Vec<DirEntry>>,
|
||
|
||
// Write buffer
|
||
write_buffer: HashMap<u64, Vec<u8>>,
|
||
buffer_size: usize, // Default: 64KB
|
||
}
|
||
|
||
impl Filesystem for MarkBaseFs {
|
||
// Operations implementation...
|
||
}
|
||
```
|
||
|
||
**Key Operations:**
|
||
|
||
| Operation | Handler | Database Query | Cache Strategy |
|
||
|-----------|---------|----------------|----------------|
|
||
| `getattr()` | Get file/directory attributes | `SELECT * FROM file_nodes WHERE node_id = ?` | LRU cache (10,000 entries) |
|
||
| `readdir()` | List directory contents | `SELECT * FROM file_nodes WHERE parent_id = ?` | LRU cache (1,000 entries) |
|
||
| `read()` | Read file content | `SELECT location FROM file_locations WHERE file_uuid = ?` | Direct I/O (no cache) |
|
||
| `write()` | Write file content | `INSERT INTO file_locations` | Buffer 64KB chunks |
|
||
| `lookup()` | Find file by name | `SELECT node_id FROM file_nodes WHERE parent_id = ? AND label = ?` | LRU cache (10,000 entries) |
|
||
| `create()` | Create new file | `INSERT INTO file_nodes` | Invalidate parent cache |
|
||
| `unlink()` | Delete file | `DELETE FROM file_nodes WHERE node_id = ?` | Invalidate parent cache |
|
||
|
||
### 2. Backend Selection (backend.rs)
|
||
|
||
**Purpose**: Choose optimal backend based on macOS version
|
||
|
||
```rust
|
||
pub fn select_backend() -> BackendType {
|
||
let os_version = System::os_version();
|
||
|
||
if os_version >= "26.0" {
|
||
// macOS 26+ supports FSKit (native, fastest)
|
||
BackendType::Fskit
|
||
} else {
|
||
// Older macOS uses NFSv4 (stable)
|
||
BackendType::Nfs4
|
||
}
|
||
}
|
||
```
|
||
|
||
### 3. Cache Management (cache.rs)
|
||
|
||
**Purpose**: Reduce SQLite query overhead
|
||
|
||
```rust
|
||
pub struct FuseCache {
|
||
attr_cache: LruCache<u64, CachedAttr>,
|
||
path_cache: LruCache<u64, PathBuf>,
|
||
|
||
ttl: Duration, // Time-to-live: 60 seconds
|
||
}
|
||
|
||
#[derive(Clone)]
|
||
struct CachedAttr {
|
||
attr: FileAttr,
|
||
cached_at: Instant,
|
||
}
|
||
|
||
impl FuseCache {
|
||
pub fn get_attr(&mut self, ino: u64) -> Option<FileAttr> {
|
||
if let Some(cached) = self.attr_cache.get(&ino) {
|
||
if cached.cached_at.elapsed() < self.ttl {
|
||
return Some(cached.attr.clone());
|
||
}
|
||
}
|
||
None
|
||
}
|
||
|
||
pub fn put_attr(&mut self, ino: u64, attr: FileAttr) {
|
||
self.attr_cache.put(ino, CachedAttr {
|
||
attr,
|
||
cached_at: Instant::now(),
|
||
});
|
||
}
|
||
}
|
||
```
|
||
|
||
### 4. Mount Manager (mount_manager.rs)
|
||
|
||
**Purpose**: Handle multi-user concurrent mounts
|
||
|
||
```rust
|
||
use tokio::task::JoinSet;
|
||
|
||
pub struct MountManager {
|
||
mounts: HashMap<String, MountedFs>,
|
||
backend: BackendType,
|
||
}
|
||
|
||
pub struct MountedFs {
|
||
user_id: String,
|
||
mount_path: PathBuf,
|
||
process: JoinHandle<Result<()>>,
|
||
}
|
||
|
||
impl MountManager {
|
||
pub async fn mount_user(&mut self, user_id: String, base_dir: PathBuf) -> Result<()> {
|
||
let mount_path = base_dir.join(format!("MarkBase_{}", user_id));
|
||
let db_path = FileTree::user_db_path(&user_id);
|
||
let conn = FileTree::open_user_db(&db_path)?;
|
||
|
||
let backend = self.backend.clone();
|
||
let fs = MarkBaseFs::new(user_id.clone(), conn, backend);
|
||
|
||
let handle = tokio::spawn(async move {
|
||
fuse::mount(fs, &mount_path, &backend.mount_options())?;
|
||
Ok(())
|
||
});
|
||
|
||
self.mounts.insert(user_id, MountedFs {
|
||
user_id,
|
||
mount_path,
|
||
process: handle,
|
||
});
|
||
|
||
Ok(())
|
||
}
|
||
|
||
pub async fn mount_all(&mut self, users: Vec<String>, base_dir: PathBuf) -> Result<()> {
|
||
let mut tasks = JoinSet::new();
|
||
|
||
for user_id in users {
|
||
tasks.spawn(async move {
|
||
MountManager::new().mount_user(user_id, base_dir).await
|
||
});
|
||
}
|
||
|
||
while let Some(result) = tasks.join_next().await {
|
||
result??;
|
||
}
|
||
|
||
Ok(())
|
||
}
|
||
|
||
pub async fn unmount_all(&mut self) -> Result<()> {
|
||
for mount in self.mounts.values() {
|
||
fuse::unmount(&mount.mount_path)?;
|
||
}
|
||
self.mounts.clear();
|
||
Ok(())
|
||
}
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## Performance Optimization
|
||
|
||
### Strategy 1: Write Buffering
|
||
|
||
**Problem**: FUSE write() syscall overhead
|
||
|
||
**Solution**: Buffer writes in 64KB chunks
|
||
|
||
```rust
|
||
impl MarkBaseFs {
|
||
fn write(&mut self, ino: u64, offset: u64, data: &[u8], reply: ReplyWrite) {
|
||
// Accumulate in buffer
|
||
let buffer = self.write_buffer.entry(ino).or_insert(Vec::new());
|
||
buffer.extend_from_slice(data);
|
||
|
||
// Flush when buffer >= 64KB
|
||
if buffer.len() >= 64 * 1024 {
|
||
self.flush_buffer(ino);
|
||
}
|
||
|
||
reply.written(data.len() as u32);
|
||
}
|
||
|
||
fn flush_buffer(&mut self, ino: u64) {
|
||
if let Some(buffer) = self.write_buffer.get(&ino) {
|
||
let path = self.get_file_path(ino);
|
||
std::fs::write(&path, buffer)?;
|
||
self.write_buffer.remove(&ino);
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
### Strategy 2: Metadata Caching
|
||
|
||
**Problem**: SQLite query latency (~2-5ms per query)
|
||
|
||
**Solution**: LRU cache with 60s TTL
|
||
|
||
**Cache Configuration:**
|
||
|
||
| Cache Type | Size | TTL | Hit Rate Target |
|
||
|------------|------|-----|-----------------|
|
||
| attr_cache | 10,000 entries | 60s | 95% |
|
||
| path_cache | 10,000 entries | 60s | 90% |
|
||
| dir_cache | 1,000 entries | 60s | 85% |
|
||
|
||
### Strategy 3: FSKit Backend
|
||
|
||
**Problem**: NFSv4 TCP/IP overhead (~5-10%)
|
||
|
||
**Solution**: Use FSKit backend on macOS 26+
|
||
|
||
**Performance Impact:**
|
||
|
||
| Metric | NFSv4 | FSKit | Improvement |
|
||
|--------|-------|-------|-------------|
|
||
| Write latency | 15ms | 8ms | 47% reduction |
|
||
| Read latency | 10ms | 5ms | 50% reduction |
|
||
| Throughput | 550 MB/s | 650 MB/s | 18% increase |
|
||
|
||
---
|
||
|
||
## Multi-User Concurrent Mount
|
||
|
||
### Architecture
|
||
|
||
```
|
||
MountManager
|
||
├── mount_user(user_id) → spawn tokio task
|
||
├── mount_all([user1, user2, ...]) → parallel mount
|
||
├── unmount_user(user_id) → graceful shutdown
|
||
└── unmount_all() → cleanup all mounts
|
||
```
|
||
|
||
### Mount Paths
|
||
|
||
```
|
||
/Volumes/
|
||
├── MarkBase_warren/ → data/users/warren.sqlite
|
||
├── MarkBase_momentry/ → data/users/momentry.sqlite
|
||
├── MarkBase_demo/ → data/users/demo.sqlite
|
||
├── MarkBase_user1/ → data/users/user1.sqlite
|
||
...
|
||
└── MarkBase_user10/ → data/users/user10.sqlite
|
||
```
|
||
|
||
### Concurrent Strategy
|
||
|
||
```rust
|
||
// Parallel mount using tokio::JoinSet
|
||
pub async fn mount_all(&mut self, users: Vec<String>) -> Result<()> {
|
||
let mut tasks = JoinSet::new();
|
||
|
||
for user_id in users {
|
||
tasks.spawn(async {
|
||
MountManager::new().mount_user(user_id).await
|
||
});
|
||
}
|
||
|
||
// Collect results
|
||
let mut results = Vec::new();
|
||
while let Some(result) = tasks.join_next().await {
|
||
results.push(result?);
|
||
}
|
||
|
||
Ok(())
|
||
}
|
||
```
|
||
|
||
### Performance Target
|
||
|
||
| Metric | Target | Measurement |
|
||
|--------|--------|-------------|
|
||
| Mount latency (single user) | <100ms | Time from mount() to ready |
|
||
| Mount latency (10 users) | <2s | Parallel mount completion |
|
||
| Mount stability | 24h uptime | No crash/lock-up |
|
||
| Concurrent writes | 10 × 600MB/s | AJA System Test |
|
||
|
||
---
|
||
|
||
## Database Integration
|
||
|
||
### File Node Mapping
|
||
|
||
**UUID to inode mapping:**
|
||
|
||
```rust
|
||
fn uuid_to_ino(uuid: &str) -> u64 {
|
||
// Use first 8 bytes of UUID as inode number
|
||
let bytes = uuid.as_bytes();
|
||
u64::from_be_bytes(&bytes[0..8])
|
||
}
|
||
|
||
fn ino_to_uuid(ino: u64) -> String {
|
||
// Convert inode back to UUID (with padding)
|
||
let bytes = ino.to_be_bytes();
|
||
format!("{:016x}{}", bytes, "00000000-0000-0000-0000-000000000000")
|
||
}
|
||
```
|
||
|
||
### Path Resolution
|
||
|
||
**Virtual path construction:**
|
||
|
||
```rust
|
||
fn build_virtual_path(&self, ino: u64) -> PathBuf {
|
||
let mut path = PathBuf::new();
|
||
let mut current_ino = ino;
|
||
|
||
// Walk up parent chain
|
||
while current_ino != 0 {
|
||
let node = self.get_node(current_ino)?;
|
||
path.push(node.label);
|
||
current_ino = node.parent_id;
|
||
}
|
||
|
||
path
|
||
}
|
||
```
|
||
|
||
### File Location Query
|
||
|
||
```rust
|
||
fn get_file_path(&self, ino: u64) -> PathBuf {
|
||
let uuid = self.ino_to_uuid(ino);
|
||
|
||
self.db.query_row(
|
||
"SELECT location FROM file_locations WHERE file_uuid = ?",
|
||
[uuid],
|
||
|row| row.get::<_, String>(0)
|
||
).map(PathBuf::from)
|
||
}
|
||
```
|
||
|
||
---
|
||
|
||
## CLI Commands
|
||
|
||
### Mount Commands
|
||
|
||
```bash
|
||
# Single user mount
|
||
cargo run -- fuse --mount --user warren --dir /Volumes/MarkBase_warren
|
||
|
||
# Multi-user concurrent mount
|
||
cargo run -- fuse --mount --all --dir /Volumes/
|
||
|
||
# Specify backend
|
||
cargo run -- fuse --mount --user warren --backend fskit
|
||
|
||
# Unmount
|
||
cargo run -- fuse --unmount --dir /Volumes/MarkBase_warren
|
||
cargo run -- fuse --unmount --all
|
||
```
|
||
|
||
### Test Commands
|
||
|
||
```bash
|
||
# Performance test
|
||
cargo run -- fuse --test --user warren --size 6GB
|
||
|
||
# AJA System Test simulation
|
||
cargo run -- fuse --test --aja --config 4K_ProRes
|
||
|
||
# Stability test
|
||
cargo run -- fuse --test --stability --duration 24h
|
||
```
|
||
|
||
---
|
||
|
||
## Testing Strategy
|
||
|
||
### Phase 1: POC Verification
|
||
|
||
**Tests:**
|
||
1. Hello FUSE mount/unmount
|
||
2. Basic read/write operations
|
||
3. Backend selection (NFSv4 vs FSKit)
|
||
|
||
### Phase 2: SQLite-backed FUSE
|
||
|
||
**Tests:**
|
||
1. warren user mount (12,659 nodes)
|
||
2. Directory traversal
|
||
3. File read/write
|
||
4. Metadata caching
|
||
|
||
### Phase 3: Multi-user Concurrent
|
||
|
||
**Tests:**
|
||
1. 10 user parallel mount
|
||
2. Concurrent writes (AJA System Test)
|
||
3. 24h stability test
|
||
4. Unmount/shutdown
|
||
|
||
### Phase 4: Performance Validation
|
||
|
||
**Tests:**
|
||
1. AJA System Test 4K ProRes 4444 (600MB/s target)
|
||
2. dd baseline comparison
|
||
3. FSKit backend performance
|
||
4. Cache effectiveness (hit rate measurement)
|
||
|
||
---
|
||
|
||
## Risk Assessment
|
||
|
||
| Risk | Probability | Impact | Mitigation |
|
||
|------|-------------|--------|------------|
|
||
| FUSE-T installation failure | Low | High | Document brew installation steps |
|
||
| NFSv4 performance bottleneck | Medium | Medium | FSKit backend fallback |
|
||
| SQLite query latency | Medium | High | LRU caching + connection pooling |
|
||
| Kernel panic (macFUSE only) | Low | Critical | Use FUSE-T (kext-less) |
|
||
| Multi-user mount deadlock | Low | High | Async mount + timeout handling |
|
||
| Write buffer overflow | Low | Medium | Chunked flush + memory limit |
|
||
|
||
---
|
||
|
||
## Performance Targets Summary
|
||
|
||
| Metric | Target | Measurement Method |
|
||
|--------|--------|---------------------|
|
||
| Write throughput | >=600MB/s | AJA System Test 4K ProRes 4444 |
|
||
| Read throughput | >=800MB/s | AJA System Test 4K ProRes 422 HQ |
|
||
| Mount latency (single) | <100ms | Timing measurement |
|
||
| Mount latency (10 users) | <2s | Parallel mount timing |
|
||
| Concurrent writes | 10 × 600MB/s | AJA concurrent test |
|
||
| Uptime stability | 24h no crash | Stability test |
|
||
| Cache hit rate | >=90% | Cache statistics |
|
||
|
||
---
|
||
|
||
**Last Updated**: 2026-05-17
|
||
**Version**: 1.0
|
||
**Status**: Design Complete, Ready for POC Implementation |