Files
markbase/docs/FUSE_DESIGN.md
2026-05-18 17:02:30 +08:00

583 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# MarkBase FUSE System Design
## Overview
**Objective**: Implement virtual file system mount for MarkBase users using FUSE technology, enabling direct file access through macOS Finder and video editing software.
**Target Performance**: 600MB/s sustained write per user, supporting 10 concurrent users.
**Technology Choice**: FUSE-T (Kext-less FUSE for macOS)
---
## FUSE-T vs macFUSE Comparison
### Core Architecture
| Feature | FUSE-T | macFUSE |
|---------|---------|---------|
| **Kernel Design** | Kext-less (userspace server) | Kernel Extension + FSKit (macOS 26+) |
| **Backend Protocol** | NFSv4 / SMB3 / FSKit | Direct kernel FUSE API |
| **Installation** | Simple (brew install) | Requires System Settings → Privacy & Security |
| **Stability** | Stable (userspace server) | Potential kernel crash/lock-up |
| **License** | Free personal use, commercial license required | Open source (BSD-style) |
| **macOS Support** | All versions | macOS 12+ |
| **App Store** | Embeddable | Requires special handling |
| **API Compatibility** | libfuse2/libfuse3 | libfuse2/libfuse3 + macFUSE.framework |
| **Install Count** | 21,892 (365 days) | 129,388 (365 days) |
### Technical Flow
**FUSE-T Operation:**
```
User App → libfuse → FUSE-T Server (userspace) → NFS/SMB/FSKit → macOS mount
```
**macFUSE Operation (Legacy):**
```
User App → libfuse → macFUSE kext (kernel) → VFS → macOS mount
```
**macFUSE Operation (macOS 26+):**
```
User App → libfuse → FSKit (userspace) → macOS mount
```
### Performance Considerations
| Factor | Impact |
|--------|--------|
| FUSE-T NFS backend | Extra TCP/IP overhead (~5-10% latency) |
| macFUSE kext | Direct kernel path (fastest) |
| macFUSE FSKit | Userspace path (similar to FUSE-T) |
| Network packet handling | FUSE-T requires NFS RPC conversion |
| Large file writes | Both limited by userspace buffer |
### Recommendation
**FUSE-T is recommended for MarkBase:**
1. **Stability Priority**: Avoid kernel panic risk
2. **Deployment Friendly**: No Security Settings configuration needed
3. **macOS 26 Support**: FSKit backend option (same as macFUSE)
4. **Commercial Distribution**: Controllable licensing cost
**macFUSE suitable for:**
- Maximum raw performance (kernel bypass)
- Existing stable kernel extension environment
- Open source projects (no commercial licensing)
---
## Backend Selection
### Backend Types
| Backend | Protocol | macOS Support | Performance | Stability |
|---------|----------|---------------|-------------|-----------|
| **NFSv4** | NFS v4 over TCP | All versions | ~5-10% overhead | Very stable |
| **SMB3** | SMB 3.0 over TCP | All versions | ~8-12% overhead | Stable |
| **FSKit** | Apple FSKit API | macOS 26+ | Direct path | Native |
### Backend Architecture
```rust
pub enum BackendType {
Nfs4, // NFSv4 backend (all macOS support)
Fskit, // FSKit backend (macOS 26+)
}
impl BackendType {
pub fn mount_options(&self) -> Vec<MountOption> {
match self {
BackendType::Nfs4 => vec![
MountOption::Backend("nfs"),
MountOption::AutoUnmount,
],
BackendType::Fskit => vec![
MountOption::Backend("fskit"),
MountOption::AutoUnmount,
],
}
}
}
```
### Performance Comparison
**Expected Throughput (4K ProRes 4444 Write):**
| Backend | Expected Speed | Overhead | Recommendation |
|---------|----------------|----------|----------------|
| NFSv4 | 550-600 MB/s | 5-10% | Stable baseline |
| FSKit | 600-700 MB/s | Minimal | Performance target |
---
## Module Architecture
### Directory Structure
```
src/fuse/
├── mod.rs # FUSE core module (entry point)
├── filesystem.rs # MarkBaseFs implementation
├── handlers.rs # FUSE operation handlers
├── backend.rs # Backend selection (NFSv4/FSKit)
├── cache.rs # LRU cache for metadata
└── mount_manager.rs # Multi-user concurrent mount
```
### Module Dependencies
```toml
# Cargo.toml additions
[dependencies]
fuse = "0.3" # FUSE-T Rust bindings (or libfuse3)
time = "0.3" # Timestamp handling
lru = "0.12" # LRU cache implementation
uuid = "1.11" # UUID handling
tokio = { version = "1", features = ["full"] }
```
---
## Core Components
### 1. MarkBaseFs (filesystem.rs)
**Purpose**: Main filesystem implementation backed by SQLite
```rust
pub struct MarkBaseFs {
user_id: String,
db: Connection,
backend: BackendType,
// Caches
attr_cache: LruCache<u64, FileAttr>,
path_cache: LruCache<u64, PathBuf>,
dir_cache: LruCache<u64, Vec<DirEntry>>,
// Write buffer
write_buffer: HashMap<u64, Vec<u8>>,
buffer_size: usize, // Default: 64KB
}
impl Filesystem for MarkBaseFs {
// Operations implementation...
}
```
**Key Operations:**
| Operation | Handler | Database Query | Cache Strategy |
|-----------|---------|----------------|----------------|
| `getattr()` | Get file/directory attributes | `SELECT * FROM file_nodes WHERE node_id = ?` | LRU cache (10,000 entries) |
| `readdir()` | List directory contents | `SELECT * FROM file_nodes WHERE parent_id = ?` | LRU cache (1,000 entries) |
| `read()` | Read file content | `SELECT location FROM file_locations WHERE file_uuid = ?` | Direct I/O (no cache) |
| `write()` | Write file content | `INSERT INTO file_locations` | Buffer 64KB chunks |
| `lookup()` | Find file by name | `SELECT node_id FROM file_nodes WHERE parent_id = ? AND label = ?` | LRU cache (10,000 entries) |
| `create()` | Create new file | `INSERT INTO file_nodes` | Invalidate parent cache |
| `unlink()` | Delete file | `DELETE FROM file_nodes WHERE node_id = ?` | Invalidate parent cache |
### 2. Backend Selection (backend.rs)
**Purpose**: Choose optimal backend based on macOS version
```rust
pub fn select_backend() -> BackendType {
let os_version = System::os_version();
if os_version >= "26.0" {
// macOS 26+ supports FSKit (native, fastest)
BackendType::Fskit
} else {
// Older macOS uses NFSv4 (stable)
BackendType::Nfs4
}
}
```
### 3. Cache Management (cache.rs)
**Purpose**: Reduce SQLite query overhead
```rust
pub struct FuseCache {
attr_cache: LruCache<u64, CachedAttr>,
path_cache: LruCache<u64, PathBuf>,
ttl: Duration, // Time-to-live: 60 seconds
}
#[derive(Clone)]
struct CachedAttr {
attr: FileAttr,
cached_at: Instant,
}
impl FuseCache {
pub fn get_attr(&mut self, ino: u64) -> Option<FileAttr> {
if let Some(cached) = self.attr_cache.get(&ino) {
if cached.cached_at.elapsed() < self.ttl {
return Some(cached.attr.clone());
}
}
None
}
pub fn put_attr(&mut self, ino: u64, attr: FileAttr) {
self.attr_cache.put(ino, CachedAttr {
attr,
cached_at: Instant::now(),
});
}
}
```
### 4. Mount Manager (mount_manager.rs)
**Purpose**: Handle multi-user concurrent mounts
```rust
use tokio::task::JoinSet;
pub struct MountManager {
mounts: HashMap<String, MountedFs>,
backend: BackendType,
}
pub struct MountedFs {
user_id: String,
mount_path: PathBuf,
process: JoinHandle<Result<()>>,
}
impl MountManager {
pub async fn mount_user(&mut self, user_id: String, base_dir: PathBuf) -> Result<()> {
let mount_path = base_dir.join(format!("MarkBase_{}", user_id));
let db_path = FileTree::user_db_path(&user_id);
let conn = FileTree::open_user_db(&db_path)?;
let backend = self.backend.clone();
let fs = MarkBaseFs::new(user_id.clone(), conn, backend);
let handle = tokio::spawn(async move {
fuse::mount(fs, &mount_path, &backend.mount_options())?;
Ok(())
});
self.mounts.insert(user_id, MountedFs {
user_id,
mount_path,
process: handle,
});
Ok(())
}
pub async fn mount_all(&mut self, users: Vec<String>, base_dir: PathBuf) -> Result<()> {
let mut tasks = JoinSet::new();
for user_id in users {
tasks.spawn(async move {
MountManager::new().mount_user(user_id, base_dir).await
});
}
while let Some(result) = tasks.join_next().await {
result??;
}
Ok(())
}
pub async fn unmount_all(&mut self) -> Result<()> {
for mount in self.mounts.values() {
fuse::unmount(&mount.mount_path)?;
}
self.mounts.clear();
Ok(())
}
}
```
---
## Performance Optimization
### Strategy 1: Write Buffering
**Problem**: FUSE write() syscall overhead
**Solution**: Buffer writes in 64KB chunks
```rust
impl MarkBaseFs {
fn write(&mut self, ino: u64, offset: u64, data: &[u8], reply: ReplyWrite) {
// Accumulate in buffer
let buffer = self.write_buffer.entry(ino).or_insert(Vec::new());
buffer.extend_from_slice(data);
// Flush when buffer >= 64KB
if buffer.len() >= 64 * 1024 {
self.flush_buffer(ino);
}
reply.written(data.len() as u32);
}
fn flush_buffer(&mut self, ino: u64) {
if let Some(buffer) = self.write_buffer.get(&ino) {
let path = self.get_file_path(ino);
std::fs::write(&path, buffer)?;
self.write_buffer.remove(&ino);
}
}
}
```
### Strategy 2: Metadata Caching
**Problem**: SQLite query latency (~2-5ms per query)
**Solution**: LRU cache with 60s TTL
**Cache Configuration:**
| Cache Type | Size | TTL | Hit Rate Target |
|------------|------|-----|-----------------|
| attr_cache | 10,000 entries | 60s | 95% |
| path_cache | 10,000 entries | 60s | 90% |
| dir_cache | 1,000 entries | 60s | 85% |
### Strategy 3: FSKit Backend
**Problem**: NFSv4 TCP/IP overhead (~5-10%)
**Solution**: Use FSKit backend on macOS 26+
**Performance Impact:**
| Metric | NFSv4 | FSKit | Improvement |
|--------|-------|-------|-------------|
| Write latency | 15ms | 8ms | 47% reduction |
| Read latency | 10ms | 5ms | 50% reduction |
| Throughput | 550 MB/s | 650 MB/s | 18% increase |
---
## Multi-User Concurrent Mount
### Architecture
```
MountManager
├── mount_user(user_id) → spawn tokio task
├── mount_all([user1, user2, ...]) → parallel mount
├── unmount_user(user_id) → graceful shutdown
└── unmount_all() → cleanup all mounts
```
### Mount Paths
```
/Volumes/
├── MarkBase_warren/ → data/users/warren.sqlite
├── MarkBase_momentry/ → data/users/momentry.sqlite
├── MarkBase_demo/ → data/users/demo.sqlite
├── MarkBase_user1/ → data/users/user1.sqlite
...
└── MarkBase_user10/ → data/users/user10.sqlite
```
### Concurrent Strategy
```rust
// Parallel mount using tokio::JoinSet
pub async fn mount_all(&mut self, users: Vec<String>) -> Result<()> {
let mut tasks = JoinSet::new();
for user_id in users {
tasks.spawn(async {
MountManager::new().mount_user(user_id).await
});
}
// Collect results
let mut results = Vec::new();
while let Some(result) = tasks.join_next().await {
results.push(result?);
}
Ok(())
}
```
### Performance Target
| Metric | Target | Measurement |
|--------|--------|-------------|
| Mount latency (single user) | <100ms | Time from mount() to ready |
| Mount latency (10 users) | <2s | Parallel mount completion |
| Mount stability | 24h uptime | No crash/lock-up |
| Concurrent writes | 10 × 600MB/s | AJA System Test |
---
## Database Integration
### File Node Mapping
**UUID to inode mapping:**
```rust
fn uuid_to_ino(uuid: &str) -> u64 {
// Use first 8 bytes of UUID as inode number
let bytes = uuid.as_bytes();
u64::from_be_bytes(&bytes[0..8])
}
fn ino_to_uuid(ino: u64) -> String {
// Convert inode back to UUID (with padding)
let bytes = ino.to_be_bytes();
format!("{:016x}{}", bytes, "00000000-0000-0000-0000-000000000000")
}
```
### Path Resolution
**Virtual path construction:**
```rust
fn build_virtual_path(&self, ino: u64) -> PathBuf {
let mut path = PathBuf::new();
let mut current_ino = ino;
// Walk up parent chain
while current_ino != 0 {
let node = self.get_node(current_ino)?;
path.push(node.label);
current_ino = node.parent_id;
}
path
}
```
### File Location Query
```rust
fn get_file_path(&self, ino: u64) -> PathBuf {
let uuid = self.ino_to_uuid(ino);
self.db.query_row(
"SELECT location FROM file_locations WHERE file_uuid = ?",
[uuid],
|row| row.get::<_, String>(0)
).map(PathBuf::from)
}
```
---
## CLI Commands
### Mount Commands
```bash
# Single user mount
cargo run -- fuse --mount --user warren --dir /Volumes/MarkBase_warren
# Multi-user concurrent mount
cargo run -- fuse --mount --all --dir /Volumes/
# Specify backend
cargo run -- fuse --mount --user warren --backend fskit
# Unmount
cargo run -- fuse --unmount --dir /Volumes/MarkBase_warren
cargo run -- fuse --unmount --all
```
### Test Commands
```bash
# Performance test
cargo run -- fuse --test --user warren --size 6GB
# AJA System Test simulation
cargo run -- fuse --test --aja --config 4K_ProRes
# Stability test
cargo run -- fuse --test --stability --duration 24h
```
---
## Testing Strategy
### Phase 1: POC Verification
**Tests:**
1. Hello FUSE mount/unmount
2. Basic read/write operations
3. Backend selection (NFSv4 vs FSKit)
### Phase 2: SQLite-backed FUSE
**Tests:**
1. warren user mount (12,659 nodes)
2. Directory traversal
3. File read/write
4. Metadata caching
### Phase 3: Multi-user Concurrent
**Tests:**
1. 10 user parallel mount
2. Concurrent writes (AJA System Test)
3. 24h stability test
4. Unmount/shutdown
### Phase 4: Performance Validation
**Tests:**
1. AJA System Test 4K ProRes 4444 (600MB/s target)
2. dd baseline comparison
3. FSKit backend performance
4. Cache effectiveness (hit rate measurement)
---
## Risk Assessment
| Risk | Probability | Impact | Mitigation |
|------|-------------|--------|------------|
| FUSE-T installation failure | Low | High | Document brew installation steps |
| NFSv4 performance bottleneck | Medium | Medium | FSKit backend fallback |
| SQLite query latency | Medium | High | LRU caching + connection pooling |
| Kernel panic (macFUSE only) | Low | Critical | Use FUSE-T (kext-less) |
| Multi-user mount deadlock | Low | High | Async mount + timeout handling |
| Write buffer overflow | Low | Medium | Chunked flush + memory limit |
---
## Performance Targets Summary
| Metric | Target | Measurement Method |
|--------|--------|---------------------|
| Write throughput | >=600MB/s | AJA System Test 4K ProRes 4444 |
| Read throughput | >=800MB/s | AJA System Test 4K ProRes 422 HQ |
| Mount latency (single) | <100ms | Timing measurement |
| Mount latency (10 users) | <2s | Parallel mount timing |
| Concurrent writes | 10 × 600MB/s | AJA concurrent test |
| Uptime stability | 24h no crash | Stability test |
| Cache hit rate | >=90% | Cache statistics |
---
**Last Updated**: 2026-05-17
**Version**: 1.0
**Status**: Design Complete, Ready for POC Implementation