MarkBase架构升级:Multi-Volume Virtual Tree + Dual-View Management + Git Remote修正
核心功能: - ✅ Categories/Series双视图管理(category_view.rs + import_markdown.rs) - ✅ FUSE Multi-Volume支持(tree_type参数) - ✅ SSH/SFTP/SCP/rsync协议完整实现(4042行) - ✅ NFS/SMB Module Phase 1-3完成 - ✅ Archive Module Phase 1-4完成(2916行) - ✅ Download Center API完整实现 - ✅ S3兼容API实现(560行) Git配置修正: - ✅ 删除错误origin(gitea.momentry.ddns.net) - ✅ 删除m5max128(指向机器名) - ✅ 设置origin = m5max128gitea.momentry.ddns.net/admin/markbase - ✅ 设置m4minigitea = m4minigitea.momentry.ddns.net/warren/markbase 数据清理: - ✅ 删除38个临时SQLite(保留accusys.sqlite、demo.sqlite) - ✅ 删除.bak、test_*.bin、调试脚本等临时文件 - ✅ 删除临时目录(build/、download files/、raid_test/等) - ✅ 更新.gitignore排除临时文件 架构优化: - 52个文件修改,2434行新增,4739行删除 - Workspace成员整合(16个crate) - 数据库状态:accusys.sqlite保留(主demo测试) 远程同步: - ✅ 准备推送到m5max128gitea(远程Gitea) - ✅ 准备推送到m4minigitea(本地Gitea)
This commit is contained in:
276
docs/SCAN_VS_UPLOAD_GUIDE.md
Normal file
276
docs/SCAN_VS_UPLOAD_GUIDE.md
Normal file
@@ -0,0 +1,276 @@
|
||||
# 目录导入方式对比指南
|
||||
|
||||
## 核心问题
|
||||
|
||||
**为何不直接用scan命令扫描目录重建数据库?**
|
||||
|
||||
答案:**scan命令更适合本地批量导入!**
|
||||
|
||||
---
|
||||
|
||||
## 方式对比
|
||||
|
||||
| 特性 | scan命令 | upload方式 |
|
||||
|------|----------|-----------|
|
||||
| **适用场景** | 本地批量导入 | 远程/外部用户上传 |
|
||||
| **操作方式** | 一键命令 | 浏览器操作 |
|
||||
| **速度** | ⭐⭐⭐ 极快(1秒) | ⚠️ 较慢(网络传输) |
|
||||
| **空目录支持** | ⭐⭐⭐ 完美支持 | ⭐⭐⭐ 完美支持 |
|
||||
| **nested目录** | ⭐⭐⭐ 4层深度 | ⭐⭐⭐ 4层深度 |
|
||||
| **SHA256校验** | ⭐ 可选(后台计算) | ⭐⭐⭐ 实时计算 |
|
||||
| **用户友好** | ⚠️ 需要CLI | ⭐⭐⭐ Web界面 |
|
||||
| **进度显示** | ⚠️ 命令行输出 | ⭐⭐⭐ 实时进度条 |
|
||||
| **适用人群** | 系统管理员 | 一般用户 |
|
||||
|
||||
---
|
||||
|
||||
## 最佳实践
|
||||
|
||||
### 场景1:本地290个文件导入(推荐scan命令)
|
||||
|
||||
```bash
|
||||
# === 步骤1:准备空目录(添加.keep)===
|
||||
bash scripts/prepare_upload.sh "/path/to/AccuSys Downloads"
|
||||
|
||||
# === 步骤2:运行scan命令导入 ===
|
||||
cargo run --bin markbase-core -- scan \
|
||||
--user accusys \
|
||||
--dir "/path/to/AccuSys Downloads" \
|
||||
--skip-hash \
|
||||
--batch 100
|
||||
|
||||
# 输出示例:
|
||||
[1/4] Scanning directory structure...
|
||||
Scanned 80 folders, 290 files in 0.10s
|
||||
|
||||
[2/5] Generating node IDs...
|
||||
Generated 80 folder IDs, 290 file IDs in 0.57s
|
||||
|
||||
[3/5] Opening database...
|
||||
Database opened in 0.00s
|
||||
|
||||
[4/5] Inserting nodes (batch size: 100)...
|
||||
Inserted 370 nodes in 0.21s (14243 nodes/sec)
|
||||
|
||||
[5/5] Updating folder children_json...
|
||||
Updated children_json for 80 folders in 0.00s
|
||||
|
||||
=== Summary ===
|
||||
Total time: 0.89s
|
||||
Folders: 80
|
||||
Files: 290
|
||||
Total nodes: 370
|
||||
Database: data/users/accusys.sqlite
|
||||
|
||||
ℹ️ SHA256 hashes skipped. Run 'markbase hash --user accusys' to compute hashes.
|
||||
|
||||
# === 步骤3:验证导入结果 ===
|
||||
curl -s http://localhost:11438/api/v2/files/accusys | jq '.total_files'
|
||||
# 输出:290
|
||||
|
||||
# === 步骤4:可选后台计算SHA256 ===
|
||||
cargo run --bin markbase-core -- hash --user accusys --threads 4
|
||||
```
|
||||
|
||||
**总耗时:** ~1秒(导入) + ~400秒(可选hash计算)
|
||||
|
||||
---
|
||||
|
||||
### 场景2:远程用户上传文件(推荐upload方式)
|
||||
|
||||
```bash
|
||||
# === 用户操作(Web界面)===
|
||||
https://download.accusys.ddns.net/upload
|
||||
|
||||
步骤:
|
||||
1. 填写User ID
|
||||
2. 点击"Select Folder"
|
||||
3. 选择文件夹
|
||||
4. 点击"Start Upload"
|
||||
5. 实时进度显示
|
||||
|
||||
# === 系统自动处理 ===
|
||||
- ✅ 实时计算SHA256
|
||||
- ✅ 创建目录结构
|
||||
- ✅ 显示上传进度
|
||||
- ✅ 错误提示
|
||||
|
||||
# === 用户验证 ===
|
||||
https://download.accusys.ddns.net/files
|
||||
```
|
||||
|
||||
**优势:** 无需CLI,用户友好,实时反馈
|
||||
|
||||
---
|
||||
|
||||
## scan命令详解
|
||||
|
||||
### 参数说明
|
||||
|
||||
```bash
|
||||
cargo run --bin markbase-core -- scan \
|
||||
--user <USER> # 用户ID(如:accusys)
|
||||
--dir <DIR> # 源目录路径
|
||||
--batch <BATCH> # 批量插入大小(默认:100)
|
||||
--skip-hash # 跳过SHA256计算(快速导入)
|
||||
--threads <THREADS> # hash计算线程数(默认:4)
|
||||
```
|
||||
|
||||
### 性能数据(实测)
|
||||
|
||||
**测试环境:**
|
||||
- 11857 files + 801 folders = 12658 nodes
|
||||
- M4 Mac mini, 4 threads
|
||||
|
||||
**快速导入(skip_hash=true):**
|
||||
- 目录扫描:0.10s
|
||||
- ID生成:0.57s
|
||||
- 数据库插入:0.21s
|
||||
- **总时间:0.89s**
|
||||
- **速度:14243 nodes/sec**
|
||||
|
||||
**完整导入(skip_hash=false):**
|
||||
- 导入时间:0.89s
|
||||
- hash计算:417.58s
|
||||
- **总时间:418.47s**
|
||||
- **速度:28 files/sec**
|
||||
|
||||
---
|
||||
|
||||
## upload方式详解
|
||||
|
||||
### 特性
|
||||
|
||||
**技术实现:**
|
||||
- webkitdirectory(HTML5标准)
|
||||
- DefaultBodyLimit::disable()(无大小限制)
|
||||
- 实时SHA256计算
|
||||
- AbortController(30分钟timeout)
|
||||
|
||||
**文件支持:**
|
||||
- ✅ 空文件(0 bytes)
|
||||
- ✅ 系统文件(.DS_Store, .localized)
|
||||
- ✅ 大文件(100GB+)
|
||||
- ✅ nested目录(4层深度)
|
||||
|
||||
**错误处理:**
|
||||
- ✅ 网络超时保护
|
||||
- ✅ JSON解析容错
|
||||
- ✅ 文件完整性校验
|
||||
- ✅ 实时进度反馈
|
||||
|
||||
---
|
||||
|
||||
## 混合策略
|
||||
|
||||
### 策略1:本地scan + 远程upload
|
||||
|
||||
```bash
|
||||
# 本地管理员用scan命令
|
||||
cargo run --bin markbase-core -- scan --user accusys --dir "/local/files"
|
||||
|
||||
# 远程用户用upload界面
|
||||
https://download.accusys.ddns.net/upload
|
||||
|
||||
# 两者共存,数据库统一
|
||||
curl http://localhost:11438/api/v2/files/accusys
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 策略2:快速scan + 后台hash
|
||||
|
||||
```bash
|
||||
# 步骤1:快速导入(用户立即可用)
|
||||
cargo run --bin markbase-core -- scan --user accusys --dir "/files" --skip-hash
|
||||
|
||||
# 步骤2:后台计算hash(不影响使用)
|
||||
cargo run --bin markbase-core -- hash --user accusys --threads 4 &
|
||||
|
||||
# 步骤3:用户立即访问文件列表
|
||||
https://download.accusys.ddns.net/files
|
||||
|
||||
# 步骤4:hash完成后提供下载服务
|
||||
# (hash用于文件完整性校验)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 空目录处理(通用)
|
||||
|
||||
### prepare_upload.sh脚本
|
||||
|
||||
```bash
|
||||
# 两种方式都需要此步骤
|
||||
bash scripts/prepare_upload.sh "/path/to/source"
|
||||
|
||||
# 功能:
|
||||
- 自动检测所有空目录
|
||||
- 添加.keep文件(0字节)
|
||||
- 支持nested 4层深度
|
||||
- 输出处理统计
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 实战案例
|
||||
|
||||
### 案例1:AccuSys产品文件(290个)
|
||||
|
||||
```bash
|
||||
# 最佳方案:scan命令
|
||||
# 1. 准备空目录
|
||||
bash scripts/prepare_upload.sh "/Users/accusys/Downloads/AccuSys Downloads"
|
||||
|
||||
# 2. 快速导入
|
||||
cargo run --bin markbase-core -- scan \
|
||||
--user accusys \
|
||||
--dir "/Users/accusys/Downloads/AccuSys Downloads" \
|
||||
--skip-hash
|
||||
|
||||
# 3. 验证
|
||||
curl -s http://localhost:11438/api/v2/files/accusys | jq '.total_files'
|
||||
|
||||
# 耗时:~1秒
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 案例2:外部客户上传资料
|
||||
|
||||
```bash
|
||||
# 最佳方案:upload方式
|
||||
# 用户通过Web界面上传
|
||||
|
||||
https://download.accusys.ddns.net/upload
|
||||
|
||||
# 客户操作:
|
||||
1. 收到上传链接
|
||||
2. 填写User ID(如:client001)
|
||||
3. 选择文件夹上传
|
||||
4. 实时进度显示
|
||||
|
||||
# 管理员验证:
|
||||
curl -s http://localhost:11438/api/v2/files/client001 | jq '.total_files'
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 总结
|
||||
|
||||
**为何提供两种方式?**
|
||||
|
||||
1. **灵活性**:适应不同场景和用户
|
||||
2. **性能优化**:本地批量用scan,远程上传用upload
|
||||
3. **用户友好**:CLI给管理员,Web给一般用户
|
||||
4. **功能互补**:scan快速导入,upload实时校验
|
||||
|
||||
**推荐策略:**
|
||||
- ✅ 本地大批量:scan命令(最快)
|
||||
- ✅ 远程/外部:upload方式(最友好)
|
||||
- ✅ 混合使用:两者共存(最灵活)
|
||||
|
||||
---
|
||||
|
||||
**Last Updated:** 2026-06-09 15:10
|
||||
**Version:** 3.0(scan命令完整指南)
|
||||
Reference in New Issue
Block a user