Add a new backend store that enables iSCSI targets backed by
S3-compatible object storage (AWS S3, MinIO, Ceph RGW, etc.).
The implementation uses a chunked storage strategy where the virtual
block device is divided into fixed-size chunks (default 4 MiB), each
stored as an independent S3 object. This enables efficient random
read/write access on top of object storage.
Key features:
- Chunked storage with configurable chunk size
- Sparse device support (unwritten chunks treated as zeros)
- Concurrent multi-chunk I/O via errgroup
- Per-chunk locking for safe read-modify-write
- AWS SDK v2 with default credential chain
- In-process gofakes3 test server (no Docker needed)
- 12 unit tests + 2 integration tests
Also updates CI workflow to run S3 backend tests and updates
README with S3 backend documentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add comprehensive benchmark suite (io_bench_test.go):
- BenchmarkEndToEndRead/Write: full SCSI stack (512B to 256KB)
- BenchmarkEndToEndReadParallel/WriteParallel: concurrent IO
- BenchmarkFileBackingStoreRead/Write: isolated backing store
pprof-guided optimizations:
- Guard hot-path log.Debugf with log.GetLevel() check in scsi.go,
sbc.go, backingstore.go — eliminates 22% CPU overhead from logrus
Entry allocation even when debug logging is disabled
- Add FileBackingStore.ReadAt for zero-copy reads directly into
caller's buffer, bypassing Read()'s per-call make([]byte, tl)
- Use ReadAt via interface assertion in bsPerformCommand to read
directly into InSDBBuffer, eliminating allocation + copy
Results (256KB reads): +42% throughput, allocs reduced from 10 to 5
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Read path: eliminate redundant allocation in bsPerformCommand - remove
the pre-allocation before bs.Read() and the append loop for zero-fill,
use direct copy and in-place zero-fill instead
- parseHeader: use command pool (getCommand) instead of direct allocation,
reducing GC pressure on the hot path
- Unmap: use a shared 1MB zero buffer instead of allocating per-descriptor,
dramatically reducing allocations for large unmap operations
- Network I/O: add 256KB bufio.Writer to iSCSI connections, batching
small PDU writes into fewer syscalls. Flush after txHandler completes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The UNMAP command was a no-op in all backing stores, causing unmapped
blocks to retain stale data instead of returning zeros per SCSI spec.
- Implement Unmap in FileBackingStore to zero out unmapped blocks
- Implement Unmap in IOUringBackingStore to zero out unmapped blocks
- Enable Unmap in RemBackingStore (was commented out)
- Change UnmapBlockDescriptor.TL from uint32 to uint64 to prevent
integer overflow when converting block count to byte length with
large block shifts
Fixes#119
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- support Block Limits VPD page (0xB0)
- add UNMAP to REPORT SUPPORTED OPERATION CODES
- READ CAPACITY(16): set LBPME when Thin provisioning is enabled
- move Thinprovisioning and BlockShift to config
- add Unmap to BackingStore