add end-to-end IO benchmarks and fix pprof-identified hotspots

Add comprehensive benchmark suite (io_bench_test.go):
- BenchmarkEndToEndRead/Write: full SCSI stack (512B to 256KB)
- BenchmarkEndToEndReadParallel/WriteParallel: concurrent IO
- BenchmarkFileBackingStoreRead/Write: isolated backing store

pprof-guided optimizations:
- Guard hot-path log.Debugf with log.GetLevel() check in scsi.go,
  sbc.go, backingstore.go — eliminates 22% CPU overhead from logrus
  Entry allocation even when debug logging is disabled
- Add FileBackingStore.ReadAt for zero-copy reads directly into
  caller's buffer, bypassing Read()'s per-call make([]byte, tl)
- Use ReadAt via interface assertion in bsPerformCommand to read
  directly into InSDBBuffer, eliminating allocation + copy

Results (256KB reads): +42% throughput, allocs reduced from 10 to 5

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Lei Xue
2026-03-14 19:41:48 +08:00
parent 87c25cf5cd
commit a5628f4ec0
5 changed files with 380 additions and 10 deletions

View File

@@ -433,7 +433,9 @@ func SBCReadWrite(host int, cmd *api.SCSICommand) api.SAMStat {
// Calculate total blocks
totalBlocks = dev.Size >> dev.BlockShift
log.Debugf("SBCReadWrite: opcode=0x%x, lba=%d, tl=%d, totalBlocks=%d", opcode, lba, tl, totalBlocks)
if log.GetLevel() >= log.DebugLevel {
log.Debugf("SBCReadWrite: opcode=0x%x, lba=%d, tl=%d, totalBlocks=%d", opcode, lba, tl, totalBlocks)
}
// Verify that we are not doing i/o beyond the end-of-lun
// Even when transfer length is 0, we must validate the LBA is within range