add end-to-end IO benchmarks and fix pprof-identified hotspots

Add comprehensive benchmark suite (io_bench_test.go): - BenchmarkEndToEndRead/Write: full SCSI stack (512B to 256KB) - BenchmarkEndToEndReadParallel/WriteParallel: concurrent IO - BenchmarkFileBackingStoreRead/Write: isolated backing store pprof-guided optimizations: - Guard hot-path log.Debugf with log.GetLevel() check in scsi.go, sbc.go, backingstore.go — eliminates 22% CPU overhead from logrus Entry allocation even when debug logging is disabled - Add FileBackingStore.ReadAt for zero-copy reads directly into caller's buffer, bypassing Read()'s per-call make([]byte, tl) - Use ReadAt via interface assertion in bsPerformCommand to read directly into InSDBBuffer, eliminating allocation + copy Results (256KB reads): +42% throughput, allocs reduced from 10 to 5 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 19:41:48 +08:00
parent 87c25cf5cd
commit a5628f4ec0
5 changed files with 380 additions and 10 deletions
--- a/pkg/scsi/sbc.go
+++ b/pkg/scsi/sbc.go
@@ -433,7 +433,9 @@ func SBCReadWrite(host int, cmd *api.SCSICommand) api.SAMStat {

 	// Calculate total blocks
 	totalBlocks = dev.Size >> dev.BlockShift
-	log.Debugf("SBCReadWrite: opcode=0x%x, lba=%d, tl=%d, totalBlocks=%d", opcode, lba, tl, totalBlocks)
+	if log.GetLevel() >= log.DebugLevel {
+		log.Debugf("SBCReadWrite: opcode=0x%x, lba=%d, tl=%d, totalBlocks=%d", opcode, lba, tl, totalBlocks)
+	}

 	// Verify that we are not doing i/o beyond the end-of-lun
 	// Even when transfer length is 0, we must validate the LBA is within range