libiscsi

Author	SHA1	Message	Date
sanjay-cpu	e9cefe7e42	.travis.yml: Also build for the ppc64le architecture [ bvanassche: Added amd64 architecture and edited commit message ]	2020-08-12 20:07:10 -07:00
Bart Van Assche	44facb175b	Merge pull request #335 from qiankehan/iscsi-ls iscsi-ls: Fix iser url scheme parsing	2020-08-12 19:59:57 -07:00
Bart Van Assche	fb0f3691ed	Merge pull request #334 from heroin-moose/fix-hardcoded-block-size test-tool: Use block_size instead of hardcoded 512 bytes	2020-08-11 18:35:51 -07:00
Bart Van Assche	0749990afb	Merge pull request #333 from ddiss/iscsi-dd-cleanup Iscsi-dd cleanup	2020-08-11 18:34:32 -07:00
Han Han	6db782bb0a	iscsi-ls: Fix iser url scheme parsing Libiscsi supports to parse two iscsi url schemes: 'iscsi://' and 'iser://'. Fix the missing iser parsing, introduced from `12222077`. Signed-off-by: Han Han <hhan@redhat.com>	2020-08-11 21:57:12 +08:00
Consus	9e160e02b6	test-tool: Use block_size instead of hardcoded 512 bytes Fix another couple of tests that fail on 4Kn drives: * iSCSI.iSCSITMF.AbortTaskSimpleAsync * iSCSI.iSCSITMF.LUNResetSimpleAsync	2020-08-10 15:59:49 +03:00
David Disseldorp	37e6e112b6	examples/iscsi-dd: use common init function for src and dst endpoints Remove a bunch of duplicate code by sharing a function for source and destination endpoint initialization. Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-08-10 01:18:06 +02:00
David Disseldorp	eb4c8e20ff	examples/iscsi-dd: use common iscsi_endpoint struct Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-08-10 00:35:28 +02:00
Ronnie Sahlberg	c54c4cd202	iscsi-perf: Add explicit casts to avoid two warnings Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>	2020-07-31 09:21:33 +10:00
Bart Van Assche	76f9578d1c	Merge pull request #332 from bytedance/fix_cancellation_handling Fix cancellation handling	2020-07-18 20:04:29 -07:00
Xie Yongji	ee47dc7338	socket: Make the pdu timeout handling aware of old iscsi context We should check the pdus in old iscsi context when scanning timeout tasks during reconnecting. Signed-off-by: Xie Yongji <xieyongji@bytedance.com>	2020-06-23 19:49:07 +08:00
wanghonghao	aad136e5b9	libiscsi: Make the cancellation aware of the pdus in old iscsi context We should check the pdus in old iscsi context when cancelling tasks. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-06-23 19:48:26 +08:00
Xie Yongji	e9c1f10258	pdu: Remove the checking for iscsi->is_loggedin in iscsi_cancel_pdus() I don't see any problems that calling the callback during connect/login in iscsi_cancel_pdus(). So let's remove this check. Otherwise, we have no way to be aware of a cancellation during login and cause something like iscsi_login_sync() hangs. Signed-off-by: Xie Yongji <xieyongji@bytedance.com>	2020-06-23 19:45:18 +08:00
Xie Yongji	10868c491d	libiscsi: Avoid discontinuities in cmdsn ordering in some cases We should plug the cmdsn gap in order to continue to use the session when the pdus is cancelled before sending out. Signed-off-by: Xie Yongji <xieyongji@bytedance.com>	2020-06-23 19:45:14 +08:00
Bart Van Assche	0e9b29751c	Merge pull request #331 from heroin-moose/fix-block-size test-tool: Use block_size instead of hardcoded 512 bytes	2020-06-17 07:04:40 -07:00
Consus	7258fbfd83	test-tool: Use block_size instead of hardcoded 512 bytes This commit fixes some tests (like ProutReserve) on pure 4k devices.	2020-06-17 15:50:44 +03:00
Bart Van Assche	bddbc01829	Merge pull request #330 from tmakatos/master assorted RPM fixes	2020-06-05 21:13:44 -07:00
Thanos Makatos	9705017e37	exclude ld_iscsi.so from RPM This is an omission from commit `e6bcdf5fdb`. Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>	2020-06-02 02:33:27 -07:00
Thanos Makatos	f82a899fc5	include iser-private.h in make dist tarball Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>	2020-06-01 05:58:01 -07:00
Bart Van Assche	33c66f2c39	test-tool, compare and write: Reduce maximum number of blocks from 256 to 255 Since the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE command is an 8 bit field, the maximum value that can be encoded is 255. Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-05-21 18:39:49 -07:00
Bart Van Assche	b4d59cd29c	test_compareandwrite_invalid_dataout_size: Simplify this test Assign the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE PDU directly. Use the terminology from SBC-4, namely NUMBER OF LOGICAL BLOCKS instead of TL. Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-05-21 18:39:49 -07:00
Bart Van Assche	9fcdce3101	test-tool: Use asprintf() in sg_send_scsi_cmd() Use asprintf() instead of snprintf() + strdup(). Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-05-15 10:40:00 -07:00
Bart Van Assche	b9effb556f	test-tool: Fix a comment in sg_send_scsi_cmd() Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-05-14 11:50:57 -07:00
Bart Van Assche	6aa5acb659	test-tool: Split send_scsi_command() This patch does not change any functionality. Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-05-14 11:47:56 -07:00
Bart Van Assche	e61d5d6241	Merge pull request #328 from bytedance/fix_rewrite_immediate_pdu_cmdsn socket: fix rewrite cmdsn of immediate pdus	2020-05-14 10:37:25 -07:00
wanghonghao	7e59b9bd23	socket: fix rewrite cmdsn of immediate pdus Cmdsn of a data-out pdu struct is less than `expcmdsn` since it's from its cmd pdu. A data-out pdu doesn't carray a cmdsn on the wire actually, so it doesn't matter to itself, but if we rewrite the cmdsn of a immediate pdu with it, it will cause an error. Related error logs: libiscsi: iscsi_write_to_socket: outqueue[0]->cmdsn < expcmdsn (3648bab5 < 3648bab9) opcode 00 [iqn.2003-01.org.linux-iscsi.tgt0] libiscsi: reconnect initiated [iqn.2003-01.org.linux-iscsi.tgt0] libiscsi: connecting to portal 127.0.0.1 [iqn.2003-01.org.linux-iscsi.tgt0] libiscsi: connection established (127.0.0.1:62404 -> 127.0.0.1) [iqn.2003-01.org.linux-iscsi.tgt0] Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-05-09 12:22:00 +08:00
Bart Van Assche	4db1e0a463	Merge pull request #325 from bytedance/iser_fix_chap_and_reduce_mallocs iSER: fix compatibility with CHAP and eliminate unnecessary memory allocations	2020-04-14 19:00:03 -07:00
wanghonghao	a22a9bb7db	iser: eliminate unnecessary memory allocations Allocate `iser_pdu` from small allocation pool. Lifecycle of `iscsi_in_pdu` is inside the function in iSER transport. Allocate it on stack. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-14 18:49:10 +08:00
wanghonghao	2b9b097c35	iser: use `login_resp_buf` until login is finished This commit is to fix compatibility with CHAP. iSER transport only post `login_resp_buf` (which is larger than `rx_desc`) as work request (WR) once, but there may be multiple requests and responses during login phase (e.g. when CHAP is used) and login can't be finished in such cases. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-14 18:49:03 +08:00
Bart Van Assche	ce09f48f02	Merge pull request #324 from bytedance/iser_conn Fix iSER connection establishment	2020-04-11 08:59:44 -07:00
wanghonghao	a03744c80a	init: free iscsi->opaque before check mallocs/frees counter Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-11 12:46:44 +08:00
wanghonghao	0659c74302	reconnect: collect mallocs/frees of the previous reconnection Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-11 12:46:44 +08:00
wanghonghao	843a01cbd8	iser: aggregate ack completion queue (CQ) events Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-11 12:46:40 +08:00
wanghonghao	bd9524b4ce	iser: free tx_desc of queued/inflight pdus tx_desc and memory region buffer assigned to iser pdus should be given back to tx_desc list and allocator before free all memory regions. This may happend during reconnecting/disconnecting. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-11 12:46:34 +08:00
wanghonghao	2212021747	iser: enhance connection procedure This patch is used to fix the following problems in the current connection method: 1. iscsi_iser_connect() waits until the connection is established or failed, and may block the caller for a long time. 2. Although there's a cm_thread handles communication events, but in fact it has no effects after the connection is established. 3. Resources are not released properly after reconnection failed. And once we try to reconnect again, the resources will leak permanently. (see iscsi_reconnect()). This patch eliminate cm_thread and handle communication events in the caller thread. Connection procedure: 1. Create a mock fd by eventfd() (or just use old_iscsi->fd while reconnecting), and assign it to iscsi->fd. 2. Create communication event channel, make it non-blocking and dup the notifier fd to iscsi->fd. 3. Handle communication events by iscsi_which_events()/iscsi_service() loop until connection established or falied. 4. If connection is established successfully, dup the notifier fd of completion queue (CQ) events to iscsi->fd. 5. Handle completion queue (CQ) events by iscsi_which_events()/iscsi_service() loop. The entire procedure is non-blocking. After established, whenever iscsi_service() is called with revents=0 or queue_pdu() is called with a NOP pdu, communication events will be checked. When connection failed, iser transport cleanup itself before callbacks. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-07 10:38:37 +08:00
wanghonghao	cdcb35e6c6	iser: destroy communication events channel on release Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-06 20:46:58 +08:00
Bart Van Assche	b4b0a79164	Merge pull request #323 from bytedance/iser_memory_region_allocator iser: dynamic memory region allocator	2020-04-05 21:26:23 -07:00
wanghonghao	68ce3363aa	iser: dynamic memory region allocator Implement an allocator for allocating memory region of different lengths. The allocator registers 4MB memory chunks as memory regions, and select a free segment from one of them each time. 4KB is the minimum allocation unit, and free segments in the same chunk can be merged into a larger free segment by the rule of buddy allocation. As a result, size of allocated segments will be power of 2, this may waste some space but produces less fragments. In each chunk, a complete binary tree (which is actully an array) is used to maintain free segments. Each node records the order of the largest segment can be allocated from its subtree. Here's a miniature example. A chunk with all segments free: level 4 4(0x1) level 3 3(0x2) 3(0x3) level 2 2(0x4) 2(0x5) 2(0x6) 2(0x7) level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf) After allocate a 16KB(order=3) memory region: level 4 3(0x1) level 3 0(0x2) 3(0x3) level 2 2(0x4) 2(0x5) 2(0x6) 2(0x7) level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf) It tooks 1 comparison to determine if a chunk can satisfy and at most 11 loops to find the leftmost free segment meets the requirments. The value of each node is not more than 11, and a 8-bit integer is enough to store it, so only 2048 bytes is required for each tree. And since the entire tree is in a contiguous piece of memory and no rotations are needed, it's far more efficient than self-balancing trees of the same size. Different 4MB chunks are linked as a list, and the selection order is from head to tail each time. If no existing chunks can satisfy the allocation, the allocator will register another 4M chunk and add it to the tail. +---------+ +---------+ +---------+ \|4MB chunk\| ---> \|4MB chunk\| ---> \|4MB chunk\| +---------+ +---------+ +---------+ In most cases, smaller IOs can always get memory regions from the first or second chunk and never traverse the list too much, and if we really send a lot of large IOs, the cost of the traversal is rarely critical. At last, obviously, the chunks can only allocate a maximum of 4MB memory region, if a larger memory region is needed, the allocater registers/deregisters a memory region directly regardless of buffer. Signed-off-by: wanghonghao <wanghonghao@bytedance.com>	2020-04-01 11:35:42 +08:00
Bart Van Assche	0bbd90e0de	Merge pull request #321 from ddiss/xcopy_type_a0_sds test: XCOPY type 0x0A segment descriptor support	2020-03-18 11:52:52 -07:00
David Disseldorp	a89583aec3	test/xcopy_simple: add XCOPY test with 0x0A segment descriptor Check for 0x0A descriptor support via SCSI_COPY_RESULTS_OP_PARAMS. If supported, perform a 0x0A type XCOPY using target's maximum_segment_length (or 256M if larger). Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-03-18 18:59:43 +01:00
David Disseldorp	c8992f45b1	test: add support for 0x0A type XCOPY segment descriptors Type 0x0A segment descriptors provide a 32-bit wide number-of-bytes field, allowing for much larger copy offloads with a single segment descriptor. Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-03-18 18:59:34 +01:00
David Disseldorp	385c6e8ae9	test/xcopy_simple: zero destination range before copy Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-03-18 18:59:34 +01:00
zhenwei pi	46e978ce97	iser: fix hang in rdma_destroy_id Hit iser hang in rdma_destroy_id with trace: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x00007f96ecbbcbb3 in rdma_destroy_id () from /usr/lib/librdmacm.so.1 #2 0x00005632027311d4 in iser_conn_release (iser_conn=iser_conn@entry=0x7f96d4027440) at iser.c:261 #3 0x0000563202731428 in iscsi_iser_connect (iscsi=0x563205206c70, sa=<optimized out>, ai_family=<optimized out>) at iser.c:1516 #4 0x000056320273dd3c in iscsi_connect_async (iscsi=iscsi@entry=0x563205206c70, portal=portal@entry=0x563205207084 "210.32.124.205:3260", cb=cb@entry=0x56320272b220 <iscsi_connect_cb>, private_data=private_data@entry=0x7f96d4008b00) at socket.c:389 #5 0x000056320272b325 in iscsi_full_connect_async (iscsi=0x563205206c70, portal=0x563205207084 "210.32.124.205:3260", lun=1, cb=cb@entry=0x56320272aef0 <iscsi_reconnect_cb>, private_data=private_data@entry=0x0) at connect.c:230 #6 0x000056320272b711 in iscsi_reconnect (iscsi=<optimized out>) at connect.c:473 #7 0x00005632026810a8 in iscsi_timed_check_events (opaque=0x563205206ae0) at block/iscsi.c:387 Currently use pthread_cancel to kill cmthread forcefully, cmthread may exits without rdma_ack_cm_event, then unacknowledged event will be remained in librdmacm. rdma_destroy_id hangs until uplayer ack all the cm event. Since destroying qp, cm thread will handle DISCONNECTED event, and exits by itself. Joining cm thread to wait cm thread to exit gracefully. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2020-03-17 20:31:46 -07:00
zhenwei pi	dcf95f9780	iser: fix crash for sending pdi during reconnecting Hit the crash stack: #0 iser_initialize_headers (iser_pdu=0x7f1a3404ef50, iser_conn=0x0) at iser.c:514 #1 iscsi_iser_send_pdu (iscsi=0x7f1a3406d700, pdu=0x7f1a3404ef50) at iser.c:714 #2 0x000055e3160f0157 in iscsi_scsi_command_async (iscsi=0x7f1a3406d700, iscsi@entry=0x55e317fbcc70, lun=lun@entry=1, task=task@entry=0x7f1a34026610, cb=cb@entry=0x55e316044c10 <iscsi_co_generic_cb>, d=d@entry=0x7f15feeb7710, private_data=private_data@entry=0x7f15feeb77e0) at iscsi-command.c:282 #3 0x000055e3160f1616 in iscsi_write10_iov_task (iscsi=0x55e317fbcc70, lun=1, lba=lba@entry=10401896, data=data@entry=0x0, datalen=4096, blocksize=<optimized out>, wrprotect=0, dpo=0, fua=0, fua_nv=0, group_number=0, cb=0x55e316044c10 <iscsi_co_generic_cb>, private_data=0x7f15feeb77e0, iov=0x7f1a34042090, niov=1) at iscsi-command.c:1107 #4 0x000055e31604680f in iscsi_co_writev (bs=<optimized out>, sector_num=<optimized out>, nb_sectors=<optimized out>, iov=0x7f1a3404e380, flags=<optimized out>) at block/iscsi.c:640 #5 0x000055e31601e89c in bdrv_driver_pwritev (bs=bs@entry=0x55e317fb6570, offset=offset@entry=5325770752, bytes=bytes@entry=4096, qiov=qiov@entry=0x7f1a3404e380, qiov_offset=qiov_offset@entry=0, flags=flags@entry=0) at block/io.c:1220 The reason is that during async reconnection, before reconnecting call back function gets woked, we have closed the old connection, and the new connection is not ready. At the same time, up layer still sends pdu to the old iscsi context. In this patch, before reconnecting successfully, just add the pdu to waitpdu without sending. Suggested by Bart, do not show iser related log here. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> [ bvanassche: reformatted patch ]	2020-03-17 20:27:49 -07:00
Bart Van Assche	37fb15e8eb	Merge pull request #320 from ddiss/xcopy_type_a0_sds_prep test/xcopy: minor clean up and refactoring	2020-03-17 17:13:48 -07:00
David Disseldorp	da7c1c4b0a	test: use spec defined XCOPY segment descriptor lengths The SPC4r37 specification defines XCOPY Segment Descriptor lengths for each SD type. The spec defined SD lengths don't account for the four SD header bytes. To make it easier to match the spec to the implementation, use the spec defined lengths in get_desc_len() with a four byte SD header addition. Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-03-18 00:49:36 +01:00
David Disseldorp	8dbf6e51e2	test: use scsi_set_uintX marshalling helpers for XCOPY Signed-off-by: David Disseldorp <ddiss@suse.de>	2020-03-18 00:49:36 +01:00
Bart Van Assche	65caf10cab	iser: Remove a superfluous pointer check Checking whether a pointer is NULL after it has been dereferenced is not useful. This was detected by Coverity. Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-03-07 12:37:51 -08:00
Bart Van Assche	bac94c18b8	libiscsi: Reduce the size of struct iscsi_context Reduce the size of struct iscsi_context by reordering the members of this data structure. Additionally, change the rdma_ack_timeout value from 'unsigned char' into 'uint8_t' to make it clear that this variable represents an integer. Signed-off-by: Bart Van Assche <bvanassche@acm.org>	2020-03-07 12:37:51 -08:00
zhenwei pi	6ed1ffb7b2	iser: support rdma ack timeout optimization Since 2c1619edef61a03cb516efaa81750784c3071d10 for linux kernel and 55843c4ab8f559679d28c559cc4d681836be769b for rdma-core, rdma cma supports RDMA_OPTION_ID_ACK_TIMEOUT. It's useful for RDMA out of sequence case. Because this feature is added recently, we have to check this in autogen.sh before building source code. Depend on production enviroument, tunning rdma ack timeout could get the best performance. Suggested by Bart and Ronnie, instead of using a fixed timeout value, add two methods to set rdma ack timeout value. 1, add URL variable 'LIBISCSI_RDMA_ACK_TIMEOUT'. This could works for a specified connection. 2, add env argument 'LIBISCSI_RDMA_ACK_TIMEOUT'. This works as a common setting for all the connection of a process. Test under different packet loss rate and different ack timeout, run fio (iodepth=1) in a guest os, I got this result: latency under packet loss rate 0.00001: timeout 19: avg 170.22, pct99.9 215 timeout 10: avg 160.08, pct99.9 215 timeout 8 : avg 146.39, pct99.9 177 timeout 7 : avg 148.37, pct99.9 211 latency under packet loss rate 0.0001: timeout 19: avg 949.23, pct99.9 306 timeout 10: avg 818.53, pct99.9 378 timeout 8 : avg 615.84, pct99.9 189 timeout 7 : avg 618.89, pct99.9 310 Base on this test report, setting ack timeout to 8(1048.576 usec) is a good choice in my test enviroument. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>	2020-03-07 12:37:46 -08:00

1 2 3 4 5 ...

1994 Commits