Commit Graph

1998 Commits

Author SHA1 Message Date
David Disseldorp
54b3dcaa30 test-tool: add LogoutDuringIOAsync test case
This attempts to reproduce upstream LIO reports of a use after free bug
when logout occurs alongside concurrent I/O.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-08-18 17:12:28 +02:00
David Disseldorp
4080c09839 test-tool: rename async write dispatch/complete counters
Add an io_ prefix, to differentiate between I/O and future iSCSI Logout
request tracking.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-08-18 16:43:54 +02:00
Bart Van Assche
3f50a1462c Merge pull request #338 from ddiss/pdu_cancel_use_after_free
pdu: Fix use after free during cancellation
2020-08-18 07:00:27 -07:00
David Disseldorp
87272919ad pdu: fix use after free during cancellation
Fixes: 10868c4 ("libiscsi: Avoid discontinuities in cmdsn ordering in some cases")
Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-08-18 15:38:43 +02:00
sanjay-cpu
e9cefe7e42 .travis.yml: Also build for the ppc64le architecture
[ bvanassche: Added amd64 architecture and edited commit message ]
2020-08-12 20:07:10 -07:00
Bart Van Assche
44facb175b Merge pull request #335 from qiankehan/iscsi-ls
iscsi-ls: Fix iser url scheme parsing
2020-08-12 19:59:57 -07:00
Bart Van Assche
fb0f3691ed Merge pull request #334 from heroin-moose/fix-hardcoded-block-size
test-tool: Use block_size instead of hardcoded 512 bytes
2020-08-11 18:35:51 -07:00
Bart Van Assche
0749990afb Merge pull request #333 from ddiss/iscsi-dd-cleanup
Iscsi-dd cleanup
2020-08-11 18:34:32 -07:00
Han Han
6db782bb0a iscsi-ls: Fix iser url scheme parsing
Libiscsi supports to parse two iscsi url schemes: 'iscsi://' and 'iser://'.
Fix the missing iser parsing, introduced from 12222077.

Signed-off-by: Han Han <hhan@redhat.com>
2020-08-11 21:57:12 +08:00
Consus
9e160e02b6 test-tool: Use block_size instead of hardcoded 512 bytes
Fix another couple of tests that fail on 4Kn drives:

* iSCSI.iSCSITMF.AbortTaskSimpleAsync
* iSCSI.iSCSITMF.LUNResetSimpleAsync
2020-08-10 15:59:49 +03:00
David Disseldorp
37e6e112b6 examples/iscsi-dd: use common init function for src and dst endpoints
Remove a bunch of duplicate code by sharing a function for source and
destination endpoint initialization.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-08-10 01:18:06 +02:00
David Disseldorp
eb4c8e20ff examples/iscsi-dd: use common iscsi_endpoint struct
Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-08-10 00:35:28 +02:00
Ronnie Sahlberg
c54c4cd202 iscsi-perf: Add explicit casts to avoid two warnings
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2020-07-31 09:21:33 +10:00
Bart Van Assche
76f9578d1c Merge pull request #332 from bytedance/fix_cancellation_handling
Fix cancellation handling
2020-07-18 20:04:29 -07:00
Xie Yongji
ee47dc7338 socket: Make the pdu timeout handling aware of old iscsi context
We should check the pdus in old iscsi context when
scanning timeout tasks during reconnecting.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
2020-06-23 19:49:07 +08:00
wanghonghao
aad136e5b9 libiscsi: Make the cancellation aware of the pdus in old iscsi context
We should check the pdus in old iscsi context when
cancelling tasks.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-06-23 19:48:26 +08:00
Xie Yongji
e9c1f10258 pdu: Remove the checking for iscsi->is_loggedin in iscsi_cancel_pdus()
I don't see any problems that calling the callback
during connect/login in iscsi_cancel_pdus(). So let's
remove this check. Otherwise, we have no way to be aware
of a cancellation during login and cause something like
iscsi_login_sync() hangs.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
2020-06-23 19:45:18 +08:00
Xie Yongji
10868c491d libiscsi: Avoid discontinuities in cmdsn ordering in some cases
We should plug the cmdsn gap in order to continue
to use the session when the pdus is cancelled before
sending out.

Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
2020-06-23 19:45:14 +08:00
Bart Van Assche
0e9b29751c Merge pull request #331 from heroin-moose/fix-block-size
test-tool: Use block_size instead of hardcoded 512 bytes
2020-06-17 07:04:40 -07:00
Consus
7258fbfd83 test-tool: Use block_size instead of hardcoded 512 bytes
This commit fixes some tests (like ProutReserve) on pure 4k devices.
2020-06-17 15:50:44 +03:00
Bart Van Assche
bddbc01829 Merge pull request #330 from tmakatos/master
assorted RPM fixes
2020-06-05 21:13:44 -07:00
Thanos Makatos
9705017e37 exclude ld_iscsi.so from RPM
This is an omission from commit
e6bcdf5fdb.

Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-06-02 02:33:27 -07:00
Thanos Makatos
f82a899fc5 include iser-private.h in make dist tarball
Signed-off-by: Thanos Makatos <thanos.makatos@nutanix.com>
2020-06-01 05:58:01 -07:00
Bart Van Assche
33c66f2c39 test-tool, compare and write: Reduce maximum number of blocks from 256 to 255
Since the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE command
is an 8 bit field, the maximum value that can be encoded is 255.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-05-21 18:39:49 -07:00
Bart Van Assche
b4d59cd29c test_compareandwrite_invalid_dataout_size: Simplify this test
Assign the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE PDU
directly. Use the terminology from SBC-4, namely NUMBER OF LOGICAL BLOCKS
instead of TL.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-05-21 18:39:49 -07:00
Bart Van Assche
9fcdce3101 test-tool: Use asprintf() in sg_send_scsi_cmd()
Use asprintf() instead of snprintf() + strdup().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-05-15 10:40:00 -07:00
Bart Van Assche
b9effb556f test-tool: Fix a comment in sg_send_scsi_cmd()
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-05-14 11:50:57 -07:00
Bart Van Assche
6aa5acb659 test-tool: Split send_scsi_command()
This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-05-14 11:47:56 -07:00
Bart Van Assche
e61d5d6241 Merge pull request #328 from bytedance/fix_rewrite_immediate_pdu_cmdsn
socket: fix rewrite cmdsn of immediate pdus
2020-05-14 10:37:25 -07:00
wanghonghao
7e59b9bd23 socket: fix rewrite cmdsn of immediate pdus
Cmdsn of a data-out pdu struct is less than `expcmdsn` since it's from its
cmd pdu. A data-out pdu doesn't carray a cmdsn on the wire actually, so it
doesn't matter to itself, but if we rewrite the cmdsn of a immediate pdu with
it, it will cause an error.

Related error logs:
libiscsi: iscsi_write_to_socket: outqueue[0]->cmdsn < expcmdsn (3648bab5 < 3648bab9) opcode 00 [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: reconnect initiated [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: connecting to portal 127.0.0.1 [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: connection established (127.0.0.1:62404 -> 127.0.0.1) [iqn.2003-01.org.linux-iscsi.tgt0]

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-05-09 12:22:00 +08:00
Bart Van Assche
4db1e0a463 Merge pull request #325 from bytedance/iser_fix_chap_and_reduce_mallocs
iSER: fix compatibility with CHAP and eliminate unnecessary memory allocations
2020-04-14 19:00:03 -07:00
wanghonghao
a22a9bb7db iser: eliminate unnecessary memory allocations
Allocate `iser_pdu` from small allocation pool.
Lifecycle of `iscsi_in_pdu` is inside the function in iSER transport. Allocate
it on stack.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-14 18:49:10 +08:00
wanghonghao
2b9b097c35 iser: use login_resp_buf until login is finished
This commit is to fix compatibility with CHAP.

iSER transport only post `login_resp_buf` (which is larger than `rx_desc`) as
work request (WR) once, but there may be multiple requests and responses during
login phase (e.g. when CHAP is used) and login can't be finished in such cases.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-14 18:49:03 +08:00
Bart Van Assche
ce09f48f02 Merge pull request #324 from bytedance/iser_conn
Fix iSER connection establishment
2020-04-11 08:59:44 -07:00
wanghonghao
a03744c80a init: free iscsi->opaque before check mallocs/frees counter
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-11 12:46:44 +08:00
wanghonghao
0659c74302 reconnect: collect mallocs/frees of the previous reconnection
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-11 12:46:44 +08:00
wanghonghao
843a01cbd8 iser: aggregate ack completion queue (CQ) events
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-11 12:46:40 +08:00
wanghonghao
bd9524b4ce iser: free tx_desc of queued/inflight pdus
tx_desc and memory region buffer assigned to iser pdus should be given back to
tx_desc list and allocator before free all memory regions.
This may happend during reconnecting/disconnecting.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-11 12:46:34 +08:00
wanghonghao
2212021747 iser: enhance connection procedure
This patch is used to fix the following problems in the current connection
method:
1. iscsi_iser_connect() waits until the connection is established or failed,
and may block the caller for a long time.
2. Although there's a cm_thread handles communication events, but in fact it
has no effects after the connection is established.
3. Resources are not released properly after reconnection failed. And once we
try to reconnect again, the resources will leak permanently.
(see iscsi_reconnect()).

This patch eliminate cm_thread and handle communication events in the caller
thread.
Connection procedure:
1. Create a mock fd by eventfd() (or just use old_iscsi->fd while reconnecting),
and assign it to iscsi->fd.
2. Create communication event channel, make it non-blocking and dup the
notifier fd to iscsi->fd.
3. Handle communication events by iscsi_which_events()/iscsi_service() loop
until connection established or falied.
4. If connection is established successfully, dup the notifier fd of completion
queue (CQ) events to iscsi->fd.
5. Handle completion queue (CQ) events by iscsi_which_events()/iscsi_service()
loop.
The entire procedure is non-blocking.

After established, whenever iscsi_service() is called with revents=0 or
queue_pdu() is called with a NOP pdu, communication events will be checked.

When connection failed, iser transport cleanup itself before callbacks.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-07 10:38:37 +08:00
wanghonghao
cdcb35e6c6 iser: destroy communication events channel on release
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-06 20:46:58 +08:00
Bart Van Assche
b4b0a79164 Merge pull request #323 from bytedance/iser_memory_region_allocator
iser: dynamic memory region allocator
2020-04-05 21:26:23 -07:00
wanghonghao
68ce3363aa iser: dynamic memory region allocator
Implement an allocator for allocating memory region of different lengths.
The allocator registers 4MB memory chunks as memory regions, and select a
free segment from one of them each time.

4KB is the minimum allocation unit, and free segments in the same chunk can be
merged into a larger free segment by the rule of buddy allocation. As a result,
size of allocated segments will be power of 2, this may waste some space but
produces less fragments.

In each chunk, a complete binary tree (which is actully an array) is used
to maintain free segments. Each node records the order of the largest segment
can be allocated from its subtree. Here's a miniature example.

A chunk with all segments free:
level 4                         4(0x1)
level 3            3(0x2)                      3(0x3)
level 2     2(0x4)        2(0x5)       2(0x6)        2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)

After allocate a 16KB(order=3) memory region:
level 4                         3(0x1)
level 3            0(0x2)                      3(0x3)
level 2     2(0x4)        2(0x5)       2(0x6)        2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)

It tooks 1 comparison to determine if a chunk can satisfy and at most 11
loops to find the leftmost free segment meets the requirments.
The value of each node is not more than 11, and a 8-bit integer is enough
to store it, so only 2048 bytes is required for each tree. And since the
entire tree is in a contiguous piece of memory and no rotations are needed,
it's far more efficient than self-balancing trees of the same size.

Different 4MB chunks are linked as a list, and the selection order is from
head to tail each time. If no existing chunks can satisfy the allocation,
the allocator will register another 4M chunk and add it to the tail.
+---------+       +---------+      +---------+
|4MB chunk| --->  |4MB chunk| ---> |4MB chunk|
+---------+       +---------+      +---------+

In most cases, smaller IOs can always get memory regions from the first or
second chunk and never traverse the list too much, and if we really send a lot
of large IOs, the cost of the traversal is rarely critical.

At last, obviously, the chunks can only allocate a maximum of 4MB memory region,
if a larger memory region is needed, the allocater registers/deregisters a
memory region directly regardless of buffer.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-01 11:35:42 +08:00
Bart Van Assche
0bbd90e0de Merge pull request #321 from ddiss/xcopy_type_a0_sds
test: XCOPY type 0x0A segment descriptor support
2020-03-18 11:52:52 -07:00
David Disseldorp
a89583aec3 test/xcopy_simple: add XCOPY test with 0x0A segment descriptor
Check for 0x0A descriptor support via SCSI_COPY_RESULTS_OP_PARAMS. If
supported, perform a 0x0A type XCOPY using target's
maximum_segment_length (or 256M if larger).

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-03-18 18:59:43 +01:00
David Disseldorp
c8992f45b1 test: add support for 0x0A type XCOPY segment descriptors
Type 0x0A segment descriptors provide a 32-bit wide number-of-bytes
field, allowing for much larger copy offloads with a single segment
descriptor.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-03-18 18:59:34 +01:00
David Disseldorp
385c6e8ae9 test/xcopy_simple: zero destination range before copy
Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-03-18 18:59:34 +01:00
zhenwei pi
46e978ce97 iser: fix hang in rdma_destroy_id
Hit iser hang in rdma_destroy_id with trace:
 #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
 #1  0x00007f96ecbbcbb3 in rdma_destroy_id () from /usr/lib/librdmacm.so.1
 #2  0x00005632027311d4 in iser_conn_release (iser_conn=iser_conn@entry=0x7f96d4027440) at iser.c:261
 #3  0x0000563202731428 in iscsi_iser_connect (iscsi=0x563205206c70, sa=<optimized out>, ai_family=<optimized out>)
     at iser.c:1516
 #4  0x000056320273dd3c in iscsi_connect_async (iscsi=iscsi@entry=0x563205206c70,
     portal=portal@entry=0x563205207084 "210.32.124.205:3260", cb=cb@entry=0x56320272b220 <iscsi_connect_cb>,
     private_data=private_data@entry=0x7f96d4008b00) at socket.c:389
 #5  0x000056320272b325 in iscsi_full_connect_async (iscsi=0x563205206c70,
     portal=0x563205207084 "210.32.124.205:3260", lun=1, cb=cb@entry=0x56320272aef0 <iscsi_reconnect_cb>,
     private_data=private_data@entry=0x0) at connect.c:230
 #6  0x000056320272b711 in iscsi_reconnect (iscsi=<optimized out>) at connect.c:473
 #7  0x00005632026810a8 in iscsi_timed_check_events (opaque=0x563205206ae0) at block/iscsi.c:387

Currently use pthread_cancel to kill cmthread forcefully, cmthread may
exits without rdma_ack_cm_event, then unacknowledged event will be
remained in librdmacm. rdma_destroy_id hangs until uplayer ack all
the cm event.

Since destroying qp, cm thread will handle DISCONNECTED event, and
exits by itself. Joining cm thread to wait cm thread to exit
gracefully.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-03-17 20:31:46 -07:00
zhenwei pi
dcf95f9780 iser: fix crash for sending pdi during reconnecting
Hit the crash stack:
 #0  iser_initialize_headers (iser_pdu=0x7f1a3404ef50, iser_conn=0x0) at iser.c:514
 #1  iscsi_iser_send_pdu (iscsi=0x7f1a3406d700, pdu=0x7f1a3404ef50) at iser.c:714
 #2  0x000055e3160f0157 in iscsi_scsi_command_async (iscsi=0x7f1a3406d700, iscsi@entry=0x55e317fbcc70,
     lun=lun@entry=1, task=task@entry=0x7f1a34026610, cb=cb@entry=0x55e316044c10 <iscsi_co_generic_cb>,
     d=d@entry=0x7f15feeb7710, private_data=private_data@entry=0x7f15feeb77e0) at iscsi-command.c:282
 #3  0x000055e3160f1616 in iscsi_write10_iov_task (iscsi=0x55e317fbcc70, lun=1, lba=lba@entry=10401896,
     data=data@entry=0x0, datalen=4096, blocksize=<optimized out>, wrprotect=0, dpo=0, fua=0, fua_nv=0,
     group_number=0, cb=0x55e316044c10 <iscsi_co_generic_cb>, private_data=0x7f15feeb77e0, iov=0x7f1a34042090,
     niov=1) at iscsi-command.c:1107
 #4  0x000055e31604680f in iscsi_co_writev (bs=<optimized out>, sector_num=<optimized out>,
     nb_sectors=<optimized out>, iov=0x7f1a3404e380, flags=<optimized out>) at block/iscsi.c:640
 #5  0x000055e31601e89c in bdrv_driver_pwritev (bs=bs@entry=0x55e317fb6570, offset=offset@entry=5325770752,
     bytes=bytes@entry=4096, qiov=qiov@entry=0x7f1a3404e380, qiov_offset=qiov_offset@entry=0, flags=flags@entry=0)
     at block/io.c:1220

The reason is that during async reconnection, before reconnecting
call back function gets woked, we have closed the old connection,
and the new connection is not ready.
At the same time, up layer still sends pdu to the old iscsi context.

In this patch, before reconnecting successfully, just add the pdu to
waitpdu without sending.
Suggested by Bart, do not show iser related log here.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[ bvanassche: reformatted patch ]
2020-03-17 20:27:49 -07:00
Bart Van Assche
37fb15e8eb Merge pull request #320 from ddiss/xcopy_type_a0_sds_prep
test/xcopy: minor clean up and refactoring
2020-03-17 17:13:48 -07:00
David Disseldorp
da7c1c4b0a test: use spec defined XCOPY segment descriptor lengths
The SPC4r37 specification defines XCOPY Segment Descriptor lengths for
each SD type. The spec defined SD lengths don't account for the four SD
header bytes. To make it easier to match the spec to the implementation,
use the spec defined lengths in get_desc_len() with a four byte SD
header addition.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-03-18 00:49:36 +01:00