After commit 10868c491d CmdSN tests no
longer work because iscsi->cmdsn is incremented during the second test
phase (sending "good" TUR after "bad" TUR), so the resulting CmdSN is
not ExpCmdSN, but ExpCmdSN + 1 that is not acknowledged by the target.
This commit fixes the issue by setting iscsi->cmdsn to iscsi->expcmdsn - 1.
Affected tests:
* iSCSI.iSCSIcmdsn.iSCSICmdSnTooLow
* iSCSI.iSCSIcmdsn.iSCSICmdSnTooHigh
In the test iSCSI.iSCSITMF.LUNResetSimpleAsync
CU_ASSERT_EQUAL(reconnect_succeeded, 1) must be called after the async
TMF command completes. Hence move that assert into the TMF completion
callback.
This patch fixes a race condition.
[bvanassche: edited commit message]
This attempts to reproduce upstream LIO reports of a use after free bug
when logout occurs alongside concurrent I/O.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Libiscsi supports to parse two iscsi url schemes: 'iscsi://' and 'iser://'.
Fix the missing iser parsing, introduced from 12222077.
Signed-off-by: Han Han <hhan@redhat.com>
Remove a bunch of duplicate code by sharing a function for source and
destination endpoint initialization.
Signed-off-by: David Disseldorp <ddiss@suse.de>
I don't see any problems that calling the callback
during connect/login in iscsi_cancel_pdus(). So let's
remove this check. Otherwise, we have no way to be aware
of a cancellation during login and cause something like
iscsi_login_sync() hangs.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
We should plug the cmdsn gap in order to continue
to use the session when the pdus is cancelled before
sending out.
Signed-off-by: Xie Yongji <xieyongji@bytedance.com>
Since the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE command
is an 8 bit field, the maximum value that can be encoded is 255.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Assign the NUMBER OF LOGICAL BLOCKS field in the COMPARE AND WRITE PDU
directly. Use the terminology from SBC-4, namely NUMBER OF LOGICAL BLOCKS
instead of TL.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Cmdsn of a data-out pdu struct is less than `expcmdsn` since it's from its
cmd pdu. A data-out pdu doesn't carray a cmdsn on the wire actually, so it
doesn't matter to itself, but if we rewrite the cmdsn of a immediate pdu with
it, it will cause an error.
Related error logs:
libiscsi: iscsi_write_to_socket: outqueue[0]->cmdsn < expcmdsn (3648bab5 < 3648bab9) opcode 00 [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: reconnect initiated [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: connecting to portal 127.0.0.1 [iqn.2003-01.org.linux-iscsi.tgt0]
libiscsi: connection established (127.0.0.1:62404 -> 127.0.0.1) [iqn.2003-01.org.linux-iscsi.tgt0]
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
Allocate `iser_pdu` from small allocation pool.
Lifecycle of `iscsi_in_pdu` is inside the function in iSER transport. Allocate
it on stack.
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
This commit is to fix compatibility with CHAP.
iSER transport only post `login_resp_buf` (which is larger than `rx_desc`) as
work request (WR) once, but there may be multiple requests and responses during
login phase (e.g. when CHAP is used) and login can't be finished in such cases.
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
tx_desc and memory region buffer assigned to iser pdus should be given back to
tx_desc list and allocator before free all memory regions.
This may happend during reconnecting/disconnecting.
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
This patch is used to fix the following problems in the current connection
method:
1. iscsi_iser_connect() waits until the connection is established or failed,
and may block the caller for a long time.
2. Although there's a cm_thread handles communication events, but in fact it
has no effects after the connection is established.
3. Resources are not released properly after reconnection failed. And once we
try to reconnect again, the resources will leak permanently.
(see iscsi_reconnect()).
This patch eliminate cm_thread and handle communication events in the caller
thread.
Connection procedure:
1. Create a mock fd by eventfd() (or just use old_iscsi->fd while reconnecting),
and assign it to iscsi->fd.
2. Create communication event channel, make it non-blocking and dup the
notifier fd to iscsi->fd.
3. Handle communication events by iscsi_which_events()/iscsi_service() loop
until connection established or falied.
4. If connection is established successfully, dup the notifier fd of completion
queue (CQ) events to iscsi->fd.
5. Handle completion queue (CQ) events by iscsi_which_events()/iscsi_service()
loop.
The entire procedure is non-blocking.
After established, whenever iscsi_service() is called with revents=0 or
queue_pdu() is called with a NOP pdu, communication events will be checked.
When connection failed, iser transport cleanup itself before callbacks.
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
Implement an allocator for allocating memory region of different lengths.
The allocator registers 4MB memory chunks as memory regions, and select a
free segment from one of them each time.
4KB is the minimum allocation unit, and free segments in the same chunk can be
merged into a larger free segment by the rule of buddy allocation. As a result,
size of allocated segments will be power of 2, this may waste some space but
produces less fragments.
In each chunk, a complete binary tree (which is actully an array) is used
to maintain free segments. Each node records the order of the largest segment
can be allocated from its subtree. Here's a miniature example.
A chunk with all segments free:
level 4 4(0x1)
level 3 3(0x2) 3(0x3)
level 2 2(0x4) 2(0x5) 2(0x6) 2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)
After allocate a 16KB(order=3) memory region:
level 4 3(0x1)
level 3 0(0x2) 3(0x3)
level 2 2(0x4) 2(0x5) 2(0x6) 2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)
It tooks 1 comparison to determine if a chunk can satisfy and at most 11
loops to find the leftmost free segment meets the requirments.
The value of each node is not more than 11, and a 8-bit integer is enough
to store it, so only 2048 bytes is required for each tree. And since the
entire tree is in a contiguous piece of memory and no rotations are needed,
it's far more efficient than self-balancing trees of the same size.
Different 4MB chunks are linked as a list, and the selection order is from
head to tail each time. If no existing chunks can satisfy the allocation,
the allocator will register another 4M chunk and add it to the tail.
+---------+ +---------+ +---------+
|4MB chunk| ---> |4MB chunk| ---> |4MB chunk|
+---------+ +---------+ +---------+
In most cases, smaller IOs can always get memory regions from the first or
second chunk and never traverse the list too much, and if we really send a lot
of large IOs, the cost of the traversal is rarely critical.
At last, obviously, the chunks can only allocate a maximum of 4MB memory region,
if a larger memory region is needed, the allocater registers/deregisters a
memory region directly regardless of buffer.
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
Check for 0x0A descriptor support via SCSI_COPY_RESULTS_OP_PARAMS. If
supported, perform a 0x0A type XCOPY using target's
maximum_segment_length (or 256M if larger).
Signed-off-by: David Disseldorp <ddiss@suse.de>
Type 0x0A segment descriptors provide a 32-bit wide number-of-bytes
field, allowing for much larger copy offloads with a single segment
descriptor.
Signed-off-by: David Disseldorp <ddiss@suse.de>