When an iSCSI connection enters the reconnection phase, the backoff
time (next_reconnect) increases with reconnection retry_cnt. However,
if the client detects that the target has recovered before reaching
next_reconnect, calling iscsi_reconnect/iscsi_force_reconnect has no
any effect, making fast reconnection impossible.
This patch introduces an interface to reset next_reconnect, so that
the client can reset the backoff time upon detecting target recovery
and achieve faster reconnection.
Resolves: https://github.com/sahlberg/libiscsi/issues/428
Signed-off-by: raywang <honglei.wang@smartx.com>
Implement a more generic wrapper API for message digests, so
that it is easier to also include gnutls as an option.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Originally, we use this in scsi-lowlevel.c only, this works as static
function. It also could be used to dump ISCSI opcode, so move it into
common utils.h/utils.c.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Describe the reason again:
A WRITE16 command[w] handles R2T, and queues DATAOUT PDU m,x,y,z:
outqueue->DATAOUT[x]->DATAOUT[y]->DATAOUT[z]...
outqueue_current->DATAOUT[m]
waitqueue->WRITE16[w]...
1, Once x, y, z gets released in initiator side, the target still expects the remaining
DATAOUT PDUs.
2, Once command w timeout and callback to uplayer, uplayers usually releases memory of
iscsi task(include memory referenced by iovec.iov_base). DATAOUT[m] would access
invalid memory iovce.iov_base.
So invoke WRITEx command callback until draining DATAOUT PDUs.
Co-developed-by: Rui Zhang <zhangrui.1203@bytedance.com>
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Once error occurs on socker read/write, libiscsi tries to reconnect
silently. Add necessary log message for this case.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Now we have more friendly opcode message once error occurs:
libiscsi:1 command timed out from waitqueue [...]
libiscsi:2 PDU header: 01 c1 ... e9 88[READ16] 00 00 00 ...
^(human readable opcode string)
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
Distinguish command timeout from outqueue or waitqueue. For example,
A command sequence: ...NOP,READ,WRITE...
NOP OUT command has no dependence on backend media, it is expected to
response soon. Once NOP timeout in libiscsi:
a, the command is already sent to target, the target side does *not*
response.
b, the command is still pending in libiscsi.
Separate the two cases for trouble shooting.
Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
The existing implementation of iscsi_task_mgmt_lun_reset_async cancels
all tasks in ready-to-send and wait-for-completion queues.
If the ISCSI context has in-flight tasks for a different LUNs or
tasks that are not LUN-specific (such as NOPIN, NOPOUT), those tasks
are not supposed to be affected by the LUN reset.
Also, the tasks for the LUN being reset may have in-flight responses
not affected by a concurrent LUN reset; they have to be handled
accordingly.
This change cancels only the tasks for the LUN being reset if they are
in the ready-to-send queue ('outqueue'). The tasks in the wait-for-
completion queue should be cancelled on LUN reset completion.
For example:
iscsi_task_mgmt_lun_reset_async(iscsi, lun, lun_reset_cb, ctxt);
....
....
void lun_reset_cb(struct iscsi_context * iscsi, int status,
void * command_data, void * private_data)
{
// 'response' field per ISCSI spec rfc7143 section 11.6.1
uint8_t iscsi_response = *(uint8_t *)command_data;
if (iscsi_response == 0) {
// The LUN has been reset. No further replies are expected
// for in-flight tasks for that LUN. Explicitly cancelling
// the tasks in wait-for-completion queue.
for (.. scsi_task-s in flight ..) {
iscsi_scsi_cancel_task(iscsi, task);
}
} ...
}
Users need to check if a task is queued to send (not sent yet)
before invoking functions such as iscsi_task_mgmt_abort_task_async.
Otherwise, because task management requests are automatically
treated as "immediate", a request to abort a task is sent before
the task itself.
if (iscsi_scsi_is_task_in_outqueue(iscsi_, task_)) {
iscsi_scsi_cancel_task(iscsi_, task_);
} else {
iscsi_task_mgmt_abort_task_async(iscsi_, task_, AbortCb, context_);
}
When execute iscsi_task_mgmt_lun_reset_async function,
pdus are already removed from waitpdu list. In iscsi_service
function, this will call iscsi_process_pdu and release
pdu from waitpdu again, which cause segmentation fault.
Whether waitpud list is NULL should be checked here to avoid
the problem.
Signed-off-by: geruijun <geruijun@huawei.com>
If a test sets the use_immediate_data parameter to ISCSI_IMMEDIATE_DATA_NO
for the iSCSI context, then the test expects that a data associated with
the Write command to be sent in a separate PDU.
But if for execute the command it is necessary to login on a target then
the use_immediate_data was previously set, will be rewrite during
the processing of the Login Response packet.
This happen during the iSCSI.iSCSIdatasn.iSCSIDataSnInvalid
(test_iscsi_datasn_invalid.c) test:
--> iSCSI 114 SCSI: Write(10) LUN: 0x01 (LBA: 0x00000064, Len: 1)
<-- iSCSI 114 Ready To Transfer
--> iSCSI 578 SCSI: Data Out LUN: 0x01 (Write(10) Request Data)
--> iSCSI 550 Login Command
Here we lose use_immediate_data value for iSCSI session.
<-- iSCSI 426 Login Response (Success)
--> iSCSI 114 SCSI: Test Unit Ready LUN: 0x01
<-- iSCSI 114 SCSI: Response LUN: 0x01 (Test Unit Ready) (Good)
And this Write command includes payload into iSCSI PDU packet, but should not do it.
--> iSCSI 578 SCSI: Write(10) LUN: 0x01 (LBA: 0x00000064, Len: 1)SCSI: Data Out LUN: 0x01 (Write(10) Request Data)
<-- iSCSI 114 SCSI: Response LUN: 0x01 (Write(10)) (Good)
--> iSCSI 114 SCSI: Write(10) LUN: 0x01 (LBA: 0x00000064, Len: 2)
<-- iSCSI 114 Ready To Transfer
--> iSCSI 578 SCSI: Data Out LUN: 0x01 (Write(10) Request Data)
--> iSCSI 626 SCSI: Data Out LUN: 0x01 (Write(10) Request Data)
Signed-off-by: Sergey Samoylenko <s.samoylenko@yadro.com>
In iscsi_send_data_out() a PDU is allocated, but there is no error
handling logic to free it if the PDU cannot be queued.
iscsi_allocate_pdu() may allocate memory for the PDU. This memory may be
leaked if iscsi_queue_pdu() fails since there is no call to free it.
Orignally there was a free() call but it was removed as a part of a
cleanup adding checks for NULL pdu callbacks.
Fixes: 423b82efa4 ("pdu: check callback for NULL everywhere")
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
If a connection attempt is hung, then iscsi_reconnect() won't do anything. This
makes sense if we'd just re-try to connect to the same target, but if (for
example) login redirect might point us to a different, healthy, target, it
should be possible to restart the full connection process on request.
Signed-off-by: John Levon <john.levon@nutanix.com>
Instead of adding __attribute__((unused)) to unused arguments, add the
-Wno-unused-parameter compiler flag.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Instead of defining the macro _R_(), define __attribute__() as a macro for
compilers that do not support __attribute__(), namely Microsoft Visual
Studio.
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
As iser_pdu->desc->data_dir is not initialised when sending a PDU.
The value remains what it was when it was used last time. Thus
a PDU could be considered to have data if it previously had and
might cause segmentation fault.
For example if a pdu is a reset task management task with no data
to transfer and the pdu is previously used as a read task. Thus
it would cause fault like below:
> struct scsi_iovector *iovector_in = &task->iovector_in;
0 0x00007ffff7bcb2d1 in iser_rcv_completion (rx_desc=0x555555b79e48, iser_conn=0x555555b573a0) at iser.c:1349
1 0x00007ffff7bcb53e in iser_handle_wc (wc=0x7fffffffdc00, iser_conn=0x555555b573a0) at iser.c:1426
2 0x00007ffff7bcb685 in cq_event_handler (iser_conn=0x555555b573a0) at iser.c:1468
3 0x00007ffff7bcb81b in cq_handle (iser_conn=0x555555b573a0) at iser.c:1516
4 0x00007ffff7bc8b28 in iscsi_iser_service (iscsi=0x555555b58710, revents=1) at iser.c:118
5 0x00007ffff7bc3862 in iscsi_service (iscsi=0x555555b58710, revents=1) at socket.c:1016
6 0x00007ffff7bc3f6c in event_loop (iscsi=0x555555b58710, state=0x7fffffffe000) at sync.c:71
7 0x00007ffff7bc4605 in iscsi_task_mgmt_sync (iscsi=0x555555b58710, lun=0, function=ISCSI_TM_LUN_RESET, ritt=4294967295, rcmdsn=0) at sync.c:281
8 0x00007ffff7bc46cf in iscsi_task_mgmt_lun_reset_sync (iscsi=0x555555b58710, lun=0) at sync.c:312
9 0x000055555555500d in iscsi_lun_reset_sync (iscsi=0x555555b58710) at iscsiclient_lun_reset.c:34
10 0x0000555555555680 in main (argc=7, argv=0x7fffffffe1c8) at iscsiclient_lun_reset.c:211
Signed-off-by: Hou Pu <houpu@bytedance.com>
The target sometimes sends a logout request to libiscsi
in case it is going down or for some other reason.
The opcode of such a request is ISCSI_PDU_ASYNC_MSG.
On receiving these kinds of PDU, there is no related pdu on the
list of iscsi->waitpdu. Just skip finding them from iscsi->waitpdu.
Or segment fault might happen.
Also rename nop_target label to no_waitpdu to be more clear.
Signed-off-by: Hou Pu <houpu@bytedance.com>
This field is documented in SPC-5 (r17 4.4.3). Unlike the descriptor
type, the fixed Information field is four bytes wide.
Signed-off-by: David Disseldorp <ddiss@suse.de>
The Information descriptor type is defined in SPC-5 (r17 4.4.2.2) and
may be used to provide the data offset on COMPARE_AND_WRITE miscompare.
Signed-off-by: David Disseldorp <ddiss@suse.de>
Explicitly check that the sense data descriptor ADDITIONAL LENGTH field
matches the expected value for sense key specific sense data
descriptors.
Signed-off-by: David Disseldorp <ddiss@suse.de>
According to SPC-5 (r17), the sense data descriptor format follows:
byte field
---- -----
0: DESCRIPTOR TYPE
1: ADDITIONAL LENGTH
2-n: Sense data descriptor specific
The VALID bit is a sense data descriptor specific flag, and is not
present in the only sense data descriptor supported by libiscsi -
Sense key specific sense data descriptors.
Drop the generic VALID bit check, in preparation for handling it on
a sense data descriptor specific basis.
Suggested-by: Bart Van Assche <bvanassche@acm.org>
Signed-off-by: David Disseldorp <ddiss@suse.de>