Commit Graph

730 Commits

Author SHA1 Message Date
zhenwei pi
46e978ce97 iser: fix hang in rdma_destroy_id
Hit iser hang in rdma_destroy_id with trace:
 #0  pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
 #1  0x00007f96ecbbcbb3 in rdma_destroy_id () from /usr/lib/librdmacm.so.1
 #2  0x00005632027311d4 in iser_conn_release (iser_conn=iser_conn@entry=0x7f96d4027440) at iser.c:261
 #3  0x0000563202731428 in iscsi_iser_connect (iscsi=0x563205206c70, sa=<optimized out>, ai_family=<optimized out>)
     at iser.c:1516
 #4  0x000056320273dd3c in iscsi_connect_async (iscsi=iscsi@entry=0x563205206c70,
     portal=portal@entry=0x563205207084 "210.32.124.205:3260", cb=cb@entry=0x56320272b220 <iscsi_connect_cb>,
     private_data=private_data@entry=0x7f96d4008b00) at socket.c:389
 #5  0x000056320272b325 in iscsi_full_connect_async (iscsi=0x563205206c70,
     portal=0x563205207084 "210.32.124.205:3260", lun=1, cb=cb@entry=0x56320272aef0 <iscsi_reconnect_cb>,
     private_data=private_data@entry=0x0) at connect.c:230
 #6  0x000056320272b711 in iscsi_reconnect (iscsi=<optimized out>) at connect.c:473
 #7  0x00005632026810a8 in iscsi_timed_check_events (opaque=0x563205206ae0) at block/iscsi.c:387

Currently use pthread_cancel to kill cmthread forcefully, cmthread may
exits without rdma_ack_cm_event, then unacknowledged event will be
remained in librdmacm. rdma_destroy_id hangs until uplayer ack all
the cm event.

Since destroying qp, cm thread will handle DISCONNECTED event, and
exits by itself. Joining cm thread to wait cm thread to exit
gracefully.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-03-17 20:31:46 -07:00
zhenwei pi
dcf95f9780 iser: fix crash for sending pdi during reconnecting
Hit the crash stack:
 #0  iser_initialize_headers (iser_pdu=0x7f1a3404ef50, iser_conn=0x0) at iser.c:514
 #1  iscsi_iser_send_pdu (iscsi=0x7f1a3406d700, pdu=0x7f1a3404ef50) at iser.c:714
 #2  0x000055e3160f0157 in iscsi_scsi_command_async (iscsi=0x7f1a3406d700, iscsi@entry=0x55e317fbcc70,
     lun=lun@entry=1, task=task@entry=0x7f1a34026610, cb=cb@entry=0x55e316044c10 <iscsi_co_generic_cb>,
     d=d@entry=0x7f15feeb7710, private_data=private_data@entry=0x7f15feeb77e0) at iscsi-command.c:282
 #3  0x000055e3160f1616 in iscsi_write10_iov_task (iscsi=0x55e317fbcc70, lun=1, lba=lba@entry=10401896,
     data=data@entry=0x0, datalen=4096, blocksize=<optimized out>, wrprotect=0, dpo=0, fua=0, fua_nv=0,
     group_number=0, cb=0x55e316044c10 <iscsi_co_generic_cb>, private_data=0x7f15feeb77e0, iov=0x7f1a34042090,
     niov=1) at iscsi-command.c:1107
 #4  0x000055e31604680f in iscsi_co_writev (bs=<optimized out>, sector_num=<optimized out>,
     nb_sectors=<optimized out>, iov=0x7f1a3404e380, flags=<optimized out>) at block/iscsi.c:640
 #5  0x000055e31601e89c in bdrv_driver_pwritev (bs=bs@entry=0x55e317fb6570, offset=offset@entry=5325770752,
     bytes=bytes@entry=4096, qiov=qiov@entry=0x7f1a3404e380, qiov_offset=qiov_offset@entry=0, flags=flags@entry=0)
     at block/io.c:1220

The reason is that during async reconnection, before reconnecting
call back function gets woked, we have closed the old connection,
and the new connection is not ready.
At the same time, up layer still sends pdu to the old iscsi context.

In this patch, before reconnecting successfully, just add the pdu to
waitpdu without sending.
Suggested by Bart, do not show iser related log here.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
[ bvanassche: reformatted patch ]
2020-03-17 20:27:49 -07:00
Bart Van Assche
65caf10cab iser: Remove a superfluous pointer check
Checking whether a pointer is NULL after it has been dereferenced is not
useful. This was detected by Coverity.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-03-07 12:37:51 -08:00
Bart Van Assche
bac94c18b8 libiscsi: Reduce the size of struct iscsi_context
Reduce the size of struct iscsi_context by reordering the members of this
data structure. Additionally, change the rdma_ack_timeout value from
'unsigned char' into 'uint8_t' to make it clear that this variable
represents an integer.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-03-07 12:37:51 -08:00
zhenwei pi
6ed1ffb7b2 iser: support rdma ack timeout optimization
Since 2c1619edef61a03cb516efaa81750784c3071d10 for linux kernel and
55843c4ab8f559679d28c559cc4d681836be769b for rdma-core, rdma cma
supports RDMA_OPTION_ID_ACK_TIMEOUT. It's useful for RDMA out of
sequence case. Because this feature is added recently, we have to
check this in autogen.sh before building source code.

Depend on production enviroument, tunning rdma ack timeout could get
the best performance. Suggested by Bart and Ronnie, instead of using
a fixed timeout value, add two methods to set rdma ack timeout value.
1, add URL variable 'LIBISCSI_RDMA_ACK_TIMEOUT'. This could works
for a specified connection.
2, add env argument 'LIBISCSI_RDMA_ACK_TIMEOUT'. This works as a
common setting for all the connection of a process.

Test under different packet loss rate and different ack timeout, run
fio (iodepth=1) in a guest os, I got this result:
latency under packet loss rate 0.00001:
	timeout 19: avg 170.22, pct99.9 215
	timeout 10: avg 160.08, pct99.9 215
	timeout 8 : avg 146.39, pct99.9 177
	timeout 7 : avg 148.37, pct99.9 211

latency under packet loss rate 0.0001:
	timeout 19: avg 949.23, pct99.9 306
	timeout 10: avg 818.53, pct99.9 378
	timeout 8 : avg 615.84, pct99.9 189
	timeout 7 : avg 618.89, pct99.9 310

Base on this test report, setting ack timeout to 8(1048.576 usec) is
a good choice in my test enviroument.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-03-07 12:37:46 -08:00
Bart Van Assche
529350eab1 Do not cast away constness
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-28 21:54:49 -08:00
Bart Van Assche
3804f3c2e0 Remove the discard_const() macro
Declare dynamically allocated strings as 'char *' instead of 'const char *'.
Remove the discard_const() macro. Do not test whether or not a pointer is
NULL before calling free() because it is allowed to pass NULL to free().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-28 21:54:49 -08:00
zhenwei pi
b98454ae97 iser: fix resource leak during reconnect
After iser reconnects successfully, iser drive should close old
connection and release resources.

Fix resource leak in this patch, and test a lot, this patch works
fine.

Test env:
192.168.122.204: run as a software gateway
192.168.122.205: run iser target, default gateway 192.168.122.204
192.168.122.206: run QEMU as intiator, default gateway 192.168.122.204

run script on 192.168.122.204:
for i in `seq 1 100`
do
	iptables -s 192.168.122.205/32 -A FORWARD -m statistic --mode random --probability 1 -j DROP
	iptables -s 192.168.122.206/32 -A FORWARD -m statistic --mode random --probability 1 -j DROP
	sleep 30
	iptables -F
	sleep 30
done

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-28 18:29:41 +08:00
zhenwei pi
3ccbceb6ff lib/connect.c: fix wrong transport type for iser reconnect
A new iscsi context is created as TCP transport type, but currently
missing iscsi_init_transport to change transport to iser in
reconnecting logic, then iser could never reconnect successfully.

Use orignal transport to initialize new iscsi context.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-27 22:13:08 +08:00
zhenwei pi
e114be4156 iser: fix memory leak for cm thread
If a thread is created without any attr, it works in attached mode.
It means that we need run pthread_join to relaim stack of thread.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-27 10:03:46 +08:00
zhenwei pi
b4ba92094e iser: fix segfault at iser_reg_mr
Hit segfault at iser_reg_mr during attaching disk with backtrace:
 #0  0x000055ace9635b0f in iser_reg_mr (iser_conn=0x55aceca33820) at iser.c:1060
 #1  iser_connected_handler (cma_id=<optimized out>) at iser.c:1300
 #2  iser_cma_handler (event=0x7f29ef1f7950, cma_id=<optimized out>, iser_conn=0x55aceca33820) at iser.c:1326
 #3  cm_thread (arg=0x55aceca33820) at iser.c:1380
 #4  0x00007f2e2c31c4a4 in start_thread (arg=0x7f29ef1f8700) at pthread_create.c:456
 #5  0x00007f2e2c05ed0f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:97
 (gdb) p *iser_conn->tx_desc
 Cannot access memory at address 0x20

This issue can be reproduced easily by attaching several disks of iser
protocol:
 # virsh attach-device stretch iser0.xml
 # virsh attach-device stretch iser1.xml
 ...

Initialize instances with zero to avoid random value pointer.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-27 10:03:46 +08:00
zhenwei pi
d020bb003d iser: set iser cm thread proc name as "iscsi_cm_thread"
libiscsi is usually linked by QEMU, and QEMU sets thread proc name
by function. But iser cm thread is created by libiscsi privately,
QEMU can't set this thread. After attaching a iser disk, we can find
a new thread 'qemu-system-x86' in QEMU process.

With this patch, iser cm thread works with thread name
'iscsi_cm_thread'.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-27 10:03:46 +08:00
zhenwei pi
e2a7fdfb36 socket: fix disconnect corner case for iser
iscsi->fd is never initialized in iser driver, so iscsi_disconnect
always does not work for iser context.

iscsi->fd is used as a member variable of TCP context, so let iscsi
TCP driver handle iscsi->fd, we just call iscsi_disconnect in
iscsi_destroy_context. Luckly, TCP driver has already handle invalid
iscsi->fd case in iscsi_tcp_disconnect.

And fix NULL pointer case for iscsi_disconnect.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-27 10:02:02 +08:00
zhenwei pi
a391176a6d init: fix memory leak in iscsi_create_context
iscsi instance is allocated in iscsi_create_context, after we return
NULL, nobody could handle it anymore.

Currently we can't hit this logic, anyway we still fix this.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-02-25 22:35:54 +08:00
Bart Van Assche
e1978f991a lib: Fix scsi_maintenancein_datain_getfullsize()
Coverity reported dead code.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-23 20:11:14 -08:00
Bart Van Assche
0ad5c2c648 libiscsi: Fix a format specifier
Use %lu to format unsigned long instead of %d.

This was detected by Coverity.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-23 20:11:14 -08:00
Matt Coleman
3b52de7c1c Simplify logic that determines when to send headers
The prior condition could be summarized as:
```
if ((first && second) || second) {
```

This will always evaluate to `second`:

```
((true && true) || true) == true
((false && true) || true) == true
((true && false) || false) == false
((false && false) || false) == false
```

Reported-by: Jeffrey Knapp <jknapp@datto.com>
2020-02-14 12:17:10 -05:00
Ronnie Sahlberg
148e5f69e8 remove FIXME
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2020-02-11 17:06:41 +10:00
Paul Carlisle
f2d750260a Fix data segment length comparison to unsigned long
In logic.c, data segment parameters in the text segment are converted to
signed longs.  Changing from strtol -> strtoul fixes compiler errors on
certain platforms that warn against comparing a signed long with
uint32_t using MIN.
2020-01-27 16:59:10 -08:00
wanghonghao
d200d7b862 iser: queue pdus when cmdsn exceeds maxcmdsn
A PDU is sent directly in iscsi_iser_queue_pdu even if the cmdsn of
it exceeds maxcmdsn, and it may be ignored by the target.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2019-12-10 14:43:39 +08:00
wanghonghao
dfbc6697ae iser: send immediate data
When ImmediateData=Yes, DataSegmentLength is set in iSCSI layer
but immediate data is not sent in the RCaP message.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2019-12-10 14:43:39 +08:00
Ronnie Sahlberg
88f67e8cf8 Merge pull request #292 from sumitrai/TMF_OVERFLOW_DATA_SIZE_CRASH
lib/iser.c: fix overflow_data_size NULL ptr dereference
2019-11-01 07:30:19 +10:00
Bart Van Assche
30fc526c6e Link with -lpthread if iSER is enabled
This patch fixes the following linker error:

/usr/bin/ld: ../lib/.libs/libiscsipriv.a(libiscsipriv_la-iser.o): undefined reference to symbol 'sem_post@@GLIBC_2.2.5'
/usr/bin/ld: //lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2019-10-31 14:16:38 -07:00
Ronnie Sahlberg
65c425ef82 It is a TODO not a FIXME
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2019-09-27 21:04:56 -07:00
David Disseldorp
d42fcd89ce lib: use const for add_data buffers
The buffer is memcopied into the PDU. const makes it a little clearer
that the caller isn't handing over ownership.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2019-09-18 13:30:04 +02:00
David Disseldorp
9c78c6d2af build: add convenience library which exports all symbols
Add a new libiscsipriv.la noinst convenience library, which can then be
used by test-tool for low-level PDU manipulation.

Link: https://github.com/sahlberg/libiscsi/issues/297
Signed-off-by: David Disseldorp <ddiss@suse.de>
2019-09-18 01:25:15 +02:00
David Disseldorp
4ecc34706b discovery: permit SendTargets on normal sessions
rfc3720 indicates that SendTargets on discovery *and* normal operational
sessions must be supported by targets:
   A system that contains targets MUST support discovery sessions on
   each of its iSCSI IP address-port pairs, and MUST support the
   SendTargets command on the discovery session.
   ...
   A target MUST support the SendTargets command on operational
   sessions...

Signed-off-by: David Disseldorp <ddiss@suse.de>
2019-09-11 16:17:22 +02:00
Ronnie Sahlberg
eea5d3ba8e new version 1.19.0
Minor updates to the library

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2019-07-14 08:04:44 +10:00
Sumit Rai
a664ca8c43 lib/iser.c: fix overflow_data_size NULL ptr dereference
Discovered this while running iSCSI.iSCSITMF AbortTaskSimpleAsync
test case. For Task Management command iser_pdu->iscsi_pdu.scsi_cbdata
is not set. When test case tries to send Task Management command
via common API iser_send_command() - it calls overflow_data_size
which tries to dereference scsi_cbdata leading to SEGFAULT.

Added a non-NULL check for scsi_cbdata before accessing it.
2019-06-20 10:15:11 +05:30
Sumit Rai
b3a1d99e27 Added support for iSER related iSCSI keys
Added support for negotiating below keys:
RDMAExtensions, TargetRecvDataSegmentLength, and
InitiatorRecvDataSegmentLength.

These are required to support iSER. See RFC5046 Section 6.
2019-06-12 14:10:31 +05:30
Tim Crawford
9347cfebf2 Replace file variables with .dir-locals.el
Signed-off-by: Tim Crawford <tcrawford@datto.com>
2019-02-21 11:54:02 -05:00
Bart Van Assche
663aad13ba Avoid that iscsi_reconnect() crashes
In the else branch, set the tmp_iscsi->old_iscsi pointer instead of the
iscsi->old_iscsi pointer.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2019-01-13 11:48:28 -08:00
David Disseldorp
9d31150e9d socket: improve ISCSI_HEADER_SIZE readability
The ISCSI_HEADER_SIZE macro accesses iscsi->header_size, so pass it in
as a parameter to make it easier to follow callers.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-10-25 23:38:59 +02:00
David Disseldorp
009892b017 socket: calculate hdr_size once per PDU process loop
ISCSI_HEADER_SIZE is determined based on the iscsi->header_digest
setting, which may change via iscsi_process_pdu().

Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-10-25 23:38:59 +02:00
David Disseldorp
96dc6e7ebd socket: check for malloc failure before dereference
Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-10-25 23:38:59 +02:00
David Disseldorp
15021faf19 lib: properly pass through NOP-In data segment
data_pos corresponds to the data_segment_length (+ padding), so should
always be passed to the NOP-In callback if greater than zero.

Fixes: https://github.com/sahlberg/libiscsi/issues/278

Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-10-25 23:38:41 +02:00
Ronnie Sahlberg
04d6c326b8 Add some missing includes
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2018-10-17 05:43:10 +10:00
Ronnie Sahlberg
83dbc4ff84 Merge pull request #275 from franciozzy/isid_2
Call srand() only once
2018-10-17 05:39:44 +10:00
Ronnie Sahlberg
6fa5eaff13 Merge pull request #274 from bonzini/for-gcc-8
Fix warnings from Coverity and GCC 8
2018-10-09 08:11:58 +10:00
Felipe Franciosi
41af44eba1 iscsi_create_context: call srand() only once
iscsi_create_context() calls srand() every time a new context is
generated. That practice is questionable, as the seed does not need to
change before each call to rand(). As a matter of fact, doing so defeats
the purpose of using rand() altogether. Furthermore, the current
implementation is not thread safe.

This improves ISID generation by using /dev/urandom (when available) as
a seed, and calling srand() only once. In case of errors, fallback to
using something similar to the previous implementation (albeit
thread-safe).

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
2018-10-05 18:10:07 +01:00
Paolo Bonzini
bffafc1c30 avoid truncation when logging message that includes target name 2018-10-04 14:06:39 +02:00
Paolo Bonzini
679d0abe7c avoid fallthrough 2018-10-04 14:06:39 +02:00
Paolo Bonzini
f507c94774 sync: remove unnecessary checks
state is always non-NULL in iscsi_sync_cb and iscsi_discovery_cb.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-04 14:06:39 +02:00
Paolo Bonzini
f0fcee72c4 iser: fix posting of receive descriptors
The old code is effectively always posting iser_conn->min_posted_rx
descriptors, since it is

   if (outstanding + iser_conn->min_posted_rx <= iser_conn->qp_max_recv_dtos) {
       if(iser_conn->qp_max_recv_dtos - outstanding > iser_conn->min_posted_rx)
           count = iser_conn->min_posted_rx;
       else
           count = iser_conn->qp_max_recv_dtos - outstanding;

which is equivalent to

    if(iser_conn->qp_max_recv_dtos - outstanding >= iser_conn->min_posted_rx)
        if(iser_conn->qp_max_recv_dtos - outstanding > iser_conn->min_posted_rx)
            count = iser_conn->min_posted_rx;
        else
            count = iser_conn->min_posted_rx;

So the "if" is redundant and the "min_posted_rx" is actually behaving more
like a _maximum_ number of posted descriptors in one iser_post_recvm.
Fix it with the (presumably) intended logic and remove a goto along
the way.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-04 14:06:39 +02:00
Ronnie Sahlberg
8f8632f0be Merge pull request #273 from franciozzy/isid_fix
Isid fix
2018-10-02 06:43:59 +10:00
Paolo Bonzini
346fb947cb iser_rcv_completion: unify error handling
Move the iscsi_set_error to iser_post_recv, and avoid leaking the
input buffer "in".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-01 13:22:36 +02:00
Felipe Franciosi
50fb64df91 iscsi_create_context: improve ISID randomness
The current random seed for determining a new context's ISID is
calculated by XOR'ing time(), getpid() and "iscsi". When invoked from
iscsi_reconnect(), all three inputs are likely to be identical,
resulting on identical ISIDs.

That happens because iscsi_reconnect() malloc()s a temporary "iscsi"
which is then free()d at the end of the call. Successive calls to
malloc() (from that function) are therefore likely to reuse the same
address for the context.

When multiple sessions are used for different LUNs of the same target,
and reconnects happen within the same second (the precision given by
time()), then multiple login attempts will happen with identical values,
violating the ISID RULE as described in Section 3.4.3 of RFC3270.

This fixes the issue by introducing a sequence number to the ISID seed
generation.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
2018-09-30 19:32:06 +01:00
Felipe Franciosi
1891d502a0 iscsi_reconnect: improve local variable naming
The current iscsi context in iscsi_reconnect() is called "old_iscsi",
whilst the temporary context is called "iscsi". That is rather
confusing, and this fixes that by calling the current context "iscsi"
and the temporary context "tmp_iscsi".

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
2018-09-30 11:38:38 +01:00
Felipe Franciosi
b377eece90 lib/connect.c: Fix whitespace formatting
This fixes some identation in iscsi_reconnect_cb() where whitespaces
were used instead of hard tabs.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
2018-09-30 11:12:30 +01:00
David Disseldorp
c88e9715ab lib/scsi: fix SCSI_PERSISTENT_RESERVE_READ_KEYS handling
When unmarshalling a SCSI_PERSISTENT_RESERVE_READ_KEYS response,
scsi_persistentreservein_datain_unmarshall() assumes that the ADDITIONAL
LENGTH field represents the number of keys packed in the key array.
This is incorrect as key array data buffer may be truncated while
ADDITIONAL LENGTH is left in tact, as per SPC5r17 4.2.5.6:

  If the information being transferred to the Data-In Buffer includes
  fields containing counts ..., then the contents of these fields shall
  not be altered to reflect the truncation, if any, that results from an
  insufficient ALLOCATION LENGTH value, unless the standard that
  describes the Data-In Buffer format states otherwise.

Determine the number of keys returned based on the minimum of the
data-in length and the ADDITIONAL LENGTH value.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-05-31 23:10:28 +02:00