Commit Graph

369 Commits

Author SHA1 Message Date
Bart Van Assche
2a5a0b3291 Enable -Wno-unused-parameter
Instead of adding __attribute__((unused)) to unused arguments, add the
-Wno-unused-parameter compiler flag.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2021-05-23 13:23:41 -07:00
Bart Van Assche
ea6b2282d4 Use __attribute__((format(printf, ...))) directly
Instead of defining the macro _R_(), define __attribute__() as a macro for
compilers that do not support __attribute__(), namely Microsoft Visual
Studio.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2021-05-23 11:52:40 -07:00
Anastasia Kovaleva
1b7d1743ae test-tool: Add overwrite check for all test cases
Check if the residual data does not owerwrite existing data blocks has now
been added for all testing data to improve the uniformity of test runs,
increase test readability and remove the duplicate testing data records.
2021-02-08 15:28:26 +03:00
Anastasia Kovaleva
2e8c571955 test-tool: Refactoring residuals write tests
Looking at test_write10_residuals.c, test_write12_residuals.c and
test_write16_residuals.c tests the similarity of the testing scenario
can be found. There are several EDTL and SPDTL combinations which are
the same for different length write command tests. They form the core
of testing scenario of these commands. There aren't so much differences
in the way of testing these combinations itself either. It is possible
to move the main parameters describing the testing scenario into a
separate structure and move the scenario itself into a separate function.
It will increase the readability and reduce the duplicate code of these tests.
2021-02-08 15:23:49 +03:00
Hou Pu
03fa3f627c iser: fix segmentation fault when task management pdu is received
As iser_pdu->desc->data_dir is not initialised when sending a PDU.
The value remains what it was when it was used last time. Thus
a PDU could be considered to have data if it previously had and
might cause segmentation fault.

For example if a pdu is a reset task management task with no data
to transfer and the pdu is previously used as a read task. Thus
it would cause fault like below:

> struct scsi_iovector *iovector_in = &task->iovector_in;

0  0x00007ffff7bcb2d1 in iser_rcv_completion (rx_desc=0x555555b79e48, iser_conn=0x555555b573a0) at iser.c:1349
1  0x00007ffff7bcb53e in iser_handle_wc (wc=0x7fffffffdc00, iser_conn=0x555555b573a0) at iser.c:1426
2  0x00007ffff7bcb685 in cq_event_handler (iser_conn=0x555555b573a0) at iser.c:1468
3  0x00007ffff7bcb81b in cq_handle (iser_conn=0x555555b573a0) at iser.c:1516
4  0x00007ffff7bc8b28 in iscsi_iser_service (iscsi=0x555555b58710, revents=1) at iser.c:118
5  0x00007ffff7bc3862 in iscsi_service (iscsi=0x555555b58710, revents=1) at socket.c:1016
6  0x00007ffff7bc3f6c in event_loop (iscsi=0x555555b58710, state=0x7fffffffe000) at sync.c:71
7  0x00007ffff7bc4605 in iscsi_task_mgmt_sync (iscsi=0x555555b58710, lun=0, function=ISCSI_TM_LUN_RESET, ritt=4294967295, rcmdsn=0) at sync.c:281
8  0x00007ffff7bc46cf in iscsi_task_mgmt_lun_reset_sync (iscsi=0x555555b58710, lun=0) at sync.c:312
9  0x000055555555500d in iscsi_lun_reset_sync (iscsi=0x555555b58710) at iscsiclient_lun_reset.c:34
10 0x0000555555555680 in main (argc=7, argv=0x7fffffffe1c8) at iscsiclient_lun_reset.c:211

Signed-off-by: Hou Pu <houpu@bytedance.com>
2020-11-08 15:45:00 +08:00
David Disseldorp
6eb4b7b6e5 lib: parse Information sense descriptor type
The Information descriptor type is defined in SPC-5 (r17 4.4.2.2) and
may be used to provide the data offset on COMPARE_AND_WRITE miscompare.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-10-19 14:54:18 +02:00
wanghonghao
843a01cbd8 iser: aggregate ack completion queue (CQ) events
Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-11 12:46:40 +08:00
wanghonghao
2212021747 iser: enhance connection procedure
This patch is used to fix the following problems in the current connection
method:
1. iscsi_iser_connect() waits until the connection is established or failed,
and may block the caller for a long time.
2. Although there's a cm_thread handles communication events, but in fact it
has no effects after the connection is established.
3. Resources are not released properly after reconnection failed. And once we
try to reconnect again, the resources will leak permanently.
(see iscsi_reconnect()).

This patch eliminate cm_thread and handle communication events in the caller
thread.
Connection procedure:
1. Create a mock fd by eventfd() (or just use old_iscsi->fd while reconnecting),
and assign it to iscsi->fd.
2. Create communication event channel, make it non-blocking and dup the
notifier fd to iscsi->fd.
3. Handle communication events by iscsi_which_events()/iscsi_service() loop
until connection established or falied.
4. If connection is established successfully, dup the notifier fd of completion
queue (CQ) events to iscsi->fd.
5. Handle completion queue (CQ) events by iscsi_which_events()/iscsi_service()
loop.
The entire procedure is non-blocking.

After established, whenever iscsi_service() is called with revents=0 or
queue_pdu() is called with a NOP pdu, communication events will be checked.

When connection failed, iser transport cleanup itself before callbacks.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-07 10:38:37 +08:00
wanghonghao
68ce3363aa iser: dynamic memory region allocator
Implement an allocator for allocating memory region of different lengths.
The allocator registers 4MB memory chunks as memory regions, and select a
free segment from one of them each time.

4KB is the minimum allocation unit, and free segments in the same chunk can be
merged into a larger free segment by the rule of buddy allocation. As a result,
size of allocated segments will be power of 2, this may waste some space but
produces less fragments.

In each chunk, a complete binary tree (which is actully an array) is used
to maintain free segments. Each node records the order of the largest segment
can be allocated from its subtree. Here's a miniature example.

A chunk with all segments free:
level 4                         4(0x1)
level 3            3(0x2)                      3(0x3)
level 2     2(0x4)        2(0x5)       2(0x6)        2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)

After allocate a 16KB(order=3) memory region:
level 4                         3(0x1)
level 3            0(0x2)                      3(0x3)
level 2     2(0x4)        2(0x5)       2(0x6)        2(0x7)
level 1 1(0x8) 1(0x9) 1(0xa) 1(0xb) 1(0xc) 1(0xd) 1(0xe) 1(0xf)

It tooks 1 comparison to determine if a chunk can satisfy and at most 11
loops to find the leftmost free segment meets the requirments.
The value of each node is not more than 11, and a 8-bit integer is enough
to store it, so only 2048 bytes is required for each tree. And since the
entire tree is in a contiguous piece of memory and no rotations are needed,
it's far more efficient than self-balancing trees of the same size.

Different 4MB chunks are linked as a list, and the selection order is from
head to tail each time. If no existing chunks can satisfy the allocation,
the allocator will register another 4M chunk and add it to the tail.
+---------+       +---------+      +---------+
|4MB chunk| --->  |4MB chunk| ---> |4MB chunk|
+---------+       +---------+      +---------+

In most cases, smaller IOs can always get memory regions from the first or
second chunk and never traverse the list too much, and if we really send a lot
of large IOs, the cost of the traversal is rarely critical.

At last, obviously, the chunks can only allocate a maximum of 4MB memory region,
if a larger memory region is needed, the allocater registers/deregisters a
memory region directly regardless of buffer.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2020-04-01 11:35:42 +08:00
David Disseldorp
c8992f45b1 test: add support for 0x0A type XCOPY segment descriptors
Type 0x0A segment descriptors provide a 32-bit wide number-of-bytes
field, allowing for much larger copy offloads with a single segment
descriptor.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2020-03-18 18:59:34 +01:00
Bart Van Assche
bac94c18b8 libiscsi: Reduce the size of struct iscsi_context
Reduce the size of struct iscsi_context by reordering the members of this
data structure. Additionally, change the rdma_ack_timeout value from
'unsigned char' into 'uint8_t' to make it clear that this variable
represents an integer.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-03-07 12:37:51 -08:00
zhenwei pi
6ed1ffb7b2 iser: support rdma ack timeout optimization
Since 2c1619edef61a03cb516efaa81750784c3071d10 for linux kernel and
55843c4ab8f559679d28c559cc4d681836be769b for rdma-core, rdma cma
supports RDMA_OPTION_ID_ACK_TIMEOUT. It's useful for RDMA out of
sequence case. Because this feature is added recently, we have to
check this in autogen.sh before building source code.

Depend on production enviroument, tunning rdma ack timeout could get
the best performance. Suggested by Bart and Ronnie, instead of using
a fixed timeout value, add two methods to set rdma ack timeout value.
1, add URL variable 'LIBISCSI_RDMA_ACK_TIMEOUT'. This could works
for a specified connection.
2, add env argument 'LIBISCSI_RDMA_ACK_TIMEOUT'. This works as a
common setting for all the connection of a process.

Test under different packet loss rate and different ack timeout, run
fio (iodepth=1) in a guest os, I got this result:
latency under packet loss rate 0.00001:
	timeout 19: avg 170.22, pct99.9 215
	timeout 10: avg 160.08, pct99.9 215
	timeout 8 : avg 146.39, pct99.9 177
	timeout 7 : avg 148.37, pct99.9 211

latency under packet loss rate 0.0001:
	timeout 19: avg 949.23, pct99.9 306
	timeout 10: avg 818.53, pct99.9 378
	timeout 8 : avg 615.84, pct99.9 189
	timeout 7 : avg 618.89, pct99.9 310

Base on this test report, setting ack timeout to 8(1048.576 usec) is
a good choice in my test enviroument.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
2020-03-07 12:37:46 -08:00
Bart Van Assche
3804f3c2e0 Remove the discard_const() macro
Declare dynamically allocated strings as 'char *' instead of 'const char *'.
Remove the discard_const() macro. Do not test whether or not a pointer is
NULL before calling free() because it is allowed to pass NULL to free().

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-28 21:54:49 -08:00
Bart Van Assche
0ad5c2c648 libiscsi: Fix a format specifier
Use %lu to format unsigned long instead of %d.

This was detected by Coverity.

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2020-02-23 20:11:14 -08:00
wanghonghao
51391285d8 iser: remove __packed from struct iser_cm_hdr declaration
`__packed` is not defined previously, and was treated as a varible
declaration.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2019-12-10 14:42:38 +08:00
wanghonghao
22d7360b5e iser: fix struct iser_rx_desc
iSER header is followed by iSCSI PDU without any pad in an RCaP Message.

Signed-off-by: wanghonghao <wanghonghao@bytedance.com>
2019-12-09 13:17:55 +08:00
Bart Van Assche
a55f11ee68 Improve iser_rx_desc alignment
Align iscsi_header[] and data[] on an 8-byte boundary instead of on a 4-byte
boundary. With this patch applied pahole produces the following output:

struct iser_rx_desc {
        struct iser_hdr    iser_header;                  /*     0    28 */
        char                       pad1[4];              /*    28     4 */
        char                       iscsi_header[48];     /*    32    48 */
        /* --- cacheline 1 boundary (64 bytes) was 16 bytes ago --- */
        char                       data[128];            /*    80   128 */
        /* --- cacheline 3 boundary (192 bytes) was 16 bytes ago --- */
        struct ibv_sge     rx_sg;                        /*   208    16 */
        struct ibv_mr *            hdr_mr;               /*   224     8 */
        char                       pad2[24];             /*   232    24 */

        /* size: 256, cachelines: 4, members: 7 */
};

Additionally, this patch fixes the following build errors:

iser.c: In function 'iser_alloc_rx_descriptors':
iser.c:916:11: error: taking address of packed member of 'struct iser_rx_desc' may result in an unaligned pointer value [-Werror=address-of-packed-member]
  916 |   rx_sg = &rx_desc->rx_sg;
      |           ^~~~~~~~~~~~~~~
iser.c: In function 'iser_post_recvm':
iser.c:955:20: error: taking address of packed member of 'struct iser_rx_desc' may result in an unaligned pointer value [-Werror=address-of-packed-member]
  955 |   rx_wr->sg_list = &rx_desc->rx_sg;
      |                    ^~~~~~~~~~~~~~~

Signed-off-by: Bart Van Assche <bvanassche@acm.org>
2019-10-31 14:16:56 -07:00
David Disseldorp
d42fcd89ce lib: use const for add_data buffers
The buffer is memcopied into the PDU. const makes it a little clearer
that the caller isn't handing over ownership.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2019-09-18 13:30:04 +02:00
Khazhismel Kumykov
22648d9dbf add some missing includes
avoids forward referenced enums, and includes standard type defns

Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
2019-07-14 12:49:05 +10:00
David Disseldorp
9d31150e9d socket: improve ISCSI_HEADER_SIZE readability
The ISCSI_HEADER_SIZE macro accesses iscsi->header_size, so pass it in
as a parameter to make it easier to follow callers.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2018-10-25 23:38:59 +02:00
Tim Crawford
32cfd3c2f8 Make iscsi_set_noautoreconnect public
In our use at Datto, we have come across several issues related to the
automatic reconnect logic (mainly its interaction with POLLHUP). This
allows us to disable the functionality, at the expense of writing our
own reconnect logic.

Related: #241
2017-12-13 16:24:20 -05:00
Felipe Franciosi
3c4925e8da pdu: Introduce iscsi_cancel_pdus()
Introduce a helper exported from lib/pdu.c which cancels all pdus for a
given context. This patch eliminates repeated code from various other
files which have the same purpose. The only functional difference is
that the cancellation done from iscsi-command.c was (incorrectly) not
checking for iscsi->is_loggedin before issuing callbacks.

Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
2017-11-25 17:03:01 +00:00
Tim Crawford
aba0f7da1a Replace WIN32 with _WIN32
Using WIN32 depends on the build environment defining the variable.
_WIN32 is a predefined MSVC macro and is always available.

Signed-off-by: Tim Crawford <crawfxrd@gmail.com>
2017-11-29 10:07:44 -05:00
Tim Crawford
cdb437c545 Fix compilation with VS2017
The primary issue is that in MSVC 14.00 (VS2015) Microsoft added
snprintf as a function to the standard library and prevents users from
defining it to something else (typically, this was _snprintf). So, only
define it when using _MSC_VER < 1900.

Other changes are:
- Fix macro definition of dup2
- Add macro for getpid
- Add function definition for win32_dup
- Add missing EXTERNs

Signed-off-by: Tim Crawford <crawfxrd@gmail.com>
2017-11-29 10:07:44 -05:00
Ronnie Sahlberg
f750101980 Add initial visual studio project files and fix the win32 build
Win32 has been rotting for a while. This patch adds vs17 build files
as well as fixing up all build errors that have accumulated.
There are still build warnings but those can be addressed in a followup
patch.

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2017-05-11 21:19:14 -07:00
Ronnie Sahlberg
33d0b63717 Merge branch 'master' into read_batch_pdu2 2017-01-07 08:42:12 -08:00
Ronnie Sahlberg
310674224c Merge pull request #233 from plieven/for_upstream
fix crc32c header checksums and some bugs
2017-01-07 08:32:32 -08:00
Peter Lieven
eb7c1d9b0c socket: process PDUs directly after receiving them
this eliminated the need for an inqueue

Signed-off-by: Peter Lieven <pl@kamp.de>
2017-01-06 11:48:17 +01:00
Ronnie Sahlberg
8784a2b65f Bump API version
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2017-01-05 19:33:40 -08:00
Peter Lieven
443b104833 crc32c: use uint_t types
Signed-off-by: Peter Lieven <pl@kamp.de>
2017-01-05 14:39:15 +01:00
Peter Lieven
c68d2c0ddb init: introduce iscsi_smalloc
Signed-off-by: Peter Lieven <pl@kamp.de>
2017-01-05 12:18:03 +01:00
David Disseldorp
4bc5f962e2 Libiscsi: add support for EXTENDED COPY
Build on existing scsi_cdb_extended_copy() functionality. This operation
can be used to offload inter and intra LU copies to the target.

The API is rather primitive, in that the caller needs to construct the
parameter buffer, including CSCD and segment descriptor lists, etc.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2017-01-04 19:33:00 +01:00
David Disseldorp
f475436c5a Libiscsi: add support for RECEIVE COPY RESULTS
Build on existing scsi_cdb_receive_copy_results() functionality. This
request is most commonly used to determine target-side EXTENDED COPY
operational parameters.

Signed-off-by: David Disseldorp <ddiss@suse.de>
2017-01-04 19:33:00 +01:00
Peter Lieven
b81e9a28a6 Batch pdu read in function iscsi_read_from_socket()
iscsi_read_from_socket can currently only read one PDU in each iscsi_service invocation even
if there is more data available on the socket. This patch reads all PDUs until the socket
would block. It enqueues all complete read PDUs and then processes them in order of arrival.

Signed-off-by: Peter Lieven <pl@kamp.de>
2017-01-02 15:52:19 +01:00
Ronnie Sahlberg
d3ef192021 Add synchronous function iscsi_discovery_sync()
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2016-10-09 11:54:10 -07:00
Ronnie Sahlberg
39001203b7 TESTS: simple support for READDEFECTDATA10/12
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2016-09-22 22:43:16 -07:00
Peter Lieven
fa123fc397 abstract transport to static driver functions and opaque driver specific information.
This splits a transport into static driver specific functions for the common
iscsi commands. Optionally, a driver specific opaque memory is introduced
which is currently only used by iSER transport.
Last a lot of functions changed to static.

Signed-off-by: Peter Lieven <pl@kamp.de>
2016-08-05 11:28:43 +02:00
Ronnie Sahlberg
37507c994a Add back iscsi_queue_pdu
We need a public symbol for iscsi_queue_pdu. This is now just a
simple wrapper around iscsi->t->queue_pdu

https://github.com/sahlberg/libiscsi/issues/212

Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2016-07-11 18:37:25 -07:00
Peter Lieven
765d492aa0 pdu: dump PDU header on ILLEGAL REQUEST sense
Signed-off-by: Peter Lieven <pl@kamp.de>
2016-07-07 11:38:21 +02:00
Ronnie Sahlberg
01de246bdc Add a feature macro for ISER and bump api version
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
2016-06-03 19:09:39 -07:00
Roy Shterman
a628264ef0 Libiscsi: iSER implementation
This commit includes all iSER implementation in libscsi
library and utilities.

Also, adding iser option in url.

Change-Id: I55ca8a9d4db802e72eb991061260dbb0bd0ef9ba
Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:59:01 -07:00
Roy Shterman
47b6881b97 Libiscsi: Adding abstraction to async functions
future iSER implementation will include different implementations
for all socket relative function. in iSER we get event only when
there is new entry in completion queue opposed to TCP that we get event
when we can write to the socket.

1. iscsi_get_fd -
	TCP - returns socket fd.
	ISER - returns completion queue channel fd.
2. iscsi_service -
	TCP -   processing the event type got from the socket
		and handles it.
	ISER -  rearming the event mechanism in the completion queue
		and polling all available completion queue entries for
		process.
3. iscsi_which_events -
	TCP -   returns which type of event the library is waiting for
		(Read, Write or both).
	ISER -  in iSER we are waiting only for POLLIN event, hence this
		function always returns POLLIN.

Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:54:02 -07:00
Roy Shterman
c85042bacb Libiscsi: Introducing new functions for zero-copy write operations
iscsi-command:  Adding new functions for all write operations (WRITE10,
                WRITE12, WRITE16, WRITEOR, etc') for cases where the user wants
                to pass his own io vectors (prevent memcpy).

                new functions are called iscsi_write*_iov_task and looks
                very similar to the iscsi_write*_task, only they get
		scsi_iovec pointer and number of scsi_iovec as
		parameters.

Change-Id: I719552b4cbda4f937975b5df7e77b4844e48cd16
Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:51:27 -07:00
Roy Shterman
e00e47d28d Libiscsi: Introducing new functions for zero-copy read operations
iscsi-command:  Adding new functions for all write operations (READ6,
                READ10, READ12, READ16, etc') for cases where the
		user wants to pass his own io vectors (prevent memcpy).

                new functions are called iscsi_read*_iov_task and looks
                very similar to the iscsi_read*_task, only they get
                scsi_iovec pointer and number of scsi_iovec as
                parameters.

Change-Id: Ice6bdb9227d72b20f495927f17d6757c124e4c84
Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:47:55 -07:00
Roy Shterman
6c1bdb4808 Libiscsi: Adding free_pdu function to transport abstraction
Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:47:23 -07:00
Roy Shterman
dff69584e0 Libiscsi: Adding disconnect function to transport abstraction
all library: change disconnect to iscsi->t->disconnect

1. In TCP we need only to put -1 in fd and we don't
have more transport resources. In future iSER we will need to
clean resources and destroy the rdma connection.

Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:46:50 -07:00
Roy Shterman
bc64420bad Libiscsi: Changing header iscsi_in_pdu
socket: need to malloc hdr

include/iscsi-private: changing iscsi_in_pdu hdr to char*
		       instead of static array for more convinient iser
		       pdu creation.

To use iscsi_in_pdu in iSER without making a copy of it
we need to change hdr to pointer from static array,
Because of that, iscsi_tcp flow need to do szmalloc (small zero malloc)
hdr when creating new iscsi_in_pdu. iscsi_in_pdu is being malloced
once per each received pdu. This change is reducing iscsi_in_pdu struct
size but adding extra allocation of the same size we reduced
from the struct.

Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:46:18 -07:00
Roy Shterman
2671e10565 Libiscsi: Adding new_pdu function to transport abstraction
Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:45:44 -07:00
Roy Shterman
e3df0bbf96 Libiscsi: Adding queue pdu function to transport abstraction
include/iscsi-private: adding queue_pdu in transport function pointers
                       struct
include/iscsi: declaration of tcp_queue_pdu function

socket: adding queue_pdu function to transport initialization

all_library: changing iscsi_queue_pdu into iscsi->t->queue_pdu

Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:44:51 -07:00
Roy Shterman
0d6362ffe6 Libiscsi: Adding connect function to transport abstraction
socket: adding tcp_connect function and implement it
	in the last common part of iSER and TCP in iscsi_connect_async

Signed-off-by: Roy Shterman <roysh@mellanox.com>
2016-06-03 18:43:46 -07:00