This patch finally introduces a small allocation pool
which recycles all the small portions of memory that
are used for headers and pdu structures. This was
the initial idea behind wrapping all memory functions
in libiscsi.
The results of booting are test system up to the login
prompt are quite impressive:
BEFORE:
libiscsi:5 memory is clean at iscsi_destroy_context() after 10712 mallocs, 18 realloc(s) and 10712 free(s)
AFTER:
libiscsi:5 memory is clean at iscsi_destroy_context() after 41 mallocs, 18 realloc(s), 41 free(s) and 10584 reused small allocations
Signed-off-by: Peter Lieven <pl@kamp.de>
Default to 0 meaning no timeout.
Implement a test for iSCS to test what happens if we send a command
with CMDSN being higher than the target allows.
In this case we dont strictly know what will happen, just that what should
NOT happen is the target responding with success.
But we have to be prepared for any kind of failure, including a timeout,
scsi sense, or even iscsi reject or session failure.
iscsi_data.size is used as parameter to memory
operation functions (such as malloc) which except
size_t rather than int.
Signed-off-by: Peter Lieven <pl@kamp.de>
Ordering by itt adds the special case of handling
itt == 0xffffffff. Order by CmdSN instead as described
by RFC3720.
Signed-off-by: Peter Lieven <pl@kamp.de>
Queing packets with itt = 0xffffffff (e.g. NOP-Out replies) ahead
of queue because they might have a higher CmdSN than following
packets. This again could lead to a deadlock AND its a protocol
violotion:
RFC3720 Section 3.2.2.1 second last paragraph on Page 21:
"On any connection, the iSCSI initiator MUST send the commands in
increasing order of CmdSN, except for commands that are retransmitted
due to digest error recovery and connection recovery."
Signed-off-by: Peter Lieven <pl@kamp.de>
This patch adds support for read/writev to directly read and write
from/to iovectors. Before this patch on read and write from/to socket
the operation was limited by the iovec boundaries. If there is enough
data in the buffer or enough buffer space available its now possible
to transfer the whole data in one atomic operaion.
Signed-off-by: Peter Lieven <pl@kamp.de>
This will set TCP_NODELAY on the socket connection
to the target. This is the first step to improve latency.
For systems supporting TCP_CORK we plan to add a cork
around certain PDUs e.g. DATA-Out, but this needs further
testing.
Signed-off-by: Peter Lieven <pl@kamp.de>
We where modifying out_offset and out_len in iscsi_write_to_socket().
If the packet that was being sent before reconnect was a write command
the was a significant change that out_offset and out_len where already
touched. When requeing the packet after reconnect we where
sending garbage!
Signed-off-by: Peter Lieven <pl@kamp.de>
A storage might sent an R2T response for a WRITE command while
we still sending out the WRITE command PDU. This is especially
the case when the command PDU carries immediata data.
Without this patch the R2T response will get lost as
the cmdpdu for the R2T cannot be found in iscsi_process_pdu()
leading to a deadlock.
Signed-off-by: Peter Lieven <pl@kamp.de>
This patch fixes a deadlock case where all available
cmdsns have been used for command PDUs which need
additional Data-OUT PDUs to succeed (e.g. a write16
which is larger than first_burst_len).
In this case the target will never increase the maxcmdsn
leading to a deadlock in iscsi_write_to_socket().
If we receive the R2T for such a write request we will
queue the Data-OUT PDUs but at the end of queue. As
a result they will never be sent as the outqueue is already
stuck.
We fix this by sorting the outqueue by ascending itt.
Signed-off-by: Peter Lieven <pl@kamp.de>
This finally makes T1040 trigger the cmdsn overflow deadlock.
Its a bad idea to mangle on negotiated parameters as this leads
to protocol errors.
We also need to avoid overlapping writes as this leads also
to protocol errors.
Additionally we test with and with (if available) immediate data.
Signed-off-by: Peter Lieven <pl@kamp.de>
RFC3720 says that cmdsn comparison must be done using
serial32 arithmetic. This will definetly avoid a deadlock
if cmdsn wraps from 2^32-1 to 0.
Signed-off-by: Peter Lieven <pl@kamp.de>
Change iscsi_scsi_command_async() to write data-out using iovectors
attached to the scsi task structure instead of copying the data into
the buffer holding the header.
Still allow passing the data via an argument to the funtcion so that the
ABI does not change but then just conver the data to an iovector.
Update the write_to_socket functions to know about the iovectors and write them
as part of the pdu.
Convert write10_task to use iovectors.
This will allow 'zero-copy' writes through libiscsi.
However, as 'zero-copy writes does mean that we do more send() calls into
the kernel this may degrade performance for very small i/o.
A scsi write will not take at least 2 send() calls.
One send call for the iscsi header structure and a second send call for the
payload data.
This will be more expensive than the old memcpy() of payload data plus one send() call since the send() will be a lot more expensive than memcpy() of a small amount of data.
If a process opens more than once connection the interfaces are assigned
round-robin from the available ones. However, different processes will
uses a random interface of as starting point. This patch will also
make the redirect case handled correctly.
This patch adds a wrapper around all memory allocations and frees.
The idea is to get warned immediately if the application leaks memory.
Additionally the wrapper functions make it easy to add different
memory allocators or memory pools in the future.
Quoth gcc-4.6.3:
libtool: compile: gcc -DHAVE_CONFIG_H -I. -I. -I./include "-D_U_=__attribute__((unused))" -Wall -W -Wshadow -Wstrict-prototypes -Wpointer-arith -Wcast-align -Wwrite-strings -g -O2 -MT lib/socket.lo -MD -MP -MF lib/.deps/socket.Tpo -c lib/socket.c -fPIC -DPIC -o lib/.libs/socket.o
lib/socket.c:445:1: warning: 'inline' is not at beginning of declaration [-Wold-style-declaration]
Fix this and make it a static function.
Also remove trailing whitespace from this file while at it.
Signed-off-by: Arne Redlich <arne.redlich@googlemail.com>
Send a large number of DATA-OUT PDUs that do not have a matching SCSI-COMMAND
PDU and verify that the target responds correctly. Either by terminating the
session or by just ignoring the data.
Verify also that the target is not "surprised" and crashes.
Signed-off-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
qemu-kvm/qemu-img starts in+out polls from the socket before the connection is
established. This leads to a hang if the connection cant be established
(i.e. the target is down when qemu-kvm is started).
before:
LIBISCSI_DEBUG=2 qemu-img convert -f iscsi -O raw iscsi://127.0.0.1/iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac/0 /dev/null
libiscsi: connecting to portal 127.0.0.1 [iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac]
libiscsi: read from socket failed, errno:111 [iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac]
libiscsi: connection to 127.0.0.1 established [iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac]
->success!!
after:
LIBISCSI_DEBUG=1 qemu-img convert -f iscsi -O raw iscsi://127.0.0.1/iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac/0 /dev/null
libiscsi: iscsi_service: socket error Connection refused(111) while connecting. [iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac]
libiscsi: Failed to connect to iSCSI socket. iscsi_service: socket error Connection refused(111) while connecting. [iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac]
qemu-img: iSCSI: Failed to connect to LUN : Failed to connect to iSCSI socket. iscsi_service: socket error Connection refused(111) while connecting.
qemu-img: Could not open 'iscsi://127.0.0.1/iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac/0': Invalid argument
qemu-img: Could not open 'iscsi://127.0.0.1/iqn.2004-04.com.qnap:ts-809u:iscsi.lieven20.c53dac/0'
This patch adds support for setting TCP_SYNCNT to overwrite
the system default values. This allows indirect support
for a configurable connect timeout.
Linux uses a exponential backoff for SYN retries starting
with 1 second.
This means for a value n for TCP_SYNCNT, the connect will
effectively timeout after 2^(n+1)-1 seconds.