If we do a write that spans more than first-burst-length amount of data
and we can use immediate data.
This will lead to a sequence
I->T SCSI_COMMAND + furst_burst_length of immediate data
T->I R2T
I->T DATA-OUT with the remainder of the data.
T->I SCSI_RESPONSE
In this situation we still have to set the F-bit in the flags for the ISCSI_COMMAND.
The F-bit indicates that the PDU is the final PDU in the current sequence
and is what will trigger the target to process what it has currently received
after which the target will either respond with a SCSI-RESULT or DATA-IN if this
was the final sequence of the command,
Or in the case this was not the last sequence, then the target will send a R2T
to request the next sequence for this command.
In this scenario above we have two sequences.
the first sequence is the SCSI_COMMAND with immediate data and since it has filled the full amount of immediate data and can not send any more data until an R2T, this means this pdu ends the sequence and thus has the F-bit set.
Once we receive the R2T the second sequence starts, which also will have the F-bit set in the PDU.
This fixes the IMMEDIATE_DATA_YES case if the paylaod did
not fit into the first PDU and fixes no unsolicited data
sent out iff IMMEDIATE_DATA_NO and INITIAL_R2T_NO.
Signed-off-by: Peter Lieven <pl@kamp.de>
set the cmdsn in the pdu struct so we can compare with
maxcmdsn when sending to socket even if data-out pdus
do not carry a cmdsn on the wire.
if we would not set pdu->cmdsn comparsion with
iscsi->maxcmdsn would cause a deadlock after
maxcmdsn increases to 2^31.
Signed-off-by: Peter Lieven <pl@kamp.de>
Remove the size field as it is not used. If we would keep it
we would have to calculate it in scsi_task_set_iov_in/out which
would add unneccassry wals to the iovec array.
Signed-off-by: Peter Lieven <pl@kamp.de>
The send logic was completely broken for any cases except
ISCSI_INITIAL_R2T_NO and ISCSI_IMMEDIATE_DATA_YES.
The final flag was set wrong or no data was sent.
It was also broken if the data did not fit into the cmd_pdu as
the consecutive pdus did not have the scsi_cbdata set which
lead to a segfault in iscsi_get_user_out_buffer().
Unfortunately we need to include scsi-lowlevel.h again in iscsi-private.h.
This should be fixed asap by introduction of an iscsi_task struct
to avoid to store iscsi relevant data in the scsi_task.
Signed-off-by: Peter Lieven <pl@kamp.de>
Conflicts:
lib/iscsi-command.c
Signed-off-by: Peter Lieven <pl@kamp.de>
obsolete the old api.
After discussions, It is probably not the right thing to do.
Lets leave the old api as is. Most simpler applications will
continue to pass a single linear buffer to the iscsi_<send-data-out-to-device>_task() for convenience so it would be wrong to force them all to
migrate to amore complex iovector api.
Convert any linear buffers, if provided, to a one element data-out iovector
if a buffer was provided.
This patch fixes a deadlock case where all available
cmdsns have been used for command PDUs which need
additional Data-OUT PDUs to succeed (e.g. a write16
which is larger than first_burst_len).
In this case the target will never increase the maxcmdsn
leading to a deadlock in iscsi_write_to_socket().
If we receive the R2T for such a write request we will
queue the Data-OUT PDUs but at the end of queue. As
a result they will never be sent as the outqueue is already
stuck.
We fix this by sorting the outqueue by ascending itt.
Signed-off-by: Peter Lieven <pl@kamp.de>
This finally makes T1040 trigger the cmdsn overflow deadlock.
Its a bad idea to mangle on negotiated parameters as this leads
to protocol errors.
We also need to avoid overlapping writes as this leads also
to protocol errors.
Additionally we test with and with (if available) immediate data.
Signed-off-by: Peter Lieven <pl@kamp.de>
This patch avoid incrementing itt to 0xffffffff which is
a reserved value for immediate pdus. Avoid incrementing
it to 0xfffffff to avoid unexpected behaviour.
Signed-off-by: Peter Lieven <pl@kamp.de>
RFC3720 says that cmdsn comparison must be done using
serial32 arithmetic. This will definetly avoid a deadlock
if cmdsn wraps from 2^32-1 to 0.
Signed-off-by: Peter Lieven <pl@kamp.de>
so that i/o from the initiator stops flowing and are just queued locally
until teh cmdsn window opens up again.
Send a lot of writes that block until we have processed the R2T
back from the target and verify we can still make process when we
have saturated the cmdsn window and are also blocked waiting for R2Ts.
We can simply next the iscsi_scsi_cbdata in the iscsi_pdu struct
since it is only used inside the iscsi_pdu.
This saves one malloc for each pdu.
Signed-off-by: Peter Lieven <pl@kamp.de>
Change iscsi_scsi_command_async() to write data-out using iovectors
attached to the scsi task structure instead of copying the data into
the buffer holding the header.
Still allow passing the data via an argument to the funtcion so that the
ABI does not change but then just conver the data to an iovector.
Update the write_to_socket functions to know about the iovectors and write them
as part of the pdu.
Convert write10_task to use iovectors.
This will allow 'zero-copy' writes through libiscsi.
However, as 'zero-copy writes does mean that we do more send() calls into
the kernel this may degrade performance for very small i/o.
A scsi write will not take at least 2 send() calls.
One send call for the iscsi header structure and a second send call for the
payload data.
This will be more expensive than the old memcpy() of payload data plus one send() call since the send() will be a lot more expensive than memcpy() of a small amount of data.
This has the nice side effect to remove the compiler warning
"dereferencing type-punned pointer will break strict-aliasing rules"
which occur since gcc-4.7.
There are 79 locations where the warning occurs. All of them are in
statements where the htonl/htons/ntohl/ntohs functions are used, e.g.:
in lib/pdu.c itt = ntohl(*(uint32_t *)&in->hdr[16]);
in lib/scsi-lowlevel.c *(uint32_t *)&task->cdb[2] = htonl(lba);
The warning is not related to the htonl/htons/ntohl/ntohs functions but
to the casting/dereferencing operation. If the dereferenced variable is
already a pointer, the warning does not not occur, e.g. this one:
in lib/pdu.c itt = ntohl(*(uint32_t *)&in->data[16]);
The warning is caused by the -fstrict-aliasing option. The
-fstrict-aliasing option is enabled at optimization levels -O2, -O3, -Os.
Signed-off-by: Bernhard Kohl <bernhard.kohl@gmx.net>
As long as we have no struct iscsi_task we cannot track the
free() of data.indata buffer. This leads to a leaked memory
report in iscsi_context_destroy().
We fix this by assuming it has been freed when we pass it
it to scsi_task.
Signed-off-by: Peter Lieven <pl@kamp.de>
This patch defines an scsi_iovec struct which is guaranteed
to be POSIX compatible. It furthermore adds support for
in+out iovectors for bi-directional operations
Signed-off-by: Peter Lieven <pl@kamp.de>
If an application passes buffers to libiscsi for reading they are
currently added to a linked list one by one. This leads to a malloc
for each buffer object plus O(n) walks trough the list to add
the buffer to then end of the list. Additionally the buffer read
routine takes up to O(n) iterations to find the right buffer
for a request.
This patch introduces an scsi_iovector struct to pass buffers to
an scsi task. Adding a new buffer is in O(1) and finding the
right buffer to also. Malloc requirements are in O(log(n)).
Additionally the scsi_iovector struct is itended to be binary
compatible to an QEMUIOVector allowing to pass this structure
directly to the library.
Initial tests have been made booting an Ubuntu LTS 12.04.1
Desktop server up to the login prompt. The following observations
have been made with regards to scsi_malloc calls:
original implementation: ~11.500 mallocs
using iovector instead of list: ~ 7.500 mallocs
passing the iovector directly: 0 mallocs
To enable this feature in qemu for testing the following patch might
be used:
diff --git a/block/iscsi.c b/block/iscsi.c
index a6a819d..2809c15 100644
--- a/block/iscsi.c
+++ b/block/iscsi.c
@@ -390,11 +390,16 @@ iscsi_aio_readv(BlockDriverState *bs, int64_t sector_num,
return NULL;
}
+#if defined(LIBISCSI_FEATURE_IOVECTOR)
+ assert(sizeof(struct QEMUIOVector) == sizeof(struct scsi_iovector));
+ scsi_iovector_assign(acb->task, (struct scsi_iovector*) acb->qiov);
+#else
for (i = 0; i < acb->qiov->niov; i++) {
scsi_task_add_data_in_buffer(acb->task,
acb->qiov->iov[i].iov_len,
acb->qiov->iov[i].iov_base);
}
+#endif
iscsi_set_events(iscsilun);
---
Signed-off-by: Peter Lieven <pl@kamp.de>