TASK SET FULL Definition(s) in FC-TAPE and SAM-2

Gerard Roudier groudier at club-internet.fr
Sat Mar 20 01:10:16 PST 1999


* From the T10 Reflector (t10 at symbios.com), posted by:
* Gerard Roudier <groudier at club-internet.fr>
*

George,

On Sat, 20 Mar 1999, George M. Ericson wrote:

> Gerard,
>=20
> Excellent summary of the real world.

Thanks.

> In our FC experience at CLARiiON, the multi-initiator case you site is a
> real problem with nasty consequences for the device driver that does not
> provide some hysteresis around queue full and busy.  My recommendation fo=
r
> device driver writers is to delay for a short time (1-5  seconds should d=
o)
> before retrying a busy.  (Note that for FC systems, immediate retries to =
a
> really busy device will generate on the order of 15,000 to 30,000 interru=
pts
> per second, more than enough to bring many servers to their knees.)  For
> queue full, treating the indication as an indication of the available que=
ue
> size is a fine short term policy decision.  However, not good at all for
> medium/long term policy.  Within a relatively short time, (probably 10's =
of
> seconds), the device driver should ratchet the assumed queue size back up=
=2E
> Unless told by some other mechanism, the device driver should not assume =
a
> hard upper bound for queue size based on a queue full indication.

In my opinion, initiators difficulty to cope gracefully with any kind=20
of compliant implementation of target regarding BUSY and QUEUE FULL=20
condition is the result of the specifications not providing guidelines=20
that address this problem.

For the BUSY status, what you want is not to waste resources and not to be
too late requeuing the command. Any fixed requeue delay is unable to
address the issue. In real life, such a problem has been addressed using
heuristics based on variable delay values. For example such an
implementation note should be added to the specs, in my opinion:=20

a) The intiator may requeue immediately a command after having received=20
   a BUSY status from the device for the first attempt of queuing this=20
   command.

b) If the device reports BUSY for the the second attempt of a command, the=
=20
   initiator should delay subsequent attempts of this command using the=20
   following algorithm for the computation of subsequent delays:

   - Select a value that may be fixed or depend on some knowledge on the=20
     device.

   - Use twice the value of the delay used for the previous attempt until=
=20
     some maximum value is reached and use this maximum value for
     subsequent attempts.

For example:  (from 10 ms to 1 second)
------------
- try 1    delay=3D0
- try 2    delay=3D10 ms
- try 3    delay=3D20 ms
- try 4    delay=3D40 ms

- try 10   computed delay=3D1280ms > 1000ms --> use delay=3D1000ms
- try 11   delay=3D1000ms
- try 12   delay=3D1000ms

and so on...

Such a heuristic ensures intiator will not be late of more than the actual
delay for the device to become READY (recovers in less that twice the
actual BUSY time of the device if > 10 ms) and takes care of not wasting
resources.

For the QUEUE FULL problem, notes could be of the following pattern:

a) A device that has returned QUEUE FULL status for a command should=20
   not return this status for the next 10 commands if the initiator=20
   does not attempt to increase the number of commands that were in=20
   the TASK SET at the previous return of QUEUE FULL.

b) A device should not return QUEUE FULL if the current TASK SET for=20
   the device is empty, but prefer the BUSY status.


(a) ensures that the simplest heuristic will not encounter more than 10%=20
    of QUEUE FULL status at any time, and allows, using a simple
    heuristic, to get far better results.

(b) avoids to trigger the 'EMPTY and FULL at the same time' problem due to=
=20
    human brain not being as perfect as computers :-), that may:
    - break the driver code, or
    - hang the device driver queuing mechanism.

Regards,
   G=E9rard.
=20
> Regards,
> George Ericson
> CLARiiON Advanced Storage Division, Data General
> GEricson at CLARiiON.com
>=20
> -----Original Message-----
> From: owner-t10 at Symbios.COM [mailto:owner-t10 at Symbios.COM]On Behalf Of
> Gerard Roudier
> Sent: Wednesday, March 17, 1999 4:15 PM
> To: gop at us.ibm.com
> Cc: t10 at Symbios.COM
> Subject: Re: TASK SET FULL Definition(s) in FC-TAPE and SAM-2
>=20
>=20
> * From the T10 Reflector (t10 at symbios.com), posted by:
> * Gerard Roudier <groudier at club-internet.fr>
> *
>=20
>=20
> On Wed, 17 Mar 1999 gop at us.ibm.com wrote:
>=20
> > * From the T10 Reflector (t10 at symbios.com), posted by:
> > * gop at us.ibm.com
> > *
> > Joe,
> > The is a significant difference on how an initiator is supposed to resp=
ond
> > to a BUSY vs a TASK SET FULL status.
> >
> > A BUSY only tells the initiator that the command could not be received =
by
> > the target and the initiator should try again later. There is no
> indication
> > as to how much later from the target.
> >
> > A TASK SET FULL also tells the initiator that the command could not be
> > received by the target and that in initiator should not try again until=
 it
> > receives a command complete indication from a currently outstanding
> > command. So in the case of a TASK SET FULL there is an indication from =
the
> > target as to when the initiator has a chance of having the command
> > accepted.
>=20
> The TASK SET FULL only tells the initiator that the device decided that i=
t
> has no resources enough to accept the command. It only informs the
> initiator on the fact that the cause has been stated to be a lack or
> resources by the device. It is initiator responsibility not to behave
> stupidly in such a situation, and in real life it is not that easy to
> implement good heuristics for TASK SET FULL conditions that deal
> gracefully with all existing device firmware implementations.
>=20
> Indeed, it the device has outstanding commands, the behaviour you suggest
> in reasonnable, but if it hasn't any, initiator still must recover from
> the situation. This has happened with some early drives, at least, likely
> when write caching was enabled, because commands are completed as soon as
> data is copied into the cache and the device can base the lack of
> resources condition on cache usage, or just share the cache memory for
> data and command queues. This may also happen in multi-initiator
> environment, regardless the way the target handles its resources. Such
> behaviours are quite compliant with the specifications.
>=20
> The behaviour of device firmwares regarding TASK SET FULL condition may b=
e
> basically of the following patterns:
>=20
> 1) Devices that implement a fixed command queue depth. We can find
>    middle-range hard disks that support a fixed value of 63 or 64
>    commands, for example.
>    Such devices return TASK SET FULL if you try to queue more commands.
>    No need to say, that it is easy to handle from the initiator.
>=20
> 2) Devices that returns TASK SET FULL on some complex condition on
>    the amount of resource used. For that devices, the device may return
>    TASK SET FULL with a variable number of outstanding commands,
>    especially when write caching is enabled. These devices require to
>    be a bit more clever for the handling of TASK SET FULL conditions
>    for performances not to be affected. Numerous high-end hard disks
>    behave so.
>=20
> And, btw, if several initiators are queuing commands to the same target,
> obviously, a TASK SET FULL can be returned by this target to an initiator
> that haven't any command currently queued to this target.
>=20
> > This difference is important because the way most initiators respond to
> > BUSY is to immediately resend the command which has the effect of clogg=
ing
> > up the SCSI bus with needless activity. Where as TASK SET FULL causes t=
he
> > initiators to hold back resending the command thus reducing traffic on =
the
> > SCSI bus.
>=20
> A device that has outstanding commands from a initiator and returns BUSY
> status when it is queued with a new command by this initiator is probably
> rare. Anyway, requeuing the command immediately when a device returns BUS=
Y
> is probably not the way to go, whatever outstanding commands from the
> initiator exist or not. My opinion is that the BUSY status is not a
> condition that must be assumed to last a very short time in average and s=
o
> requeuing it immediately is probably close to the worst thing to do. But
> since BUSY condition does not happen so often and is generally a not to
> long transient condition, such a behaviour from initiators is probably
> harmless in real life. Perhaps, BUSY should be handled the same way as
> TASK SET FULL if it happens while the target has outstanding commands fro=
m
> the initiator and the requeue of the command should be delayed of a short
> delay (tens of milliseconds) if no outstanding commands currently exist
> from that initiator.
>=20
> > Bye for now,
> > George Penokie
> >
> > Dept PPV  114-2 N212
> > E-Mail:    gop at us.ibm.com
> > Internal:  553-5208
> > External: 507-253-5208   FAX: 507-253-2880
>=20
> Regards,
>    G=E9rard.
>=20
> G=E9rard ROUDIER.
>=20
>=20
> *
> * For T10 Reflector information, send a message with
> * 'info t10' (no quotes) in the message body to majordomo at symbios.com
>=20
>=20
>=20

*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at symbios.com





More information about the T10 mailing list