96-274r0 -- Proposed Change in QErr=1 Behavior for SPC-2

Gene Milligan Gene_Milligan at notes.seagate.com
Fri Nov 22 14:34:49 PST 1996


* From the SCSI Reflector (scsi at symbios.com), posted by:
* Gene Milligan <Gene_Milligan at notes.seagate.com>
*
Yes it requires the behavior already defined. It does not require the proposed 
behavior.

They are not arrows, they are cream puffs.

Gene

To: scsi @ Symbios.COM @ INTERNET
cc:  (bcc: Gene Milligan)
From: cbinford @ ppdpost.ks.symbios.com ("Binford, Charles") @ INTERNET
Date: 11/22/96 02:07:00 PM PST
Subject: RE: 96-274r0 -- Proposed Change in QErr=1 Behavior for SPC-2

* From the SCSI Reflector (scsi at symbios.com), posted by:
* "Binford, Charles" <cbinford at ppdpost.ks.symbios.com>
*

Please point your arrows at me, not Ralph.  Ralph proposed this on my behalf 
since I have not been attending the T10 meetings lately.  Those of you who 
have been following this reflector for a while may remember I brought this 
up about two years ago.  The subject of QErr and multi-initiator behavior 
was bounced around on the reflector a while, then the subject came up at the 
next SCSI Working Group.  Unfortunately, I was not able to attend, and 
without someone pushing, an issue dies real quick.  I decided to drop the 
issue.  The reason I bring it now is because Peter Johansson's Basic Queuing 
model *requires* it.

See my specific responses to Gerry's objections below.

Thanks,
Charles Binford
Symbios Logic

 ----------
>From: scsi-owner
>To: scsi
>Subject: RE: 96-274r0 -- Proposed Change in QErr=1 Behavior for SPC-2
>Date: Thursday, November 21, 1996 1:10PM
>
>---------------------------------------------------------------------------  
 ---
>* From the SCSI Reflector (scsi at symbios.com), posted by:
>* Gerry Houlder <Gerry_Houlder at notes.seagate.com>
>*
>Quoting Ralph's message:
>
>>This proposal solves the problem by eliminating the interaction between
>>a CHECK CONDITION status sent to one initiator and the handling of tasks
>>belonging to other initiators.  This proposal also can be viewed as
>>reducing implementation complexity in the device server.
>
>(1) This proposal doesn't reduce the implementation complexity in the 
device
>server. It requires the device server to test each command to see which
>initiator it is from and then choose to either abort it or maintain it. 
This
>is MUCH more difficult that aborting everything unconditionally.

Ralph's (my) proposal merely says treat the QErr like an Abort Task Set, 
instead of a Clear Task Set.  Devices which support queuing ALREADY know how 
to do it.  It is not increased complexity, merely a modified algorithm.

The idea of *reducing* complexity comes from the notion that there may be a 
way for a "smart" device to help avoid the endless loop potential 
demonstrated in Ralph's original posting.  I haven't figured out how a 
device would do that, but when if I do, I'm sure it would add complexity. 
 The modified QErr proposal side steps the issue by not having any QErr / 
multi-initiator interaction.

>(2) He is actually trying to reduce the initiator complexity, not the 
device
>server complexity. The fact is that the QErr feature was designed for
>single initiator systems and multi-initiator systems were supposed to use
>the ACA ACTIVE technique (or at least normal contingent allegiance) so
>commands from other initiators weren't blown away. This is more work for
>the initiator to conditionally decide which other commands it has sent
>should be aborted because of the failed command. He just wants to
>push the hard work down to the device instead of doing it himself !!
>

We almost agree on one point Gerry!  You state, " the QErr feature was 
designed for single initiator systems...".  I agree it works fine there. 
 There would be  absolutely NO change in behavior with the QErr  proposal 
for a single initiator.  Am I'm reading too much into your statement to say 
you agree it doesn't work for multi-initiator???

>(3) This flies against the "simple queuing" proposal presented by Peter
>Johnsson, based on an earlier proposal by Jim McGrath. That proposal
>unequivically stated that it is more difficult to do "conditional aborts".
>This is why ABORT TASK and ABORT (where only one initiator's commands
>are aborted) were removed from the simple queuing set.

When was ABORT TASK and ABORT removed??  All I could find on the WEB was 
revision 3 of Peter's proposal, it still had ABORT TASK.  The minutes of the 
November mentioned rev 4, but there was no mention of what was changed.

As I said before, *this* proposal is what caused me to bring up the issue 
again.  I like everything about the basic queueing model except the QErr bit 
being *required* to be on.  I would prefer to not use QErr for a single 
initiator.  But for a multi-initiator environment I believe the QErr totally 
breaks.  Consider the following scenario.  I think it is reasonable:

 - Host A is issuing queued I/Os to drive 1 - maintaining 4 outstanding I/Os
 - Host B starts talking to drive 1, gets a Check Condition for Power-up Unit 
Attention.
 - The QErr bit causes host A's 4 I/Os to be aborted but he doesn't know it. 
 *** 10 Second time-out  ***
 - Meanwhile Host B continues issuing I/Os to drive 1 - also maintaining 4 
outstanding I/Os
 - Host A finally times out - Issue an Abort Task Set (no effect on host B)
 - Host A attempts to re-issue I/Os which were timed out
 - Host A gets a Check Condition for Commands Cleared by Another Initiator
 - The QErr bit causes host B's 4 I/Os to be aborted but he doesn't know it. 
 *** 10 Second time-out  ***
 - Host A continues his recovery and gets back to maintaining 4 outstanding 
I/Os
 - Host B finally times out - Issue an Abort Task Set (no effect on host C)
......

Get the picture?  I DON'T want all of those TIME-OUTS just because another 
initiator started talking to the same device.

No, we have not seen this problem in our products, but that is because we 
don't use QErr with the drives.  Just for grins I did do a little test with 
two hosts and one drive.  Yes, the problem I described above can occur with 
today's SCSI drives (my simple test didn't have any error recovery).

>
>(4) This proposal also affects Charles Monia's proposal to add an "ACA
>like" behavior for RESERVATION CONFLICT, QUEUE FULL, etc. Perhaps
>Ralph should support Charles' proposal and use that technique instead of
>changing an older feature.

I don't see the connection between the scope of the QErr bit and Charles 
Monia's proposal.

>
>Just for the record, I don't like this change. Ralph is messing with a 
feature
>whose behavior has been well defined for many years and is liked by a
>lot of customers JUST THE WAY IT IS.

But are those customers using QErr in a multi-initiator environment????  If 
so please help me understand how it works.

>*
>* For SCSI Reflector information, send a message with
>* 'info scsi' (no quotes) in the message body to majordomo at symbios.com
>
*
* For SCSI Reflector information, send a message with
* 'info scsi' (no quotes) in the message body to majordomo at symbios.com
 

*
* For SCSI Reflector information, send a message with
* 'info scsi' (no quotes) in the message body to majordomo at symbios.com




More information about the T10 mailing list