Deferred error handling with ACA

Jeff Williams jlw at hpbs1506.boi.hp.com
Wed Sep 28 07:46:36 PDT 1994




Gerry wrote:

> I'm not so concerned about when the "deferred error" is reported, I am
> concerned about what the drive should do in the time between the deferred 
> error occurs and it is able to report it. For example, an unrecovered write 
> error with write caching enabled. My programmers and customers have a strong
> preference to not begin any new commands (excepting other write operations 
> that were merged with the command with the failure) until the deferred 
> error is reported. Does Bob's proposal suggest that previously queued 
> commands from other initiators can/should be executed during this time? 

There are two issues, one is what do you do from the time you determine that
an ACA exists until it is successfully reported, the other is what you do 
after it is reported until it is cleared.  

In the first case, there is nothing that the model can state since, by the
definitions within the queuing model, any commands which have been received
into the task set and have been "activated" could complete or do other actions
at any time.  Therefore, you have no clue what your command states are when
the ACA causing event occurs.  For example, if I have 2 simple writes and
all of the write data and an error is resported on write 2, write 1 may
or may not be completed and you cannot inquire as to it's completions state.
Because of this, there can be no meaningful requirements on the state of 
the queue when an ACA occurs.  In fact, the only information you can count 
on is that an ordered commands received after the command in error will not
be executing since the requirement for "de-queuing" is that the previous
one complete, aka send the CHECK which creates that ACA and freezes the 
queue.

The SCSI-3 queuing model is explicit on what can and can not occur during 
the latter.  


> Should new commands from other initiators be queued or even executed 
> rather than completed with ACA Active status? We would rather treat 
> commands for other (non-failing) initiators as if the ACA begins as soon 
> as the error occured, not when it was reported to the failing initiator 
> (which could be a long time span). I think we need to add some wording to 
> specify how to handle this situation. Most customers like the concept of 
> "stop everything else until the error has been reported", and existing 
> ACA wording doesn't recommend this or (in some people's opinion) may not 
> even allow this.

The ACA does not exist until the CHECK is returned.  This is the same
definition as SCSI-2 CA.  In SCSI-2, if you send BUSY status for non-error
initiators during CA, you should not send it until the CA is created by
the sending of the CHECK.  Now, what is the damage of sending the BUSY
or the ACA ACTIVE early?  Probably nothing except that the queuing model
does not allow it.


Regards,
Jeff
----------------------------------------------------------------------
Jeffrey L. Williams                                HP TelNet: 396-5030
Disk Controllers Lab             	 Telephone: (1) (208) 396-5030
Disk Memory Division                     Facsimile: (1) (208) 396-6858
Hewlett-Packard Co.                    ARPANET: jlw at hpdmd48.boi.hp.com
----------------------------------------------------------------------




More information about the T10 mailing list