SCSI, Networks & Sense Data

Ken Hallam 0003450626 at mcimail.com
Fri Mar 17 14:58:00 PST 1995


-- [ From: Kenneth J. Hallam * EMC.Ver #2.0 ] --

Kurt,

> | > More specifically, what would happen if an error occurred on a
command
> | > sent with NACA=1, and a parity error occurred on DATA IN during the
> | > subsequent Request Sense data transfer?  Would the sense data be lost
> | > permanently regardless of whether ACA is in effect?
> | 
> | Although I thought the subject was SCSI-2 parity errors and ACA is
unique
> | to SCSI-3, it would seem the Target has responsibility to preserve
sense
> | data until it has been SUCESSFULLY transfered to the Initiator. If the
> | Target got the 05h message from the host after the Request Sense DATA
IN
> | phase, it should assume it was not a sucessfull transfer and retain the
> | sense data.
> | 
> | Now how many inplementations actually do all of the above is a good
> | question.
> 
> Hi Ken,
> 
> I was hoping somebody would address this issue, since Dal and the rest
> of the FC-AL working group have been struggling with it for some time.
> In a way, this issue strikes at probably the core difference between
> channel-attached storage and network-attached storage.  
> 
> I agree that one solution is to require that Target verify transfer of
> sense data before clearing it.  However, SAM does not explicitly
> require the Target to detect whether or not the Initiator has
> successfully received sense data before the Target is allowed to clear it:
> 
>   "Sense data shall be preserved by the logical unit for the intiator
>    until is it TRANSFERRED by one of the methods listed below or until
>    another task from that initiator is entered into the task set. 

> I believe the term "transferred" is derived from the channel-based
> philosophy that 'anything you send makes it to the other end of the
> cable'.
>

Although SAM does not require confirmation of Sense data delivery, I was
only suggesting it might be a form of "proper etiquette" for a Target to do
this. I agree that a change to SAM in this area will be a long and tough
battle. I also agree that SCSI makes a lousy network protocol. It was
designed and built by I/O channel bigots and will always be awkward in
serial, network-style implementations. However, its here and we're stuck
with it.

Jeff Stai in his "SCSI-Encyclopedia" started the concept of SCSI Etiquette.
It's a way to advocate friendly protocol behavior that still falls within
the rules of SCSI. The problem is as you mention, we are all drawn down by
the lowest common denominator out there. Very few SCSI designs do much of
anything in terms of supplying the Initiator with useful Sense data and
very few initiators do much with it even if they get it. So the philosophy
becomes "Why bother to do anything with an error? The host will just issue
an Abort anyhow."

Maybe by encouraging good SCSI etiquette by designers of future products we
can raise the LCD to a point where a useful exchange of error information
can occur. Yeah, Right.

The problem is that as we go farther and faster with the new serial
interface schemes, the possibility of transient errors becomes very real.
The channel guys are spoiled by the level of system integrity they have
enjoyed over the years. One parity error could be considered a direct
indication that the bus was sick. Check cables and terminators, as
something has gone seriously wrong. We never get parity errors on a
properly configured bus! Sure, as long as everyone was in the same room,
probably sharing the same ground referrence and with good solid window
margins, that was a good bet. Not so with a campus-wide net of peripherals,
most of which have nothing in common electrically, not to mention the
questionable quality of the switches and butted-together cables users will
insist on using.

Preservation of Sense data is not the Link's problem. As you suggest, it
belongs in the Upper Layer Protocol.
But having said that, how do we get it accomplished? SAM is still based on
the channel-centric view of the universe and is not likely to change. Or is
it? Is there a change in philosophy regarding error recovery out there?
I doubt it, as the prevailing attitude is that of blunt force. Any error
indication means invoke the software driver, issue an abort, reset the
device and try again. An error means it is BROKEN, not a transient.This in
not only the prevailing attitude, but based on experience, (with terrible
SCSI error recovery implementations) it is the correct one.

Best Regards,

Ken Hallam







More information about the T10 mailing list