FCP and Target-Initiated Recovery Abort

Kurt Chan kc at core.rose.hp.com
Wed Jul 6 14:45:33 PDT 1994


From:   Kurt Chan
To:     FC, SCSI Reflectors
Subj:   FCP and Target-Initiated Recovery Abort
Date:   Wed Jul  6 13:25:20 PDT 1994

This is request for FCP to reconsider it's requirement for
Target-Initiated Recovery Abort.

---

Paraphrasing FCP Rev 008b:

| An Exchange is in an ambiguous state if:
| 
| - the FCP_Port has Sequence Initiative and there exists an unacknowledged
|   frame for the Sequence, or 
| 
| - the FCP_Port has transferred Sequence Initiative but the transfer has
|   not been confirmed
| 
| At a Target FCP_Port, if TARGET RESET or CLEAR TASK SET is received
| from an Initiator, any Exchanges outstanding with other Initiators are
| in an ambiguous state.

... FCP then proceeds to require "whichever port detects the ambiguous
state" to terminate the open Exchanges.

First off, bullet 1 assumes class 2.  In class 3, no frames are
"acknowledged" via FC ACKs.  If Read XFR_RDY is disabled in class 3,
no frames coming from the Target are even acknowledged at the *FCP*
level much less the FC-PH level.  This wording therefore is
inaccurate.

Secondly, I disagree with this definition of "ambiguous", or the need
to define ambiguity from the Target's perspective.  FCP Exchanges have
two and only two states from the Originator's perspective:  complete
or open (incomplete).  It is sufficient to consider ONLY the Originator's
perspective for the purposes of error recovery and data integrity.
The Target's idea of what it thinks are "ambiguous" Exchanges states
can be considered advisory information to the entity responsible for
ensuring Exchange data integrity: the host.

In class 3, the Exchange is only complete when FCP_RSP has been
received for that Exchange.  If no FCP_RSP has been received, the
Exchange is open (incomplete).  It can't get any simpler.

In class 3, there are two states of incompletion at the Host:

Case 1:   CMD sent, no XFR_RDY or Read Data received within a timeout
          period.

Case 2:   CMD sent, at least one XFR_RDY or Read Data frame received
          but no FCP_RSP received within a timeout period.

In Case 1, the Target may not have ever received the CMD.  However, if
it receives a Recovery Abort for a command that it never received, no
harm should come.  If an Originator REALLY needs to differentiate
between case 1 and case 2, then it has several tools:

- ABTS without L_S 
- RES/RSS
- ULP tools

All of these allow the Host to discover and synchronize to the
Target's state before actually performing Recovery Abort.  Without
such synchronization, there is no guarantee that the Target did not
already execute the command and simply failed to report the result.

One example that comes to mind is tape drives which accept relative
motion commands.  If a tape driver did not receive a good FCP_RSP and
needs to know whether or not a MOVE command was executed, it must READ
POSITION before performing the next Write.  Note the following
differences:

- A good FCP_RSP received *is* a reliable indication that the command
  *was* performed.

- A missing FCP_RSP (or any other intermediate ACK/interlock) is NOT
  necessarily a reliable indication that the command was NOT performed

Now consider TARGET RESET and CLEAR TASK SET, both of which clear CMDs
for other Initiators.  If other Initiators have open (incomplete)
commands for a Target that has received either of these, then

1. The Target can generate a Unit Attention Condition to those
   Initiators which had open commands, and

2. The Initiators of those open commands can perform Recovery Abort
   for each of the open commands.

There is NO FUNCTIONAL difference between the Target performing these
aborts or the Initiators performing these aborts.  Practically
speaking, however, it allows implementation of FC Error Recovery to be
centralized in the Initiator, where many prefer it reside.

Regarding responses to FCP Comments:

| A)	Use of Error Abort	(Jim Coomes, Seagate, Jim_Coomes at notes.seagate.com)
| 
| Comment:
| 
| 	We would like to eliminate requirements for target to send ABTS
| 	to host after:
| 
| 	    	FCP Logout
| 	    	Target reset
| 	    	Clear Task Set
| 		Abort Task Set
| 
| 	Here are the reasons:
| 
|         For FCP logout, explicitly aborting every task from that host
|         is not needed since the host knows that it should not expect
|         any further interaction with this target.
| 
|         For the other three, the current FCP requirement is contrary
|         to SCSI.  SCSI and SAM require targets to simply discard the
|         tasks, creating a unit attention condition for each affected
|         host (or all hosts, with Target Reset).
| 
|         The requirement to use ABTS in these situations adds
|         substantial complexity to both the target and the host, so
|         should be eliminated.
| 
| Proposed Response:
| 
|         A clean layering of the architecture is desirable to allow the
|         transmission of TCP/IP, FCP, and other Fibre Channel FC-4's
|         through the same FC host adapter at the same time.

"Clean layering of the architecture" is red herring.  I claim that
requiring Targets to perform Recovery Abort is UNclean.  FC Error
recovery is not something that host SW will typically want to entrust
to Targets.  Spontaneous BUS FREE on catastrophic failure is one
thing, but a surgical ABTS from a Target is a worrisome intrusion to
host SW rather than an assist.  This assumes the Target's database and
knowledge of outstanding Tasks is more reliable or takes precedence
over the Host's - not a good assumption.

|         It is true that the Target Reset, Clear Task Set, and Abort
|         Task Set cause the SCSI logical tasks to be discarded.
|         However, these functions are interpreted by the Task Manager,
|         a SCSI layer, and not by the Fibre Channel port.  The
|         mechanism that the FCP uses to clean up the Fibre Channel port
|         resources whose state is possibly unknown is the Recovery
|         Abort function.

Once again, there are no "unknown" task states from an Initiator's
perspective, only from a Target's perspective (which I claim we can
and want to discount in order to Keep It Simple).  

|         Without this function, the layering gets quite muddled and it
|         is likely that interactions between Fibre Channel error
|         recovery and SCSI task management functions could leave the
|         system in states that are not defined by the standards.

I dispute this.  Tasks are either complete or open (incomplete) from a
Host perspective.  Powering off a device, doing FCP logout, issuing a
CLEAR TASK SET or TARGET RESET leave tasks to other Initiators
incomplete, (and perhaps permanently so) but not necessarily unknown
or ambiguous, since Initiators have tools to discover the state of the
Target.

| 	I would suggest we leave FCP Rev 8b unchanged.

I would suggest that we reword FCP as follows to accomodate EITHER
method:

   TARGET RESET, when set to one, performs a reset of the SCSI device
   as defined in SAM.  TARGET RESET resets all tasks for all
   Initiators and resets all internal states of the Target to their
   initial power on and default values as established by PRLI.  A Unit
   Attention condition is created for all Initiators.

   The TARGET RESET is transmitted by the Initiator (Exchange
   Originator) using a new Exchange.  All Open Exchanges with that
   Target shall be aborted using Recovery Abort procedures before any
   new Exchanges are originated.


   CLEAR TASK SET causes all tasks from all Initiators in the
   specified task set to be aborted as defined in SAM.  If there are
   Initiators other than the Initiator which issued the CLEAR TASK SET
   with tasks in the task set, a Unit Attention condition is created
   for those Initiators.

   The CLEAR TASK SET is transmitted by the Initiator (Exchange
   Originator) using a new Exchange.  All Open Exchanges with the
   Target shall be aborted using Recovery Abort procedures before any
   new Exchanges are originated.

     ... where "Open Exchange" is defined as an Exchange for which
     no FCP_FSP has been received by the Exchange Originator.

Note that a Target can still issue Recovery Abort under these rules
for those Exchanges which it has not transmitted FCP_RSP for. 




More information about the T10 mailing list