FCP Rev 9 and Target-Initiated ABTS

Kurt Chan kc at core.rose.hp.com
Thu Jul 28 11:45:07 PDT 1994


From:  Kurt Chan
To:    FC/SCSI Reflectors
Subj:  Target-Initiated ABTS and FCP
Date:  7/28/94

After discussing this subject with Bob Snively at the FCSI meeting
yesterday, Bob and I agree that some changes are needed to FCP rev 9
if Targets want to claim compliance to FCP and simultaneously avoid
initiating ABTS.

The rationale is that FCP currently allows Initiators to RELY on
Target Initiated ABTS, and therefore Exchanges could inadvertantly be
left Open when Clear Task Set or Target Reset is received by a
multiple-initiator Target. Inadvertant reuse of an open OX_ID
is the data integrity hazard to be avoided.

In studying this issue further, I think I also now understand some of
the difficulty in getting class 3 and class 1/2 implementors to read
the same meaning into the words in FCP.  I've attempted to clarify
the essential behaviors of both below. 

For those anxious for a bottom line, I believe my conclusions lead to
a happy ending for all:

- class 1/2 Responders indeed must be able to initiate ABTS when
  responding to Task Management Flags (although I believe that there
  is only ONE "ambiguous" Exchange state that the Responder is
  REQUIRED to resolve because the Originator cannot).

- class 3 Responders do not have to initiate ABTS since, by design, they
  do not have reliable knowledge of Exchange closure at the Originator.

PS (In the following discussion, Targets are Exchange Responders and
    SCSI Initiators are Exchange Originators)

My objective is to obtain consensus on the FCP wording changes below
among as many implementors as possible so as to effect a change to rev
9 before Aug 12. Please read and comment.


FCP OVERVIEW
------------
FCP Rev8b currently defines "Exchange ambiguity" from the Responder
perspective as follows (paraphrased):

> - If the Responder is in the process of transferring Sequence
>   Initiative back to the Originator but has not yet been able to
>   confirm the transfer, then the Exchange associated with that
>   Sequence is in an "ambiguous" state.
> 
> - If a Responder Reset or Clear Task Set has been received from an
>   Originator, then all Exchanges with Originators other than the one
>   that sent the Task Management flag are in an "ambiguous" state.

My personal feeling is that the state of Sequence Initiative within
the Exchange has nothing to do with the Exchange STATE.  I believe the
STATE of an Exchange is either OPEN or CLOSED, and those states can be
unambiguously defined from both Originator or Responder context based
on FC-PH rules.

I believe the current FCP document, when it refers to Exchange state, is
actually referencing SEQUENCE state (i.e., whether or not the Sequence is
deliverable or whether Initiative has been successfully transferred).
However, I claim that the state of SEQUENCES within the Exchange is
uninteresting when the unit of recovery is the EXCHANGE, as it is with
ABTS-LS.  Attempting to use Sequence Initiative as the determining factor
for deciding who is obligated to originate ABTS is overly complex,
particularly when the ultimate goal is Exchange termination not Sequence
termination.

If the unit of recovery were the Sequence, then the status of
intermediate Sequences would be important.  However, when dealing with
EXCHANGE recovery this concern degrades to simply worrying about the
state of the FINAL Sequence of the Exchange:  FCP_RSP.


CLOSED vs OPEN EXCHANGE, CLASS 3 
--------------------------------
Consider the following definition of OPEN and CLOSED Exchanges from the
Originator perspective in class 3:

- An OPEN Exchange is one for which the Originator has transmitted an
  FCP_CMD but has not received an FCP_RSP for that Exchange.

- A CLOSED Exchange is any other Exchange:
  a) FCP_CMD has not yet been transmitted by the Originator 
  b) FCP_RSP has already been received for a transmitted FCP_CMD

Exchange State can only be reliably defined by the ORIGINATOR of the
Exchange.  The Responder can never reliably determine whether or not
an Exchange is Closed at the Originator, but the Originator can ALWAYS
reliably determine whether or not an Exchange is Closed at the
Responder.  It is this advantage that the Originator holds over the
Responder that requires the Originator to be responsible for recovery
operations, not the Responder.

The reason it is difficult for the RESPONDER to define Exchange state
is due to the definition of "Closed" at the Responder.  The Responder
considers the Exchange CLOSED in class 3 when it transmits the FCP_RSP.
However, since the FCP_RSP may never reach the Originator, not all
Exchanges which are Closed at the Responder can also be considered
Closed at the Originator.

THIS IS THE PRIMARY DIFFERENCE - in class 3 the Originator knows for
certain if an Exchange is truly Closed at the Responder by virtue of
receiving FCP_RSP, but the only time a Class 3 Responder knows that an
Exchange has been Closed at the Originator is when it sees the
Originator reuse the OX_ID!  Note that this means Class 3 Exchange
Responders must clear all Exchange resources as FCP_RSP is
*transmitted*.

The other boundary condition occurs when the Originator and Responder
disagree about whether or not an Exchange has been OPENed.  However,
this is not nearly as dangerous.  The worst case that can happen is
that the Originator believes it has opened an Exchange when it really
has not (since the FCP_CMD was lost).  In this case, the Originator
performs a Recovery Abort to a Responder which has no knowledge of the
Exchange being aborted, which is harmless.

In summary, since class 3 Responders are always ignorant of Exchange
Completion state at the Originator, I propose that only the Originator
can reliably perform Recovery Abort on Open Exchanges, since only it
knows which Exchanges are not yet Closed.


CLOSED vs OPEN EXCHANGE, CLASS 2 
---------------------------------
In classes 1 or 2, an Exchange is Closed at the Responder when it
receives the ACK to FCP_RSP. An Exchange is Open at the Responder
|from the time it receives an FCP_CMD until it receives the ACK to
FCP_RSP.

Like class 3, the Originator considers an Exchange Open as soon as it
transmits FCP_CMD.  It considers an Exchange Closed when it either
receives the FCP_RSP or transmits the ACK to the FCP_RSP (depending on
whether it's from the vantage point of the FC-2 or ULP within the
Originator).

Therefore the roles are now reversed!  Only the RESPONDER can reliably
measure Exchange completion in Classes 1/2.  The Originator may close
the Exchange at it's end when it transmits the ACK to FCP_RSP, but if
the Responder never receives the ACK then 

a) the Exchange could stil be open at the Originator if the FCP_RSP
   was lost, or
b) the Exchange might be closed if it was the ACK that was lost.

The danger is that if the Responder cannot initiate Recovery Abort,
then it may have Exchange resources dangling that will never get
cleaned up, and therefore may be INADVERTANTLY REUSED by the
Originator, and could alias into the old Exchange.

The good news is that there is only ONE condition where the Responder
is actually required to initiate Recovery Abort, and that is when

1) it has transmitted FCP_RSP, and
2) a Task Management flag is received which clears that task before the 
   ACK can be returned

For all other Exchange states or boundary conditions in class 1/2, the
Originator knows with certainty whether or not the Responder MAY have
an Open Exchange, and therefore the Originator can initiate Recovery
Abort reliably.


PATHOLOGICAL BEHAVIOR
---------------------
A question arose during "Devil's Advocate" discussions with Bob
regarding what happens if a Host loses context of it's Exchanges
(i.e., goes insane or powerfails).  The concern is that without
Responder-initiated Recovery Abort there are data integrity problems
at the FC-PH level.

My recomended behavior in the Loop profile wll be as follows:

1) Wait a minimum of R_A_TOV

2) Initiate fabric discovery protocol (FLOGI)

3) Initiate N_Port physical discovery protocol 
    - OPN to AL_PA space on local loop topology
    - Server query for attached D_IDs on fabric topology

4) Initiate FC-2 discovery protocol (PLOGI)

5) Initiate FC-4 discovery protocol (PRLI)

This procedure has the effect of clearing all outstanding
Exchanges between N_Ports and clearing all pending frames 
for all Exchanges that may reside in the fabric without
having to perform explicit Recovery Aborts for each
potentially open Exchange.


SUGGESTED FCP WORDING
---------------------
Ref: Pages 27-8 of FCP Rev8b 

Current wording for TARGET RESET:

| TARGET RESET, when set to one, performs a reset to the SCSI device as
| defined in SAM.  TARGET RESET resets all tasks for all initiators and
| resets all internal states of the target to their initial power on
| and default values as established by PRLI.  A unit attention condition
| is created for all initiators.
| 
| The TARGET RESET is transmitted by the initiator (Exchange Originator)
| using a new Exchange.  The initiator and target clear all resources
| that can be cleared unambiguously.  Any open Exchanges that are in an
| ambiguous state shall be terminated by whichever port detects the
| ambiguous state using a Recovery Abort.  For a target or initiator
| FCP_Port, an Exchange is in an ambiguous state if the FCP_Port has
| sequence initiative and there exists an unacknowledged frame for the
| sequence or if the FCP_Port has transferred sequence initiative but
| the transfer of the initiative has not been confirmed.  For a target
| FCP_Port, an Exchange is also in an ambiguous state if the Exchange
| exists between the target FCP_Port and an initiator other than the
| initiator FCP_Port that performed the TARGET RESET.

A couple of comments:

1)  Ownership of Sequence Initiative on intermediate Sequences of an
    Exchange is irrelevant when the unit of recovery is the Exchange.

2)  The term "unacknowledged frame" assumes class 1/2, and does
    not address class 3.

3)  Intermediate unacknowledged frames are irrelevant when the unit of
    recovery is the Exchange.

One of the compelling reasons we decided to make the Exchange the unit
of recovery is to avoid dealing with Frames and Sequences where
possible.  I suggest we look only at the boundary conditions where
Exchanges are opened and closed, and put required behavior only in
those places.  Therefore I suggest the second paragraph above be
replaced with:

"The TARGET RESET is transmitted by the SCSI Initiator (Exchange
 Originator) using a new Exchange.  The SCSI Initiator and Target
 shall clear all resources that can be cleared unambiguously.  The
 Target shall perform Recovery Abort on Exchanges for which an FCP_RSP
 frame has been transmitted in classes 1 or 2, but no ACK has been
 received.  Upon discovery of a Unit Attention condition with a
 Target, SCSI Initiators shall perform Recovery Abort on all Exchanges
 with that Target for which FCP_CMD has been transmitted, but FCP_RSP
 has not been received."

Similarly, Clear Task Set should be reworded from:

| The CLEAR TASK SET is transmitted by the initiator (Exchange
| Originator) using a new Exchange.  The initiator and target clear all
| resources that can be cleared unambiguously.  Any open Exchanges that
| are in an ambiguous state shall be terminated by whichever port
| detects the ambiguous state using a Recovery Abort.  For a target or
| initiator FCP_Port, an Exchange is in an ambiguous state if the
| FCP_Port has sequence initiative and there exists an unacknowledged
| frame for the sequence or if the FCP_Port has transferred sequence
| initiative but the transfer of the initiative has not been confirmed.
| For a target FCP_Port, an Exchange is also in an ambiguous state if
| the Exchange exists between the target FCP_Port and an initiator
| other than the initiator FCP_Port that performed the CLEAR TASK SET.

to the identical wording as suggested above, except replacing TARGET
RESET with CLEAR TASK SET.

Also note that these words do not prohibit a Target from aborting more
Exchanges than just the ones for which an ACK to FCP_RSP is
outstanding - having the Target abort ALL Exchanges which are still
Open at the Target would be compliant (albeit redundant) behavior.

Implementing just the minimal requirement set also reduces the
liklihood of "ABTS collisions" (see FCSI SCSI Profile for discussion
of this subject).

---

Kurt Chan        
kc at core.rose.HP.com




More information about the T10 mailing list