FCP-2: Lost FCP_CMND, Unacknowledged classes.
santoshr at cup.hp.com
Tue Jun 25 17:48:20 PDT 2002
* From the T10 Reflector (t10 at t10.org), posted by:
* Santosh Rao <santoshr at cup.hp.com>
We have 3 issues regarding Annexe C Fig C.2 and the text in Section 8.2
Section 8.2 states :
"If the destination FCP_Port of the REC request determines that the
originator S_ID, OX_ID, RX_ID or task retry id are inconsistent, it
shall respond with a FCP_RJT with a rsn_code of "unable to perform
command request" and rsn_expln of "invalid OXID-RXID combination".
Annex C Fig C.2 states :
"The LS_RJT (Logical Error, Invalid OXID-RXID combination) for the REC
indicates that the exchange is unknown."
The 2 quoted sections above are inconsistent with the reason code of the
FCP_RJT to be used. From the target's perspective, when it receives a
REC with an OXID-RXID combination for which it has no exchange state,
both the above sections of FCP-2 hold good.
Which is the reason code to be returned in this case ?
How should the initiator differentiate b/n a FCP_RJT from a target due
to a lost FCP_CMD (the scenario described in Annexe C Fig C.2) and the
case where the target has discarded exchange state due to the expiration
of RR_TOV after sending FCP_RSP.
In both the above cases, our interpretation of FCP-2 is that the
initiator will see a FCP_RJT response to the REC with :
rsn_code = "Logical Error" or "Unable to perform command request"
rsn_expln = "Invalid OXID-RXID combination"
In this case, the initiator cannot apply the same error recovery for the
2 cases. In the lost FCP_CMND case, the initiator may safely re-issue
the command. The latter case could occur in the following manner :
- Initiator issues a command which does not involve data xfer.
- Target sends FCP_RSP, FCP_RSP is lost.
- Initiator REC_TOV timer pops and initiators sends REC.
- REC times out after RA_TOVels (which is > RR_TOV, for fabric)
- Initiator aborts REC and issues another REC
- Target sends FCP_RJT response since it has discarded the exchange
In the above case, the initiator MUST NOT re-issue the FCP_CMND, since
this can potentially cause a data corruption with tape devices. (ex :
re-issuing a scsi command like SPACE, WRITE FILEMARKS when they had
previously been executed successfully can cause tape data corruption.)
Can someone clarify on how FCP-2 differentiates these 2 cases ? Without
the ability to differentiate between these 2 cases, the use of SLER in a
lost FCP_CMND scenario can result in potential data corruption with tape
Section 12.5.2 states that if a REC response is not received within
RA_TOV(els), the initiator shall abort the REC and send another REC in a
Since the initiator detects the REC timeout only after RA_TOV (or 2 *
RA_TOV, as per proposed change in FCP-3) and this time value is larger
than RR_TOV, the target would have discarded exchange information after
Hence, what is the point in retrying the REC ? It only exposes the
initiator to the issue described under "Issue 2".
Any clarifications would be appreciated.
The world is so fast that there are days when the person who says
it can't be done is interrupted by the person who is doing it.
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org
More information about the T10