[Fwd: FCP-2: Lost FCP_CMND, Unacknowledged classes.]

Santosh Rao santoshr at cup.hp.com
Fri Jun 28 15:41:47 PDT 2002


* From the T10 Reflector (t10 at t10.org), posted by:
* Santosh Rao <santoshr at cup.hp.com>
*
Hello,

I did not see any response to this and hence, am raising this issue
again. We would appreciate any clarifications from FCP-2 editors and
implementors. Can this be fixed in FCP-2 ?

Thanks,
Santosh


Santosh Rao wrote:
> 
> Hello,
> 
> We have 3 issues regarding Annexe C Fig C.2 and the text in Section 8.2
> for REC.
> 
> Issue 1
> =======
> Section 8.2 states :
> 
> "If the destination FCP_Port of the REC request determines that the
> originator S_ID, OX_ID, RX_ID or task retry id are inconsistent, it
> shall respond with a FCP_RJT with a rsn_code of "unable to perform
> command request" and rsn_expln of "invalid OXID-RXID combination".
> 
> Annex C Fig C.2 states :
> 
> "The LS_RJT (Logical Error, Invalid OXID-RXID combination) for the REC
> indicates that the exchange is unknown."
> 
> The 2 quoted sections above are inconsistent with the reason code of the
> FCP_RJT to be used. From the target's perspective, when it receives a
> REC with an OXID-RXID combination for which it has no exchange state,
> both the above sections of FCP-2 hold good.
> 
> Which is the reason code to be returned in this case ?
> 
> Issue 2
> =======
> How should the initiator differentiate b/n a FCP_RJT from a target due
> to a lost FCP_CMD (the scenario described in Annexe C Fig C.2) and the
> case where the target has discarded exchange state due to the expiration
> of RR_TOV after sending FCP_RSP.
> 
> In both the above cases, our interpretation of FCP-2 is that the
> initiator will see a FCP_RJT response to the REC with :
> rsn_code = "Logical Error" or "Unable to perform command request"
> rsn_expln = "Invalid OXID-RXID combination"
> 
> In this case, the initiator cannot apply the same error recovery for the
> 2 cases. In the lost FCP_CMND case, the initiator may safely re-issue
> the command. The latter case could occur in the following manner :
> 
> - Initiator issues a command which does not involve data xfer.
> - Target sends FCP_RSP, FCP_RSP is lost.
> - Initiator REC_TOV timer pops and initiators sends REC.
> - REC times out after RA_TOVels (which is > RR_TOV, for fabric)
> - Initiator aborts REC and issues another REC
> - Target sends FCP_RJT response since it has discarded the exchange
> state.
> 
> In the above case, the initiator MUST NOT re-issue the FCP_CMND, since
> this can potentially cause a data corruption with tape devices. (ex :
> re-issuing a scsi command like SPACE, WRITE FILEMARKS when they had
> previously been executed successfully can cause tape data corruption.)
> 
> Can someone clarify on how FCP-2 differentiates these 2 cases ? Without
> the ability to differentiate between these 2 cases, the use of SLER in a
> lost FCP_CMND scenario can result in potential data corruption with tape
> devices.
> 
> Issue 3
> =======
> Section 12.5.2 states that if a REC response is not received within
> RA_TOV(els), the initiator shall abort the REC and send another REC in a
> new exchange.
> 
> Since the initiator detects the REC timeout only after RA_TOV (or 2 *
> RA_TOV, as per proposed change in FCP-3) and this time value is larger
> than RR_TOV, the target would have discarded exchange information after
> RR_TOV.
> 
> Hence, what is the point in retrying the REC ? It only exposes the
> initiator to the issue described under "Issue 2".
> 
> Any clarifications would be appreciated.


-- 
The world is so fast that there are days when the person who says 
it can't be done is interrupted by the person who is doing it.
	~ Anon
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org




More information about the T10 mailing list