FCP-2 recovery problem

Zeitler, Carl Carl.Zeitler at COMPAQ.com
Fri Jun 23 08:45:26 PDT 2000


* From the T10 Reflector (t10 at t10.org), posted by:
* "Zeitler, Carl" <Carl.Zeitler at compaq.com>
*
Rob, all outstanding  Exchanges between an S_ID - D_ID pair must have
different OX_IDs.  If OX_ID = NNNN is outstanding to Device A in your
example, you cannot reuse that OX_ID for that device or any other device on
the same S_ID - D_ID pair until the Exchange is completed. 

Yes, you would have to keep track of the bit on a per OX_ID basis, just as
you have to keep track of the OX_IDs, to prevent their reuse.  

Regards, Carl

Carl Zeitler
Compaq Computer Corporation
MS 150801, 20555 SH249, Houston, TX 77070
Phone:281-518-5258 Fax: 281-514-5270
E-Mail: Carl.Zeitler at compaq.com


-----Original Message-----
From: robbyb at us.ibm.com [mailto:robbyb at us.ibm.com]
Sent: Friday, June 23, 2000 10:21 AM
To: Zeitler, Carl; t10 at t10.org
Subject: RE: FCP-2 recovery problem




Carl,
Even in a single LUN device you can get into problems with this solution if
you
re-use the OX_ID to different devices, if I'm understanding your proposal
correctly.

Device A receives  OX_ID = NNNN  Qualifier Bit = 0
(Device A receives no commands for the time being)
..
Device B receives OX_ID = NNNN Qualifier Bit = 1
..
Devices A is sent a command with OX_ID = NNNN Qualifier Bit = 0 and the
command is lost
..
Device A is sent an REC with OX_ID = NNNN Qualifier Bit = 0

As you can see from the above scenario, a single bit doesn't fix the
problem.  Also, it requires
that a field be kept on a per OX_ID basis even while the exchange does not
exist.  I don't believe
we currently have any requirement that requires active information to be
kept on a "dead" exchange.
With a 4 byte field, just using a random value is adequate.  There's no
need to do accounting, and
certainly no need to do accounting on a per exchange basis.
Regards,
Rob


I made 2 boo boos.  Dave Ford it right.  The VI Handles are in the Device
Header.  The payloads of REC/SRR are not changed under your method.

In my mind, the proposed solution is elegant and works, but the elegance is
not needed to solve this Class 3 error recovery corner case for tapes.  I
believe it preferable to preserve the Parameter field in the frame header
for more pervasive problems and for future expansion.

 The only thing necessary to solve this problem is to be able to
distinguish
OX_ID A from OX_ID A'.  A single bit, think of it as an extension of the
OX_ID, is all that is necessary to make this distinction.  The
implementation (Initiator) would have to remember the setting of this bit
(lets call it Herman) on a per OX_ID basis and flip Herman on the next
usage
of the  same OX_ID.

Herman can be specified by Byte 1, Bit 7 in the FCP_CMD IU, which is
currently reserved.

Herman can be specified in Byte 5, Bit 7 of the REC Payload, which is
currently reserved.  The REC is ACCed or Rejected as a function of the
presence of the OX_ID and the setting of Herman in the Target (Initiator
for
the FCP Confirm case).  There is no need to return Herman in the ACC/Reject
Payload.

Herman can be specified in byte 13, bit 7 of the SRR Payload, which is
currently reserved.  The SRR is ACCed or Rejected as a function of the
presence of the OX_ID and the setting of the bit in the Target. There is no
need to return Hermman in the ACC/Reject Payload.

I believe this solves the problem and should not impact current
implementations, assuming they are not checking reserve fields or assuming
that they are 0. REC will have to be changed in FC_FS if it has already
been
incorporated.  Other changes would be contained in FCP-2.


Regards, Carl

Carl Zeitler
Compaq Computer Corporation
MS 150801, 20555 SH249, Houston, TX 77070
Phone:281-518-5258 Fax: 281-514-5270
E-Mail: Carl.Zeitler at compaq.com


-----Original Message-----
From: Baldwin, Dave [mailto:Dave.Baldwin at emulex.com]
Sent: Thursday, June 22, 2000 3:10 PM
To: Zeitler, Carl
Cc: T10 at t10.org; 'FC Reflector'
Subject: Re: FCP-2 recovery problem


Carl,

I don't see how you are going to make this work in a multi-LUN environment.
Additionally, if your handle is in the REC (payload or header), then this
is
the same change to FC-FS as was needed with my proposal. In fact, it could
be
argued that my behavior doesn't need to go into FC-FS (it doesn't hurt
anyone
using the current definition), while if you are suggesting changing the REC
payload, then that MUST go into FC-FS.

If you are willing to come up with a complete proposal, including the rules
each device must follow, I would be willing to look at it. When does the
initiator change the bit? On a target, LUN, or exchange basis? Where is
this
bit put in REC and SRR?

A 32-bit handle (same as FC-VI) guarantees recovery can be performed
accurately. Why mess with perfection? ;-)

Best regards,
Dave Baldwin

"Zeitler, Carl" wrote:

> *
> * From the fc reflector, posted by:
> * "Zeitler, Carl" <Carl.Zeitler at compaq.com>
> *
> OK, I agree for long links it could have a performance impact.
>
> Rather than using a 4 byte handle in the Parameter field in the header,
the
> same thing could be done with a 1 BIT handle in the FCP_Cmd IU, where the
> bit would flip between 0 and 1 for the same OX_ID.  This 1 BIT Handle
would
> then be carried in REC/SRR as previously described.  But this would
confine
> the change to FCP without involving FS and all the broad implications
that
> implies. This is in keeping with FC_VI which carries a handle in its
> payload, rather than the
> header.
>
> Regards, Carl
>
> Carl Zeitler
> Compaq Computer Corporation
> MS 150801, 20555 SH249, Houston, TX 77070
> Phone:281-518-5258 Fax: 281-514-5270
> E-Mail: Carl.Zeitler at compaq.com
>
> -----Original Message-----
> From: robbyb at us.ibm.com [mailto:robbyb at us.ibm.com]
> Sent: Thursday, June 22, 2000 9:31 AM
> To: Zeitler, Carl; T10 at t10.org
> Subject: RE: FCP-2 recovery problem
>
> Carl,
> I'm opposed to FCP_CONF due to performance considerations.  Not all chips
> handle FCP_CONF automatically.
> Even if suport is built into hardware it adds an extra round trip, which
> for a device that generally doesn't support queueing
> causes significant performance penalties in the bigger SANs.
>
> The Parameter field fix is very easy to implement and does not affect
> performance.  I agree that it seems more of
> an FC-FS construct than an FCP-2 construct.
> Regards,
> Rob Basham
> IBM Tapes
>
> * From the T10 Reflector (t10 at t10.org), posted by:
> * "Zeitler, Carl" <Carl.Zeitler at compaq.com>
> *
> Dave, it would seem to me that extending the definition of the Parameter
> field should be handled in FC-FS, not FCP-2.  This is a more global issue
> in
> my mind and would apply to all FC-4s, not just FCP-2.
>
> I understand the recovery problem that you have described and it needs to
> be
> addressed.  Are you opposed to the use of FCP_CONF as part of the
solution
> for performance reasons? Or just that you can't see how the use of
FCP_CONF
> solves the problem?
>
> Regards, Carl
>
> Carl Zeitler
> Compaq Computer Corporation
> MS 150801, 20555 SH249, Houston, TX 77070
> Phone:281-518-5258 Fax: 281-514-5270
> E-Mail: Carl.Zeitler at compaq.com
>
> -----Original Message-----
> From: Baldwin, Dave [mailto:Dave.Baldwin at emulex.com]
> Sent: Tuesday, June 20, 2000 8:42 PM
> To: Fibre Reflector; T10 Reflector
> Cc: Robert Snively (Brocade)
> Subject: FCP-2 recovery problem
>
> After considering several solutions to this issue, I have come up with
> the attached proposal to solve the problem. Since Emulex is not a T10
> member, I am not sure where to put the proposal, or how to number it.
> Any guidance in this area would be appreciated.
>
> There are other potential solutions to the problem, but they are
> generally more complex and cause performance degradation. Matt's
> suggestion to use FCP_CONF and then send REC to make sure the FCP_CONF
> made it to the target would have to be used on all non-write commands,
> thus degrading performance (but it does work).
>
> There are OX_ID games one could play, but they would require lengthy
> delays in releasing resources, or complex queuing behaviors that will
> cause problems or performance degradation  in some  implementations.
> Here is an example of a hole one could hit trying to control OX_ID use:
>
> Behavior: Control the OX_ID such that there are never two identical
> consecutive OX_IDs sent to an FCP-2 LUN.
>
> Flaw: This behavior would fail in a multi-LUN FCP-2 target. Here is one
> scenario:
>
> Command1 -----> OX_ID=1, LUN=0
> OX_ID=1     <----- Response1
>
> (Send 64k other commands to other targets in less than 2 seconds)
>
> Command2 ------> OX_ID=1, LUN=1   X (command dropped)
>
> REC             -------> OX_ID=1, RX_ID=0xFFFF
>
> Good response sent <------- ACC
>
> The Initiator sent the REC to check on Command2 to LUN=1. The target
> sent an ACC with information about Command1 to LUN=0. This leads to
> invalid recovery. Controlling OX_ID reuse to a LUN is not sufficient to
> solve this problem.
>
> Please review the attached proposal and send your comments back to the
> reflector.
>
> Best regards,
> Dave Baldwin
> Emulex Corporation
> *
> * For T10 Reflector information, send a message with
> * 'info t10' (no quotes) in the message body to majordomo at t10.org
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org


*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org




More information about the T10 mailing list