FCP-2 recovery problem

Zeitler, Carl Carl.Zeitler at COMPAQ.com
Sun Jun 25 10:52:38 PDT 2000


* From the T10 Reflector (t10 at t10.org), posted by:
* "Zeitler, Carl" <Carl.Zeitler at compaq.com>
*
All my thinking the past has been implicitly based on an I-T nexus and NEVER
on an I-T-L nexus.  This adds a whole new dimension to the equation. If we
are going to go back and add LUNs to qualify an Exchange, I believe we have
some major, major re-thinking to do.  This would have to be a global change
to Fibre Channel, at the FC-FS level.  Class 2 doesn't work either for
I-T-L.  RX-IDs are just as ambiguous as OX_IDs; they are I-T based, not
I-T-L based.  To be completely pure and pristine, which I believe we have to
be at this level, we would have to add 8 bytes to the FC Header.  Maybe we
need to follow the FC-VI track of putting the LUN in a Device Header for
FCP-2, if this is what needs to be done to solve this problem!!

Regards,


Carl


-----Original Message-----
From: David A. Peterson [mailto:dap at storage.network.com]
Sent: Friday, June 23, 2000 7:21 PM
To: Binford, Charles
Cc: T10 at t10.org; 'FC Reflector'
Subject: Re: FCP-2 recovery problem



I'd rather qualify the REC/SRR using OX_ID, CRN, LUN.
A max of 255 outstanding commands to a queueing tape lun works for me.
Are we trying to make the error recovery work/bullet-proof for disk?

> "Binford, Charles" wrote:
> 
> Oops.  I forgot to consider I_T vs. I_T_L behavior on this.  I don't
> think the CRN (at 8 bits) is large enough to close the hole (it wraps
> too soon).
> 
> Summary:  Back to Dave's proposal.  I'll retract my layering argument
> in favor of something that will work. :-)  I consider my layering
> argument less important than the compatibility issues of enlarging the
> payload.  There are not enough reserved bytes to not change the
> payload size and have enough bits to close the hole.  Therefore, I'm
> back to Dave's change the parameter data.
> 
> *******
> Details for those interested in why my CRN suggestion won't work:
> 
> Suppose I have two non-queuing tape devices behind bridge being
> represented as two LUNs behind a single target.  Take Dave's original
> scenario, add my CRN field in the REC and we still come up short.
> 
> **** from Dave's original email with CRN info added *****
> Initiator                                        Target
> 
> CMD ---------------------------->
> 
> 1. A command (e.g. Test Unit Ready) is sent to the target with
> OX_ID=1. (CRN=5)
> 
>            <---------------------------     Response
> 
> 2. A "good" response is sent back to the initiator. The initiator gets
> 
> the response and knows the TUR command has been completed, so the
> exchange resources are freed. The target has sent the response, so it
> saves the exchange information just in case the initiator needs to
> recover a dropped response with REC/SRR.
> 
> CMD ---------------------------->  X (dropped frame)
> 
> 3. A new command (e.g. SPACE forward 1 block) is sent to the target
> with
> OX_ID = 1 (AND CRN=6). This OX_ID reuse can occur for many reasons in
> various
> systems. The command never makes it to the target because of a bit
> error.
> 
> REC ------------------------------>
> 
> 4. The initiator sends an REC ELS command to the target to make sure
> all
> is well with OX_ID 1 (and CRN=6).
> 
> ******* end cut and paste from Dave's email ******
> 
> At this point my plan is working - single LUN.  But consider
> Multi-LUN.  IF you change the scenario only slightly we break:
> 
> CMD ------OX_ID=1, CRN=6, LUN=1 ---------------------->
> 
>            <---------------------------     Response
> 
> CMD -----OX_ID=1, CRN=6, LUN=2 ----->  X (dropped frame)
> 
> REC -----OX_ID=1, CRN=6  -------------------->
> 
> The target can't tell if host wants LUN=1 data (lost response) or if
> the host is really talking about the lost CMD to LUN=2
> 
> With Dave's proposal an initiator can increment the payload field of
> the FCP_CMD frame every time he sends it out.  He only needs to track
> it for the open exchanges.  Because this is a 32 bit field the wrap
> time is so long we don't have to worry about it.  That is why it
> works!
> 
> Charles Binford
> LSI Logic Storage Systems
> (316) 636-8566
> 
> -----Original Message-----
> From: David A. Peterson [mailto:dap at storage.network.com]
> Sent: Friday, June 23, 2000 12:29 PM
> To: Binford, Charles
> Cc: T10 at t10.org; 'FC Reflector'
> Subject: Re: FCP-2 recovery problem
> 
> Howdy All,
> Finally catching up on emails.
> Charles proposal is exactly what I was thinking also. In my reading of
> 
> the CRN text in FCP-2 today, it does not explicitly state the CRN is
> based on the I_T_L nexus, but the EPDC bit is contained in the lun
> control mode page, so I guess it is implied. Would be nice to see some
> 
> text stating this.
> 
> Anyways, I think the proposal would work if the CRN is based on the
> I_T_L
> nexus (i.e. not I_T nexus).
> 
> Dave
> 
> "Binford, Charles" wrote:
> >
> > *
> > * From the fc reflector, posted by:
> > * "Binford, Charles" <cbinford at lsil.com>
> > *
> > Carl, I fail to see how adding one bit does any more than extend the
> 
> > OX_ID
> > field by one bit.  Now instead of roll over at 64k, you roll over at
> 
> > 128k.
> > However, in many implementations I'm familiar with the OX_ID range
> used
> > is
> > much smaller.  This is because people are using it as designed by
> the FC
> > committee - to be an HW lookup index to the resources associated
> with
> > the
> > exchange.  Most FC chips don't have resources to support 64K
> exchanges
> > per
> > d_id and thus the real range for OX_ID is much smaller.  (We'd be
> guilty
> > of
> > designing a very limiting architecture if the typical implementation
> was
> > actually bumping up against the limitations of the standard.)
> >
> > The other point (which is more important to this discussion) I want
> to
> > make
> > about OX_ID assignments is that because it is an index into chip
> > resources
> > the management of it is often in a very low layer of the driver that
> has
> > no
> > knowledge of the payload.  The layer building the FCP_CMD payload
> has no
> > clue what OX_ID is going to be assigned, the layer assigning the
> OX_ID
> > has
> > no clue about any 'Hermann' bit that may or may not need to be set.
> >
> > While I believe Dave Baldwin's solution will work, I also have some
> > reservations about it.  Again it is the layering thing.  FC driver
> > interfaces would have to be changed to allow the upper layer to
> specify
> > a
> > value to be placed in the header (built by a lower layer).  I'd
> rather
> > place
> > the new data in the payload that is built by the same layer that
> > understands
> > what is going on.
> >
> > ********* here is my proposal **********
> > For any command that can't be simply aborted and retried (i.e. not
> > Inquiry,
> > etc.) use a non-zero CRN value.  Define the reserved byte in the REC
> 
> > payload
> > to hold the CRN value if non-zero.  A target receiving an REC with a
> 
> > non-zero "CRN" value shall match it and the OX_ID before determining
> if
> > to
> > ACC or RJT the command.
> >
> > This is basically the same thing as Dave's proposal with the
> following
> > modification:
> > - new data in payload instead of header
> > - on 8 bits instead of 32
> >
> > I'd argue that 8 bits is sufficient for the same reasons it is large
> 
> > enough
> > for command delivery ordering.  There it not a need to queue larger
> than
> > 255
> > for sequential devices.
> >
> > Comments??
> >
> > Charles Binford
> > LSI Logic Storage Systems
> > (316) 636-8566
> >
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org




More information about the T10 mailing list