FCP-2 recovery problem
David A. Peterson
dap at storage.network.com
Fri Jun 23 17:20:42 PDT 2000
* From the T10 Reflector (t10 at t10.org), posted by:
* "David A. Peterson" <dap at storage.network.com>
*
This is a multi-part message in MIME format.
--------------CD9FD922748CB058DC3A4948
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
I'd rather qualify the REC/SRR using OX_ID, CRN, LUN.
A max of 255 outstanding commands to a queueing tape lun works for me.
Are we trying to make the error recovery work/bullet-proof for disk?
> "Binford, Charles" wrote:
>
> Oops. I forgot to consider I_T vs. I_T_L behavior on this. I don't
> think the CRN (at 8 bits) is large enough to close the hole (it wraps
> too soon).
>
> Summary: Back to Dave's proposal. I'll retract my layering argument
> in favor of something that will work. :-) I consider my layering
> argument less important than the compatibility issues of enlarging the
> payload. There are not enough reserved bytes to not change the
> payload size and have enough bits to close the hole. Therefore, I'm
> back to Dave's change the parameter data.
>
> *******
> Details for those interested in why my CRN suggestion won't work:
>
> Suppose I have two non-queuing tape devices behind bridge being
> represented as two LUNs behind a single target. Take Dave's original
> scenario, add my CRN field in the REC and we still come up short.
>
> **** from Dave's original email with CRN info added *****
> Initiator Target
>
> CMD ---------------------------->
>
> 1. A command (e.g. Test Unit Ready) is sent to the target with
> OX_ID=1. (CRN=5)
>
> <--------------------------- Response
>
> 2. A "good" response is sent back to the initiator. The initiator gets
>
> the response and knows the TUR command has been completed, so the
> exchange resources are freed. The target has sent the response, so it
> saves the exchange information just in case the initiator needs to
> recover a dropped response with REC/SRR.
>
> CMD ----------------------------> X (dropped frame)
>
> 3. A new command (e.g. SPACE forward 1 block) is sent to the target
> with
> OX_ID = 1 (AND CRN=6). This OX_ID reuse can occur for many reasons in
> various
> systems. The command never makes it to the target because of a bit
> error.
>
> REC ------------------------------>
>
> 4. The initiator sends an REC ELS command to the target to make sure
> all
> is well with OX_ID 1 (and CRN=6).
>
> ******* end cut and paste from Dave's email ******
>
> At this point my plan is working - single LUN. But consider
> Multi-LUN. IF you change the scenario only slightly we break:
>
> CMD ------OX_ID=1, CRN=6, LUN=1 ---------------------->
>
> <--------------------------- Response
>
> CMD -----OX_ID=1, CRN=6, LUN=2 -----> X (dropped frame)
>
> REC -----OX_ID=1, CRN=6 -------------------->
>
> The target can't tell if host wants LUN=1 data (lost response) or if
> the host is really talking about the lost CMD to LUN=2
>
> With Dave's proposal an initiator can increment the payload field of
> the FCP_CMD frame every time he sends it out. He only needs to track
> it for the open exchanges. Because this is a 32 bit field the wrap
> time is so long we don't have to worry about it. That is why it
> works!
>
> Charles Binford
> LSI Logic Storage Systems
> (316) 636-8566
>
> -----Original Message-----
> From: David A. Peterson [mailto:dap at storage.network.com]
> Sent: Friday, June 23, 2000 12:29 PM
> To: Binford, Charles
> Cc: T10 at t10.org; 'FC Reflector'
> Subject: Re: FCP-2 recovery problem
>
> Howdy All,
> Finally catching up on emails.
> Charles proposal is exactly what I was thinking also. In my reading of
>
> the CRN text in FCP-2 today, it does not explicitly state the CRN is
> based on the I_T_L nexus, but the EPDC bit is contained in the lun
> control mode page, so I guess it is implied. Would be nice to see some
>
> text stating this.
>
> Anyways, I think the proposal would work if the CRN is based on the
> I_T_L
> nexus (i.e. not I_T nexus).
>
> Dave
>
> "Binford, Charles" wrote:
> >
> > *
> > * From the fc reflector, posted by:
> > * "Binford, Charles" <cbinford at lsil.com>
> > *
> > Carl, I fail to see how adding one bit does any more than extend the
>
> > OX_ID
> > field by one bit. Now instead of roll over at 64k, you roll over at
>
> > 128k.
> > However, in many implementations I'm familiar with the OX_ID range
> used
> > is
> > much smaller. This is because people are using it as designed by
> the FC
> > committee - to be an HW lookup index to the resources associated
> with
> > the
> > exchange. Most FC chips don't have resources to support 64K
> exchanges
> > per
> > d_id and thus the real range for OX_ID is much smaller. (We'd be
> guilty
> > of
> > designing a very limiting architecture if the typical implementation
> was
> > actually bumping up against the limitations of the standard.)
> >
> > The other point (which is more important to this discussion) I want
> to
> > make
> > about OX_ID assignments is that because it is an index into chip
> > resources
> > the management of it is often in a very low layer of the driver that
> has
> > no
> > knowledge of the payload. The layer building the FCP_CMD payload
> has no
> > clue what OX_ID is going to be assigned, the layer assigning the
> OX_ID
> > has
> > no clue about any 'Hermann' bit that may or may not need to be set.
> >
> > While I believe Dave Baldwin's solution will work, I also have some
> > reservations about it. Again it is the layering thing. FC driver
> > interfaces would have to be changed to allow the upper layer to
> specify
> > a
> > value to be placed in the header (built by a lower layer). I'd
> rather
> > place
> > the new data in the payload that is built by the same layer that
> > understands
> > what is going on.
> >
> > ********* here is my proposal **********
> > For any command that can't be simply aborted and retried (i.e. not
> > Inquiry,
> > etc.) use a non-zero CRN value. Define the reserved byte in the REC
>
> > payload
> > to hold the CRN value if non-zero. A target receiving an REC with a
>
> > non-zero "CRN" value shall match it and the OX_ID before determining
> if
> > to
> > ACC or RJT the command.
> >
> > This is basically the same thing as Dave's proposal with the
> following
> > modification:
> > - new data in payload instead of header
> > - on 8 bits instead of 32
> >
> > I'd argue that 8 bits is sufficient for the same reasons it is large
>
> > enough
> > for command delivery ordering. There it not a need to queue larger
> than
> > 255
> > for sequential devices.
> >
> > Comments??
> >
> > Charles Binford
> > LSI Logic Storage Systems
> > (316) 636-8566
> >
--------------CD9FD922748CB058DC3A4948
Content-Type: text/x-vcard; charset=us-ascii; name="dap.vcf"
Content-Transfer-Encoding: 7bit
Content-Disposition: ATTACHMENT; filename="dap.vcf"
Content-Description: Card for David A. Peterson
begin:vcard
n:Peterson;David
tel;cell:612-251-6229
tel;work:763-391-1008
x-mozilla-html:FALSE
org:StorageTek;Minnesota Research and Development Center
adr:;;;;;;
version:2.1
email;internet:dap at network.com
title:Principal Engineer
fn:David A. Peterson
end:vcard
--------------CD9FD922748CB058DC3A4948--
More information about the T10
mailing list