FCP-2 recovery problem

Binford, Charles cbinford at lsil.com
Fri Jun 23 16:24:54 PDT 2000


* From the T10 Reflector (t10 at t10.org), posted by:
* "Binford, Charles" <cbinford at lsil.com>
*
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01BFDD6A.3D0538CC
Content-Type: text/plain; charset="iso-8859-1"

Oops.  I forgot to consider I_T vs. I_T_L behavior on this.  I don't
think the CRN (at 8 bits) is large enough to close the hole (it wraps
too soon).  

Summary:  Back to Dave's proposal.  I'll retract my layering argument in
favor of something that will work. :-)  I consider my layering argument
less important than the compatibility issues of enlarging the payload.
There are not enough reserved bytes to not change the payload size and
have enough bits to close the hole.  Therefore, I'm back to Dave's
change the parameter data.


******* 
Details for those interested in why my CRN suggestion won't work: 

Suppose I have two non-queuing tape devices behind bridge being
represented as two LUNs behind a single target.  Take Dave's original
scenario, add my CRN field in the REC and we still come up short. 

**** from Dave's original email with CRN info added ***** 
Initiator                                        Target 

CMD ----------------------------> 

1. A command (e.g. Test Unit Ready) is sent to the target with OX_ID=1.
(CRN=5) 

           <---------------------------     Response 

2. A "good" response is sent back to the initiator. The initiator gets 
the response and knows the TUR command has been completed, so the 
exchange resources are freed. The target has sent the response, so it 
saves the exchange information just in case the initiator needs to 
recover a dropped response with REC/SRR. 

CMD ---------------------------->  X (dropped frame) 

3. A new command (e.g. SPACE forward 1 block) is sent to the target with

OX_ID = 1 (AND CRN=6). This OX_ID reuse can occur for many reasons in
various 
systems. The command never makes it to the target because of a bit 
error. 

REC ------------------------------> 

4. The initiator sends an REC ELS command to the target to make sure all

is well with OX_ID 1 (and CRN=6). 

******* end cut and paste from Dave's email ****** 

At this point my plan is working - single LUN.  But consider Multi-LUN.
IF you change the scenario only slightly we break:

CMD ------OX_ID=1, CRN=6, LUN=1 ----------------------> 

           <---------------------------     Response 

CMD -----OX_ID=1, CRN=6, LUN=2 ----->  X (dropped frame) 


REC -----OX_ID=1, CRN=6  --------------------> 

The target can't tell if host wants LUN=1 data (lost response) or if the
host is really talking about the lost CMD to LUN=2

With Dave's proposal an initiator can increment the payload field of the
FCP_CMD frame every time he sends it out.  He only needs to track it for
the open exchanges.  Because this is a 32 bit field the wrap time is so
long we don't have to worry about it.  That is why it works!


Charles Binford 
LSI Logic Storage Systems 
(316) 636-8566 


-----Original Message----- 
From: David A. Peterson [ mailto:dap at storage.network.com
 ] 
Sent: Friday, June 23, 2000 12:29 PM 
To: Binford, Charles 
Cc: T10 at t10.org; 'FC Reflector' 
Subject: Re: FCP-2 recovery problem 


Howdy All, 
Finally catching up on emails. 
Charles proposal is exactly what I was thinking also. In my reading of 
the CRN text in FCP-2 today, it does not explicitly state the CRN is 
based on the I_T_L nexus, but the EPDC bit is contained in the lun 
control mode page, so I guess it is implied. Would be nice to see some 
text stating this. 

Anyways, I think the proposal would work if the CRN is based on the 
I_T_L 
nexus (i.e. not I_T nexus). 

Dave 

"Binford, Charles" wrote: 
> 
> * 
> * From the fc reflector, posted by: 
> * "Binford, Charles" <cbinford at lsil.com> 
> * 
> Carl, I fail to see how adding one bit does any more than extend the 
> OX_ID 
> field by one bit.  Now instead of roll over at 64k, you roll over at 
> 128k. 
> However, in many implementations I'm familiar with the OX_ID range
used 
> is 
> much smaller.  This is because people are using it as designed by the
FC 
> committee - to be an HW lookup index to the resources associated with 
> the 
> exchange.  Most FC chips don't have resources to support 64K exchanges

> per 
> d_id and thus the real range for OX_ID is much smaller.  (We'd be
guilty 
> of 
> designing a very limiting architecture if the typical implementation
was 
> actually bumping up against the limitations of the standard.) 
> 
> The other point (which is more important to this discussion) I want to

> make 
> about OX_ID assignments is that because it is an index into chip 
> resources 
> the management of it is often in a very low layer of the driver that
has 
> no 
> knowledge of the payload.  The layer building the FCP_CMD payload has
no 
> clue what OX_ID is going to be assigned, the layer assigning the OX_ID

> has 
> no clue about any 'Hermann' bit that may or may not need to be set. 
> 
> While I believe Dave Baldwin's solution will work, I also have some 
> reservations about it.  Again it is the layering thing.  FC driver 
> interfaces would have to be changed to allow the upper layer to
specify 
> a 
> value to be placed in the header (built by a lower layer).  I'd rather

> place 
> the new data in the payload that is built by the same layer that 
> understands 
> what is going on. 
> 
> ********* here is my proposal ********** 
> For any command that can't be simply aborted and retried (i.e. not 
> Inquiry, 
> etc.) use a non-zero CRN value.  Define the reserved byte in the REC 
> payload 
> to hold the CRN value if non-zero.  A target receiving an REC with a 
> non-zero "CRN" value shall match it and the OX_ID before determining
if 
> to 
> ACC or RJT the command. 
> 
> This is basically the same thing as Dave's proposal with the following

> modification: 
> - new data in payload instead of header 
> - on 8 bits instead of 32 
> 
> I'd argue that 8 bits is sufficient for the same reasons it is large 
> enough 
> for command delivery ordering.  There it not a need to queue larger
than 
> 255 
> for sequential devices. 
> 
> Comments?? 
> 
> Charles Binford 
> LSI Logic Storage Systems 
> (316) 636-8566 
> 


------_=_NextPart_001_01BFDD6A.3D0538CC
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: Quoted-Printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

RE: FCP-2 recovery problem Oops.  I forgot to consider I_T vs. I_T_L = behavior on this.  I don't think the CRN (at 8 bits) is large = enough to close the hole (it wraps too soon).  Summary:  Back to Dave's proposal.  I'll = retract my layering argument in favor of something that will work. = :-)  I consider my layering argument less important than the = compatibility issues of enlarging the payload.  There are not = enough reserved bytes to not change the payload size and have enough = bits to close the hole.  Therefore, I'm back to Dave's change the = parameter data. 
******* 
Details for those interested in why my CRN = suggestion won't work: Suppose I have two non-queuing tape devices behind = bridge being represented as two LUNs behind a single target.  Take = Dave's original scenario, add my CRN field in the REC and we still come = up short. **** from Dave's original email with CRN info added = ***** 
Initiator         =             =             =        Target CMD ----------------------------> 1. A command (e.g. Test Unit Ready) is sent to the = target with OX_ID=3D1. (CRN=3D5)            = <---------------------------     Response 2. A ;good; response is sent back to the = initiator. The initiator gets 
the response and knows the TUR command has been = completed, so the 
exchange resources are freed. The target has sent = the response, so it 
saves the exchange information just in case the = initiator needs to 
recover a dropped response with REC/SRR. CMD ---------------------------->  X (dropped = frame) 3. A new command (e.g. SPACE forward 1 block) is sent = to the target with 
OX_ID =3D 1 (AND CRN=3D6). This OX_ID reuse can = occur for many reasons in various 
systems. The command never makes it to the target = because of a bit 
error. REC ------------------------------> 4. The initiator sends an REC ELS command to the = target to make sure all 
is well with OX_ID 1 (and CRN=3D6). ******* end cut and paste from Dave's email = ****** At this point my plan is working - single LUN.  = But consider Multi-LUN.  IF you change the scenario only slightly = we break: CMD ------OX_ID=3D1, CRN=3D6, LUN=3D1 = ---------------------->            = <---------------------------     Response CMD -----OX_ID=3D1, CRN=3D6, LUN=3D2 ----->  = X (dropped frame) 
REC -----OX_ID=3D1, CRN=3D6  = --------------------> The target can't tell if host wants LUN=3D1 data = (lost response) or if the host is really talking about the lost CMD to = LUN=3D2 With Dave's proposal an initiator can increment the = payload field of the FCP_CMD frame every time he sends it out.  He = only needs to track it for the open exchanges.  Because this is a = 32 bit field the wrap time is so long we don't have to worry about = it.  That is why it works! 
Charles Binford 
LSI Logic Storage Systems 
(316) 636-8566 
-----Original Message----- 
From: David A. Peterson [mailto:dap at storage.network.com] 
Sent: Friday, June 23, 2000 12:29 PM 
To: Binford, Charles 
Cc: T10 at t10.org; 'FC Reflector' 
Subject: Re: FCP-2 recovery problem 
Howdy All, 
Finally catching up on emails. 
Charles proposal is exactly what I was thinking = also. In my reading of 
the CRN text in FCP-2 today, it does not explicitly = state the CRN is 
based on the I_T_L nexus, but the EPDC bit is = contained in the lun 
control mode page, so I guess it is implied. Would = be nice to see some 
text stating this. Anyways, I think the proposal would work if the CRN = is based on the 
I_T_L 
nexus (i.e. not I_T nexus). Dave ;Binford, Charles; wrote: 
> 
> * 
> * From the fc reflector, posted by: 
> * ;Binford, Charles; = <cbinford at lsil.com> 
> * 
> Carl, I fail to see how adding one bit does any = more than extend the 
> OX_ID 
> field by one bit.  Now instead of roll = over at 64k, you roll over at 
> 128k. 
> However, in many implementations I'm familiar = with the OX_ID range used 
> is 
> much smaller.  This is because people are = using it as designed by the FC 
> committee - to be an HW lookup index to the = resources associated with 
> the 
> exchange.  Most FC chips don't have = resources to support 64K exchanges 
> per 
> d_id and thus the real range for OX_ID is much = smaller.  (We'd be guilty 
> of 
> designing a very limiting architecture if the = typical implementation was 
> actually bumping up against the limitations of = the standard.) 
> 
> The other point (which is more important to = this discussion) I want to 
> make 
> about OX_ID assignments is that because it is = an index into chip 
> resources 
> the management of it is often in a very low = layer of the driver that has 
> no 
> knowledge of the payload.  The layer = building the FCP_CMD payload has no 
> clue what OX_ID is going to be assigned, the = layer assigning the OX_ID 
> has 
> no clue about any 'Hermann' bit that may or may = not need to be set. 
> 
> While I believe Dave Baldwin's solution will = work, I also have some 
> reservations about it.  Again it is the = layering thing.  FC driver 
> interfaces would have to be changed to allow = the upper layer to specify 
> a 
> value to be placed in the header (built by a = lower layer).  I'd rather 
> place 
> the new data in the payload that is built by = the same layer that 
> understands 
> what is going on. 
> 
> ********* here is my proposal ********** 
> For any command that can't be simply aborted = and retried (i.e. not 
> Inquiry, 
> etc.) use a non-zero CRN value.  Define = the reserved byte in the REC 
> payload 
> to hold the CRN value if non-zero.  A = target receiving an REC with a 
> non-zero ;CRN; value shall match it = and the OX_ID before determining if 
> to 
> ACC or RJT the command. 
> 
> This is basically the same thing as Dave's = proposal with the following 
> modification: 
> - new data in payload instead of header 
> - on 8 bits instead of 32 
> 
> I'd argue that 8 bits is sufficient for the = same reasons it is large 
> enough 
> for command delivery ordering.  There it = not a need to queue larger than 
> 255 
> for sequential devices. 
> 
> Comments?? 
> 
> Charles Binford 
> LSI Logic Storage Systems 
> (316) 636-8566 
> 
------_=_NextPart_001_01BFDD6A.3D0538CC--




More information about the T10 mailing list