FCP-2 recovery problem
Carl.Zeitler at COMPAQ.com
Tue Jun 27 04:58:58 PDT 2000
* From the T10 Reflector (t10 at t10.org), posted by:
* "Zeitler, Carl" <Carl.Zeitler at compaq.com>
If I issue ABTS to OX_ID 1, which LUN steps up to the plate?
I do not believe any of the schemes presented to date work on a LUN basis.
ABTS is necessary to do recovery. It uses a bit in the Parameter field to
distinguish between aborting the Sequence Vs the Exchange and therefore
cannot be used for a 4 byte identifier. The LUN or some means of making the
OX_ID unique on a LUN basis must be part of ABTS. Otherwise the ABTS is
ambiguous to the Target.
So I think we are back to the possibilities I have proposed. If it is
deemed necessary to have a nexus of D_ID/S_ID/OX_ID/RX_ID/LUN, then the LUN
(or some handle) has to be carried in the frame header that can also be used
by ABTS, to be squeaky clean. The only place for it is in a Device Header.
This is necessary only for Class 3, I believe, as for Class 2, the ACK to
FCP_Resp will wipe out the context for the completed exchange. If the ACK
to the response gets whacked, then ABTS/RRQ process will also wipe out the
Exchange. If both the ACK to FCP_RESP and the following FCP_CMD using the
same OX_ID both get wiped, then this is a double error and the outstanding
Exchange is aborted. There are other double error scenarios that we have
covered in Class 2 and believe that if we follow the rule of aborting the
current Exchange in the event of a double error, we stay out of trouble and
avoid data integrity problems.
The other solution is to steer away from an I_T_L nexus and stick with an
I_T nexus. If this is the case, then the Herman bit solves the problem as
there can only be one outstanding Exchange that is considered "open" ( See
D.5 Class 3, T10/00-137r5). Herman can make the distinction between the old
"open" Exchange and the new, current Exchange having the same OX_ID.
Another possibility is to use FCP_CONF. FCP_CONF "closes" the Exchange. If
FC__CONF is not received by the Target, it issues REC. If the OX_ID in the
REC payload is the same as an outstanding OX_ID, the current Exchange is
aborted. In some cases, hopefully infrequent, this could unnecessarily
clobber a validly received command.
So the saga of this seemingly trivial Class 3 error recovery corner case
continues. The problem appears trivial, but the solutions are downright
Compaq Computer Corporation
MS 150801, 20555 SH249, Houston, TX 77070
Phone:281-518-5258 Fax: 281-514-5270
E-Mail: Carl.Zeitler at compaq.com
From: Baldwin, Dave [mailto:Dave.Baldwin at emulex.com]
Sent: Monday, June 26, 2000 10:11 PM
To: Matt Wakeley
Cc: T10 at t10.org; 'FC Reflector'
Subject: Re: FCP-2 recovery problem
* From the fc reflector, posted by:
* "Baldwin, Dave" <Dave.Baldwin at emulex.com>
It would not work with the change you are suggesting. If I send OX_ID=1 and
CRN=6 to LUN=1, then send CRN 7 through CRN 5 (wraparound) to LUN=2 (maybe a
disk), then send OX_ID=1 and CRN=6 to LUN=1 (which gets dropped), I have the
Matt Wakeley wrote:
> * From the fc reflector, posted by:
> * Matt Wakeley <matt_wakeley at agilent.com>
> If the CRN was *not* based on LU, your proposal would work...
> "Binford, Charles" wrote:
> > At this point my plan is working - single LUN. But consider Multi-LUN.
> > IF
> > you change the scenario only slightly we break:
> > CMD ------OX_ID=1, CRN=6, LUN=1 ---------------------->
> > <--------------------------- Response
> > CMD -----OX_ID=1, CRN=6, LUN=2 -----> X (dropped frame)
> If the above had a CRN of 7 (since both LUs are in the same target),
> everything would work fine.
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org
More information about the T10