Persistent Reservation Proposal - Group Reservations

Kevin D Butt kdbutt at us.ibm.com
Mon Dec 24 07:39:34 PST 2007


Attachment #1: <A HREF="r0712240_nameless-772-2-1.html">nameless-772-2-1.html</A>

Ray,
Please see this font.
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com
http://www-03.ibm.com/servers/storage/ 
"Raymond Gilson" <raymond_gilson at symantec.com> 
12/22/2007 07:06 AM
To
Kevin D Butt/Tucson/IBM at IBMUS
cc
Christine R Knibloe/Tucson/IBM at IBMUS, "Knight, Frederick" 
<Frederick.Knight at netapp.com>, "Roger Cummings" 
<roger_cummings at symantec.com>, <t10 at t10.org>
Subject
RE: Persistent Reservation Proposal - Group Reservations
Comments in line
From: Kevin D Butt [mailto:kdbutt at us.ibm.com] 
Sent: Friday, December 21, 2007 5:33 PM
To: Raymond Gilson
Cc: Christine R Knibloe; Knight, Frederick; Roger Cummings; t10 at t10.org
Subject: RE: Persistent Reservation Proposal - Group Reservations
Ray, 
Thanks for joining in.	Let me summarize what I think has been said by all 
parties who have joined in these discussions. 
1) (From Ray) Some applications will have trouble providing a list of 
Transport IDs. 
2) (From Fred) There is a desire to allow members of a cluster that were 
not active at the creation of the reservation to join in. 
3) (From Kevin) Who can join or participate in a group reservation is 
required to be controlled such that only those initiators that are part of 
the cluster (i.e. group) can join. 
4) PREEMPT must be allowed (i.e., what we do cannot lock out PREEMPT or 
make it not work correctly) 
5) Both Registrants Only and All Registrants types of functionality need 
to be provided for. 
6) There should be an option for all target ports 
If we can't require a list of Transport ID's then it seems that the 
suggested "shared secret" (not cryptographic, just some unique value that 
is protected through obfuscation) is probably the best way to do this. 
This would be something akin to requiring the same reservation key value. 
However, using the reservation key does not seem to be plausible because 
of how it is being used today.	What we need would be some different value 
that would not be reportable via a Persistent Reserve In.  This would keep 
third-party initiators from joining the reservation.  If this new Group 
Reservation Identifier (GRID) were added, that would take care of #1, #2, 
and #3 above. 
For #5 above, it seems that we could provide that by adding a bit in the 
reserve command that indicates "All Participants are reservation holders". 
 If set to one, then it acts like an All Registrants.  If set to zero then 
it acts like Registrants Only.	This includes in the unregistering. 
For #6 we use the ALL_TGT_PORTS bit the same as other reservations today. 
Issues still to be resolved: 
a) Some systems won't want to require all initiators to send a Persistent 
Reserve Out command.  Possible solution is to allow reserving multiple 
initiators if a Transport ID list is sent.  Additional initiators could 
join later if they have the GRID.  However this would make it more 
complicated and if it is not needed I would rather not add this option.  
In a practical sense, I cannot see how this could be avoided (all 
initiators sending PRO) -- since PR requires trust and good behavior, each 
initiator must make no assumption about what the protection level is 
currently set at -- so it must verify the settings as the correct and 
expected.  If the settings aren't as expected, it must bail out, or go 
into error recovery to attempt to avoid messing up some other application 
(a fist fight on the SAN for device control does nobody any good).  I see 
no reason to provide for this (I do know that the current command allows a 
registration for multiple ports, but I cannot imagine using it in the real 
world). 
<<kdbutt: I am certainly willing to agree.  All could still be registered 
by using the all_i_pt bit. However, I suspect there will be those that 
will find this unacceptable.  Anybody who needs a way to add all 
initiators who are currently registered to the group reservation, please 
speak up (and comment on a method to accomplish this).>>
b) If the first I_T nexus sets the "All Participants are reservation 
holders" to zero when it creates the reservation and then a subsequent I_T 
nexus sets it to one, what is the behavior?  Change the type?  Reject? 
Also, what is reported in the Report Full Status if All Participants are 
reservation holders is set to zero?  
I don't think this is a problem -- once a reservation is established it 
cannot be changed without a preempt, clear, or removal of the old.  If 
this isn't either true, I would want the attempt to change it to get 
rejected.  I would expect a change to require a preempt type operation. 
<<kdbutt: I think the correct response for a new participant that attempts 
to change the type is to reject a command that attempts to change the 
type.
For what to return in Report Full Status if  "All Participants are 
reservation holders" is set to zero, I am concerned about confusion.  In 
reality, only the first is a reservation holder and therefore only the 
first should set the reservation_holder bit to one.  However, there would 
now be two groups that cannot be distinguished.  There first is not the 
reservation_holder but part of the reservation and the second is not the 
reservation_holder and not part of the reservation.  I think we should 
probably add a "group reservation participant" bit to distinguish the 
two.>>
c) If we go to this method of using the GRID to determine who can join, 
then the Reservation Key may or may not be different. 
c-1) if the Reservation Key is different, then a PREEMPT of a Reservation 
Key will do what? 
c-2) if the Reservation Key is the same, then a PREEMPT will act the same 
as an AR or RO reservation today. 
c-3) Do we require the Reservation Key to be the same?	
Preempt is of a reservation, not a key.  The key's currently are not 
compared, and have no valid use (by the device) except that each initiator 
has registered one, and only one at a time.  We don't want to change this 
behavior -- a key is random number assigned for some external purpose that 
the device records and reports.  (My application requires this to operate 
properly) 
<<kdbutt: Look at clause 5.6.10.4 of SPC-4r11.	This looks to me like the 
Reservation Key is used to decide between unregistering I_T nexuses with 
the sent reservation key or if the reservation key is that of the 
reservation holder, then removing the reservation and registrations of all 
that have that reservation key.  My intent is not to change the current 
behavior.>>
d) Does this approach still have the issues that Roger was concerned about 
(e.g., the corner cases)? 
 I hope the use of a GRID would not introduce any new issues to SPR -- it 
only prevents a registrant from becoming a reservation participant without 
some external knowledge.  It doesn't prevent a registrant from preempt, 
clear, or any other error recovery operations (and MUST not). 
Thanks, 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com
http://www-03.ibm.com/servers/storage/ 
"Raymond Gilson" <raymond_gilson at symantec.com> 
12/21/2007 12:45 PM 
To
"Roger Cummings" <roger_cummings at symantec.com>, "Knight, Frederick" 
<Frederick.Knight at netapp.com>, Kevin D Butt/Tucson/IBM at IBMUS 
cc
<t10 at t10.org>, Christine R Knibloe/Tucson/IBM at IBMUS 
Subject
RE: Persistent Reservation Proposal - Group Reservations
Several years ago I was trying to figure out a way to introduce a "JOIN" 
function to the SPR.  The initiator would register, but that would not 
grant it access to a reservation of the "joined only" type.  To join it, 
the initiator would have to send a join SPR command -- we could add a 
"shared secret" field to the join, so that only those initiators that knew 
the secret could join. 
I think we will have a great deal of trouble with a "white list" approach 
-- as an application, I have no idea what my port ID is (or anything else 
for that matter). 
Would something like this make sense? 
Thanks, 
Ray Gilson 
From: owner-t10 at t10.org [mailto:owner-t10 at t10.org] On Behalf Of Roger 
Cummings
Sent: Tuesday, December 18, 2007 10:24 AM
To: Knight, Frederick; Kevin D Butt
Cc: t10 at t10.org; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
Fred, 
The way you clean up from a disaster is to Preempt, that's what it's there 
for. Most of the applications that I know that will actually issue a 
Preempt make it a very special function that doesn't happen in the normal 
flow, and one app at least DOES require manual intervention of an operator 
before kicking off the preempt. 
Yes, today, a Preempt has to be issued through a registered I_T nexus, but 
a registration with the SPEC_I_T bit doesn't have to come from an already 
registered initiator - see Table 33 in SPC-4, and I don't believe Kevin 
changed that in his proposal. 
For the future, however we define a "group" for the purposes of new 
reservation types, we will have to make sure that an Initiator outside of 
the "group" can issue a Preempt to handle the disaster recovery case. 
Regards, 
Roger 
From: Knight, Frederick [mailto:Frederick.Knight at netapp.com] 
Sent: Tuesday, December 18, 2007 10:56 AM
To: Roger Cummings; Kevin D Butt
Cc: t10 at t10.org; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
My question has had to do with differentiating the disaster clean up 
case from the non-cooperating host case. 
How do I clean up from a disaster?  If all my "reserved" initiators 
melt down, and there aren't any of them left anymore (because of 
a site disaster, or whatever), how does some other node come along 
and clean up so it can gain access? 
Would it require manual intervention?  Or, is there a way in the protocol 
that I can register and preempt the group reservation (does the use 
of the SPEC_I_PT bit allow this as you have suggested Roger).  I 
thought the SPEC_I_PT had to come from an already registered 
initiator (which in a disaster, none exist anymore). 
    Fred Knight 
From: Roger Cummings [mailto:roger_cummings at symantec.com] 
Sent: Tuesday, December 18, 2007 10:03 AM
To: Kevin D Butt; Knight, Frederick
Cc: t10 at t10.org; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
Kevin, 
I'm sorry, I don't think it's as cut and dried as you make out. This gets 
into some of the corner cases that I listed in my first response. 
The point to be made in response to Fred's case is that a third-party can 
create registrations for a downed initiator (via the SPEC_I_PT) bit, so 
that when it comes up again it will be able to participate in the 
reservation without having to register itself. 
Also, you say that "We have made provisions for adding members once the 
reservation exists, but only one of the reservation holders can add 
another entity." Two things in response to that: 
1) I didn't see any specific provision for adding members in your 
proposal, so I presume you'd just issue another RESERVE with the same type 
and the whole list of transport IDs to be included again, and thus the 
Target would have a whole lot of work to do again to set up another 
reservation. 
2) I that really what you want, that an member of the existing group can 
reissue the RESERVE with a whole bunch of different TransportIDs, perhaps 
excluding some that were previously there? 
Regards, 
Roger 
From: owner-t10 at t10.org [mailto:owner-t10 at t10.org] On Behalf Of Kevin D 
Butt
Sent: Monday, December 17, 2007 3:54 PM
To: Knight, Frederick
Cc: t10 at t10.org; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
Fred, 
This is being proposed for SPC. 
There are multiple types of reservations.  In an environment where one 
node of a cluster must join later, one of the other types can be used. 
Either that or have an existing node in your cluster add the new node. The 
whole intent of this Group reservation is to lock out everybody that is 
not explicitly specified during the reserve.  We have made provisions for 
adding members once the reservation exists, but only one of the 
reservation holders can add another entity.  The new entity cannot add 
itself.  This is the whole point of reservations (i.e., lock out others 
|from doing stuff while I think I have exclusive rights). 
To put it in other word's, to allow somebody to join the reservation of 
their own accord without permission is EXACTLY what I am trying to protect 
against. 
Thanks, 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com
http://www-03.ibm.com/servers/storage/ 
"Knight, Frederick" <Frederick.Knight at netapp.com> 
12/17/2007 01:38 PM 
To
Kevin D Butt/Tucson/IBM at IBMUS 
cc
Subject
RE: Persistent Reservation Proposal - Group Reservations
Sorry, you can't require everyone to register before the reserve. 
That's like saying my whole cluster can't boot because 1 node is down. You 
need 
to have a way for a "down" initiator to join the fun after the fact. 
I helped write a host cluster product that used a shared tape (failover 
model).  The 
backup application would write to the tape.  If a system failure ever 
happened, the 
backup application would failover to a different host.	It would skip 
backwards on 
the tape for a few records, recognize where it left off, and then resume 
operation. 
BUT, for some protection, we used reservations to make sure only 1 
initiator at a 
time could access the tape.  The interesting point however, is that we 
were in the 
process of upgrading from old SCSI-2 RESERVE to using PR.  Because, we 
also 
have multiple HBAs in the host, and we wanted to be able to use more than 
1 of 
those HBAs (so we needed multiple reservations - aka PR).  Having this 
idea 
(group reservations) would have been a real nice addition. 
As for the RA/AR differences.  It seemed to be timing.	Registrants Only 
was fairly 
early on (as I remember), and so implemented by several O/S vendors. Later 
on, 
some issues were found (which got complicated spec-ees added to address), 
but also, 
the All Registrants was added (which didn't have those issues).  But, 
since there were 
implementations, it couldn't be removed like the other old PR types that 
no one ever 
used.  Anyway, I agree, they offer basically the same capabilities, but RO 
is already 
out there, and AR is probably what new implementers are using (it's easier 
to understand 
and implement from the host side).  Most of the differences are already 
documented, 
so there wouldn't be that much extra for you to write to have both types 
(which I think 
would be better than bit somewhere - do it the same way all the others are 
done).	But, 
you could also just do the AR version, and let someone else add the RO 
version if they 
want it. 
Are you proposing this for tape only? or SPC in general?  I assume SPC in 
general. 
   Fred Knight 
From: Kevin D Butt [mailto:kdbutt at us.ibm.com] 
Sent: Monday, December 17, 2007 9:51 AM
To: Roger Cummings
Cc: t10 at t10.org
Subject: RE: Persistent Reservation Proposal - Group Reservations
Roger, 
Thank you for your feedback.  I am certainly willing to entertain  other 
methods for accomplishing the end goal in an easier fashion.  I am not 
sure I understand how your proposed method makes it more backward 
compatible.  In my proposal PRin would show a different type of 
reservation and hence the application clients would not try to join the 
reservation because they don't know about the type.  In your proposal, 
application clients would not be allowed to register.  This is a deviation 
|from what they can always do today - unless there is a resource issue. 
This seems more disruptive to me.  I would assume that there would be a 
new additional sense code added for UNABLE TO REGISTER BECAUSE A GROUP 
RESERVATION IS IN PLACE (or analogous).  This would be a new thing for 
failure to register and there would be pain at the register point. Perhaps 
that is better than at the reserve point - but I would think that it would 
be better handled as a reservation conflict since that is what it is 
instead of something the application client does not understand. 
As for "all registrants" type vs. "registrants only" I didn't see where 
the difference would be interesting, but I am not opposed to providing a 
way to switch between which of these two types is done.  Whether it is 
additional types or some bit during registration etc. 
As for some of the corner cases mentioned below, if each I_T nexus that is 
supposed to be part of the group reservation is required to be registered 
before the reservation is made, and if the reservation is released when 
the last group reservation participant is unregistered, then I think we 
don't have an issue. 
I would prefer that we work together to shape a mutually beneficial 
proposal as opposed to have "competing" proposals.  I am willing to modify 
my proposal where it can be made easier and such.  I am not sold that my 
proposed method is the only way or even the best - it's just the way I 
thought of doing it.  I admit that I have always been very confused about 
the usefulness of RA and AR types.  They make absolutely no sense in the 
tape world. 
Thanks, 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com
http://www-03.ibm.com/servers/storage/ 
"Roger Cummings" <roger_cummings at symantec.com> 
Sent by: owner-t10 at t10.org 
12/14/2007 12:10 PM 
To
Kevin D Butt/Tucson/IBM at IBMUS 
cc
<t10 at t10.org> 
Subject
RE: Persistent Reservation Proposal - Group Reservations
Kevin, 
First of all, let me say that I completely support what you're trying to 
do here. I think that providing a method in persistent reservations (PRs) 
to support shared access between ONLY a specifically-designated set of 
systems is a worthy goal, and something we should do in SPC-4. 
Adding a set of Transport IDs to Reserve as per your document 08-024 & 
08-025 is certainly possible, but it's a massive change to the way that 
PRs work today, and it throws up a bunch of nasty corner cases and 
backwards compatibility issues. 
The massive change comes from the fact that now the Target will have to 
remember which registrations are in the Reservation, and which are not. It 
will probably have to preserve all of the transport information for the 
life of the reservation. 
The corner cases are things like, what happens if there's no longer a 
registration that corresponds to the transport ID in the Reserve? Does the 
Reserve succeed? What happens if a registration comes in later, after the 
reservation has been established - does that device it get access? 
Backwards compatibility issues may arise like this: An existing device 
registers, and finds it has no access, so it does a PR In and finds out 
that a reservation is in place, retries its access and still it has no 
access. What does it do next, preempt the reservation because it assumes 
the Target is broken? 
Reserve also has to be an "atomic" command, and I've always thought that 
was why it's functionality is as compact as it is today. Most of the 
complex operations related to addresses and keys are done at registration 
time, and those operations don't have to be atomic. 
One more thing: you chose for your new "group" reservations to follow the 
"all registrants" approach is terms of the definition of the reservation 
holder. While that's fine by me (obviously), I suspect there are also 
situations where group reservations that follow the "registrants only" 
approach might be useful. 
The bottom line from my point of view is this: Your proposal is feasible 
and we can probably make it work. But I wonder if there's an easier way to 
achieve the same goal that is more compatible with existing practice and 
requires less of a change in functionality on the Target side. 
What if we didn't add any new reservations types, but instead added some 
new functionality to the registration process? What I'm thinking of a new 
Register feature that causes the Target to kill all existing 
registrations, create the registrations identified in the transport IDs in 
the Register command, and not accept any future registrations. That way, 
we don't need any changes to Reserve, and an Initiator with existing 
functionality would just not be able to register and therefore would not 
be confused. 
Does that make sense to you? Is there a chance this is an easier approach? 
If so, I'll write up a detailed proposal that's the equivalent of 08-025r0 
and we can compare and contrast at the next CAP. 
Again, thanks for getting this started, I think it's a worthwhile endeavor 
and I'll be glad to put some cycles towards defining this sort of 
functionality for SPC-4. 
Regards, 
Roger 
From: owner-t10 at t10.org [mailto:owner-t10 at t10.org] On Behalf Of Kevin D 
Butt
Sent: Monday, December 10, 2007 4:18 PM
To: t10 at t10.org
Subject: Persistent Reservation Proposal - Group Reservations
I have posted two documents related to an additional Persistent 
Reservation Type.  The first document is a presentation on where 
persistent reservations are today and where they fall short in the 
scenarios covered by the proposal.  It also covers the intent of the 
proposal and what will be proposed.  The second is the actual proposal 
Your PDF file will be posted at:
http://www.t10.org/ftp/t10/document.08/08-024r0.pdf 
http://www.t10.org/ftp/t10/document.08/08-025r0.pdf
Normally, the posting/archiving process takes about 30 minutes. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware, IBM
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com
http://www-03.ibm.com/servers/storage/ 



More information about the T10 mailing list