Persistent Reservation Proposal - Group Reservations

Knight, Frederick Frederick.Knight at
Thu Dec 27 06:58:31 PST 2007

Formatted message: <A HREF="r0712271_f.htm">HTML-formatted message</A>

Would moving access restrictions from being based on the registration to
being based on a specific reservation type help for this?
Today, a bunch of initiators register - and that basically has no impact
anyones access.  When 1 initiator does the reservation, then that action
impacts all previous registrations (allowing continued access), and all
(non-registered) initiators (denying them access).  Anyone who registers
after that immediately joins the existing reservation.	That is what
group reservation is trying to deal with (getting free access under the
reservation just by doing a register - which is easy to do).
Could we create reservations types for:
  write exclusive - reserved only
  exclusive access - reserved only
This would create reservation types that require reservation actions to
access.  A simple registration all by itself would still have no impact
on access
until an initiator also performed the reservation step.  Once any
uses this new type reservation, then a registered node would loose
(a reservation conflict status) until that initiator also performed a
function (with type reserved only).  This also means there would be
reservation holders (since every initiator does a reserve); so no need
to deal
with the one reservation holder case (#5 below).
Once this reservation type (reserved only) is in place, an initiator
that is already
registered but not reserved, could not do I/O or change the reservation
type (reserves
with other reservation types would fail).  Only a reserved initiator
could change
the reservation type (with a new reserve).
This would cover all cases below (1-6) except for #5.  As for #4
the process could be a little more protected.  With all the existing
types, the initiator just registers and preempts.  With these new types,
initiator would have to register, reserve, and then preempt.  Would that
the #4 requirement, or do you feel preempt can't have any changes at
My opinion is that a new reservation type could when it is used, create
requirements.  On the other hand, if you want to have preempt without
then we could exempt that 1 function from the reserve requirement.
The question would be a group ID.  Is one needed? or would the simple
to require a matching reservation (of type reserved only) be enough?
Using a
simple shared value wouldn't work for this idea because of the problem
would create for preempt (register, reserve with shared value, then
if you don't know the shared value, you can't preempt; so that would
using a shared value impracticle; unless we exempt preempt from the
reserve requirement, and just allow register; preempt (without a
then, this approach could work.
More comments below on the existing proposal.
    Fred Knight
From: Kevin D Butt [mailto:kdbutt at] 
Sent: Monday, December 24, 2007 10:40 AM
To: Raymond Gilson
Cc: Christine R Knibloe; Knight, Frederick; Roger Cummings; t10 at
Subject: RE: Persistent Reservation Proposal - Group Reservations
Please see this font. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at 
"Raymond Gilson" <raymond_gilson at> 
12/22/2007 07:06 AM 
Kevin D Butt/Tucson/IBM at IBMUS 
Christine R Knibloe/Tucson/IBM at IBMUS, "Knight, Frederick"
<Frederick.Knight at>, "Roger Cummings"
<roger_cummings at>, <t10 at> 
RE: Persistent Reservation Proposal - Group Reservations
Comments in line 
From: Kevin D Butt [mailto:kdbutt at] 
Sent: Friday, December 21, 2007 5:33 PM
To: Raymond Gilson
Cc: Christine R Knibloe; Knight, Frederick; Roger Cummings; t10 at
Subject: RE: Persistent Reservation Proposal - Group Reservations
Thanks for joining in.	Let me summarize what I think has been said by
all parties who have joined in these discussions. 
1) (From Ray) Some applications will have trouble providing a list of
Transport IDs. 
2) (From Fred) There is a desire to allow members of a cluster that were
not active at the creation of the reservation to join in. 
3) (From Kevin) Who can join or participate in a group reservation is
required to be controlled such that only those initiators that are part
of the cluster (i.e. group) can join. 
4) PREEMPT must be allowed (i.e., what we do cannot lock out PREEMPT or
make it not work correctly) 
5) Both Registrants Only and All Registrants types of functionality need
to be provided for. 
6) There should be an option for all target ports 
If we can't require a list of Transport ID's then it seems that the
suggested "shared secret" (not cryptographic, just some unique value
that is protected through obfuscation) is probably the best way to do
this.  This would be something akin to requiring the same reservation
key value.  However, using the reservation key does not seem to be
plausible because of how it is being used today.  What we need would be
some different value that would not be reportable via a Persistent
Reserve In.  This would keep third-party initiators from joining the
reservation.  If this new Group Reservation Identifier (GRID) were
added, that would take care of #1, #2, and #3 above. 
For #5 above, it seems that we could provide that by adding a bit in the
reserve command that indicates "All Participants are reservation
holders".  If set to one, then it acts like an All Registrants.  If set
to zero then it acts like Registrants Only.  This includes in the
I'm not sure a bit is the right place for this.  Right now, it's
specified in the reservation type (registrants only, or all
registrants).  Creating a bit creates a place for conflicting
information to be supplied.  Are you suggesting this bit would apply to
only some reservation types, and be unused for other reservation types?
What would it mean if you did a registrants only reservation type, but
set the all registrants bit?
Another method would be to use all the existing reservation types (for
#5 above), but add a GRID bit to specify that the reservation applies to
only those that supply a matching GRID (all others get reservation
conflict until they supply a matching GRID).  Then it could in fact
apply to all reservation types.   
For #6 we use the ALL_TGT_PORTS bit the same as other reservations
Issues still to be resolved: 
a) Some systems won't want to require all initiators to send a
Persistent Reserve Out command.  Possible solution is to allow reserving
multiple initiators if a Transport ID list is sent.  Additional
initiators could join later if they have the GRID.  However this would
make it more complicated and if it is not needed I would rather not add
this option. 
In a practical sense, I cannot see how this could be avoided (all
initiators sending PRO) -- since PR requires trust and good behavior,
each initiator must make no assumption about what the protection level
is currently set at -- so it must verify the settings as the correct and
expected.  If the settings aren't as expected, it must bail out, or go
into error recovery to attempt to avoid messing up some other
application (a fist fight on the SAN for device control does nobody any
good).	I see no reason to provide for this (I do know that the current
command allows a registration for multiple ports, but I cannot imagine
using it in the real world).  
I'm not sure I understand the issue here.  How can a system that doesn't
want to send PR commands take advantage of the features offered by that
command?  Are you thinking of multi-path systems (where a single host
system has multiple initiators with access to the same target)?  How
does this new proposal make this different than the situation today
(where they need to use the transport ID list and the spec_i_pt bit), or
send PR-OUT from every initiator?  I guess I''m mostly agreeing that
good behavior is already required.
<<kdbutt: I am certainly willing to agree.  All could still be
registered by using the all_i_pt bit. However, I suspect there will be
those that will find this unacceptable.  Anybody who needs a way to add
all initiators who are currently registered to the group reservation,
please speak up (and comment on a method to accomplish this).>> 
I would suggest we do not want a way to add all currently registered
initiators to the group.  This would tend to have the potential to
enlarge the group beyond what is intended.  I'd prefer a method that
requires explicit action.
b) If the first I_T nexus sets the "All Participants are reservation
holders" to zero when it creates the reservation and then a subsequent
I_T nexus sets it to one, what is the behavior?  Change the type?
Reject?  Also, what is reported in the Report Full Status if All
Participants are reservation holders is set to zero?   
I don't think this is a problem -- once a reservation is established it
cannot be changed without a preempt, clear, or removal of the old.  If
this isn't either true, I would want the attempt to change it to get
rejected.  I would expect a change to require a preempt type operation.
<<kdbutt: I think the correct response for a new participant that
attempts to change the type is to reject a command that attempts to
change the type. 
I don't think you can require a preempt/clear in order to change the
type.  The whole point of PR is that a reservation is present at all
times; you can change the type, you can move the owner of the
reservation (such as preempt on a registrants only type), but you never
want to loose the protection provided by the PR (see note 10 in SPC4 -
section 5.6 - clearing).
For what to return in Report Full Status if  "All Participants are
reservation holders" is set to zero, I am concerned about confusion.  In
reality, only the first is a reservation holder and therefore only the
first should set the reservation_holder bit to one.  However, there
would now be two groups that cannot be distinguished.  There first is
not the reservation_holder but part of the reservation and the second is
not the reservation_holder and not part of the reservation.  I think we
should probably add a "group reservation participant" bit to distinguish
the two.>> .   
c) If we go to this method of using the GRID to determine who can join,
then the Reservation Key may or may not be different. 
c-1) if the Reservation Key is different, then a PREEMPT of a
Reservation Key will do what? 
c-2) if the Reservation Key is the same, then a PREEMPT will act the
same as an AR or RO reservation today. 
c-3) Do we require the Reservation Key to be the same?	 
Preempt is of a reservation, not a key.  The key's currently are not
compared, and have no valid use (by the device) except that each
initiator has registered one, and only one at a time.  We don't want to
change this behavior -- a key is random number assigned for some
external purpose that the device records and reports.  (My application
requires this to operate properly) 
<<kdbutt: Look at clause of SPC-4r11.	This looks to me like
the Reservation Key is used to decide between unregistering I_T nexuses
with the sent reservation key or if the reservation key is that of the
reservation holder, then removing the reservation and registrations of
all that have that reservation key.  My intent is not to change the
current behavior.>> 
Agreed Kevin.  A Preempt should impact the registration/reservation of
all those initiators with a key that matches the one that is being
preempted - the same as current behavior.
d) Does this approach still have the issues that Roger was concerned
about (e.g., the corner cases)? 
I hope the use of a GRID would not introduce any new issues to SPR -- it
only prevents a registrant from becoming a reservation participant
without some external knowledge.  It doesn't prevent a registrant from
preempt, clear, or any other error recovery operations (and MUST not). 
I think this is one of the questions.  Error recovery is often one of
the cases where you end up with fist-fights out in the SAN over who owns
the device.  Hosts do exactly what you suggested above (host 1 checks
with PR-IN, doesn't like what it sees, and preempts and "fixes" it;
then, host 2 does exactly the same - and the fight is on).  It's
perfectly valid to want to leave this working as is.  I understand that
desire.  I just would like to discuss the possibility of improving the
situation.  If we can't or have other requirements not to change it,
that's fine. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at 
"Raymond Gilson" <raymond_gilson at> 
12/21/2007 12:45 PM 
"Roger Cummings" <roger_cummings at>, "Knight, Frederick"
<Frederick.Knight at>, Kevin D Butt/Tucson/IBM at IBMUS 
<t10 at>, Christine R Knibloe/Tucson/IBM at IBMUS 
RE: Persistent Reservation Proposal - Group Reservations
Several years ago I was trying to figure out a way to introduce a "JOIN"
function to the SPR.  The initiator would register, but that would not
grant it access to a reservation of the "joined only" type.  To join it,
the initiator would have to send a join SPR command -- we could add a
"shared secret" field to the join, so that only those initiators that
knew the secret could join. 
I think we will have a great deal of trouble with a "white list"
approach -- as an application, I have no idea what my port ID is (or
anything else for that matter). 
Would something like this make sense? 
Ray Gilson 
From: owner-t10 at [mailto:owner-t10 at] On Behalf Of Roger
Sent: Tuesday, December 18, 2007 10:24 AM
To: Knight, Frederick; Kevin D Butt
Cc: t10 at; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
The way you clean up from a disaster is to Preempt, that's what it's
there for. Most of the applications that I know that will actually issue
a Preempt make it a very special function that doesn't happen in the
normal flow, and one app at least DOES require manual intervention of an
operator before kicking off the preempt. 
Yes, today, a Preempt has to be issued through a registered I_T nexus,
but a registration with the SPEC_I_T bit doesn't have to come from an
already registered initiator - see Table 33 in SPC-4, and I don't
believe Kevin changed that in his proposal. 
For the future, however we define a "group" for the purposes of new
reservation types, we will have to make sure that an Initiator outside
of the "group" can issue a Preempt to handle the disaster recovery case.
From: Knight, Frederick [mailto:Frederick.Knight at] 
Sent: Tuesday, December 18, 2007 10:56 AM
To: Roger Cummings; Kevin D Butt
Cc: t10 at; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
My question has had to do with differentiating the disaster clean up 
case from the non-cooperating host case. 
How do I clean up from a disaster?  If all my "reserved" initiators 
melt down, and there aren't any of them left anymore (because of 
a site disaster, or whatever), how does some other node come along 
and clean up so it can gain access? 
Would it require manual intervention?  Or, is there a way in the
that I can register and preempt the group reservation (does the use 
of the SPEC_I_PT bit allow this as you have suggested Roger).  I 
thought the SPEC_I_PT had to come from an already registered 
initiator (which in a disaster, none exist anymore). 
   Fred Knight 
From: Roger Cummings [mailto:roger_cummings at] 
Sent: Tuesday, December 18, 2007 10:03 AM
To: Kevin D Butt; Knight, Frederick
Cc: t10 at; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
I'm sorry, I don't think it's as cut and dried as you make out. This
gets into some of the corner cases that I listed in my first response. 
The point to be made in response to Fred's case is that a third-party
can create registrations for a downed initiator (via the SPEC_I_PT) bit,
so that when it comes up again it will be able to participate in the
reservation without having to register itself. 
Also, you say that "We have made provisions for adding members once the
reservation exists, but only one of the reservation holders can add
another entity." Two things in response to that: 
1) I didn't see any specific provision for adding members in your
proposal, so I presume you'd just issue another RESERVE with the same
type and the whole list of transport IDs to be included again, and thus
the Target would have a whole lot of work to do again to set up another
2) I that really what you want, that an member of the existing group can
reissue the RESERVE with a whole bunch of different TransportIDs,
perhaps excluding some that were previously there? 
From: owner-t10 at [mailto:owner-t10 at] On Behalf Of Kevin D
Sent: Monday, December 17, 2007 3:54 PM
To: Knight, Frederick
Cc: t10 at; Christine R Knibloe
Subject: RE: Persistent Reservation Proposal - Group Reservations
This is being proposed for SPC. 
There are multiple types of reservations.  In an environment where one
node of a cluster must join later, one of the other types can be used.
Either that or have an existing node in your cluster add the new node.
The whole intent of this Group reservation is to lock out everybody that
is not explicitly specified during the reserve.  We have made provisions
for adding members once the reservation exists, but only one of the
reservation holders can add another entity.  The new entity cannot add
itself.  This is the whole point of reservations (i.e., lock out others
|from doing stuff while I think I have exclusive rights). 
To put it in other word's, to allow somebody to join the reservation of
their own accord without permission is EXACTLY what I am trying to
protect against. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at 
"Knight, Frederick" <Frederick.Knight at> 
12/17/2007 01:38 PM 
Kevin D Butt/Tucson/IBM at IBMUS 
RE: Persistent Reservation Proposal - Group Reservations
Sorry, you can't require everyone to register before the reserve. 
That's like saying my whole cluster can't boot because 1 node is down.
You need 
to have a way for a "down" initiator to join the fun after the fact. 
I helped write a host cluster product that used a shared tape (failover
model).  The 
backup application would write to the tape.  If a system failure ever
happened, the 
backup application would failover to a different host.	It would skip
backwards on 
the tape for a few records, recognize where it left off, and then resume
BUT, for some protection, we used reservations to make sure only 1
initiator at a 
time could access the tape.  The interesting point however, is that we
were in the 
process of upgrading from old SCSI-2 RESERVE to using PR.  Because, we
have multiple HBAs in the host, and we wanted to be able to use more
than 1 of 
those HBAs (so we needed multiple reservations - aka PR).  Having this
(group reservations) would have been a real nice addition. 
As for the RA/AR differences.  It seemed to be timing.	Registrants Only
was fairly 
early on (as I remember), and so implemented by several O/S vendors.
Later on, 
some issues were found (which got complicated spec-ees added to
address), but also, 
the All Registrants was added (which didn't have those issues).  But,
since there were 
implementations, it couldn't be removed like the other old PR types that
no one ever 
used.  Anyway, I agree, they offer basically the same capabilities, but
RO is already 
out there, and AR is probably what new implementers are using (it's
easier to understand 
and implement from the host side).  Most of the differences are already
so there wouldn't be that much extra for you to write to have both types
(which I think 
would be better than bit somewhere - do it the same way all the others
are done).  But, 
you could also just do the AR version, and let someone else add the RO
version if they 
want it. 
Are you proposing this for tape only? or SPC in general?  I assume SPC
in general. 
  Fred Knight 
From: Kevin D Butt [mailto:kdbutt at] 
Sent: Monday, December 17, 2007 9:51 AM
To: Roger Cummings
Cc: t10 at
Subject: RE: Persistent Reservation Proposal - Group Reservations
Thank you for your feedback.  I am certainly willing to entertain  other
methods for accomplishing the end goal in an easier fashion.  I am not
sure I understand how your proposed method makes it more backward
compatible.  In my proposal PRin would show a different type of
reservation and hence the application clients would not try to join the
reservation because they don't know about the type.  In your proposal,
application clients would not be allowed to register.  This is a
deviation from what they can always do today - unless there is a
resource issue.  This seems more disruptive to me.  I would assume that
there would be a new additional sense code added for UNABLE TO REGISTER
BECAUSE A GROUP RESERVATION IS IN PLACE (or analogous).  This would be a
new thing for failure to register and there would be pain at the
register point.  Perhaps that is better than at the reserve point - but
I would think that it would be better handled as a reservation conflict
since that is what it is instead of something the application client
does not understand. 
As for "all registrants" type vs. "registrants only" I didn't see where
the difference would be interesting, but I am not opposed to providing a
way to switch between which of these two types is done.  Whether it is
additional types or some bit during registration etc. 
As for some of the corner cases mentioned below, if each I_T nexus that
is supposed to be part of the group reservation is required to be
registered before the reservation is made, and if the reservation is
released when the last group reservation participant is unregistered,
then I think we don't have an issue. 
I would prefer that we work together to shape a mutually beneficial
proposal as opposed to have "competing" proposals.  I am willing to
modify my proposal where it can be made easier and such.  I am not sold
that my proposed method is the only way or even the best - it's just the
way I thought of doing it.  I admit that I have always been very
confused about the usefulness of RA and AR types.  They make absolutely
no sense in the tape world. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at 
"Roger Cummings" <roger_cummings at> 
Sent by: owner-t10 at 
12/14/2007 12:10 PM 
Kevin D Butt/Tucson/IBM at IBMUS 
<t10 at> 
RE: Persistent Reservation Proposal - Group Reservations
First of all, let me say that I completely support what you're trying to
do here. I think that providing a method in persistent reservations
(PRs) to support shared access between ONLY a specifically-designated
set of systems is a worthy goal, and something we should do in SPC-4. 
Adding a set of Transport IDs to Reserve as per your document 08-024 &
08-025 is certainly possible, but it's a massive change to the way that
PRs work today, and it throws up a bunch of nasty corner cases and
backwards compatibility issues. 
The massive change comes from the fact that now the Target will have to
remember which registrations are in the Reservation, and which are not.
It will probably have to preserve all of the transport information for
the life of the reservation. 
The corner cases are things like, what happens if there's no longer a
registration that corresponds to the transport ID in the Reserve? Does
the Reserve succeed? What happens if a registration comes in later,
after the reservation has been established - does that device it get
Backwards compatibility issues may arise like this: An existing device
registers, and finds it has no access, so it does a PR In and finds out
that a reservation is in place, retries its access and still it has no
access. What does it do next, preempt the reservation because it assumes
the Target is broken? 
Reserve also has to be an "atomic" command, and I've always thought that
was why it's functionality is as compact as it is today. Most of the
complex operations related to addresses and keys are done at
registration time, and those operations don't have to be atomic. 
One more thing: you chose for your new "group" reservations to follow
the "all registrants" approach is terms of the definition of the
reservation holder. While that's fine by me (obviously), I suspect there
are also situations where group reservations that follow the
"registrants only" approach might be useful. 
The bottom line from my point of view is this: Your proposal is feasible
and we can probably make it work. But I wonder if there's an easier way
to achieve the same goal that is more compatible with existing practice
and requires less of a change in functionality on the Target side. 
What if we didn't add any new reservations types, but instead added some
new functionality to the registration process? What I'm thinking of a
new Register feature that causes the Target to kill all existing
registrations, create the registrations identified in the transport IDs
in the Register command, and not accept any future registrations. That
way, we don't need any changes to Reserve, and an Initiator with
existing functionality would just not be able to register and therefore
would not be confused. 
Does that make sense to you? Is there a chance this is an easier
approach? If so, I'll write up a detailed proposal that's the equivalent
of 08-025r0 and we can compare and contrast at the next CAP. 
Again, thanks for getting this started, I think it's a worthwhile
endeavor and I'll be glad to put some cycles towards defining this sort
of functionality for SPC-4. 
From: owner-t10 at [mailto:owner-t10 at] On Behalf Of Kevin D
Sent: Monday, December 10, 2007 4:18 PM
To: t10 at
Subject: Persistent Reservation Proposal - Group Reservations
I have posted two documents related to an additional Persistent
Reservation Type.  The first document is a presentation on where
persistent reservations are today and where they fall short in the
scenarios covered by the proposal.  It also covers the intent of the
proposal and what will be proposed.  The second is the actual proposal 
Your PDF file will be posted at:
Normally, the posting/archiving process takes about 30 minutes. 
Kevin D. Butt
SCSI & Fibre Channel Architect, Tape Firmware, IBM
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-2869 / 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at 

More information about the T10 mailing list