Re PLEASE EVALUATE ... Persistent Reserve

Lansing J Sloan anduin.ocf.llnl.gov!ljsloan
Thu Feb 2 12:18:46 PST 1995


SCSI Folks,

Here's a two-part response on the proposal Bob Snively sent out
on "Persistent Reserve."  The second part looks at the proposal
|from the perspective of distributed SCSI I/O (in the sense
that I presented it at the January 11 1995 SCSI WG meeting).
Although quotations are from the first (January 24) e-mail,
I made a few changes in view of Bob's January 26 update.


PART 1: GENERAL COMMENTS (IGNORING DISTRIBUTED SCSI I/O)

> Bill Dallas of DEC has suggested a slightly different
> approach to the multi-port multi-initiator problem. ...

On the whole, these ideas look good.

...

> It is only cleared by a power cycle or by
> a properly qualified Persistent Reserve from another 
> Initiator.  ...

I understand that Fibre Channel, Serial Bus, and
SSA all potentially assign addresses dynamically, and that the
addresses can be re-assigned dynamically when the interconnect
medium reconfigures.  To the extent possible, X3T10 should ensure
that peripherals are alerted to fabric reconfiguration.  Peripherals
should treat reconfiguration as a potentially serious problem,
since the address granted a reservation may suddenly belong
to a different system.

With the technologies that dynamically reassign addresses,
ports typically have identifiers, supposed to be unique
world-wide, that do not change.

In particular, should such reconfiguration be an event for which
persistent reservations are cleared?  Alternatively, can
persistent reservations help when reconfiguration occurs?

...

> This command has no effect on reservations established by the
> normal RESERVE/RELEASE command and will receive a Reservation ...

Given the decision that every command description must specify
interactions with reservations, presumably every command must
specify how it interacts with Persistent reservations, also.

If a peripheral implements persistent reservations, should it
remain mandatory for the peripheral to implement (the mandatory
parts of) normal reservations?

...

> Discussion with Roger Cummings and Bill Dallas on
> Persistent Reservation provided these additional recommendations
> and clarifications:

...

> d)  Roger feels that the protection of peripherals from
>     unauthorized hosts should be done by the switch.

I concur until discussing distributed I/O.  There I disagree.

...

Other

Normal SCSI reservations have several provisions for atomic
behavior.  However, some normal reservation operations cannot be
performed atomically.  X3T10 should make persistent reservations
at least as atomic as normal reservations.   X3T10 might
consider making them atomic for normal reservations.

Specifically, X3T10 should consider providing a way to atomically 
supersede a single (LUN or Extent) reservation with multiple
Extent reservations that have distinct extent reservation
identification values.  X3T10 should also consider providing a way
to atomically supersede reservations of multiple adjacent Extents
with a single reservation of a single Extent, even when the
superseded reservations have distinct extent reservation
identification values.  The basic purpose is to be able to
fragment and re-combine extents while never having any interval
of time when an extent is unreserved.

(Such atomic operations might or might not be important for
distributed I/O operations, as is discussed later.)


A suggestion by Jim McGrath for "group reservations" is deferred
until distributed SCSI I/O is discussed.


PART 2: REGARDING DISTRIBUTED SCSI I/O

As I presented Distributed SCSI I/O at the January 1995 Working
Group meeting, a "network-attached peripheral" such as a SCSI
peripheral is supposed to be fully controlled by its owner, say a
file server.  Some initiators (e.g., those that are part of the
file server) are "trusted initiators" that should have full
control of the peripheral.  Other initiators are "untrusted" and
should not be able to control the peripheral except to the extent
authorized by a trusted initiator.  Somehow a peripheral is
configured to know which initiators are trusted.

To provide high availability, a file server may contain
multiple redundant initiators, and the proposal is valuable for
specifying how a healthy trusted initiator can take over from a
dead trusted initiator.

Thus the comments here are directed toward ensuring that
untrusted initiators cannot acquire control.

Since the distributed SCSI I/O ideas do not represent approved
ideas of X3T10, it should be apparent that the details below
may change depending upon whether and just how X3T10 specifies
distributed SCSI I/O in the future.

...

> Bill defines a new function called "Persistent Reserve".
> It has all the properties of a normal reserve except that
> it is not cleared by Target or LUN Reset (Bus Device Reset) or by 
> SCSI RST for those protocols that have that capability.  

If Distributed SCSI I/O relies on reservations,
this feature seems quite desirable.

> It is only cleared by a power cycle or by
> a properly qualified Persistent Reserve from another 
> Initiator.  ...

For Distributed SCSI I/O, clearing after a power cycle
might be a problem.  To avoid problems,
after a power cycle a peripheral must make itself unavailable
to any untrusted initiators.  It might do so either by reserving
itself to a trusted initiator automatically after a power cycle,
or by having a rule that untrusted initiators never can access
unreserved LUNs and Extents.

Thus it is important that X3T10 permit peripherals to reserve
themselves automatically to trusted initiators after a power
cycle or to make unreserved LUNs and Extents inaccessible to
untrusted initiators.  (Ditto when interconnection fabrics
reconfigure and reassign addresses.)

Further, X3T10 should permit peripherals to reject all Persistent
Reserve commands from untrusted initiators, even if they are
"properly qualified."  (More generally, X3T10 should permit
peripherals to reject most commands from untrusted initiators,
once details of distributed I/O are worked out.)

...


> Discussion with Roger Cummings and Bill Dallas on 
> Persistent Reservation provided these additional recommendations
> and clarifications:

...

> d)  Roger feels that the protection of peripherals from
>     unauthorized hosts should be done by the switch.

With distributed SCSI I/O, a key intent was for peripherals
to exchange data directly with untrusted initiators when
blessed by trusted initiators.  That means it should be
possible to exchange "data" but not certain "control" directly
with untrusted initiators.  We believe such fine,
distributed-SCSI-specific distinctions are best made by the SCSI
ports instead of the switch.

However, a switch can play several important roles.  Most
important is that it deliver to the correct destination and
that it prevent forgery of source identifiers.  If the SCSI
ports distinguish "trusted" from "untrusted" initiators based
on initiator identifiers, it is necessary to ensure that those
identifiers are correct.  Switches might have such capabilities,
and I understand current Fibre Channel switches do have them.

Another possible switch service is to permit the creation of
logical sub-fabrics.  For instance, it would be useful to create
a sub-fabric that interconnects all of a file server's SCSI
ports (including the network-attached peripheral ports).  The
addresses used on such a sub-fabric might be distinctive.  In that
way, a peripheral could be easily configured to recognize trusted
initiators because the initiators on the sub-fabric would be the
trusted ones.  I am not aware of switches that provide quite this
service, however.

...

> f)  All the task management functions violate reservations.

So distributed SCSI I/O should ensure untrusted initiators
cannot perform them.

...

Other

Possible ideas for more atomic reservation operations were
discussed before discussing distributed SCSI I/O.  Their 
importance for distributed I/O depends on which mechanisms,
if any, X3T10 adopts.  Here are some considerations.

If the distributed I/O functions do not depend on reservations,
obviously distributed I/O has no special need for atomicity.
This might be the case if distributed I/O were limited to
transfers between processors and peripherals and if the
peripherals acted as initiators (e.g., the COPY command).

If the distributed I/O functions rely on reservations and if
peripherals prohibit access by untrusted initiators to unreserved
LUNs and Extents, then there is little requirement for new atomic
capability.  To allow access by an untrusted initiator, a trusted
initiator changes a LUN (or Extent(s)) from unreserved to third
party reserved.  When access should be stopped, the trusted
initiator releases the reservation.  Normal reserve rules are
already sufficiently atomic, and persistent reservations should
have the same or better atomic properties.

If the distributed I/O functions rely on reservations, if
peripherals permit access by untrusted initiators to unreserved
LUNs, and if only LUN (and not Extent) reservations are used,
normal reserve rules are already sufficiently atomic, and
persistent reservations should have the same or better atomic
properties.

If the distributed I/O functions rely on Extent reservations and
if peripherals allow access by untrusted initiators to unreserved
LUNs and Extents, then it appears that more atomicity is needed.
To prevent improper access, a trusted initiator keeps a LUN or
all Extents reserved to itself except that it makes third party
reservations to allow access by untrusted initiators to specific 
extents.  In some way, a reservation for a LUN (or Extent must be
replaceable by multiple reservations of Extents such that
  o  no LUN or Extent is ever unreserved,
  o  some Extent(s) can remain reserved and inaccessible, and
  o  other Extent(s) become reserved to an untrusted initiator.
Similarly, when the third-party reservations are finished there
should be a mechanism to merge adjacent Extents, even if they
have different extent reservations identifers, to prevent
Extent fragmentation, while ensuring
  o  no LUN or Extent is ever unreserved.
Some ideas were listed before Distributed SCSI I/O was discussed.
This paragraph points out why such ideas may be important.


When I talked at the January 1995 SCSI Working Group, Jim
McGrath asked if I wanted "group reservations" to allow
multiple controlling initiators.  I'd not been thinking in
such terms.  However, this is an intriguing idea.  It should
fit well with some of the ideas already in use at the National
Storage Laboratory (NSL).  (Specifically, it seems to fit in
well when a processor, such as a massively parallel processor, has
multiple ports and exchanges data with a peripheral.  It seems
desirable to give a reservation to the set of initiator ports.
That way, if one port is instantaneously busy an idle port can be
substituted.)  I urge X3T10 to consider group reservations.

Lansing Sloan (ljsloan at llnl.gov)
reflector: scsi




More information about the T10 mailing list