Drive XOR -- proposal for distributed XOR

Paul Hodges phodges at VNET.IBM.COM
Fri Mar 17 15:52:08 PST 1995


At the March 6 meeting of the XOR study group, I made a brief
presentation of a scheme in which the XOR function for Rebuild and
Regenerate was distributed across the drives of an array, rather than
concentrated at one temporary initiator.  I promised to publish more
detail on the reflector to allow discussion before the May meeting.

Taking Regenerate as an example, the current proposal requires one
target to issue multiple READ commands and to do multiple XOR's on the
data.  In the proposal below, a chain of N drives is defined, each of
which receives a REGENERATE command from an initiator higher in the
chain and each of which issues a single REGENERATE command to a drive
lower in the chain.  Each drive receives data from the "lower" drive,
does a single XOR, and passes the results to the "higher" drive (or
controller) Thus, with a failed drive, the additional workload is
automatically shared among all drives.

The necessary commands and the sequences to do the RAID functions are
described below.


COMMAND DESCRIPTIONS
--------------------

The format of the commands is the same as that described in 94-111r6,
but the REBUILD and REGENERATE commands operate differently.  Obviously,
if both behaviors are allowed, there must be a control bit defined to
distinguish between them.


1. REGENERATE -- the target receiving a REGENERATE command, will also
receive Parameter Data describing the chain of M drives from which the
XOR input will be received.  Drive M corresponds to the last Source
Descriptor in the list.  The drive

(a) Reads data from its disk.

(b) As a temporary initiator, issues a command directed to drive M:

      If there are two or more source descriptors in the received
      Parameter Data, the drive constructs a new parameter list by
      dropping the Source Descriptor for drive M.  It then issues a
      REGENERATE command, with Parameter Data containing source
      descriptors corresponding to drives 1 through (M-1).

      If there is only one source descriptor in the received Parameter
      Data, the drive issues a READ command.

(c) When data is received from drive M, performs the XOR and transmits
the result to the initiator from which it received the original command:

     If no Intermediate Data was received with the command, the drive
     XOR's the data from the disk and the data received from drive M.

     If Intermediate Data was received with the command, the drive XOR's
     the Intermediate Data, the data from the disk, and the data
     received from drive M. (used for hybrid subsystem)


2. REBUILD -- The target receiving a REBUILD command, will also receive
Parameter Data describing the chain of N drives from which data is to be
rebuilt.  Drive N corresponds to the last Source Descriptor in the
list.  The drive

(a) As a temporary initiator, issues a command directed to drive N:

      If there are two or more source descriptors in the received
      Parameter Data, the drive constructs a new parameter list by
      dropping the Source Descriptor for drive N.  It then issues a
      REGENERATE command, with Parameter Data containing source
      descriptors corresponding to drives 1 through (N-1).

      If there is only one source descriptor in the received Parameter
      Data (N=1), the drive issues a READ command.  (used for dual copy
      and for hybrid subsystem)

(b) When data is received from drive N, writes data to its disk:

      If no Intermediate Data was received with the REBUILD command, the
      drive writes the data received from drive N to disk.

      If Intermediate Data was received with the REBUILD command, the
      drive XOR's the Intermediate Data with the data received from
      drive N, and writes the result to disk.  (used for updating parity
      with a failed data drive, and for hybrid subsystem)


RAID OPERATIONS
---------------

-- REGENERATING DATA, READ OPERATION ADDRESSED TO A FAILED DRIVE:

The controller/host issues REGENERATE to a selected drive.  A chain of
REGENERATE commands to the remaining operational drives results, with
the "last" drive receiving a READ command.

The drive executing the READ reads data from its disk and transmits it
to the drive from which it received the READ command.

Each of the drives executing a REGENERATE reads data from its disk,
receives data in response to the command it issued, XOR's the two and
transmits the result to the drive (or controller) from which it received
the REGENERATE command.


-- REBUILDING AFTER REPAIR:

The controller/host issues REBUILD to the drive to be rebuilt.  A chain
of REGENERATE commands to the remaining drives results, with the "last"
drive receiving a READ command.

The drive executing the READ reads data from its disk and transmits it
to the drive from which it received the READ command.

Each of the drives executing a REGENERATE reads data from its disk,
receives data in response to the command it issued, XOR's the two and
transmits the result to the drive from which it received the REGENERATE
command.

The drive executing the REBUILD receives data in response to the command
it issued and writes it to disk.


-- UPDATING PARITY, WRITE OPERATION ADDRESSED TO A FAILED DRIVE:

The controller/host issues REBUILD to the parity drive, with
Intermediate Data (the data that would have been written to the data
drive).  A chain of REGENERATE commands to the remaining operational
drives results, with the "last" drive receiving a READ command.

The drive executing the READ reads data from its disk and transmits it
to the drive from which it received the READ command.

Each of the drives executing a REGENERATE reads data from its disk,
receives data in response to the command it issued, XOR's the two and
transmits the result to the drive from which it received the REGENERATE
command.

The drive executing the REBUILD receives data in response to the command
it issued, XOR's that data with the Intermediate Data received from the
controller, and writes the result to disk.



Regards, Paul Hodges
-------------------------------------------------------------------------
:    Telephone:  408-256-6224                     Fax:  408-256-5151    :
:    Internet:   phodges at vnet.ibm.com             IBMMAIL:  USIBMCC9    :
-------------------------------------------------------------------------




More information about the T10 mailing list