T10/97-155R0.TXT More Discussion of Tapes in PLDA

Doug Hagerman, Digital Equipment, 508-841-2145, Flames to /dev/null 27-Mar-1997 0932 hagerman at starch.enet.dec.com
Thu Mar 27 06:25:40 PST 1997


* From the SCSI Reflector (scsi at symbios.com), posted by:
* "Doug Hagerman, Digital Equipment, 508-841-2145, Flames to /dev/null  27-Mar-1997 0932" <hagerman at starch.ENET.dec.com>
*
"More Discussion of Tapes in PLDA"		T10/97-155R0

9703013

1. Introduction 

Here is another attempt at putting together a "what to do about
tapes on Fibre Channel" proposal.

2. Overview of the Fibre Channel Tape Problem

Tape devices have different performance requirements than disks.
The special characteristics of tapes in an FC-AL environment are
summarized as follows:

a. If a tape command or data transfer fails on the interconnect, the
recovery requires more than simply the reissuance of the command.
The operating system driver software must manage the position of
the media by issuing a sequence of repositioning commands in addition
to reissuing the failed I/O command. This code is in SCSI tape
drivers now, but the mechanical process required to complete the
recovery may be time consuming.

b. Using the SCSI command timeout to detect errors is unacceptable
because the timeout value must be set to a large number (e.g. 10 minutes)
to enable normal tape device operation. This timeout value is
acceptable if the error rate at the physical level is low enough
so that the timeout is only excercised once or twice a day.

c. When devices are swapped on an FC-AL loop the loop signal is
disrupted. It may not be possible to predict when this will occur,
but in some environments many devices may be swapped in a day.

d. The FC-AL loop may under normal conditions experience fairly
frequent random bit errors. A normal parallel SCSI bus experiences
errors at an extremely low rate--weeks may pass between parity errors.
It is not known how frequently bit errors will occur on a normally
operationg FC-AL loop. Worst-case calculations indicate that
hardware complying with the standards may deliver an error bit
every 10 seconds.

One may argue what the delivered error rate will be. However, in
order to minimize risk at the system level, the PLDA profile must
have a "fence" against the worst case. The following is based on
that assumption.

A secondary goal is to avoid the introduction of Class 2 as a special
case for tapes. This is particularly important in the case of
subsystem controllers that must support both disk and tape device
models. How is the driver to know whether to send a given
INQUIRY command using Class 2 or Class 3? Must the driver handle
INQUIRY commands differently from READ or WRITE commands?

The best place to fix the tape problem is at the FCP level as
described in PLDA. FC-PH and SCSI are long-established, and changes
to SCSI driver software or FC-PH hardware are not desireable.
Furthermore, it has already been agreed that FCP could change if
a need can be demonstrated. Small changes to FCP and PLDA cause
the minimum amount of disturbance to the status quo.

3. Reliable Tape Transfers to Be Constrained in Size

It is widely agreed (not universally) that ALL tape transfers may
be classified as one of:

a. Transfers where data integrity is required, and where a maximum
of 64kBytes will be transferred in any SCSI I/O command, or

b. Transfers where bulk data is being moved and a data error should be
ignored, and where the maximum transfer size may be greater than 64kB.

This is convenient for both tape drives and for controllers that
support tape drives on the back end. In both cases it is convenient
to be able to buffer all the data for a command before begining
media movement.

4. Overview of Proposed Solution

Under this proposal, transfers would look like this:

========
WRITE: Transfer of "n" sequences. Maximum total data is 64kB, or
about 32 frames--typically in one or only a few sequences.

Initiator          Target

FCP_CMD ---------->
        <---------- FCP_XFR_RDY
DATA  a ---------->     This DATA sequence transferred sucessfully
DATA  b ---------->
DATA  c ---------->
DATA  d -----X.....     Error occurs at "X"
                        Error is detected by target using sequence count
                        All further frames and sequences ignored
                        Target waits RA_TOV to age any pending frames
        <---------- FCP_XFR_RDY
                        With offset set back to "a"
DATA  a ---------->
DATA  b ---------->
*
* For SCSI Reflector information, send a message with
* 'info scsi' (no quotes) in the message body to majordomo at symbios.com




More information about the T10 mailing list