Tape Presentation - Re: FCL Error SSWG Conference Call for Today

Michael E. O'Donnell mike_odonnell at stortek.com
Tue Apr 15 12:30:44 PDT 1997

* From the SCSI Reflector (scsi at symbios.com), posted by:
* "Michael E. O'Donnell" <mike_odonnell at stortek.com>
This is a multi-part message in MIME format.

Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Jim, Pete (et.al),

	After my presentation during the April X3T11 meeting, several people
asked that I post the text of the following paper both to the FC and
SCSI reflectors.

	=-=-= Mike =-=-=
  *   Michael E. O'Donnell            Storage Technology Corp      *
  *   mike_odonnell at stortek.com       2270 So. 88th Street         *
  *   303.673.3286                    Louisville, Co. 80028-4223   *

Pete Popov wrote:
> Jim,
> >       The tape people gave some more presentations on problems with
> >       tape and FC.  I would like to propose (and get response from
> >       people) the following on this issue: given our October deadline,
> >       and the fact that none of the alternatives that address this
> >       problem (tape operations in a non queued or queued but only 1
> >       command deep environment, so we do not have the out of order
> >       command problem) cleanly, why not require class 2 behavior when
> >       talking to a tape drive?
> Is this tape presentation on the problems with tape and FC available
> in some sort of printed form?
> Thank you,
> --
>  Pete Popov
>  Sony Electronics
>  Advanced Storage Development
>  3300 Zanker Rd, SJ3B2
>  San Jose, CA 95134
>  (408)955-5265 phone
>  (408)955-5066 fax
>  pete_popov at asd.sel.sony.com

Content-Type: text/plain; charset=iso-8859-1; name="ansi0402.txt"
Content-Transfer-Encoding: quoted-printable
Content-Disposition: inline; filename="ansi0402.txt"

                          Storage Technology Coporation

Date: 02 April 1997

To: X3T11 Membership

From: Michael E. O'Donnell
      Advisory Development Engineer
      Systems Development
      (303) 673-3286 =

      (303) 673-2568 (fax)
      e-mail - mike_odonnell at stortek.com

Subject: Tape Characteristics on  FC-AL (Class 3)

Fibre Channel error detection as specified in PLDA rev 1.10 is
significantly lacking in detecting and recovering errors for tape devices=

at the FC-2 layer.  The current PLDA and FC Class 3 cannot provide all
the necessary mechanisms to guarantee frame delivery at the FC-2 layer fo=
Tape drives.  I believe that inherently this is due to the emphasis
placed on loop performance and the `serial-less' behaviour of disk (which=

was the original focus of this profile).  Note that there are protocols i=
place above FC-2 to provide data integrity, but Host vendors would prefer=

this problem be solved with less ULP intervention.

This paper introduces several fundamental FC tape drive characteristics.
Additionally, it provides the reader an insight into the problems
introduced when unconfirmed frame delivery is the protocol used for tape

Fibre Channel Tape Device Assumptions

Tape devices operating on Fibre Channel interfaces do not provide native
transfer rates at the FC link rate.  Additionally, tape devices typically=

employ internal buffers to mask the time to get to a block of data
(mechanical motions attribute to this delay; motor ramp up times, rewind,=

etc).  For both these reasons, it seems practical to state that tape
devices attaching to FC loops will implement some buffering (both to
`hide' the slower native device data rates and to hide mechanical

Assumption: Tape Devices provide customer data buffering. Size and
            buffer management are vendor unique.

The use of tape devices implies serial operations.  The host and device
are typically in synch with regards to each other's perception of record
or block location on tape.  Hosts can instruct tape devices to position
implicitly (read, write) or explicitly (rewind, locate, space, etc).
Tape drives perform no positioning on their own (short of its own error
recovery) so as to stay in synch with the host's perception of the
current block position on tape.  Write data is appended to current data
on media (versus disk writes where data is re-written at the same
location until the disk is commanded to locate to a different sector).
For read operations, tape devices read a block of data and are
positioned to read the next sequential block on tape (upon receipt of
the initiator's next read request)

There are occasions (due to media management or loss of the implicit
synchronization between the host and initiator) where the host must
re-establish its view of tape/block position with that of the drive.  SCS=
interface commands are provided that return block ID information to the

While queuing of tape commands provides some performance benefits at the
interface level, commands cannot be re-ordered by the tape device, as
data associated with the read/write operations would also be reordered
(which is unacceptable behaviour for a tape drive).

Assumption: Tape Devices execute commands serially. Commands (while they
            can be queued) are not re-ordered.

Block sizes written to and read from a tape drive are not restricted by
any practical characteristic or limitation of the interface.  While there=

may be block sizes that make more efficient use of the FC interface, tape=

drives are generally not designed with these limitations as a factor.
Customers still think in terms of block size on the media and not the
characteristics of the interface.  The interface merely provides the path=

for the data to travel and not an intermediate nor final storage location=

Assumption: Block or record size on media is unrelated to that transferre=
            across the interface.

The timeout values chosen to insure that a device completes an initiator
request are DIFFERENT than those values that are used to insure proper
responses on the FC interface.  Again, due to mechanical reasons, the
serial nature of tape media, etc. device operations can execute for many
minutes (for example, format media or erase).  The behaviour of the
interface is independent of stream devices.  Common timers should not be
used for error detection mechanisms.  FC timers insure efficient use of
the interface and are used to detect unresponsive nodes (not device
latencies or device timeouts).

Assumption: Device (ULP) error detection timers are different than those
            for detecting FC interface errors.

Fundamental Problems with Tape Operations on a Class 3 FC AL

While Upper Level Protocols can be required for error
detection/management, detection at the FC-2 layer should be attempted
first.  Several HBA vendors contend that using ULP timeouts is an
inefficient mechanism to detect and recover from FC frame transmission

The fundamental problem with using Class 3 today is that there is no
confirmation of frame delivery ("ship and pray").  Some frame delivery ca=
be deduced by the sender as follows:

I.      a command was successfully received because:
         - an FCP Transfer ready was sent by the command recipient
         - or FCP read data was received
         - or a response was received

II.     write data was successfully received because:
         - an FCP Transfer Ready was received
         - or a response was received.

However, receipt of FCP data and FCP responses cannot be implicitly
detected.  Today, PLDA defines detection of these missing frames by using=

timeouts (which also is stated as being optional for targets!).

Potential Tape Operational Errors
Because of this shortcoming there are several scenarios where the
initiator and target can get 'out of sync' with regards to block
positioning on tape.

                  Out of Sync Example #1

Assume the initiator and target are both positioned to process the 76th
block on tape.  The intiator opens the target, sends an FCP_CMD frame to
read block 76 but receives NO response frame or data frame from the

Initiator                           Target

      --- FCP_CMD (read) ------->

            < missing frames >
            < ULP_TOV >

In this scenario, the Intiator assumes the target did not get the command=

and re-issues the command again.  Assuming the command DID NOT arrive at
the target, the intiator is still 'in sync' with the drive.  A read
operation would return block 76.

                  Out of Sync Example #2

Assume the initiator and target are both positioned as above (to read the=

76th block on tape).  The intiator opens the target, sends an FCP_CMD
frame to read block 76 but receives NO response frame or data frame from
the target.

Initiator                                    Target

      --- FCP_CMD (read) ---------------------->

                   <~~~~ (corrupt FCP_DATA) ~~~~
                   <~~~~ (corrupt FCP_RSP ) ~~~~

      < missing frames >
      < ULP_TOV >

In this scenario, the Intiator assumes the target did not get the command=

and re-issues the command again.  However, the assumption is incorrect!
The drive DID process the read command, sent FCP_DATA and FCP_RSP.  The
drive now is positioned to process block 77.  If the Initiator issues the=

read command (assuming that the drive never received the command as in Ou=
of Sync Example #1), The drive processes block 77 (in the case of a
FCP_CMD read, would return the WRONG data!).

These examples can be extended to ANY tape operational request where the
drive is requested to alter its current position (i.e.  Space operations,=

Locate, etc.).  In certain instances, security issues arise.  For example=
a drive is commanded to Erase the media following execution of a
'duplicate' Space operation (like the 'Out of Sync' example #2 above).  I=
effect, the drive could be commanded to Erase a record that should not
have been erased!

There have been several suggestions made to provide confirmed
frame delivery for reliable tape operations. This can be addressed by:

   +  status quo (i.e. FC-2 time out detection and letting the ULP
      manage FC-2 detected errors). This is what occurs today.
   +  operating tape in Class 2 service on current FC-AL topologies
      (changes required to PLDA or new PLDA-2)
   +  operating tape in Class 2 service only (on its own loop, switched
      or Point to Point connection)
   +  adding FCP controls to improve error detection and recovery at
      FC-2 layer (required FCP and PLDA changes).
   +  adding acks to tape operations.

"From a tape operational standpoint, it is imperative that fibre channel =

error detection and recovery mechanisms insure that the Initiator and =

Target are in sync with respect to each others=92 view of block position.=


* For SCSI Reflector information, send a message with
* 'info scsi' (no quotes) in the message body to majordomo at symbios.com

More information about the T10 mailing list