Mallikarjun C. cbm at rose.hp.com
Tue Jul 23 10:55:52 PDT 2002

INCITS T11.3 Mail Reflector
Paul and all,

I am not a tape expert, but let me add some general comments on this topic.

When an IU (or PDU) is lost due to transport exceptions, transport layer is 
generally in the best position to understand and track the lost IU at that granularity.
Escalating to the SCSI ULP level has the virtue of being transport agnostic,
but also has the shortcoming that the recovery is possible only at the I/O 
granularity.   My understanding of READ POSITION/LOCATE is that besides
being tape-specific, they don't enable sub-I/O recovery as transport-level recovery 
does, please correct me if I'm mistaken.

In data center environments, I/O level recovery is perhaps reasonable(*1).  But in long-haul
networking applications such as remote tape vaulting or asynchronous remote mirroring,
doing transport-level recovery is highly desirable due to the latencies involved in 
retransmitting the data for the entire I/O again.  Long-haul networks are also more
prone to errors due to the amount of networking gear along the way - both in terms
of random errors and path failures.  iSCSI chose to define transport-level error
recovery for this reason, but made it optional to deploy it.

(*1) With one caveat.  In high-availability applications such as HP Service Guard
        that also guarantee the service responsiveness, it is crucial to bound the time 
        allowed for each I/O, even in a data center environment.  This again argues for
        transport-level recovery.

So I guess I'm suggesting that SCSI-level recovery is desirable in certain situations,
but predicting that the end is near for transport-level recovery seems premature.

Mallikarjun Chadalapaka
Networked Storage Architecture
Network Storage Solutions
Hewlett-Packard MS 5668 
Roseville CA 95747
cbm at rose.hp.com

----- Original Message ----- 
From: <Paul.A.Suhler at seagate.com>
To: <t10 at t10.org>; <t11_3 at mail.t11.org>; <snia-backup at snia.org>
Sent: Tuesday, July 23, 2002 10:10 AM
Subject: [T11.3] Re: Use of READ POSITION/LOCATE

> INCITS T11.3 Mail Reflector
> ********************************
> Dave Peterson asked for some more input in re the system vendor who
> strongly preferred transport level error detection and recovery.
> My understanding of their reasoning is that backup applications cannot
> tolerate even one command to fail because of a transport error; it would
> mean the termination of the backup session.  I was told that even one
> failed command per week was way too many, since that night's backup would
> have to be repeated.
> Can we have input from Veritas, Legato, CA, and other backup vendors on
> tolerance for errors?  Can you recover automatically from loss of a medium
> (in a library)?  From failure of a command at the transport level?  If not,
> then the transport layer has to everything possible to keep a command from
> failing and we have to keep all of the FCP-2 error recovery in FCP-3.
> Similarly for other transports.
> And what is your use of the two commands referenced -- to recover from
> medium or other errors, or simply to access particular backup sets and
> files on a successfully-written medium?  Or both?
> Thanks,
> Paul Suhler
> Seagate Removable Storage Solutions
> To Unsubscribe:
> mailto:t11_3-request at mail.t11.org?subject=unsubscribe

To Unsubscribe:
mailto:t11_3-request at mail.t11.org?subject=unsubscribe

More information about the T10 mailing list