SSC-2 Note 48

Kevin D Butt kdbutt at us.ibm.com
Thu Mar 7 09:00:22 PST 2002


* From the T10 Reflector (t10 at t10.org), posted by:
* "Kevin D Butt" <kdbutt at us.ibm.com>
*
Joe,
      I agree with your concern about Note 48, as it can lead to data
integrity problems when used with certain formats.  For example, let's say
that there is an undetected compression failure during a write such that an
illegal (or incorrect) compression codeword is generated.  Let's say the
part of the compressed data stream (CDS) which encompasses the bad codeword
is after the access point, but before the end of, dataset N.  In that case
we can decompress dataset N-1, even a record spanning out of dataset N-1
into dataset N would be decompressible, because by the definition of an
access point that would occur before it. Similarly we could decompress
starting at an access point in dataset N+1.  The one thing we would almost
certainly fail at is decompressing from the access point in dataset N to
the access point in dataset N+1, and we would typically get a CRC failure.
There might be many records between these two points. Thus we have some
length of CDS (e.g. as little as 8 bytes, or as many as ~806000, if the
access points are in sequential datasets) which we cannot decompress
(properly).  The problem is the record boundaries are embedded into the CDS
and an illegal codeword would typically make it impossible for us to
discern where those record boundaries are.  What we do know is how many
records (and filemarks) are supposed to be contained in that length of CDS.
As an example we might have 600000 bytes of CDS and know that it
corresponds to 3 records.  For the sake of argument let's say that the
illegal codeword is in the second of these records. In that case we could
even decompress and give to the host, without error, the first record. But
we would not typically know how many bytes of the CDS are associated with
the second record, and consequently we don't know how many are associated
with the third record.  Let's consider this case in the context of your
proposed rewording of Note 48:


   =========================================


   When compressed data is encountered on the medium that the device is
   unable to decompress, the device should treat each logical block of the
   data similarly to a block that cannot be read due to a permanent read
   media error, i.e.: transfer all data to the initiator up to the
   beginning of the first non-decompressible block; set a contingent
   allegiance indicating the error (0x03, 0x11, 0x0E - CANNOT DECOMPRESS
   USING DECLARED ALGORITHM?); set the VALID, ILI, and INFORMATION fields
   according to the original (uncompressed) state of the block; and set the
   current logical position to the following logical block, whether
   decompressible or not.


   This will allow the initiator to issue subsequent reads to the device,
   each failing, until the non-decompressible region is exited. This
   mechanism is directly analogous to the method the initiator may use to
   'step' its way through a damaged area of tape, (sequence of logical
   blocks with media errors).


   =========================================


I agree with your broad strokes -- that is the application should get error
codes for the second and third records, but he should be able to traverse
these and then continue on reading the next record (which would correspond
to the first of data set N+1).  As far as the specifics, given that we were
able to decompress the first (let's say this corresponds to 198000 bytes of
CDS), how do we apportion the remaining 402000 bytes to the remaining two
records so that we can give an ILI?

The key point of the whole Note, from my perspective, is "and set the
current logical position to the following logical block, whether
decompressible or not".  On this I agree with you completely, it is the
only way we would be able to allow an application to traverse an
incompressible region without data integrity issues.


On your other point, some people think of  'logical block' and 'record' as
fully interchangeable, but there are some subtle differences.  A 'logical
block' might refer to either a record or a filemark.


On the issue of filemarks, let's say the 402000 bytes discussed above were
associated with two records and a filemark (instead of just two records as
discussed above).  In this case we cannot know whether the filemark was the
first, second, or third entity in the 402000 bytes (it could only be the
third in this specific case if the access point in DS N+1 was at zero, but
that is a side issue).   The question is whether even with the rewording of
Note 48 we don't still have a data integrity issue.  As an example it might
be the intention of the application to read each block until a filemark is
encountered and then grab the next 20 records.  In that case it makes a big
difference if the filemark is the first, second, or third entity in the
incompressible area. Even if the applications intent were more generic and
he would grab all of the records after the filemark (e.g. until the next
filemark), it cannot know whether it is missing some incompressible records
(e.g. if the filemark were the first or second entities), or if in fact it
will get everything it wants (e.g. filemark is third entity).


Also, is it possible to post the proper error indicators (set a contingent
allegiance indicating the error (0x03, 0x11, 0x0E - CANNOT DECOMPRESS USING
DECLARED ALGORITHM?); set the VALID, ILI, and INFORMATION fields according
to the original (uncompressed) state of the block) without knowing whether
the logical entity was a record or a filemark?

Kevin D. Butt
IBM Tape Products
SCSI and Fibre Channel Microcode Development
6TYA, 9032 S. Rita Rd.
Tucson, AZ  85744
Office:  (520)799-5280, Tie-line 321
Lab: (520)799-2869
Fax:  (520)799-4062
Email:  kdbutt at us.ibm.com

*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org




More information about the T10 mailing list