Fwd: T10 DIF for Error correction
shivaram.u at quadstor.com
Wed May 15 13:11:06 PDT 2013
Formatted message: <a href="http://www.t10.org/cgi-bin/ac.pl?t=r&f=r1305153_f.htm">HTML-formatted message</a>
Thanks a lot for the reply. Many points are now clear to me.
>> The APPLICATION TAG field originally was planned to correspond to
>> or logical drive letters in a system. This provides detection for a
>> that is transferred without error (no CRC errors), directed to the
>> block address (no Reference Tag errors) but to the wrong physical or
>> drive. Note that this use requires only one or a small number of
>> values to be in use on any particular drive.
I just realized that the device doesn't really need to check the PI on a
read. This depends on the setting as reported by the Extended Inquiry VPD
page which means that an application cannot control it. Or can it ?
Assuming that the device does check application tags, what would be the
assumption on an application tag mismatch on a READ ?
For example the application asks for block X with reference tag Y and
application tag Z. The device determines that the block X reads fine, the
reference tag Y is as expected but the application tag Z does not match.
1. Does the device assume that the request from the client/application is
2. Does the device assume that logical block read is incorrect ?
A device controller if assumes (2) it can try to perform error recovery,
however if (1) is assumed then the command will be terminated with error.
But i'm curious to know what a controller would assume on a mismatch in the
protection information when a possible redundant copy may be
On Wed, May 15, 2013 at 8:23 PM, Gerry Houlder
<gerry.houlder at seagate.com>wrote:
> It looks like you have given a lot of thought to use of the Protection
> Information (also called DIF) provided by the SBC-3 standard. Let me point
> out the original intent for the different parts of the Protection
> information fields.
> The GUARD field is a CRC that is useful for detecting the change of a few
> bits within a logical block. All transport protocols (e.g., SAS) use a CRC
> to protect data in transit, but these CRC fields tend to be generated
> before being placed on a bus, checked at the other end of the bus, an
> discarded. The CRC fields in these protocols might be more powerful (i.e.,
> can detect larger number of changed bits) than the CRC used in the PI
> algorithm, but because they are only in place for a small part of the data
> path there are holes in the protection. The PI guard field stay in place
> across all data paths and storage elements and fill in all of these holes.
> The REFERENCE TAG field provides a value that is unique for each logical
> block (at least within a capacity of 1_0000_0000h logical blocks). This
> provide detection of logical blocks that were transferred without error
> (i.e., no CRC errors) but were written to the wrong Logical Block Address
> with in drive (or an extent with the same Application Tag). This can happen
> if RAID controllers or string controllers incorrectly translate an address
> or get a single bit error in an address.
> The APPLICATION TAG field originally was planned to correspond to physical
> or logical drive letters in a system. This provides detection for a logical
> block that is transferred without error (no CRC errors), directed to the
> correct logical block address (no Reference Tag errors) but to the wrong
> physical or virtual drive. Note that this use requires only one or a small
> number of application tag values to be in use on any particular drive.
> Using Type 2 protection and 32 byte commands provides the most flexible
> use of Application Tags. It also is required for RAID systems, because the
> striping of RAID systems results in the logical block address that the host
> is writing to being translated to another logical block address, with those
> addresses being spread over several physical drives. Type 2 PI allows the
> original host logical block address to be translated (by the RAID
> controller) to a different logical block address by putting the host LBA
> value in the EXPECTED REFERENCE TAG field and the LBA where the logicla
> block is to be written into the LOGICAL BLOCK ADDRESS field. In this
> manner, none of the PI fields in the data stream need to be modified.
> On Wed, May 15, 2013 at 5:03 AM, Shivaram Upadhyayula <
> shivaram.u at quadstor.com> wrote:
>> I am looking a way for automatic error correction (self healing) using
>> T10 protection information. I have written down a possible solution below.
>> Questions are
>> a) will it work ?
>> b) Is such a thing already supported by implementations
>> c) Easier, obvious alternatives
>> Feedback, pointers and criticism will be very helpful.
>> Best Regards,
>> Note: A copy can be got from
>> Many filesystems now incorporate self healing when a corrupt data block
>> detected. However self healing requires RAID within the
>> manager layer or at least functionality to retrieve an alternate copy for
>> verification and writing back the good copy. Many filesystems/applications
>> would need to change significantly to incorporate such techniques which
>> isn't possible.
>> Guard checksums (T10 DIF) already allows for verifying the integrity of
>> data received and stored. A device can (if it supports) automatically do
>> handling if guard checking is enabled. For example on a READ
>> 1. Compute the guard for the data block
>> 2. Verify that the guard stored is the guard which is computed
>> 3. If not, retrieve an alternate block if available do steps 1, 2 etc and
>> found to be good, return that data. Additionally if possible, fix the
>> incorrect block.
>> Retrieving an alternate block could be as simple as reading a redundant
>> from an alternate disk in a mirrored configuration. In the case of for
>> a triple mirrored configuration, the redundant copy can be more than one.
>> In the case of a RAID configuration where parity is used, retrieving an
>> alternate block would amount recreating the block from the parity block
>> all other blocks other than the incorrect block.
>> However there does exist the possibility (in theory) that the final block
>> received by the application is not what it expected. T10 DIF allows an
>> application to store a tag with the data and receive the same on a READ.
>> during a READ there is a tag mismatch, the application can assume an
>> data block.
>> Using Application Tags for error correction
>> Lets suppose due to stale data on one of the disks or probably due a CRC
>> hash collision, the guard checksum matches, and the data is sent back to
>> Assuming that the application is capable of determining that the data it
>> received is not what it expected (application tag mismatch), the
>> tag can be useful then to inform the device to look for redundant copies.
>> However the application tag verification can be complicated to use.
>> Application tag verification can be enabled in the control mode page and
>> device can be notified of the expected application tag for a certain
>> range of
>> logical blocks using the application tag mode page. This applies to all
>> and WRITE commands except for the 32 byte CDB variants
>> In the case of the 32 byte CDB variants the expected tag is passed along
>> the CDB itself.
>> Now this is fine if the application tags used are a small number of
>> However if the application wishes to use a different tag for each logical
>> block (or a very small range of logical blocks, say per filesystem block)
>> will be cumbersome to set mode pages for all the application tags. And
>> is no reason for an application to know the expected tag for a block in
>> advance. Due to this limitation the following will work only with
>> type 2 (??? or is it possible otherwise)
>> One good way to use application tag would be
>> 1. Send the application tags during WRITE 32 but disable application tag
>> checking. Guard check will ensure that the data received is order and the
>> application tag will be stored along with the block
>> 2. Receive the data without an application tag checking. The application
>> are sent along in the READ 32 response without any checks.
>> 3. If the application receives the tag it expects, then fine.
>> 4. If not reissue the read command, this time with application tag check
>> enabled. This would be by Issuing a READ 32 command per logical block or
>> of blocks with the same application tags, with the expected application
>> specified in the CDB itself.
>> 5. Now when the device reads the blocks and finds a mismatch in the
>> application tag it could then do the appropriate error handling, which
>> could be
>> a. Do nothing in which case the application has unexpected data as
>> b. Retrieve an alternate copy for which the guard checks fine and the
>> application tag matches. In this case the application has a good copy of
>> c. Retrieve an alternate copy for which the guard check fails or the
>> application tag still has a mismatch. This is equivalent to (a) from an
>> application perspective.
>> d. In the case of (b), the device can transparently write back the
>> data, guard and application tag to the incorrect block
>> When things may not work
>> 1. SBC3 r35c mentions data deduplication and states "De-duplication shall
>> affect protection information, if any.". There is a definite possibility
>> losing an Application Tag over here. How are application tags and
>> tags handled here ?
>> One possibility is that the implementation deduplicates the application
>> but still has the ability to retain the original application tag
>> However this would mean that when returning the application data, the
>> will return the retained original application tag with the incorrect
>> This would mean that the application would see the correct application
>> tag but
>> incorrect data. But this is really left to the device implementation to
>> this situation in the first place.
>> 2. Unless the application tag used are a small number, only Type 2
>> will work
QUADStor Storage Virtualization : Thin Provisioning, Data Deduplication,
VAAI, High Availability
QUADStor Storage Virtualization : Thin Provisioning, Data Deduplication,
VAAI, High Availability
More information about the T10