H e w l e t t - P a c k a r d Computer Peripherals Bristol *** *** *** *** *** ******** ******** ******** ******** * *** *** *** *** * *** *** *** *** *** **** **** *** *** ******** * * * * * *** *** ******** *** **** **** *** * *** * *** *** *** DC Control Proposal for Sequential Access devices with special regard to DDS DAT drives. Document : X3T9.2/90-119 Rev 2 Date : 14th December 1990 Author : Steve Krupa, HP CPB Hewlett Packard, Computer Peripherals Bristol, Filton Rd, Stoke Gifford, Bristol, BS12 6QZ Tel : + 272 799910 Fax : + 272 236091 INTRODUCTION ************** This is the second revision of the document X3T9.2/90-119. The initial document was dated 16th July 1990. Revision 1 was dated 29th October 1990. A Proposal for a SCSI MODE SENSE/SELECT page to control and report on the operation of a DC DDS drive, with specific regard to Data Compression components. It is intended for, but not necessarily limited to, support of DDS-DC drives which make use of lossless compression algorithms which are based on substitution; e.g. those of the Lempel-Ziv family. There are a number of issues addressed by this document with regard to data compression. a) How does a host use the available functionality in SCSI-2 to control data compression ? b) What additional features could a device which supports data compression offer to the host and how should these be made available to the host ? c) If a device which does not support data compression encounters a piece of media containing compressed data, what action should it take ? d) What form does the host-device interface take in order to allow the host to perform software decompression if required ? This is an issue for both current functionality in SCSI-2 and for any additional feature set. Document Structure : Part 1 : Describes the current level of support for DC control in SCSI 2 Part 2 : Describes the format of a new Data Compression Mode Page Part 3 : Describes how a host interacts with a device using the Data Compression Mode Page with particular emphasis on software decompression. PART 1 ********** DATA COMPRESSION CONTROL using DEVICE CONFIGURATION MODE PAGE ***************************************************************** This section describes the support available in SCSI-2 for Data Compression control. 1.1 DC support in X3T9.2 SCSI-2 Rev 10 ==================================== The only support currently available in the SCSI set is a one-byte field, the SDCA field (byte 14), in the Device Configuration Page (page 10h). This page is specific to Sequential Access devices. The use of this field is in many ways vendor-specific - the QIC manufacturers have reached a separate agreement on its meaning. 1.1.2 X3T9.2 definition ------------------- Byte value Description ------------------------------------------------------------- 00h Disable Compression 01h Select targets default compression algorithm 02h - 7Fh Select compression algorithm # 80H - FFh Vendor Specific 1.1.3 QIC definition ---------------- ------------------------------------------------- | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | ------------------------------------------------- | DC | On | QIC-Approved DC Algorithm | | on |Drive| | ------------------------------------------------- Byte Value Description ------------------------------------------------------------ 00h Disable Compression 01h - 7Fh Invalid 10xxxxxxb DC on - host performs algorithm # xxxxxxb 11xxxxxxb DC on - drive performs algorithm # xxxxxxb As can be seen from the QIC definition, QIC chooses to ignore the X3T9.2 definition of byte values 01h-7Fh. Because one byte is not enough to fulfill the functional requirements for data compression control, the SDCA field will not be supported. The justification for this is : a) DC algorithm identifiers are 8 bits wide and therefore would fill this field on their own. b) The complexity of allowing the host system to retrieve compressed data which the device cannot decompress requires more than 1 byte. c) Most drivers currently available set the SDCA field to 0 and may therefore inadvertently disable compression. PART 2 ******** DATA COMPRESSION CHARACTERISTICS MODE PAGE ******************************************** This section describes a new Mode Page to be used for control of data compression in DDS drives. It adds new support for DC in the SCSI standard which is not possible using the current features. 2.1 Proposal for SCSI DC Characteristics Mode Page ================================================ 2.1.1 DC Page Definition -------------------- The Data Compression Characteristics Mode Page is defined as : ================================================================ | | | | | | | | | | | Bit | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |Byte | | | | | | | | | |-------========================================================= | 0 |Reserved (0) | Page Code (0Fh) | -------+------------------------------------------------------- | 1 | Page Length (04h) | -------+------------------------------------------------------- | 2 | DCE | Reserved (0) | -------+------------------------------------------------------- | 3 | Compression Algorithm | -------+------------------------------------------------------- | 4 | DDE | RED | Reserved (0) | -------+------------------------------------------------------- | 5 | Decompression Algorithm | -------+------------------------------------------------------- 2.1.1.1 DCE : Data Compression Enable The Data Compression Enable field is used to enable and disable data compression. On a MODE SELECT command this field allows the host to enable or disable data compression. If the host sets this field to 0 then the device will disable data compression, and any subsequent data sent to the device by the host will be written to the media uncompressed. If the host sets this field to 1 then the device will compress any subsequent data sent to it by the host before writing it to the media. The algorithm used to compress the data will be the one specified in the Compression Algorithm field. In this case the Compression Algorithm field must contain a valid data compression algorithm identifier. On a MODE SENSE command, this field allows the host to determine whether compression is enabled or disabled. A value of 1 indicates that compression is enabled, and a value of 0 indicates that compression is disabled. 2.1.1.2 Compression Algorithm : The Compression Algorithm field determines the algorithm the device will use to compress data sent to it by the host. On a MODE SELECT command, this field allows the host to select a compression algorithm. A value of 0 will direct the device to deselect all compression algorithms and is only valid if the DCE field is 0. A CHECK CONDITION status with ILLEGAL REQUEST sense key will otherwise be generated. A value of 1 will direct the device to select it's default compression algorithm whilst a value of N, where N > 1, will direct the device to select compression algorithm N. If the device does not support the indicated algorithm then it will issue a CHECK CONDITION status with ILLEGAL REQUEST sense key. On a MODE SENSE command, this field will contain the registered algorithm identifier for the currently selected compression algorithm. Note that this field is never 1. 2.1.1.3 DDE : Data Decompression Enable The Data Decompression Enable field is used to enable and disable data decompression. On a MODE SELECT command, this field allows the host to enable or disable decompression. If the host sets this field to 0 then decompression is disabled. Any compressed data encountered on the media will be returned to the host as a single variable-length record. See section 3 for information on returning compressed data to the host. If the host sets this field to 1 then decompression will be enabled. The device will attempt to decompress any compressed data it encounters on the media. On a MODE SENSE command, this field allows the host to determine whether or not the device will attempt to decompress any compressed data it encounters on the media. A value of 0 indicates that the device will not attempt decompression. A value of 1 indicates that the device will attempt to decompress any compressed data encountered during a READ. 2.1.1.4 RED : Report Error on Decompression The Report Error on Decompression field allows the host to define how and when CHECK CONDITIONS are generated during READ commands when boundaries between uncompressed and compressed data are encountered. On a MODE SELECT command, this field allows the host to specify when CHECK CONDITIONS are reported to the host during READS. If the host sets this field to 0 then a CHECK CONDITION will be generated by the device every time it encounters compressed data which it cannot decompress (because for example, the data was compressed using an algorithm which the device does not support, or because decompression is disabled). If the host sets this field to 1 then a CHECK CONDITION will be generated by the device in response to a READ command only when the format of the data changes. A format change occurs when data changes : FROM TO uncompressed compressed by unsupported algorithm compressed by supported algorithm compressed by unsupported algorithm compressed by unsupported algorithm uncompressed compressed by unsupported algorithm compressed by supported algorithm compressed by unsupported algorithm A compressed by unsupported algorithm B Note that this field only has an effect on a READ command. Other commands including SPACE and LOCATE will not be affected by the setting of this field. On a MODE SENSE command, this field allows the host to determine whether CHECK CONDITIONS will be generated on format boundaries or on encountering any compressed data which cannot be decompressed. A value of 0 signifies that CHECK CONDITIONS will be generated on encountering any compressed data which the device cannot decompress. A value of 1 signifies that CHECK CONDITIONS will be generated on format changes only. 2.1.1.5 Decompression Algorithm The Decompression Algorithm field allows the host to monitor the format of the data on the media. On a MODE SELECT command, this field has no meaning and will be ignored by the device. On a MODE SENSE command, this field allows the host to determine the type of data item last sent to the host in response to a READ command. A value of 0 indicates that the last data item returned was an uncompressed record, whilst a value of N (N > 01h) indicates that the last data item returned was an entity compressed using algorithm N. Note that this is valid even if the data was decompressed by the device. A value of 01h is never returned as it is a reserved value. Note that even though this field is updated by the device and therefore changes, a MODE PARAMETERS CHANGED CHECK CONDITION will not be generated. PART 3 ******** APPLICATION NOTE ****************** 3.1 SCSI Protocol Issues ====================== On a typical DDS-DC tape, two different data item types may be encountered. A data item is either : a) An uncompressed record b) An entity compressed using algorithm L Entities are written by DDS drives which support on-board data compression. An entity is made up of a number of same-sized records compressed using the device's compression algorithm and prefixed by an entity header which is an uncompressed descriptor containing information about the data within the entity. Entity Entity Record | | | V V V ------------------------------------------------------- | | | | ------------------------------------------------------- Type N Type M A device which supports compression algorithm N can decompress entities of type N when it encounters them on the media, and return the decompressed data to the host transparently. It can also return uncompressed records to the host as it finds them on the media. If such a device encounters an entity of type M (written by a different device using a different compression algorithm) then it doesn't generally know how to decompress it. Similarly if a device which doesn't support data compression at all encounters an entity on the media then it can not decompress it. Some hosts may support software decompression, where they themselves are capable of decompressing entities. This requires that the device be able to return a compressed entity to the host. The host must also be aware that the data it is receiving is COMPRESSED data and not a normal uncompressed record. Using the Data Compression Characteristics Mode page, the host has a choice of 2 different methods for handling compressed data which the device is unable to decompress. These 2 methods are selected using the RED field. If the host wishes to be able to read any compressed data on the media which the device is unable to decompress, and at the same time it wishes to minimise the number of CHECK CONDITIONS it receives from the device then it will set the RED field to 1. This will indicate to the device that it should only report DECOMPRESSION EXCEPTION CHECK CONDITIONS at format boundaries, where the type of data it will return to the host changes between uncompressed and compressed data, or that the compressed data that it will return to the host has been processed using a different algorithm to that which was returned before the DECOMPRESSION EXCEPTION. In this mode the host must keep track of whether it is receiving uncompressed or compressed data in response to READ commands. It is important for the host to be aware that any OTHER media-access commands sent to the device will mean that subsequent READ commands will NOT report a DECOMPRESSION EXCEPTION CHECK CONDITION if the device encounters uncompressed data, but WILL report such a CHECK CONDITION upon first encountering compressed data which it cannot decompress. A media-access command is defined as any command which causes logical tape motion. If the host is not concerned about the number of DECOMPRESSION EXCEPTION CHECK CONDITIONS it receives then it will set the RED field to 0. In this mode, the host does not need to keep a track of whether it is receiving uncompressed or compressed data in response to a READ command. The device will issue a DECOMPRESSION EXCEPTION CHECK CONDITION in response to any READ command which encounters compressed data which it cannot decompress. Note that in both modes, progress is made along the tape towards EOP. If a host does not wish to perform software decompression, then it can still read all of the uncompressed data on the media. 3.2 Host/Drive Interaction for Software Decompression =================================================== 3.2.1 DC drive - initial operating mode example ------------------------------------------- After drive reset, a drive which supports data compression may, for example, power up with compression enabled and algorithm N selected. The Data Compression Characteristics Page contains the following values : DCE = 1 : Compression enabled. Compression Algorithm = N : Compression algorithm N selected. DDE = 1 : Decompression enabled. RED = 1 : CHECK CONDITION returned on format change only. When the RED field is first set and decompression is enabled, the device will be in an initial state that is expecting to read either uncompressed data or data compressed with a supported algorithm. If the Red field is set and decompression is disabled, the device will be in an initial state that is expecting to read uncompressed data. If the expected data is not read then a CHECK CONDITION is generated. This corresponds to a format change. Because the RED field is set to 1 the device will now be in a secondary state where it is expecting to read data in the format of that which has just been encountered (ie that which forced the generation of the CHECK CONDITION). Again, if the expected data is not read then a CHECK CONDITION will be generated. Note that any media-access command other than a READ, will return the device to its initial state. 3.2.2 Non-DC drive - initial operating mode example. ------------------------------------------------ A drive which doesn't support data compression may, for example, power up with the Data Compression Characteristics Page containing the following values : DCE = 0 : Compression disabled. Compression Algorithm = 0 : No Compression algorithm selected. DDE = 0 : Decompression disabled. RED = 0 : CHECK CONDITION returned on encountering compressed data. Because the RED field is set to 0 and the device doesn't support data decompression, any compressed data encountered will generate a CHECK CONDITION. 3.2.3 Example of Software Decompression control ------------------------------------------- From power-on, the host issues a number of READ commands to the drive, all of which successfully return uncompressed data. On the next READ command, however, the drive detects an entity of type M on the media, where M is an unsupported compression algorithm. The drive cannot decompress the entity; it therefore treats it as a single variable-length record and returns either the number of bytes in one block or the total number of bytes in the entity, whichever is smaller. At this point, it is necessary for the drive to inform the host that it has encountered a data item on the media which it cannot decompress. It does this by issuing a CHECK CONDITION to the host and setting up the sense data as follows : Valid = 1 To indicate that the Information field contains residual information from the failed READ command. Note this will only be set if the entity length was different from the requested block length. Sense Key = No Sense (00h) This is used as an 'escape' code to indicate that the host should look at the ASC/ASCQ to determine the reason for the CHECK CONDITION. Information = READ residue The READ command failed with a residue as given in this field. Note this will only be set if the entity length was different from the requested block length. Command-Specific Information = Number of records in data item The number of records in the entity is obtained from the entity header which the drive can read. Note that in the case of a compressed-to-uncompressed format change, this field will contain 1 to indicate that 1 uncompressed record was encountered. Additional Sense Code = DECOMPRESSION EXCEPTION (68h) This ASC indicates the reason for the CHECK CONDITION as being a DECOMPRESSION EXCEPTION. Additional Sense Code Qualifier = XX Algorithm identifier for compressed entity encountered. Note that in the case of a compressed-to-uncompressed format change, this field will contain 0 to indicate that uncompressed data was encountered. The drive is now positioned on the EOP side of the entity. If the host doesn't support software decompression then, if it so wishes, it can continue reading. Note that in most cases, this type of host would clear the RED bit during initial device configuration so that it only ever received a DECOMPRESSION EXCEPTION when it encountered entities. All records would be returned without this type of CHECK CONDITION. If the host supports software decompression then it must check the sense data to see if it has received all the data from the entity. If it is reading in variable mode, it does this by looking at the residual count in the Information field. If this field is non-negative then the host has received all the compressed data and will not therefore need to SPACE reverse and reread the entity. If, however, the Information field is negative, then the requested block length is less than the actual entity length, and the host must SPACE reverse and reread the whole entity in order to successfully perform the software decompression. The host does this by looking in the Command-Specific Information field in order to find the number of records in the entity. It then issues a SPACE reverse with the count field set to the 2's complement of this value. This will position the device at the start of the entity. (Note that by subtracting the Information field - ie the residual count - from the requested block length, the host can determine the actual entity size and reserve enough buffer space to receive the data). As long as the RED bit is set, the host will be able to continue reading entities from the device until either an entity is encountered which has been compressed using a different algorithm, or uncompressed data is encountered. Note that in fixed mode, the host will not be able to determine the size of the encountered entity from the Information field and the requested block length, as the residual information will be in terms of blocks, not bytes. It is up to the host in this case to take the appropriate action. Note that whenever the host needs to SPACE reverse over an entity because it has not managed to read all the data the first time around, the device will return to its initial state as far as the RED bit is concerned and will issue a DECOMPRESSION EXCEPTION CHECK CONDITION in response to the following READ command. -- Steve Krupa Hewlett Packard Computer Peripherals Bristol Address: Email: Filton Road stevek@hpcpbla Stoke Gifford Bristol BS12 6QZ UK Phone: Fax: (44) 272 799910 x22237 (44) 272 236091