X3T9.2/90-198 Steve Krupa HP Presentation December 3, 1990 Hewlett-Packard Data Compression 1 D a t a C o m p r e s s i o n C h a r a c t e r i s t i c s ------------------------------------------------------------------- M o d e P a g e --------------------- W h y d o w e n e e d i t ? .page. Hewlett-Packard Data Compression 2 R e q u i r e m e n t s ------------------------- * To allow the host to enable/disable compression * To allow the host to select the compression algorithm * To allow the host to retrieve processed data from the device * To allow the host to monitor compression ratio * To allow the full power of the compression engine to be utilised, in order to maximise compression ratio .page. Hewlett-Packard Data Compression 3 H P 's O b j e c t i v e s ------------------------------ * To indicate the insufficient support for Data Compression in SCSI-2 * To raise the issues in everybody's minds * To provide a standard mechanism by which host systems and sequential access devices can manage Data Compression Other options : . Produce a vendor unique solution . Produce a DDS unique solution * To get valuable feedback from device driver writers, SCSI systems engineers and sequential access device manufacturers Non-Objectives : * To get a rubber stamp from X3T9.2 * To enforce the use of any one particular compression algorithm * To limit the support to any one type of sequential access device .page. Hewlett-Packard Data Compression 4 S o W h a t A r e T h e I s s u e s ? ---------------------------------------------------- Non-Issues : * Enabling/disabling compression * Selection of compression algorithm * Monitoring of compression ratio Issues : * Returning processed data to the host * Utilising the full power of the compression engine .page. Hewlett-Packard Data Compression 5 I s s u e 1 : Returning Processed Data to the Host ------------------------------------------------------- * An entity (generic term) is one or more processed records along with some control information * The control information includes the registered identifier for the algorithm which was used to process the data .page. Hewlett-Packard Data Compression 6 * A tape may contain a mixture of unprocessed records and entities containing data processed using different algorithms BOP EOP <-- --> ------------------------------------------------------------------ | R | En | R | Em | R | R | En | ------------------------------------------------------------------ R = Unprocessed Record En = Entity containing data processed by algorithm N Em = Entity containing data processed by algorithm M * A host may require that a device which either . does not support compression/decompression or . only supports decompression for algorithm M attempts to read the tape What should the device do on encountering an entity of type En ? .page. Hewlett-Packard Data Compression 7 O p t i o n s --------------- * Report a CHECK CONDITION and disallow the host from retrieving the processed data * Treat the entity as a single variable length record . return as much data as requested . report CHECK CONDITION with ILI bit set and ASC, ASCQ indicating ENTITY DETECTED * Treat the entity as a format error/discontinuity . return no data from the entity . report CHECK CONDITION with the ASC,ASCQ indicating ENTITY DETECTED . return as much additional information about the entity as possible In all cases, the final position will be on the EOP side of the entity. .page. Hewlett-Packard Data Compression 8 O p t i o n : Treat it as a single variable length record ----------------------------------------------------------- Advantages : * It is simple * If all the data in the entity is returned to the host, no repositioning is required Disadvantages : * It is NOT a single record. It is a representation of one or more records * It has implications for positioning LBA=60 LBA=65 | | V V ---------------------------------------------- | 512 | Entity | 512 | | byte | (5 x 512 byte | byte | | record | records) | record | ---------------------------------------------- < 1800 bytes > If the entity is considered as a single variable length record, are the Logical Block Addresses shown above, correct ? * Not all the data may be returned - the entity may be larger than the requested size. How does the host recover position in order to read the whole entity ? * Does the ILI bit get set depending on entity length or unprocessed record size ? For example, in the case above, the unprocessed record length of the records in the entity is 512 bytes - the same as the surrounding unprocessed records - but the entity length is 1800 bytes. .page. Hewlett-Packard Data Compression 9 O p t i o n : Treat is as a format error/discontinuity -------------------------------------------------------- Advantages : * The host only gets the type of data it is expecting. There is no mixing of processed/unprocessed data * Positioning is handled robustly - CHECK CONDITION sense data includes number of records in entity Disadvantages : * The host must Space reverse, change mode and reread in order to retrieve ANY of the entity. This makes the process slow and involves more repositioning .page. Hewlett-Packard Data Compression 10 I s s u e 2 : Utilising the full power of the compression engine ------------------------------------------------------------------- * Some compression engines have a wide variety of optimisation features * Is it appropriate for these features to be made available to the host ? * The host (application) has more knowledge about the type of data being written to the device and can plan optimisation accordingly * Does the host really care about algorithm-specific features ? .page. Hewlett-Packard Data Compression 11 O p t i o n s --------------- * Leave all optimisation and algorithm-specific features up to the device . simple but limiting * Offer all features to the host . complex but flexible . vendor/algorithm unique page . vendor/algorithm unique extension to Data Compression Characteristics page * Features in current document, X3T9.2/90-119, for Data Compression Characteristics page, are for HP's compression engine, 1XB4, which implements the DCLZ compression algorithm .page. Hewlett-Packard Data Compression 12 D a t a C o m p r e s s i o n - Related Committees and Documents --------------------------------------------------------------------- * X3B5 (Digital Magnetic Tape Committee) is working on data compression with close liaison with : . X3B11 . X3B11.1 . X3B8 * Related Documents . X3B5/90-322 DCLZ Algorithm . X3B5/90-323 DDS-DC Format .page. -- Steve Krupa Hewlett Packard Computer Peripherals Bristol Address: Email: Filton Road stevek@hpcpbla Stoke Gifford Bristol BS12 6QZ UK Phone: Fax: (44) 272 799910 x22237 (44) 272 236091