June 14, 1990 X3T9.2/90-088 TO: X3T9.2 / SCSI-3 FROM: Thomas Wicklund / Ciprico RE: Caching page proposal (Comments on document 90-021 R2) After looking at this page I have a few comments on some of the proposed additions. Note that many of my comments are areas which will be left vendor specific but I want to clearly state that these are ambiguities. 1. ABPF bit: This bit as defined is somewhat redundant and ambiguous. The existing definition of the minimum pre-fetch field makes this bit redundant. As currently defined, pre- fetch should be aborted if a new command is ready to execute and at least the minimum pre-fetch has been read. The ABPF bit would cause pre-fetch to be terminated upon selection. In most cases, this would terminate pre-fetch a very short time sooner than under the current definition. I don't think the short time difference is enough to justify adding a new control bit. The definition "terminated upon selection" is also unclear. If a SCSI device has several logical units, does pre-fetch terminate upon any selection or just a selection for the logical unit doing the pre-fetch? There are also cases where terminating pre-fetch upon selection will slow the device down (e.g. sequential untagged reads, each read after the first depends on pre-fetch continuing). Therefore, I suggest that this bit be removed and the existing minimum pre-fetch definition be cleaned up instead. 2. DISC bit: I don't understand the last sentence in this paragraph, which ends "be truncated (or wrapped) at time discontinuities". What does the term "wrapped" mean? 3. SIZE bit: The last sentence is unclear: "Simultaneous use of both number of segments and segment size is vendor specific." There is not combination of SIZE and IC bit which allow the target to look at both number of segments and segment size, therefore I don't see how a target can look at both fields simultaneously and comply with the definition of the fields. 4. DRA bit: A. The DRA bit duplicates the maximum pre-fetch. A maximum pre-fetch length of 0 accomplishes the same thing, so why create two ways of doing the same thing? B. In addition, as currently defined, a DRA of 1 (disable read-ahead) only disables pre-fetch if this process incurs overhead time. If a device can pre-fetch without adding any overhead, it is allowed to pre- fetch regardless of the setting of this bit. C. If this bit is incorporated it should be renamed to "Disable Pre-fetch (DPF)" to be consistent with the terminology used elsewhere in the caching page. 5. Define segmentation: The current document doesn't define what cache segmentation means. If a read command is issued, must it work completely within one cache segment or can it use multiple segments? If a device is setup to pre-fetch until its cache is full (large pre-fetch limits), must it stop pre-fetch when one cache segment is full or can it allocate and fill all cache segments? Both interpretations are allowed as currently defined. Other issues are unclear. If a read (otherwise cacheable) is larger than the cache segment size, is one cache segment used as a circular buffer, the non cache buffer used, or multiple cache segments? 6. Cache segmentation: Note that setting of a specific number of cache segments or segment size is a hit or miss process. The initiator has no knowledge about what values the target supports in the "Number of Cache Segments" and "Cache Segment Size" fields. In a device which uses bank select techniques (e.g. 256K cache and internal DMA which addresses 64K directly) there may be arbitrary restrictions on these fields. The non-cache segment size will also be affected by hardware architecture in the same way (for instance, the device might have several small non-cache segments rather than one large one). 7. Non Cache Segment Size: The last sentence of this field's definition needs work. It currently states: "The impact of the Non Cache Buffer Size equal 0 or the sum of this field plus the Cache Segment Size greater than the buffer size is vendor specific." The above sentence assumes a single cache segment in computing whether the buffer size has been exceeded. It should be re-worded to incorporate the product of Cache Segment Size and Number of Cache Segments. 8. Rounding: Some notes on how the target should handle rounding should be added. This page is a situation where the target will tend to round parameters by default, since the initiator has no idea the target's buffer size or restrictions on breaking it up. For example, I assume that setting a Non Cache Segment Size of 1 byte will not cause the device to read the disk 1 byte at a time (though I'm aware of hardware which could do it). 9. Field sizes: There are currently SCSI devices with 1MB and larger cache sizes. I think the 3 byte non-cache segment size field (16MB maximum) will be too small. Similarly, a cache segment field of 2 bytes (64K byte maximum size) is probably too small. Cache sizes are growing might easily exceed the defined field sizes. As an alternative, the cache segment size information could be definied in terms of block sizes. 10. Performance implications: The caching page (and current DPO bit in the read / write commands) can have some very great performance impacts if implemented literally by a device. For example: A. If the Non Cache Segment Size is 0 and the DPO bit is set, it's possible to deadlock if all cache segments are in use. DPO set means that no cache may be freed for the current cache and Non Cache Segment Size of 0 means there's no non-cache buffer. I hope nobody does a literal implementation. B. A small Non Cache Segment Size: If non cache segment size is equal to 1 block, does the device read a block at a time (implying 1 block per physical rotation)? Poorly chosen values will result in some very poor performance. C. Ignoring parameters: I anticipate that many devices will treat the caching page as advisory and ignore it where it will cause rediculously low performance. If so, this defeats the purpose of the caching page and leaves the initiator without real control. 11. Multiple Logical Units: While bridge controllers are slowly going out of date, this modified caching page will either create a very complex caching algorithm for multiple logical unit controllers or imply that changes to the caching page for one LUN will also affect the caching page other LUNs which are present. I can't think of any other case where one LUN affects another (other than Send Diagnostic, which provides for it). This becomes very complex if a device has two LUNs with different block sizes (though I admit I've never known anybody to do this in practice).