June 14, 1990                                       X3T9.2/90-088

TO:     X3T9.2 / SCSI-3

FROM:   Thomas Wicklund / Ciprico

RE:     Caching page proposal (Comments on document 90-021 R2)


After looking at this page I have a few comments on some of the
proposed additions.  Note that many of my comments are areas which
will be left vendor specific but I want to clearly state that
these are ambiguities.


  1.  ABPF bit:  This bit as defined is somewhat redundant and
      ambiguous.  The existing definition of the minimum pre-fetch
      field makes this bit redundant.  As currently defined, pre-
      fetch should be aborted if a new command is ready to execute
      and at least the minimum pre-fetch has been read.

      The ABPF bit would cause pre-fetch to be terminated upon
      selection.  In most cases, this would terminate pre-fetch a
      very short time sooner than under the current definition.  I
      don't think the short time difference is enough to justify
      adding a new control bit.

      The definition "terminated upon selection" is also unclear.
      If a SCSI device has several logical units, does pre-fetch
      terminate upon any selection or just a selection for the
      logical unit doing the pre-fetch?  There are also cases
      where terminating pre-fetch upon selection will slow the
      device down (e.g. sequential untagged reads, each read after
      the first depends on pre-fetch continuing).

      Therefore, I suggest that this bit be removed and the
      existing minimum pre-fetch definition be cleaned up instead.


  2.  DISC bit:  I don't understand the last sentence in this
      paragraph, which ends "be truncated (or wrapped) at time
      discontinuities".  What does the term "wrapped" mean?


  3.  SIZE bit:  The last sentence is unclear:

      "Simultaneous use of both number of segments and segment
      size is vendor specific."

      There is not combination of SIZE and IC bit which allow the
      target to look at both number of segments and segment size,
      therefore I don't see how a target can look at both fields
      simultaneously and comply with the definition of the fields.


  4.  DRA bit:

        A.  The DRA bit duplicates the maximum pre-fetch.  A
            maximum pre-fetch length of 0 accomplishes the same
            thing, so why create two ways of doing the same thing?

        B.  In addition, as currently defined, a DRA of 1 (disable
            read-ahead) only disables pre-fetch if this process
            incurs overhead time.  If a device can pre-fetch
            without adding any overhead, it is allowed to pre-
            fetch regardless of the setting of this bit.

        C.  If this bit is incorporated it should be renamed to
            "Disable Pre-fetch (DPF)" to be consistent with the
            terminology used elsewhere in the caching page.


  5.  Define segmentation:  The current document doesn't define
      what cache segmentation means.  If a read command is issued,
      must it work completely within one cache segment or can it
      use multiple segments?  If a device is setup to pre-fetch
      until its cache is full (large pre-fetch limits), must it
      stop pre-fetch when one cache segment is full or can it
      allocate and fill all cache segments?  Both interpretations
      are allowed as currently defined.

      Other issues are unclear.  If a read (otherwise cacheable)
      is larger than the cache segment size, is one cache segment
      used as a circular buffer, the non cache buffer used, or
      multiple cache segments?


  6.  Cache segmentation:  Note that setting of a specific number
      of cache segments or segment size is a hit or miss process.
      The initiator has no knowledge about what values the
      target supports in the "Number of Cache Segments" and
      "Cache Segment Size" fields.  In a device which uses bank
      select techniques (e.g. 256K cache and internal DMA which
      addresses 64K directly) there may be arbitrary restrictions
      on these fields.  The non-cache segment size will also be
      affected by hardware architecture in the same way (for
      instance, the device might have several small non-cache
      segments rather than one large one).


  7.  Non Cache Segment Size:  The last sentence of this field's
      definition needs work.  It currently states:















      "The impact of the Non Cache Buffer Size equal 0 or the sum
      of this field plus the Cache Segment Size greater than the
      buffer size is vendor specific."

      The above sentence assumes a single cache segment in
      computing whether the buffer size has been exceeded.  It
      should be re-worded to incorporate the product of Cache
      Segment Size and Number of Cache Segments.


  8.  Rounding:  Some notes on how the target should handle
      rounding should be added.  This page is a situation where
      the target will tend to round parameters by default, since
      the initiator has no idea the target's buffer size or
      restrictions on breaking it up.

      For example, I assume that setting a Non Cache Segment Size
      of 1 byte will not cause the device to read the disk 1 byte
      at a time (though I'm aware of hardware which could do it).


  9.  Field sizes:  There are currently SCSI devices with 1MB and
      larger cache sizes.  I think the 3 byte non-cache segment
      size field (16MB maximum) will be too small.  Similarly, a
      cache segment field of 2 bytes (64K byte maximum size) is
      probably too small.  Cache sizes are growing might easily
      exceed the defined field sizes.

      As an alternative, the cache segment size information could
      be definied in terms of block sizes.


 10.  Performance implications:  The caching page (and current DPO
      bit in the read / write commands) can have some very great
      performance impacts if implemented literally by a device.
      For example:

        A.  If the Non Cache Segment Size is 0 and the DPO bit is
            set, it's possible to deadlock if all cache segments
            are in use.  DPO set means that no cache may be freed
            for the current cache and Non Cache Segment Size of 0
            means there's no non-cache buffer.  I hope nobody does
            a literal implementation.

        B.  A small Non Cache Segment Size:  If non cache segment
            size is equal to 1 block, does the device read a block
            at a time (implying 1 block per physical rotation)?
            Poorly chosen values will result in some very poor
            performance.

















        C.  Ignoring parameters:  I anticipate that many devices
            will treat the caching page as advisory and ignore it
            where it will cause rediculously low performance.  If
            so, this defeats the purpose of the caching page and
            leaves the initiator without real control.


 11.  Multiple Logical Units:  While bridge controllers are slowly
      going out of date, this modified caching page will either
      create a very complex caching algorithm for multiple logical
      unit controllers or imply that changes to the caching page
      for one LUN will also affect the caching page other LUNs
      which are present. I can't think of any other case where one
      LUN affects another (other than Send Diagnostic, which
      provides for it).

      This becomes very complex if a device has two LUNs with
      different block sizes (though I admit I've never known
      anybody to do this in practice).