stds-1394: empirical character of SBP2 scatter/gather lists?

Eric Anderson ewa at
Tue Jul 3 23:07:14 PDT 2001

* From the T10 Reflector (t10 at, posted by:
* "Eric Anderson" <ewa at>
By the way, Apple uses the "unrestricted page table" exclusively
(except for the occasional ORB with no page table at all); we could
not identify any reason to bother with the "normalized page table".
It offers no advantage to either the OS or the target, as far as I
can tell, since support for unrestricted page tables is mandatory.

Furthermore, because the unrestricted page table is an unconstrained
scatter/gather list, we expose this in our API, and allow drivers to
specify noncontiguous buffers of any alignment they choose.  The
moral:  As a target, you should fully support unrestricted page
tables, and make no assumptions about the alignment of any PTE, or
any boundary between PTEs.  This is mandatory in SBP-2; experience
suggests that your device will fail if you do not comply fully with
the spec.  This includes page table entries with lengths that are
not a multiple of four, and start addresses that are also not a
multiple of four, nor the same non-multiple of four as the previous
start address.

Several commercially available 1394-ATA bridges have bugs in this
area, and we painfully double-buffer some I/O to them as a result,
at a great loss in performance.  I will not identify them; ask your
silicon suppliers very clearly if they have this problem or not.


>* From the T10 Reflector (t10 at, posted by:
>* "John Nels Fuller" <jfuller at>
>Which OS are you using?  What you are asking about is primarily a function
>of the OS.  The target device should not care about how the page table is
>specified other than the size of packets it must create (addresses are
>numbers that are filled into the headers).  I know from a past life that
>Windows 2000 and XP will generate the kind of page table that you expect
>(but with page size of 0), but that Win 98 and ME will be much more
>arbitrary (in fact Win 98 was the reason that unrestricted page tables
>added to the standard).  If you want to question Microsoft's
>you should email 1394 at
>John Nels Fuller
>Principal Engineer -- Standards
>Interconnect Architecture Lab
>Sony US Research Labs
>24034 NE 29th Street
>Sammamish, WA   98074-5468   USA
>Primary phone (cell): +1 206 409 0338
>Office (no messages): +1 425 558 0464
>Fax: +1 425 558 0575
>email: jfuller at
>-----Original Message-----
>From: owner-stds-1394 at
>[mailto:owner-stds-1394 at]On Behalf Of Pat LaVarre
>Sent: Tuesday, June 26, 2001 5:46 PM
>To: t10 at
>Cc: p.lavarre at
>Subject: stds-1394: empirical character of SBP2 scatter/gather lists?
> > mailto:stds-1394 at
>Hey, hi, has anyone here already dug into the empirical character of ORBs?
>How about in particular the actual content of the theoretically
>"unrestricted page table"s to which ORBs point?  I ask because me,
>over bus traces, I'm seeing both More and Less variety than I knew to
>in page tables.
>Me, I first learned to spell FireWire about three weeks ago.  Now I'm
>looking to connect with less clueless folk, hoping to provoke you to share
>your history of pain and thereby shortcut my learning curve.  Me, I'm
>to make sense of actual bus traces.
>I'm wondering how completely I have misunderstood what's going on in how
>popular shipping operating systems actually manage the virtual memory
>involved in block i/o?
>Me, I had the idea that block i/o involved a length of blocks at a virtual
>address.  This virtual memory, once pinned, would appear in physical
>as a leading fragment of a page, then a series of whole pages, then a
>trailing fragment.
>Offline two different volunteers told me my summary is none too clear, so
>let's try a specific example.  Given a block device that stores x200 (512)
>bytes per block and a host that allocates physical memory in chunks of
>(4096) bytes each, I thought I might someday see the scatter/gather list
>a physically fragmented, but virtually contiguous, region of memory look
>something like:
>         for a total of ,........... x3400 bytes
>         from address x1234:5FFF for x0001 bytes
>         from address x6789:A000 for x1000 bytes
>         from address xBCDE:F000 for x2000 bytes
>         from address x0123:4000 for x03FF bytes
>A scatter/gather list like this has LOS of structure.
>All but the first fragment begins at a x1000-byte boundary.  All but the
>last fragment ends at a x1000-byte boundary.  The byte length of any
>fragment is a multiple of x1000.  The sum of the lengths of the first and
>last fragments is a multiple of x200 bytes.  Like I said, LOTS of
>But the scatter/gather lists (aka "unrestricted page table"s) that I see
>actual bus traces of 1394 sbp2 traffic have both less and more structure
>than this example.
>In particular, I see four surprises ...
>1) I see more structure than I expect:
>Some hosts seemingly enforce a less than minimal alignment: every starting
>"segment_base" address appears as a multiple of x10 and correspondingly
>every byte length ends up a multiple of x10.  I imagine somebody out there
>is double-buffering.
>WHERE in CSR space does the device get to share its opinion of what
>double-buffering is?
>2) I see less structure than I expect:
>I've heard on some hosts the virtual memory page size is x1000 bytes
>Indisputably (log2(x1000) - 8) is 4.  But the "page_size" fields of the
>ORB's I see isn't 4: it is 0.  Somebody out there wants to say I'll be
>seeing "unrestricted page table"s?
>WHY favour "unrestricted page table"s?
>WHERE in CSR space does the device get to express its preference for
>restricted page tables?
>3) Less structure than I expect:
>Smack in the middle of an sbp2 page table, on occasion, I see
>"segment_length"s that are not multiples of x1000.
>HOW can an app make that happen?
>4) Less structure than I expect:
>I see a variety of "max_payload" values in the ORBs.  I think I understand
>the "max_payload" value, if small, can tell the device to subdivide access
>of even a single element of a page table.
>As the "max_payload" value, mostly I see 7 to 9, where I was expecting to
>see xA i.e. (log2(x1000) - 2).
> > mailto:sbp3 at
>Please tell me if I'd do better to ask there, rather than here
>(t10 at
>Thanks in advance.    PatLaVarre 
>* For T10 Reflector information, send a message with
>* 'info t10' (no quotes) in the message body to majordomo at

* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at

More information about the T10 mailing list