DavidW at DavidW at
Thu Feb 27 15:03:03 PST 1997

* From the SCSI Reflector (scsi at, posted by:
* DavidW at

Oh goody. Another reflector debate.  I may actually get good at this after 
awhile.  Responses to responses to responses somewhere below.


Frank Campbell <f_campbell at> Wrote:
| David,
| My comments/responses are embedded below.
| Frank Campbell
| QLogic
| DavidW at wrote:
| > 
| > Frank,
| > 
| > Many of the things that you point out as concerns are considered features 
| > many of us who worked on SBP-2.  Details embedded below.
| > 
| > David Wooten
| > Compaq
| > 
| > Frank Campbell <f_campbell at> Wrote:
| > |
| > | Before we consider withdrawal of SBP we should subject SBP-2 to greater
| > | scrutiny.
| > |
| > | There are aspects of SBP-2 which I believe should give rise to concern:
| > |
| > | 1.    SBP-2 requires that targets deal with scatter/gather and paging
| > | issues within the host.
| > 
| > Actually, I think the way that SBP-2 is described right now, the
| > scatter/gather tables are a performance optimization.  We could have 
| > a transfer that had to be scattered/gathered with lots of individual 
| >  Each of these commands would have taken more bus cycles to fetch and more
| > overhead on the device to process (in order to get reasonable performance, 
| > target would have had to prefetch a large number of commands and discover 
| > association through the target address.)  So, after much debate, the group
| > decided that these tables were less complex than the alternative(s).
| Unfortunately, I was not present at these debates. Why would it have
| been necessary to break up commands? IDE, SCSI and Fibrechannel do not
| provide scatter/gather. Why is it necessary for 1394?

Hum.  My undrestanding of IDE is that it does have scatter/gather.  It might 
not be visible on the IDE cable but its there.  EIther the CPU is moving the 
data to/from the disk from/to memory or we are using the new whizzy DMA.  In 
either case, a scatter gather table is present and being used.  Now, when we 
go to 1394, neither of these methods is particularly desireable, and, in fact, 
using the CPU to move the data is impossible.  We could have added something 
like the IDE DMA hardware but we have found that this doesn't work when we are 
trying to handle o-o-o execution of commands or when we have to deal with a 
large number of devices (IDE DMA only deals with one thing at a time.)

SCSI also has scatter/gather.  It is provided by the host controllers in a 
manner that is similar to what is done for IDE DMA.  This was the only option 
for SCSI because SCSI has no provisions for passing memory addresses across 
the bus.  We don't like the SCSI interface because of this shortcoming.  It 
tends to require a lot of complexity in a host controller in order to deal 
with the scatter/gather of each device (or you have to serialize the requests 
which gives up performance.)  
Anyway, the fact that devices can use memory addresses is a good thing because 
that allows us to build devices that can share the burden of scatter/gather.  
Since each and every device we add can be its own bus master, we don't tend to 
run out of system resources (DMA channels) as fast.  Its kind of like the 
difference between ISA and PCI.  The bus mastering capability of PCI is _much_ 
better than the third party DMA of ISA.


| > 
| > | 2.    SBP-2 requires that targets have the capability to break transfers 
| > | on odd byte boundaries.
| > 
| > This is simply a property of the application.  Many applications have the 
| > to move data to arbitrary boundaries.  Without this capability in SPB-2, 
| > host would have to double-buffer which adds latency and decreases 
| >  Also, our BIOS experts told us that they simply did not have any 
| > amount of space laying around to support double buffering.  So, we could 
| > 'simplified' the device but that would have made it a bad fit to the
| > application.
| > 
| > Many disk drive people spent a lot of time wrestling with this.  They had
| > wanted sector alignment, but then conceded to quadlet alignment and then,
| > after some more work, decided that byte alignment was merely an 
| > and not a significant problem.
| Again, since other interfaces do not provide this capability, why is it
| necessary for SBP-2 to provide it? How do your BIOS experts deal with
| odd byte boundaries and scatter/gather for IDE and SCSI?

Again, the device doesn't solve this problem, it is left to the interface to 
solve it.  Or, in the case of IDE, the CPU gets to solve it.  

| > 
| > | 3.    SBP-2 uses host addresses as handles with no means of identifying
| > | stale handles, exposing the risk of stale handles causing corruption.
| > 
| > I'm not sure what you mean here.  SBP-2 uses actuall device addresses.  
| > the node address gets stale due to a bus reset, the target stops its 
| > and waits for the target to fixup the node addresses and reissue the 
| My objection is that in the event that a stale handle is sent by a
| target to a host, there is nothing in SBP-2 to identify it. Since the
| handles are host addresses, a stale handle that involves a write can
| cause corruption in host memory. If the assertion is that stale handles
| will never occur and targets will always behave perfectly, then there is
| no problem. 

The possibility of stale 'handles' is well understood and conprehended in 
gruesome detail by SBP-2 and every other 1394 document that I know of.

| > 
| > | 4.    SBP-2 uses more bus transactions than necessary to perform I/O. As 
| > | worst case example, a single sector read from disk requires up to 14 bus
| > | arbitration cycles and 14 packet transfers.
| > 
| > I'm not sure where you get your 14 number from.  Granted, one might 
| > of a protocol that uses fewer bus transactions but I think it would be 
hard to
| > out perform SBP-2 from a global context.  I/O optimization is not simply a
| > matter of using the fewest I/O bus transactions.  Optimzation includes the
| > time taken to get from endpoint to endpoint or to/from the application 
| > the target.  SPB-2 does a better job of this than anything we have seen 
| > other buses.
| What I quoted was a worst case example. 1394 is a split transaction bus
| and hence, worst case, a read or write involves 2 bus arbitrations for 2
| packets, request and response. For a single disk block read we have the
| following:
| 	host writes to doorbell address to notify target
| 	target reads the most recent ORB to get the address of the new ORB
| 	target reads the new ORB
| 	target reads page map
| 	target writes data up to page boundary
| 	target writes data from page boundary to end
| 	target writes status
| Seven reads or writes. Seven requests and seven responses. Fourteen
| packets/bus arbitrations.

Oh, well, I guess that could happen but... Although it is not precluded, I 
doubt that the device is going to send an ack_pending on a write to the 
doorbell register. And I'm pretty sure that the host is not going to send a 
ack_pending for the status write but it is not precluded.  Actually, I could 
get the number of arbitrations up to some _very_ large number simply by 
throwing in some ack_busy responses.  Thing I'm having trouble understanding 
is, what does it matter?  I will admit that if one is doing isolated accesses, 
the protocol isn't as efficient as it might be.  On the other hand, when we 
have isolated accesses, optimizing performance isn't a big deal.  The place 
where we realy tried to optimize SBP-2 was when we had lots to do.  Then, the 
ability of the device to pace itself and optimize its performance without CPU 
intervention is of _great_ benefit to the system.

| > 
| > | 5.    SBP-2 hosts use DMA addresses supplied by targets, creating
| > security
| > | problems.
| > 
| > What security problem?  The addresses are created by trusted software
| > (drivers).  So why do we have a security problem. What addresses could the
| > device use that would not create a security problem?
| The addresses may be created by 'trusted' software, but they are passed
| to target devices which then are required to return them some time
| later. I haven't seen anything in SBP-2 that would allow the 'trusting'
| system to verify that they have not been changed/corrupted by the
| target.

We trust 'target' devices to not corrupt addresses all the time.  Go look at 
any bus where there are bus masters (e.g.,  PCI) and you will see 'target' 
devices that ae inheriently no more or less trustworthy than is a 1394 target. 
 Besides, there are so many other ways for devices to screw us up that getting 
the address right is not a significant problem.  For example, if the disk 
returns the wrong data when the swap space is accessed, we can put it in 
exaclty the right place in system memory (assuming that there is a right place 
for the wrong data) and the system crashes.  So, why are you worried about the 
addresses getting hosed?

| > 
| > |
| > | Items 1,2 and 4 relate to the cost and complexity of target hardware and
| > | firmware. Items 3 and 5 relate to reliability and security.
| > |
| > | Frank Campbell
| > | Qlogic Corporation
| > | Costa Mesa, California.
| > |

I think the folks who worked on SPB-2 did a pretty good job of taking 
advantage of the atributes of 1394 to do some pretty good system optimization. 
 Every group (disk drive, host controller, systems, software, firmware, etc.) 
was represented in putting the SBP-2 spec together and I think it is 
represents a reasonably optimum approach to putting devices on 1394.
* For SCSI Reflector information, send a message with
* 'info scsi' (no quotes) in the message body to majordomo at

More information about the T10 mailing list