Object oriented issues
Dave_B_Anderson at seagate.com
Dave_B_Anderson at seagate.com
Thu Jul 20 13:50:24 PDT 2000
* From the T10 Reflector (t10 at t10.org), posted by:
* Dave_B_Anderson at seagate.com
Thanks for your comments. This issue of data transfers in both directions
is a tough problem for SCSI, but at the same time, I think required for
efficiently implementing OBSD. Perhaps Jim is right when, in another memo,
he wrote that some other protocol, not SCSI, probably would be more
appropriate. Also, Jim's suggestions for alternative solutions to the
problem are worthwhile, though I actually think that more instances will
arise for employing bi-directional transfers. If that is the case, then it
is better we come to this realization earlier than later. Still, I would
really prefer to find a way to make it work in SCSI, if at all possible.
We can always port the definition to another protocol, and certainly there
are some candidates in the wings eager for a chunk of the storage
interconnect business - TCP/IP and InfiniBand, for example. But there is
so much invested by the industry in SCSI, and it has proven itself over and
over to have the flexibility and extensibility to meet our needs for so
long, that it would seem really unfortunate if we cannot find a SCSI
solution to this requirement, as well. I shudder to think about having to
bring another protocol up to the level of SCSI, in terms of maturity,
stability and solutions - i.e. all the problems that have been addressed
and resolved in it.
This is probably heresy to you, and most of the rest of T10 probably does
not want to hear anything that smells like changing SAM, but we have
actually developed and are shipping SPI drives that implement a command
that sends data in both directions. (The host in this case is a large
system that has low level control over command processing and is able to
accept the second, in-bound data stream.) It has worked very well for this
customer. As this was done as a proprietary command for a single customer,
we have not pursued approaching T10 about accommodating the command.
Nevertheless, it was not that hard to do and, as I said, continues to work
Seagate would certainly be willing to propose the needed changes.
Appreciate your thoughts,
Robert Snively <rsnively at Brocade.COM> on 07/18/2000 10:40:23 AM
To: "'hafner at almaden.ibm.com'" <hafner at almaden.ibm.com>, t10 at t10.org
cc: Dave_Anderson at notes.Seagate.COM
Subject: RE: Object oriented issues
Making data transfer flow both ways requires the management of TWO data
pointers, including the capability of explicitly modifying both of them.
While serial SCSI usually contains the pointer embedded in the data or
in the data request, parallel SCSI does not have that capability and must
create a new labeling process for the pointers.
Normally, the data pointers are actually DMA engines, implying that two
simultaneous DMA engines would be required for each command, one inbound
and one outbound. That doubles the DMA state that has to be maintained for
both serial and parallel SCSI host adapters, assuming that there are a pair
of re-usable DMA engines on each host adapter.
And, as one of you folks has already pointed out, the error processing
opens a whole new can of worms.
The savings of overhead with a simultaneous CREATE AND WRITE is probably
not significant in serial SCSI. The overheads on the link are low compared
with the processing overheads required to perform and commit the CREATE
function on the logical unit. Note that the CREATE should probably be
pretty much an uninterruptible operation. Locking the WRITE to it forces
the loss of a revolution on a disk device that doesn't cache. Depending on
how you implement this, it could create an extended period of
or busyness for a device. On devices that do cache, data integrity is
threatened because you must not only record the data, but record the
of the object and the descriptors of the recorded data before you can be
data integrity. And of course, at the individual disk level, the whole
oriented approach is somewhat suspect unless file system level mirroring
is provided. RAIDs should be okay, since they implement recording
and redundant non-volatile caching.
If the overhead you are actually worrying about is related to the
and creation of each object oriented command, then you have a far more
Doable, yes. Wise, no. Unless we can contain the data in a fixed maximum
sub-field of the command (say 32 bytes) so that it is never transmitted in
"data phase", let's instead look at Jim's solutions below.
> -----Original Message-----
> From: hafner at almaden.ibm.com [mailto:hafner at almaden.ibm.com]
> Sent: Monday, July 17, 2000 4:14 PM
> To: t10 at t10.org
> Cc: Dave_Anderson at notes.seagate.com
> * From the T10 Reflector (t10 at t10.org), posted by:
> * hafner at almaden.ibm.com
> There was an interesting discussion at the last T10 meeting on OSDs
> (osd-r01). I presented some suggestions in 00-262r0, and a reply
> (00-295r0) was supplied by Dave Anderson (Seagate). In many
> cases, we
> agreed on many things. A few things I raised questions
> about because I
> wasn't clear on the issues and requirements. A few things
> we disagree on
> mostly in terms of implementation. But....
> One issue that I want to open for discussion here is the
> CREATE. I didn't
> see how a CREATE could include write data (DataOut) and
> still get back an
> ObjectID in DataIn, so I suggested separating the two operations.
> Dave remarked that "this seemed wasteful in an environment
> where there are
> a lot of small file creates" (from 00-295r0, last page, item 6).
> My response is:
> 1) if or until there is bidirectional data on a single SCSI
> command, I
> don't see an immediate and good alternative but...
> 2) one could CREATE+WRITE with no returned ObjectID, and
> then follow that
> with a second command to request a report on the created object's ID
> (that's still two commands though there is better atomicity of
> 3) one could CREATE+WRITEw/suggested ObjectID and then the
> OSD can return
> Status GOOD if the ObjectID was acceptable (to the OSD) and
> some other
> status (CHECK CONDITION and include in the additional sense
> data the actual
> ObjectID that was assigned by the OSD). This is a hack (in
> my opinion) but
> workable, though it does distribute "namespace" responsibilities in a
> different way.
> 4) The filesystem that expects to open lots of small files
> could issue a
> number of CREATE commands and cache the ObjectIDs for when a
> real file
> needs to be created. This modifies the filesystem behavior,
> but is not
> 5) We can mitigate the latency and overhead of many CREATES
> (as in (4)) by
> having a CREATE MULTIPLE (create 'n' objects) which would
> return a list of
> ObjectIDs. Interesting error scenarios arise in this case, however.
> Anybody else got thoughts on this?
> Anybody want to bite off changing SAM to allow for
> bi-directional data?
> Jim Hafner
> * For T10 Reflector information, send a message with
> * 'info t10' (no quotes) in the message body to majordomo at t10.org
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org
More information about the T10