CAM-2

MGauthier MGauthier at iit.nrc.ca
Tue Jan 10 09:39:00 PST 1995


Subject:     RE>CAM-2 
> Date: Wed, 4 Jan 1995 19:53:29 -0500
> From: Bill Dallas <dallas at zk3.dec.com>
> Message-Id: <9501050053.AA14855 at wasted.zk3.dec.com>
> To: scsi at WichitaKS.ATTGIS.COM
> 
> The CAM-2 working will be meeting on January 10, 1995 to discuss the
> options for CCB structure sizing and some general rules.  The CAM-2
> working group will then present to the general SCSI working group its
> thoughts on the structure and sizing of the CCBs and the reasoning for its
> selection.  It is hoped that the general SCSI working will comment on the
> selection (e.g., approve, disapprove,  enhance the CCB, sizing and
> general rules or have specific ideas).
> 
> The migration of the CAM specification which is based upon the SCSI-2
> specification to the CAM-2 specification and will be based on the SCSI-3
> presents a number of problems.  The largest problem that SCSI-3
> presents to CAM is the converting from a dense addressing space to a
> extremely sparse addressing space.  The expansion of the target and
> LUN from 8 bits to 64 bits will break all the currently written CAM
> software.  The term break used in the previous sentence means that the
> currently written software will have to be modified and restructured to
> conform with CAM-2 and SCSI-3.

I concur; not only because the new CAM-2 CCBs will have bigger fields,
and different layout, but also because the assumption that one can
organise active logical units by arrays indexed by SCSI ID and LUN
is no longer true.  Even simple things like scanning the bus for
devices, ie. trying every possible ID (and LUN for responding IDs)
won't work in many environments.  Scanning for reachable devices
likely cannot be done by client/application software anymore, but
would have to be handled through CAM; appropriate primitives
(CAM commands) for getting a list of active devices need to be
adapted or added.

> Some other problems with the CAM specification today is its lack of
> expansion capability for new features to both the CAM and SCSI
> specifications and its assumption of a 32 bit processor world.  The CAM-2
> working group will be working towards a specification that is easily
> expanded, fully dynamic and processor word size independent.
> 
> The following are some ideas that should help in solving some of these
> issues.
> 
> Processor word size independence:
> To allow transportability (not binary comparability) of CAM-2 peripheral
> drivers, and SIM/HBA's between different machine platforms and
> operating systems, a cam types definition file (cam_types.h) will be
> provided by the supplier of the xpt.  The cam_types.h header file shall
> define the CCB structures member sizes.  While the CCB's size varies
> based on machine word size, they are of fixed size and structure
> member offsets are fixed for a specific machine word size.

Your example definitions seem to imply, however, that the size would
stay the same for all architectures but only the way in which it is
defined varies (eg. defining two U_32's instead of one U_64).
Any change in size would be only due to an architecture's natural
alignment, and that could likely be normalized with proper padding.

> The supplier of the xpt shall also the supply a CCB definitions file (cam.h)
> which shall use the defined CCBs as specified by CAM-2.  The CCBs
> specified in CAM-2 shall specify specific member names of the CCB and
> the size of each member. 

Why does it have to be supplied by the xpt vendor, and not be the same
for all architectures of a given word size?  (not many word sizes to deal
with, likely only 32 and 64 bits, though 16 bits and 128 bits might be
considered; each could have standard definition...?).

> Alignment (address boundaries) of the CCB and its members shall also be
> specified by CAM-2. It shall be the responsibility of the xpt supplier to
> preserve those alignments as specified by the alignment rules. The
> alignment rules allows CCB members to at a known offset for a 16, 32,
> 64, etc bit processors. 
> 
> I have chosen the C language to represent the types definition file.
> The cam_types.h file contents shall contain type definitions of the follow:
>  typedef char   I_8;                    /* 8 bits */

You should say "signed char" rather than "char", which may be signed
or unsigned depending on compiler and compiler options.  Actually, it is
probably safer to say "signed" in other signed definitions below.

>  typedef unsigned char    U_8;          /* 8 bits */
>  typedef short  I_16;                   /* 16 bits     */
>  typedef unsigned short   U_16;         /* 16 bits     */
> /* For different machine word sizes ( e.g., 32 or 64 bit )
> #ifdef 32_BIT
>  typedef  long I_32;                     /* 32 bits          */
>  typedef unsigned long  U_32;            /* 32 bits          */
> #endif /* 32_BIT */
> #ifdef 64_BIT
> typedef int I_32                         /* 32 bits     */
> typedef unsigned int U_32                /* 32 bits     */
> 
> Alignment Rules:
>      All (x)_8 (chars) shall being on a 8 bit address boundary and shall
>      be 8 bits in length.
> 
>      All (x)_16 (shorts) shall being on a on a 16 bit address boundary
>      and shall be 16 bits in length.

Probably shouldn't say (shorts) or even (chars) as these don't necessarily
have the given number of bits.  ("Octets" should by definition however always
consist of eight bits.)

>      All x_32  (32 bits) shall being on a 32 bit address boundary and
>      shall be 32 bits in length.

Shouldn't the same apply for 64 bits?  Why not simply a general rule,
that any basic type should be aligned according to its size?

>      All pointers shall begin on a machine word address boundary shall
>      be of machine word size.

Careful, is machine word size necessarily the size of a pointer?
On 8086's (and 68000's?), I would think machine word size is 16,
but pointers are 32 bits.  (A general rule would sidestep this issue.)

>      All structures and unions and arrays shall begin and end on a
>      machine word boundary. If they don't they shall be padded out to
>      the machine word boundary.
> 
>      All CCBs shall begin on a machine word boundary.

Should alignment to cache lines also be considered?  In high-performance
implementations on certain hardware, cache-alignment might simplify
low-level code where it has to deal with explicit cache management.
Of course this may make sense with a 68040's 16-byte cache lines, but
probably not with 1024+ byte cache lines in certain multi-level caching
schemes.

>      Due to compiler differences between machine platforms and
>      operating systems if the next defined member type does not align
>      with the specified alignment then it shall be padded to force
>      alignment.
> 
>      Padding shall use the following type:
>           U_8          :8;     /* Alignment Padding            */

Couldn't bigger sizes also be used as long as padding with them is aligned
to their size?  And allow using arrays of the biggest type for long padding
sections aligned to the biggest type?  Otherwise the above gets rather
verbose with long padding areas.

> I have looked at a number of different options for the CAM-2 CCBs from a
> GPP similar structure arrangement to what has been proposed/talked in
> the distant past CAM-2 Working Group meetings. I believe the following
> CCB Header definitions are the most optimal for CAM-2.  The below
> CAM-2 header definitions allow for CAM and CAM-2 peripheral drivers
> and SIM/HBAs to co-exist in a system also allowing for backwards
> compatibility.
>
> I have rejected the GPP similar structure arrangement due to the following
> reasons:
>      To large of a departure from the current definitions.
> 
>      The use of too many data pointers.  The use of pointers to describe
>      data is very flexible but has many drawbacks.  The allocation of the
>      storage areas impacts O.S. performance the more you allocate the
>      greater the impact.  Has a tendency to be wasteful of a critical
>      resource (system memory) where most operating systems have
>      power of 2 kernel memory allocators so when you ask for a buffer of
>      36 bytes you get a 64 byte bucket.

I would be interested to have a look a look at this "GPP" proposal, if
possible -- at least to see the history of what was rejected and
understand the above reasons for its rejection.  And perhaps avoid
repeating anything, though I somewhat doubt I would have come up
with extra pointers... (???)


> CAM (CAM-1) CCB header:
> /* Common CCB CAM header definition. 
>  * For 32 bit machines 
>  */
> typedef struct ccb_header {
>      void *my_addr;                /* The address of this CCB */
>      U_16 cam_ccb_len;             /* Length of the entire CCB */
>      U_8 cam_func_code;            /* XPT function code */
>      U_8 cam_status;               /* Returned CAM subsystem status */
>      U_16 cam_hrsvd0;              /* Reserved */
>      U_8 cam_path_id;              /* Path ID for the request */
>      U_8 cam_target_id;            /* Target device ID */
>      U_8 cam_target_lun;           /* Target LUN number */
>      U_32 cam_flags;               /* Flags for operation of the subsys */
> }CCB_HEADER;                       /* structure ends on 32 bit boundary */
> 
> CAM 2 CCB header:
> /* Common CCB CAM 2 header definition. 
>  * For 32 bit machines 
>  */
> typedef struct ccb_header2 {

You might give thought to making identifiers unique in the first 8 chars,
for those older compilers that don't take any more into account.  This
hasn't been done in the past (eg. cam_target_id vs cam_target_lun),
but we've had to change structure definitions in our environment to
follow our portability guidelines.

>      void *my_addr;                /* The address of this CCB */
>      U_16 cam_ccb_len;             /* Length of the entire CCB */
>      U_8 cam_func_code;            /* XPT function code/CAM 2 signifier */
>      U_8 cam_status;               /* Returned CAM subsystem status */
>      U_32 cam_path_mid;            /* Path ID for the request (Most
>                                     * significant) 
>                                     */
>      U_32 cam_path_lid;            /* Path ID for the request (Least
>                                     * significant) 
>                                     */

Here I would rather not see large integers split up explicitly at this level.
At least not if expressing them in C.  Please!  For one, it forces a byte-order
(something which has not so far been discussed).  Although a machine might
be mostly 32 bit, it may have instructions that help it deal with 64 bits,
and that should be left to its natural order if possible.  It also makes things
more consistent.  I would rather, in 32 bit environments, define something
like:

    typedef struct I_64 {
       I_32   msb;
       I_32   lsb;
    } I_64;		/* 64-bit signed integer, as a structure */

    typedef struct U_64 {
       U_32   msb;
       U_32   lsb;
    } U_64;		/* 64-bit unsigned integer, as a structure */

You could then define struct ccb_header2 more cleanly and naturally
using U_64's.  One might even keep the same definition for various
architectures (compiler-specific variations might still occur because
of padding, etc. considerations, but it is easier to only have to deal with
writing special header files for exceptional compilers, which has to be
done in any case anyway).

If one wants to avoid the potential confusion of the "U_64" type with
a real integer, one might name it something else like "U_64_struct"
or whatnot.  However these pseudo U_64's can be treated like integers
to the extent that they can be assigned and declared as storage (the only
thing to watch out for is passing structures as parameters by value to
functions, which may work differently on different compilers or
environments).

Little arithmetic will probably ever be done with
U_64's; if I understand/recall correctly, scsi ids and luns are
mostly treated as "magic cookies" in SCSI-3.  As for path ids, what
they are is yet to be defined here, though if 64 bits are ever to be
used, they will almost certainly be magic cookies as well (modulo the
all 1's path id for XPT, and without excluding the possibility of other
special values).  I expect that 64 bit path ids will be used by
providing another primitive for explicitly scanning for path ids.
That is, XPT would provide another command, "get next path id"
which takes a path id as a parameter:  the XPT would return the
next path id following the one provided, or the first path id if given
a special value (say, the XPT all 1's path id).  The XPT would
determine the format of the 64-bit path ids, and would format
them such that the "get next path id" primitive can be executed
efficiently (or, and this may be more flexible in the end,
require that to scan for path ids requires a context, and that the
same context be used for getting successive path ids, similarly
to the way opendir() and readdir() work in ANSI/POSIX C).
This primitive would be trivial to implement in
current CAM-1 software in migrating them to CAM-2, if they
just use the least significant 8 bits (or any 'n' lsbits) of path ids
as an index into an array of SIMs as is currently done in CAM-1.
So why not define such a "get next path id" primitive right away?
It doesn't add complexity, yet allows for immediate expansion and
compatibility with arbitrary 64-bit path ids (to the discretion
of the XPT of a given architecture/environment).  Thus instead
of saying "we define path ids as taking up 64-bits but only the
8 lsbits are significant at this time" as below, we make it
trivial to implement path ids that only use the 8 lsbits but allow
full use of all 64 bits.  Will not some
similar mechanism be needed for scanning for SCSI IDs and LUNs,
since they can no longer be simply scanned numerically?  Having
some similar mechanism for discovering path ids right away
would be more consistent, and simple to understand.
Has any design been put forward for a method of scanning for
SCSI IDs and LUNs?  Maybe "get next SCSI ID" and/or
"get next LUN" primitives, with scanning context as suggested
above for scanning path ids?  Or is there something totally
different needed, some implementation requirements I just
haven't considered?  It just seems the obvious solution.

>      U_32 cam_target_mid;          /* Target device ID (Most significant) */
>      U_32 cam_target_lid;          /* Target device ID (Least significant) */
>      U_32 cam_target_mlun;         /* Target LUN number  (Most
>                                     * significant)
>                                     */
>      U_32 cam_target_mlun;         /* Target LUN number  (Least
>                                     * significant)
>                                     */
>      U_32 cam2_func_code;          /* The Real CAM 2 function code. */     
>      U_32 cam_flags;               /* Flags for operation of the subsys */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_32 cam_target_flags         /* Target mode flags */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      U_8 :8;                       /* Reserved for expansion */
>      /* The above reservation of 8 bytes is done to allow for future routing

(nitpicking) There's one too many byte (9 are present).

>       * of the request over a communication path (e.g., network).  The
>       * boundaries of  what a host (system) is has rapidly changed from
>       * what it was 2 years ago. While I haven't thought this out completely
>       * yet (maybe 16 bytes is needed source and destination addresses) I
>       * have placed it here as a marker to provoke some thought into it.
>       */
> }CCB_HEADER2;            /* structure ends on 32 bit boundary */

With the above suggestions this structure would look like:

/* Common CCB CAM 2 header definition. 
 * For 32 bit machines 
 */
typedef struct ccb2_header {
     void *my_addr;                /* The address of this CCB */
     U_16 cam_ccb_len;             /* Length of the entire CCB */
     U_8 cam_func_code;            /* XPT function code/CAM 2 signifier */
     U_8 cam_status;               /* Returned CAM subsystem status */
     U_64 cam_path_id;             /* Path ID for the request */
     U_64 cam_target_id;           /* Target device ID */
     U_64 cam_target_lun;          /* Target LUN number */
     U_32 cam2_func_code;          /* The Real CAM 2 function code. */     
     U_32 cam_flags;               /* Flags for operation of the subsys */
     U_32 cam_pad0;                /* Reserved for expansion */
     U_32 cam_target_flags         /* Target mode flags */
     U_64 cam_pad1;                /* Reserved for expansion */
} CCB2_HEADER;          /* structure ends on 32 bit (and 64-bit) boundary;
                           would also end on 64-byte (512-bit) boundary if
                           an additional 64 bits were added */

> Note there is a restriction that there can only be 0x0 to 0xef SIM/HBAs in
> CAM-2. This restriction will be lifted when CAM-3 (if there is one) is
> defined. When this occurs it is expected that there all SIM/HBA's must be
> migrated to CAM-2.

There is no need to be so restrictive to maintain a smooth upgrade path.
Likely initial implementations may only use 8 bits, but a method that
allows use of full 64 bits can be defined right away.  See comments above.
Perhaps what should be said is that at most 0xf0 SIM/HBAs may be
registered at any time in CAM *if* it supports CAM-1 CCBs (one can
imagine newer environments that could reasonably support CAM-2
without support for CAM-1, if all drivers have to be adapted to this
new OS/environment anyway).  Let it be clear that this requirement
is solely a consequence of support for both CAM-1 and CAM-2 CCBs
at the same time by an XPT.  One can also imagine that an XPT could
support both types of CCBs by maintaining an array (of no more than
0xf0 elements) of HBA/SIMs each with their own full 64-bit path id,
and taking the 64-bit path id in CAM-2 CCBs and the 8-bit index in
CAM-1 CCBs.  One might ask why one would do that.  Perhaps if there
are more than 0xf0 SIM/HBAs, only the subset of SIM/HBAs that are
accessed by CAM-1 peripheral drivers would have an entry in the
array and be accessible by CAM-1 CCBs.  After all, if we don't expect
more than 0xf0 SIM/HBAs, why have 64 bits for path ids in the
first place?

> Rules for the CAM 2 XPT:
>      The XPT shall support both CAM and CAM-2 CCBs.
> 
>      The XPT shall in xpt_action() determine if this is a CAM or CAM-2
>      function and route accordingly.
> 
>      The XPT shall be addressed by a cam_path_id of 0xFF (CAM) and
>      a cam_path_mid of 0xFFFFFFFF with a cam_path_lid of
>      0xFFFFFFFF (CAM-2). This means the CAM-2 XPT can be
>      addressed by both addresses so peripheral drivers can determine if
>      this is a CAM or CAM-2 XPT.
> 
>      The XPT shall report in the CAM Path Inquiry CCB that it is a
>      CAM-2 XPT.
> 
> Rules for the SIM/HBAs:
>      The SIM/HBAs shall report in the CAM Path Inquiry CCB that it is a
>      CAM or CAM-2 SIM/HBA.
> 
>      The SIM/HBAs that support CAM-2 shall report in the CAM-2 Path
>      Inquiry CCB that it supports CAM-2 CCBs and it shall report if it
>      supports CAM CCBs (optional).
> 
>      The SIM/HBA shall determine through a CAM Path Inquiry CCB that
>      it is a CAM or CAM-2 XPT.

By "determine", do you mean "report"?  (If yes, it seems redundant
with the preceding paragraph, if not, it seems contradictory with it...?)

Also, the XPT (presumably) always determines the path id, including
64-bit path ids for CAM-2.  This would mean that if the SIM/HBA
supports CAM-2, it should learn of its 64-bit path id(s) from the XPT.
Ie. xpt_bus_register() should return a 64-bit path id (by reference
in 32-bit systems where functions cannot return more than 32 bits at
a time), and xpt_bus_deregister() should take a 64-bit path id as
parameter (again, possibly by reference if needed).

> Rules for CAM-2 peripheral drivers:
> 
>      Peripheral drivers shall determine through the CAM Path Inquiry
>      CCB if the XPT is a CAM or CAM-2 XPT.
> 
>      Peripheral drivers shall determine through the CAM Path Inquiry
>      CCB if the addressed SIM/HBA is a CAM or CAM-2 SIM/HBA.

Presumably to know whether its CCBs are supported, or to know which
CCB to use in talking to the XPT if it (the PD) supports both.

>      A peripheral driver shall support CAM or CAM-2 and optionally both.


I should mention that we are currently designing a SCSI support
architecture for our realtime message-passing multiprocessor OS, Harmony
(see <http://wwwsel.iit.nrc.ca/harmony.html>).  Because it is fully
message-passing, without necessarily shared memory between processes
(eg. on multiprocessor systems connected by means other than a common bus),
CAM request structures don't fit naturally (or efficiently) into messages.
So I have completely redesigned CAM structures to provide us with a
native SCSI interface between PD and SIM/HBA that is more easily and
efficiently translated into messages -- without, incidentally, losing
efficiency or simplicity for non-message-passing environments.
As such they are no longer CAM, but may be useful in helping design
CAM-2 structures.

The main difference is that "input" and "output" fields have been
identified and separated.  This way output fields (sent by peripheral
driver to SIM/HBA [via XPT]) can be sent in one contiguous message,
and "input" fields be received just as efficiently as well.  Target-mode
only fields have also been separated and are only present in messages
exchanged by target-mode peripheral drivers (the design for target-mode
support is still incomplete however).  Fields that are common to all
commands of a linked command chain had also been separated from fields
that are unique (or partially unique) for each linked command, however
linked commands are in general seldom encountered and are not usually
among the commands that form the bulk of I/O requests such as READ and
WRITE (is there experience to the contrary?).  So support for optimizing
messages when linked commands are present is being dropped, since any
added complexity for optimizing linked commands will most likely slow
down the more general case of unlinked commands.  I will show example
of these structures later in a subsequent message.

One thing that has been avoided is duality of meaning of certain fields,
such as a data pointer being either a pointer to a block of data or a
pointer to a scatter/gather list.  Rather, only a pointer to a scatter/
gather list is present, but a single scatter/gather element (pointer +
size) is present in the I/O request structure, which the scatter/gather
list pointer can point to in the simple case of a single block of data
being transferred.  A similar thing is being done for the command data
bytes (CDB).  However, one point that is still unclear to me is whether
SCSI-3 commands can ever exceed 16 bytes.  If not, then the simplest
thing to do is to provide 16 bytes for the CDB and not support
interpretation of the CDB as a pointer to the real CDB's.  If they can,
and there is no fixed and reasonably short bound on the size of the CDB,
then what we'd do is provide 16 bytes for the CDB and a separate pointer
to the CDB.  The pointer would always be used, and may point either to
the 16-byte CDB array within the request structure, or to an external
array of command data bytes if more than 16 bytes are needed.
Avoiding such duality makes the code easier to understand, and *more
efficient* (since code doesn't have to check the meaning of a field;
it simply always accesses the same, more general field).

I would have liked to attend the CAM WG meeting, however this is
beyond the scope of our budget at this time.  I am, nevertheless,
very much interested in what happens in the development of CAM-2.


Regards,

-Marc

--
Marc E. Gauthier
Software Engineering Lab, Institute for Information Technology (SEL,IIT)
National Research Council Canada, Bldg M-50,  Ottawa ON Canada  K1A 0R6
+1 613 991 6975   fax:  +1 613 952 7151    email:  mgauthier at iit.nrc.ca






More information about the T10 mailing list