5/95 FC-AL Direct Attach Disk Minutes

Kurt Chan kc at core.rose.hp.com
Fri May 12 10:15:46 PDT 1995

From:  Kurt Chan 
To:    FC, SCSI Reflectors 
Subj:  5/95 FC-AL Direct Attach Disk Adhoc Minutes
Date:  5/12/95 

               FC-AL Direct Attach Disk Adhoc Meeting Minutes
                            Harrisburg, PA 

    Dal Allan           ENDL                    dal.allan at mcimail.com 
    Charles Binford     Symbios Logic           charles.binford at symbios.com
    Kurt Chan           HP Roseville            kc at core.rose.hp.com
    Mike Chenery        Fujitsu                 mchenery at fcpa.fujitsu.com
    Jim Coomes          Seagate                 jim_coomes at notes.seagate.com
    Edward Fong         Amdahl                  edward.fong at spg.amdahl.com
    Giles Frazier       IBM Austin              gfrazier at vnet.ibm.com
    Gene Freeman        Symbios Logic           gene.freeman at symbios.com
    Ed Frymoyer         HP FCSI                 emf at best.com
    Ed Gardner          Quantum                 gardner at acm.org
    Gary Goodwin        IBM SPC                 ggoodwin at vnet.ibm.com
    Doug Hagerman       DEC                     hagerman at starch.enet.dec.com
    Norm Harris         Adaptec                 nharris at eng.adaptec.com
    Bill Hutchison      HP                      hutch at boi.hp.com
    Roger Hungerford    HP DMD                  rogerh at hpdml16.boi.hp.com
    Gene Milligan       Seagate                 
    Charles Monia       DEC                     monia at shr.dec.com
    Brian R. Smith      Infinity Software       isi_info at io.com
    Bob Snively         Sun Microsystems        bob.snively at sun.com
    Gary Stephens       FSI Consulting          6363897 at mcimail.com
    Horst Truestedt     IBM Rochester           truested at vnet.ibm.com
    Peter Walford       FCSI/DemoGraFx          walford at btr.com
    Gary Watson         Trimm Technologies      trimm at netcom.com

Dal reported that FC-PH2 will adopt a new N_Port common service parameter
which indicates whether or not the port was successful in acquiring it's
"preferred" address.  There was considerate discussion over what the meaning
of "preferred" is. Some terminology before describing the two login bit

Dal's interpretation of preferred address or AL_PA:
 a) On a fabric, an address requested by the N_Port in FLOGI (24 bits), or
 b) On FC-AL a hard-assigned AL_PA, or
 c) On FC-AL, the previously acquired AL_PA.

Using the above definitions, a n NL_Port would be defined as "not
having" a preferred address if it has no hard-assigned AL_PA and is in
the process of powering up or performing power-on reset upon receipt
of a Hard Reset LIP.  A Target is not required to remember it's
previously acquired address across power-cycles or power-on resets.

If the port has a hard-assigned address but is currently using a soft
address, it was agreed that the current address will become the preferred
address following the next LIP unless the port has been power-cycled or the
LIP is a directed Hard Reset LIP, in which case the hard-assigned address
becomes the preferred address.

2. NPA and HANA
The first proposal for reporting address acquisition during N_Port login 
was called "NPA" (No Preferred Address).

NPA = 0 means:
 - port does not have a preferred address, or
 - on a fabric or point-point, the N_Port successfully acquired its preferred 
   address, or
 - on FC-AL following power-on reset, the NL_Port has a hard AL_PA and 
   successfully acquired it, or
 - on FC-AL following a LIP which was NOT accompanied by power-on reset, 
   the NL_Port successfully acquired it's previously acquired AL_PA. 
NPA = 1 means the port had a preferred address and:
 - on a fabric or point-point, the preferred address was not acquired
 - on a loop, the current AL_PA is not the preferred AL_PA

The second proposal was a "HANA" bit (Hard Address Not Acquired).

HANA = 0 means:
  - On a fabric or point-point, the N_Port does not have a preferred address 
    or the N_Port successfully acquired its preferred address, or
  - On FC-AL, the NL_Port does not have a hard address, or the NL_Port has a 
    hard address which matches its current address
HANA = 1 means:
  - On a fabric or point-point, the N_Port has a preferred address which was 
    not granted to the N_Port during the most recent FLOGI, or
  - On FC-AL, the NL_port has a hard address, which does not match the
    current soft or fabric-assigned address 

The primary difference between HANA and NPA is that HANA allows
implementations for which address conflicts are considered pathological to
only report when hard addresses were not acquired.  It would not alert users
to changes in "soft" configurations following initialization, whereas NPA

   Note 1:  in order to protect "naive" hosts which have no concept of
   functioning with hard address conflicts, the current profile
   requires a device which device which did not obtain it's hard
   address to not proceed past PRLI (see 8.3).  A HANA/NPA login bit
   may allow Targets to function with soft addresses by providing a
   reporting mechanism to hosts alerting ULPs of configuration

   Note 2: an additional ELS or PLOGI field which allows a target to
   report it's hard address would enhance both NPA and HANA proposals.

Since the FC-PH2 editors are looking for guidance before the June Direct
Attach disk adhoc meeting, this subject will be discussed in detail over the
disk_attach reflector.  To subscribe, send an email message with a blank
subject line to majordomo at dtc.wdc.com with a single line of text:

                subscribe disk_attach your_email_address

You will receive an automated reply to your subscribe request.

One problem the user community currently has with loop addresses is that
the AL_PA's are meaningless to a user configuring SFF-8045 drives, since
the enumerated 7-bit values from Annex K in FC-AL are used, not the 8-bit
AL_PA's that are part of the 24-bit native N_Port Identifier.

The profile will add a normative clause which binds the 7-bit hard addresses
in SFF-8045 and the 8-bit AL_PAs described in FC-AL.

It was agreed on that a common goal is to require as few users as possible
|from requiring a copy of this table to perform their configuration or
administrative tasks.  This includes any users of system console displays as
well as users of displays or switches mounted on cabinets or storage devices.

However, this objective ran somewhat contrary to the suggestion that
system software needs to have a unified address scheme for all
storage, and that switching between 24 and 23 bit address schemes
based on whether or not a device is loop-attached vs fabric-attached
is not desireable.  Furthermore, public loops would pose some
difficulties in determining which address scheme was used in software,
even if a login bit could report loop vs fabric vs point-point

The only thing the group could agree on was that we would have a new term for
the 7-bit enumerated address: Loop_ID (in keeping with the term "SCSI ID").
There will be no profile recommendation regarding switches or displays,
although one disk cabinet vendor mentioned they would use rotary switches to
represent a hard offset within the Loop_ID space.  In total, there are six
possible addresses a device could retain and display:

     1) Hard AL_PA (or N_Port ID for fabric attach)
     2) Hard Loop_ID
     3) Previously acquired soft AL_PA
     4) Previously acquired soft Loop_ID
     5) Current AL_PA (or N_Port ID for fabric attach)
     6) Current Loop_ID

In Monterey we agreed that aliasing would not be a required feature of
NL_Ports.  Therefore, class 3 frames sent with the "wrong" Domain+Area
(nonzero in the case of Private NL_Port and zeroes in the case of Public
NL_Ports) could be discarded and result in timeouts on PDISC or PLOGI.

For Private NL_Ports, this means they cannot practically "discover" the
coexisting Public NL_Ports residing on the loop.  For Public NL_Ports, this
would mean they would have to make a two-pass discovery to find the Private
NL_Ports (one pass with Domain+Area = FL_Port, and another pass with
Domain+Area = 0).  Even though this does not affect private loops, a note
describing this procedure was added to version 1.60.  The procedure requires
that Public NL_Ports close the loop after sending PDISC or PLOGI so as not to
consume bus bandwidth in the event of a timeout, and that concurrent
Exchanges be supported during Target Discovery.

This subject was brought up again when version 1.50 of the profile was
discovered to have an error. There are four cases where the Node and
Port names could be bound on a dual-ported device. The first three
cases involve both port and node names being extended (60-bit) addresses
('n' = one byte):

        CASE_1                   CASE_2                 CASE_3
  01nnnnnn   02nnnnnn      01nnnnnn  02nnnnnn     01nnnnnn   02nnnnnn
       00nnnnnn                 00nnnnnn                00nnnnnn

The fourth case has the node name equal to a 48-bit IEEE adddress, with the
port names extended:
			   01nnnnnn 02nnnnnn

Seagate implements case 3.  Ed mentioned that 48 bits is not large enough for
the consumer disk drive market, even though Peter mentioned that allowing a
node to be a 48-bit entity would allow it to become part of the general
networking community and perhaps act as a catalyst for the address expansion
proposals being considered in IEEE.

Gary and Charles Binford brought up the notion that it is unnecessary
to require the lower 48 bits to be the same, since the WWNs are
already bound in login.  Requiring them to be bound by name either
prohibits a multi-vendor node solution where the ports are acquired
|from different vendors than the node (e.g., an array controller), or
requires that ports acquire their WWNs from their nodes - viewed as
unnecessarily restrictive.

Therefore, case 3 will be adopted for single-lun targets, with a mapping
to CH1 and CH2 in SFF-8045.  Other devices will not be required to have any
naming consistency between ports or nodes.  For ease of notation, the first
nibble of each name will be reserved to distinguish between ports and nodes,
with a common 56-bit base address for all three names, yielding 256 times
more node names than case 4.

6. E_D_TOV
Currently E_D_TOV is selected by an Initiator to be the largest value
of the discovered Targets.  This implies that if a new Target is
inserted with a longer value, the Initiator must relogin with all
Targets to update the E_D_TOV value, which would abort all open

Before the discussion degenerated into an argument about
initialization timers, Dal said that until we have a proposal for
empirical determination by initiators and optimization among multiple
initiators, we should simply make 10 seconds the default E_D_TOV for
all Targets and Initiators, with the ability for Initiators to LOWER
the chosen value, but not increase it.  LIP initializes E_D_TOV in all
devices to 10 seconds.

On the subject of initialization, Dal questioned the need for the loop
master to time anything, since it need take no action until frames are
actually returned.  Initialization can be retried based on ULP timers,
which need not be specified since they are a function of the
"patience" a system chooses to exhibit.

Giles pointed out that there are actually three timers:

1) AL_Time (15ms) - the max physical loop propagation delay 
2) SW time to pass around all LISA,LIPA,LIHA frames (this should be
   "default" EDTOV = 10sec since it could be slow)
3) "working" EDTOV used by communicating N_Ports based on expected
   round trip delays to frames, including any turnaround times in the
   nodes and ports

We walked through the BB_Credit examples in Annex B and made several
corrections, particularly to example 3.  Example 3 must take into
consideration the rule that says a full duplex OPN Initiator must have 2x
Login_BB_Credit available in the event a CLS was issued at the same time
Login_BB_Credit's worth of frames was returning from the OPN Recipient, and
an immediate OPN also used up Login_BB_Credit.  Bob pointed out that any
nonzero Login_BB_Credit need not really be available on CLS, but by the next
time frames were received following an OPN. The difference could be 2 round
trip loop times or more.

The group agreed that the only unique value that FCP_RSP confirmation
provides is to to ensure delivery of "important" status.  That being
said, Giles suggested a log page could be used to log "important"
errored status.

Dal liked this approach since he felt it keeps responsibility in the
host for recovery, with minimum Target impact, and that it solves the
problem for ALL interfaces, not just FC-AL (SSA and SIP/SPI also have
the same problem, although it may not have been addressed yet).

It was also noted that lost FCP_RSP would be more likely on FC than on
SPI due to hot plugging, bypassing, and congestion.

Gary suggested the idea of using the OPNr replicate function on FC-AL
as confirmation.  However, it was noted that just having the returned
frame is no guarantee of receipt, since there may not have been
buffers available to receive the autosense data.

This was discussed also at great length in the SCSI-3 Working Group
the next day, with no consensus as to whether one, neither, or both
Log Page and IU solutions should be worked on.

It was agreed that loop tenancies and Sequences are orthogonal concepts:
- more than 1 Sequence may be sent in a loop tenancy, and 
- more than 1 loop tenancy may be used to send a single Sequence.

The term "burst" is defined in the SPC disconnect-reconnect mode page:

   "The maximum burst size field indicates the maximum amount of data
    that the device server shall transfer during a single data moving

FCP chose to map Burst Size to the Sequence Length associated with a
XFR_RDY (see FCP Rev 11, clause 7.2).

Ed reviewed his X3T11 proposal to change the mandatory 6-word latency
to a nominal value in FC-AL so that existing implementations, which
can have maximum latencies in the range of 5.08 to 6.58 words, will be

Ed suggested the following use of fields within this page:

Buffer Full/Empty Ratios: No mapping

Bus Inactivity Limit:     The max time that the target may defer closing
                          the loop without sending frames 

Disconnect Time Limit:    No mapping

Connect Time Limit:       The max time that the target may keep the loop 
                          open (i.e., max loop tenancy).

Maximum Burst Size:       Maximum Sequence Length

EMPD:                     Modify wording to map to non-Sequential Sequence 
                          delivery on Reads (RO of first frame of each Read 
                          Data Sequence is not required to be continuously 

Ed also noted that the three time units would have to be re-scaled
appropriately for FC speeds.

Giles presented some charts demonstrating performance as a function of
node delay, BB_Credit, and Number of ARB cycles per I/O. It was suggested
by Dal that he consider publishing an X3T11 Technical Report (perhaps in
collaboration with SGI) as a repository for this information. This may
also form the basis for changes to future, performance-enhanced profiles.

The question was raised that since the minimum payload length is now
256 bytes, what we should do with the FCP_RSP payload, which was
limited in length to accomodate a 128-byte limit.  The goal was to
keep FCP_RSP a single-frame Sequence.  The 128-byte limit for
fabric-attach was to accomodate early Ancor implementations.  Since
Private Loops prohibit use of fabrics, the 256-byte limit was adopted
to increase Read Exchange lengths and also accomodate LILP/LIRP frames
in the future (which require more than 128 bytes of payload).

Since FCP_RSP_INFO is now limited to 8 bytes in FCP, 12 bytes can be
reclaimed and used for FCP_SNS_INFO (total of 96 bytes).  However,
this still falls short of the 256 bytes allowed by SPC Request Sense
data (one byte of allocation length).  Also, some array vendors may
send back VU "layered" sense data which exceeds 96 bytes.

Rather than increase the minimum RX data field size to 384 or 512
bytes, it was decided that it should stay at 128 bytes with 96 bytes
of sense data allowed, which should be enough for single-LUN, non-SCC
devices.  Proposals for the definition of hierarchical sense data
within the context of SCC would be treated separately.

Question:  the profile still only provides a "partial" compliance to
SPC by providing 96 instead of 256 bytes of sense information.  Is
this acceptable?  If not, we may want to consider allowing Targets to
send and requiring Initiators to be able to receive 512 byte frames,
while still only requiring 256 byte frame reception by Targets.

Friday, June 16 (X3T11 week)
Rochester, MN
Radisson Plaza (507-281-8000)
830am - 2pm (no lunch break)

Monday, July 10 (X3T10 week)
Colorado Springs
Red Lion (719-576-8900)
9am - 5pm

More information about the T10 mailing list