5/95 FC-AL Direct Attach Disk Minutes
Kurt Chan
kc at core.rose.hp.com
Fri May 12 10:15:46 PDT 1995
From: Kurt Chan
To: FC, SCSI Reflectors
Subj: 5/95 FC-AL Direct Attach Disk Adhoc Minutes
Date: 5/12/95
FC-AL Direct Attach Disk Adhoc Meeting Minutes
Harrisburg, PA
5/8/95
Dal Allan ENDL dal.allan at mcimail.com
Charles Binford Symbios Logic charles.binford at symbios.com
Kurt Chan HP Roseville kc at core.rose.hp.com
Mike Chenery Fujitsu mchenery at fcpa.fujitsu.com
Jim Coomes Seagate jim_coomes at notes.seagate.com
Edward Fong Amdahl edward.fong at spg.amdahl.com
Giles Frazier IBM Austin gfrazier at vnet.ibm.com
Gene Freeman Symbios Logic gene.freeman at symbios.com
Ed Frymoyer HP FCSI emf at best.com
Ed Gardner Quantum gardner at acm.org
Gary Goodwin IBM SPC ggoodwin at vnet.ibm.com
Doug Hagerman DEC hagerman at starch.enet.dec.com
Norm Harris Adaptec nharris at eng.adaptec.com
Bill Hutchison HP hutch at boi.hp.com
Roger Hungerford HP DMD rogerh at hpdml16.boi.hp.com
Gene Milligan Seagate
Charles Monia DEC monia at shr.dec.com
Brian R. Smith Infinity Software isi_info at io.com
Bob Snively Sun Microsystems bob.snively at sun.com
Gary Stephens FSI Consulting 6363897 at mcimail.com
Horst Truestedt IBM Rochester truested at vnet.ibm.com
Peter Walford FCSI/DemoGraFx walford at btr.com
Gary Watson Trimm Technologies trimm at netcom.com
1. PREFERRED ADDRESS TERMINOLOGY
--------------------------------
Dal reported that FC-PH2 will adopt a new N_Port common service parameter
which indicates whether or not the port was successful in acquiring it's
"preferred" address. There was considerate discussion over what the meaning
of "preferred" is. Some terminology before describing the two login bit
proposals:
Dal's interpretation of preferred address or AL_PA:
a) On a fabric, an address requested by the N_Port in FLOGI (24 bits), or
b) On FC-AL a hard-assigned AL_PA, or
c) On FC-AL, the previously acquired AL_PA.
Using the above definitions, a n NL_Port would be defined as "not
having" a preferred address if it has no hard-assigned AL_PA and is in
the process of powering up or performing power-on reset upon receipt
of a Hard Reset LIP. A Target is not required to remember it's
previously acquired address across power-cycles or power-on resets.
If the port has a hard-assigned address but is currently using a soft
address, it was agreed that the current address will become the preferred
address following the next LIP unless the port has been power-cycled or the
LIP is a directed Hard Reset LIP, in which case the hard-assigned address
becomes the preferred address.
2. NPA and HANA
----------------
The first proposal for reporting address acquisition during N_Port login
was called "NPA" (No Preferred Address).
NPA = 0 means:
- port does not have a preferred address, or
- on a fabric or point-point, the N_Port successfully acquired its preferred
address, or
- on FC-AL following power-on reset, the NL_Port has a hard AL_PA and
successfully acquired it, or
- on FC-AL following a LIP which was NOT accompanied by power-on reset,
the NL_Port successfully acquired it's previously acquired AL_PA.
NPA = 1 means the port had a preferred address and:
- on a fabric or point-point, the preferred address was not acquired
- on a loop, the current AL_PA is not the preferred AL_PA
The second proposal was a "HANA" bit (Hard Address Not Acquired).
HANA = 0 means:
- On a fabric or point-point, the N_Port does not have a preferred address
or the N_Port successfully acquired its preferred address, or
- On FC-AL, the NL_Port does not have a hard address, or the NL_Port has a
hard address which matches its current address
HANA = 1 means:
- On a fabric or point-point, the N_Port has a preferred address which was
not granted to the N_Port during the most recent FLOGI, or
- On FC-AL, the NL_port has a hard address, which does not match the
current soft or fabric-assigned address
The primary difference between HANA and NPA is that HANA allows
implementations for which address conflicts are considered pathological to
only report when hard addresses were not acquired. It would not alert users
to changes in "soft" configurations following initialization, whereas NPA
would.
Note 1: in order to protect "naive" hosts which have no concept of
functioning with hard address conflicts, the current profile
requires a device which device which did not obtain it's hard
address to not proceed past PRLI (see 8.3). A HANA/NPA login bit
may allow Targets to function with soft addresses by providing a
reporting mechanism to hosts alerting ULPs of configuration
changes.
Note 2: an additional ELS or PLOGI field which allows a target to
report it's hard address would enhance both NPA and HANA proposals.
Since the FC-PH2 editors are looking for guidance before the June Direct
Attach disk adhoc meeting, this subject will be discussed in detail over the
disk_attach reflector. To subscribe, send an email message with a blank
subject line to majordomo at dtc.wdc.com with a single line of text:
subscribe disk_attach your_email_address
You will receive an automated reply to your subscribe request.
3. 7-BIT AND 8-BIT ADDRESSES
-----------------------------
One problem the user community currently has with loop addresses is that
the AL_PA's are meaningless to a user configuring SFF-8045 drives, since
the enumerated 7-bit values from Annex K in FC-AL are used, not the 8-bit
AL_PA's that are part of the 24-bit native N_Port Identifier.
The profile will add a normative clause which binds the 7-bit hard addresses
in SFF-8045 and the 8-bit AL_PAs described in FC-AL.
It was agreed on that a common goal is to require as few users as possible
|from requiring a copy of this table to perform their configuration or
administrative tasks. This includes any users of system console displays as
well as users of displays or switches mounted on cabinets or storage devices.
However, this objective ran somewhat contrary to the suggestion that
system software needs to have a unified address scheme for all
storage, and that switching between 24 and 23 bit address schemes
based on whether or not a device is loop-attached vs fabric-attached
is not desireable. Furthermore, public loops would pose some
difficulties in determining which address scheme was used in software,
even if a login bit could report loop vs fabric vs point-point
attachment.
The only thing the group could agree on was that we would have a new term for
the 7-bit enumerated address: Loop_ID (in keeping with the term "SCSI ID").
There will be no profile recommendation regarding switches or displays,
although one disk cabinet vendor mentioned they would use rotary switches to
represent a hard offset within the Loop_ID space. In total, there are six
possible addresses a device could retain and display:
1) Hard AL_PA (or N_Port ID for fabric attach)
2) Hard Loop_ID
3) Previously acquired soft AL_PA
4) Previously acquired soft Loop_ID
5) Current AL_PA (or N_Port ID for fabric attach)
6) Current Loop_ID
4. TARGET DISCOVERY AND ALIASES
-------------------------------
In Monterey we agreed that aliasing would not be a required feature of
NL_Ports. Therefore, class 3 frames sent with the "wrong" Domain+Area
(nonzero in the case of Private NL_Port and zeroes in the case of Public
NL_Ports) could be discarded and result in timeouts on PDISC or PLOGI.
For Private NL_Ports, this means they cannot practically "discover" the
coexisting Public NL_Ports residing on the loop. For Public NL_Ports, this
would mean they would have to make a two-pass discovery to find the Private
NL_Ports (one pass with Domain+Area = FL_Port, and another pass with
Domain+Area = 0). Even though this does not affect private loops, a note
describing this procedure was added to version 1.60. The procedure requires
that Public NL_Ports close the loop after sending PDISC or PLOGI so as not to
consume bus bandwidth in the event of a timeout, and that concurrent
Exchanges be supported during Target Discovery.
5. NODE AND PORT NAMING
-----------------------
This subject was brought up again when version 1.50 of the profile was
discovered to have an error. There are four cases where the Node and
Port names could be bound on a dual-ported device. The first three
cases involve both port and node names being extended (60-bit) addresses
('n' = one byte):
CASE_1 CASE_2 CASE_3
01nnnnnn 02nnnnnn 01nnnnnn 02nnnnnn 01nnnnnn 02nnnnnn
00nnnnnn 00nnnnnn 00nnnnnn
The fourth case has the node name equal to a 48-bit IEEE adddress, with the
port names extended:
01nnnnnn 02nnnnnn
nnnnnn
Seagate implements case 3. Ed mentioned that 48 bits is not large enough for
the consumer disk drive market, even though Peter mentioned that allowing a
node to be a 48-bit entity would allow it to become part of the general
networking community and perhaps act as a catalyst for the address expansion
proposals being considered in IEEE.
Gary and Charles Binford brought up the notion that it is unnecessary
to require the lower 48 bits to be the same, since the WWNs are
already bound in login. Requiring them to be bound by name either
prohibits a multi-vendor node solution where the ports are acquired
|from different vendors than the node (e.g., an array controller), or
requires that ports acquire their WWNs from their nodes - viewed as
unnecessarily restrictive.
Therefore, case 3 will be adopted for single-lun targets, with a mapping
to CH1 and CH2 in SFF-8045. Other devices will not be required to have any
naming consistency between ports or nodes. For ease of notation, the first
nibble of each name will be reserved to distinguish between ports and nodes,
with a common 56-bit base address for all three names, yielding 256 times
more node names than case 4.
6. E_D_TOV
----------
Currently E_D_TOV is selected by an Initiator to be the largest value
of the discovered Targets. This implies that if a new Target is
inserted with a longer value, the Initiator must relogin with all
Targets to update the E_D_TOV value, which would abort all open
Exchanges.
Before the discussion degenerated into an argument about
initialization timers, Dal said that until we have a proposal for
empirical determination by initiators and optimization among multiple
initiators, we should simply make 10 seconds the default E_D_TOV for
all Targets and Initiators, with the ability for Initiators to LOWER
the chosen value, but not increase it. LIP initializes E_D_TOV in all
devices to 10 seconds.
On the subject of initialization, Dal questioned the need for the loop
master to time anything, since it need take no action until frames are
actually returned. Initialization can be retried based on ULP timers,
which need not be specified since they are a function of the
"patience" a system chooses to exhibit.
Giles pointed out that there are actually three timers:
1) AL_Time (15ms) - the max physical loop propagation delay
2) SW time to pass around all LISA,LIPA,LIHA frames (this should be
"default" EDTOV = 10sec since it could be slow)
3) "working" EDTOV used by communicating N_Ports based on expected
round trip delays to frames, including any turnaround times in the
nodes and ports
7. BB_CREDIT EXAMPLES
---------------------
We walked through the BB_Credit examples in Annex B and made several
corrections, particularly to example 3. Example 3 must take into
consideration the rule that says a full duplex OPN Initiator must have 2x
Login_BB_Credit available in the event a CLS was issued at the same time
Login_BB_Credit's worth of frames was returning from the OPN Recipient, and
an immediate OPN also used up Login_BB_Credit. Bob pointed out that any
nonzero Login_BB_Credit need not really be available on CLS, but by the next
time frames were received following an OPN. The difference could be 2 round
trip loop times or more.
8. FCP_RSP IUs
---------------
The group agreed that the only unique value that FCP_RSP confirmation
provides is to to ensure delivery of "important" status. That being
said, Giles suggested a log page could be used to log "important"
errored status.
Dal liked this approach since he felt it keeps responsibility in the
host for recovery, with minimum Target impact, and that it solves the
problem for ALL interfaces, not just FC-AL (SSA and SIP/SPI also have
the same problem, although it may not have been addressed yet).
It was also noted that lost FCP_RSP would be more likely on FC than on
SPI due to hot plugging, bypassing, and congestion.
Gary suggested the idea of using the OPNr replicate function on FC-AL
as confirmation. However, it was noted that just having the returned
frame is no guarantee of receipt, since there may not have been
buffers available to receive the autosense data.
This was discussed also at great length in the SCSI-3 Working Group
the next day, with no consensus as to whether one, neither, or both
Log Page and IU solutions should be worked on.
9. BURSTS, SEQUENCES, AND LOOP TENANCIES
----------------------------------------
It was agreed that loop tenancies and Sequences are orthogonal concepts:
- more than 1 Sequence may be sent in a loop tenancy, and
- more than 1 loop tenancy may be used to send a single Sequence.
The term "burst" is defined in the SPC disconnect-reconnect mode page:
"The maximum burst size field indicates the maximum amount of data
that the device server shall transfer during a single data moving
operation."
FCP chose to map Burst Size to the Sequence Length associated with a
XFR_RDY (see FCP Rev 11, clause 7.2).
10. RETRANSMISSION LATENCY
--------------------------
Ed reviewed his X3T11 proposal to change the mandatory 6-word latency
to a nominal value in FC-AL so that existing implementations, which
can have maximum latencies in the range of 5.08 to 6.58 words, will be
compliant.
11. DISCONNECT-RECONNECT MODE PAGE
----------------------------------
Ed suggested the following use of fields within this page:
Buffer Full/Empty Ratios: No mapping
Bus Inactivity Limit: The max time that the target may defer closing
the loop without sending frames
Disconnect Time Limit: No mapping
Connect Time Limit: The max time that the target may keep the loop
open (i.e., max loop tenancy).
Maximum Burst Size: Maximum Sequence Length
EMPD: Modify wording to map to non-Sequential Sequence
delivery on Reads (RO of first frame of each Read
Data Sequence is not required to be continuously
increasing).
Ed also noted that the three time units would have to be re-scaled
appropriately for FC speeds.
12. PERFORMANCE
---------------
Giles presented some charts demonstrating performance as a function of
node delay, BB_Credit, and Number of ARB cycles per I/O. It was suggested
by Dal that he consider publishing an X3T11 Technical Report (perhaps in
collaboration with SGI) as a repository for this information. This may
also form the basis for changes to future, performance-enhanced profiles.
13. FCP_RSP PAYLOAD LENGTH
--------------------------
The question was raised that since the minimum payload length is now
256 bytes, what we should do with the FCP_RSP payload, which was
limited in length to accomodate a 128-byte limit. The goal was to
keep FCP_RSP a single-frame Sequence. The 128-byte limit for
fabric-attach was to accomodate early Ancor implementations. Since
Private Loops prohibit use of fabrics, the 256-byte limit was adopted
to increase Read Exchange lengths and also accomodate LILP/LIRP frames
in the future (which require more than 128 bytes of payload).
Since FCP_RSP_INFO is now limited to 8 bytes in FCP, 12 bytes can be
reclaimed and used for FCP_SNS_INFO (total of 96 bytes). However,
this still falls short of the 256 bytes allowed by SPC Request Sense
data (one byte of allocation length). Also, some array vendors may
send back VU "layered" sense data which exceeds 96 bytes.
Rather than increase the minimum RX data field size to 384 or 512
bytes, it was decided that it should stay at 128 bytes with 96 bytes
of sense data allowed, which should be enough for single-LUN, non-SCC
devices. Proposals for the definition of hierarchical sense data
within the context of SCC would be treated separately.
Question: the profile still only provides a "partial" compliance to
SPC by providing 96 instead of 256 bytes of sense information. Is
this acceptable? If not, we may want to consider allowing Targets to
send and requiring Initiators to be able to receive 512 byte frames,
while still only requiring 256 byte frame reception by Targets.
14. NEXT MEETINGS
-----------------
Friday, June 16 (X3T11 week)
Rochester, MN
Radisson Plaza (507-281-8000)
830am - 2pm (no lunch break)
Monday, July 10 (X3T10 week)
Colorado Springs
Red Lion (719-576-8900)
9am - 5pm
More information about the T10
mailing list