SAS Expander Pathway Recovery

Elliott, Robert (Server Storage) elliott at hp.com
Sat Oct 4 13:16:38 PDT 2003


* From the T10 Reflector (t10 at t10.org), posted by:
* "Elliott, Robert (Server Storage)" <elliott at hp.com>
*
> -----Original Message-----
> From: Gil Romo [mailto:gil.romo at qlogic.com] 
> Sent: Tuesday, September 30, 2003 6:04 PM
> To: t10 at t10.org
> Subject: SAS Expander Pathway Recovery
> 
> * From the T10 Reflector (t10 at t10.org), posted by:
> * Gil Romo <gil.romo at qlogic.com>
> *
> As shown in SAS1.1 Table 80, the pathway recovery priority of 
> a request is determined from its pathway blocked count,
> source SAS address, and connection rate fields.
> 
> Can a single device (ie, source SAS address) create a 
> deadlock with itself?

No, its requests can never cross each other and end
up waiting for each other to complete.  They can run
in parallel and compete with each other, though.

> Blocking requests cannot have the same source SAS address.  
> Therefore, the connection rate field is not required for 
> pathway recovery priority.

A wide port can send multiple requests out multiple phys
at the same time.  They must all have the same source SAS 
address, but could have different values for destination 
SAS address, pathway blocked count, arbitration wait time, 
and connection rate.

SAS-1.0 doesn't describe the requests included in pathway 
recovery very well.  We ended up with two different
approaches merged together.

There are several requests that could be compared when a
phy's Partial Pathway Timeout timer expires and the ECM
needs to decide whether to retry that phy's request with 
OPEN_REJECT (PATHWAY BLOCKED):

a) the blocked partial pathway (if any) that is using the
destination phy (i.e. some other phy has established a 
partial connection through this expander, but it is 
receiving AIP (WAITING ON PARTIAL) and is not fully 
connected yet).

b) requests from other phys that are requesting the 
destination phy.  All those phys are also running their
own Partial Pathway Timers.

c) the request from the phy in question

Assuming the a) pathway or one of the  b) requests is
|from the same source SAS address, including the connection 
rate field favors the faster request and ends up retrying
the slower request.  The theory is it will finish its 
connection faster and free up the domain sooner.

The a) b) c) approach offers the ability of retrying only
the lowest priority request; the a) c) approach leads to
retrying all but the highest priority request.

There is a bad side-effect of including the connection
rate, however, with the a) b) c) approach.  A 1.5 Gbps 
request can run over a 3 Gbps phy, but the reverse is not 
true.  Retrying the slower request won't help a waiting 
3 Gbps to proceed, while retrying the faster request
will let both 1.5 and 3 Gbps requests proceed.

In SAS-1.1, I think we should make it clear that a) and c) are
the only requests compared and drop the connection rate from
the compared fields.  The "retry all but highest" approach
seems to be the safer one; it leads to extra retries in some
scenarios, but recovers more quickly from deadlocks.

The last paragraph in 7.12.4.4 needs work too:

Current:
The ECM shall instruct the arbitrating expander phy to reject the
connection request by transmitting OPEN_REJECT (PATHWAY BLOCKED) when
the Partial Pathway Timeout timer expires and the pathway recovery
priority of the arbitrating expander phy (i.e., the expander phy
requesting the connection) is less than the pathway recovery priority of
all expander phys within the destination port with an arbitration status
of WAITING_ON_PARTIAL.

Change to:
The ECM shall instruct the arbitrating expander phy to reject the
connection request by transmitting OPEN_REJECT (PATHWAY BLOCKED) when
the Partial Pathway Timeout timer expires and the pathway recovery
priority of the arbitrating expander phy (i.e., the expander phy
requesting the connection) is less than the pathway recovery priority of
<any of the> expander phys within the destination port <that are sending
Phy Status (Blocked Partial Pathway) to the ECM>.

Reasons:
1. <all> to <any of the> agrees with the "retry all but highest"
approach
2. The paragraph already does not mention group b), just a) and c)
3. The phrase "arbitration status of WAITING_ON_PARTIAL" is unclear; it
means sending or receiving AIP (WAITING ON PARTIAL), which is internally
communicated with the Phy Status (Blocked Partial Pathway) message


> Gilbert Romo
> Circuits & Integration
> QLogic Corporation, Aliso Viejo, California
> Office: 949-389-6266
> E-mail: gil.romo at qlogic.com

--
Rob Elliott, elliott at hp.com
Hewlett-Packard Industry Standard Server Storage Advanced Technology
https://ecardfile.com/id/RobElliott
*
* For T10 Reflector information, send a message with
* 'info t10' (no quotes) in the message body to majordomo at t10.org




More information about the T10 mailing list