Subject: RE: BCT and PHY CAP exchange -- Possible interoperability issue Date: Wed, 12 Sep 2007 11:15:18 -0700 From: "Shah, Amit M" <amit.m.shah@intel.com> To: "Stephen FINCH" <steve.finch@st.com>, <t10@t10.org>, "Elliott, Robert (Server Storage)" <Elliott@hp.com> X-Message-Number: 8059 Formatted message: HTML-formatted message Attachment #1: image001.gif Hi Steve, What I am trying to point out is that bad implementation could lead to interoperability issues. And this can happen as standard does not provide high level guidance on how PHYCAP should be implemented and so there is a possibility that vendors will choose one of the bad implementation. I am not sure why you call the bad implementation as non-complaint implementation. Also when I said that receiver can start BCT when ComWake Detect is detected, I meant receiver will start BCT in the next cock of CW Detect. So it means that in one BCT time, bad implementation will wait for only one CW event (either Detect or Complete). Maybe it was mis-leading as I didn't draw that one clock difference in the diagram. Also when I was talking about clock crossing issues, I did not mean a drift. I meant a simple back to back sync flop that will synchronize burst / idle detect signals from SERDES in the HW logic that detects Comwake sequence. Also in the SERDES itself, if the detection signal for idle / burst gets asserted few psec / nsec late and if that results in missing of CW sequence detection in HW, it could be a problem. Steve replied: >>When the first (START) bit is detected - i.e., ComWake Detect in the diagram below, we are in the middle of the first BCT time. This is not the start of the first BCT. If one were to load the BCT timer with a start value that is appropriate (maybe the middle of the full BCT count) when the first ComWake Detect occurs, then all of the remaining ComWake Detects will be positioned roughly in the middle of each remaining BCT. Drift in clock frequencies will not move it outside of the window. Now in each of the BCT's you will see zero or one ComWake Detects. Never two. Amit: I am not sure if I understand your response in this paragraph. But the diagram shows two BCT timers. One for TX and one for RX. What I show is that as a TX, we should start BCT when we start transmission of CW sequence. But as a receiver, we can only rely on CW Detect of Complete events. So as a receiver we can only start our BCT on Detect or Complete (the standard does not mention this explicitly). Also I agree that in one BCT we will never see two ComWake Detect. As mentioned above, even detection one ComWake Detect could be a problem if we try to detect it on the last clock edge of BCT. So to re-capture the problem - Bad implementation rely on the fact that they will detect a CW sequence (either a Detect or Complete) during the end of BCT time (most probably the last clock of BCT expiration). Now if due to internal clock crossing (back to back sync flops) or SERDES timing issue, we miss the CW sequence by a clock, we will mis-interpret the PHY CAP. So what I am asking for is why cant the standard provide higher level guidance so that vendors don't choose the wrong implementation. I have been bitten by these kind of implementation issues in the past and I am just trying to be pro-active about it. Let me know if we can talk for couple of minutes so that I can clarify the issue at hand... Thanks, Amit Shah Intel Corporation ________________________________ From: Stephen FINCH [mailto:steve.finch@st.com] Sent: Wednesday, September 12, 2007 8:51 AM To: Shah, Amit M; t10@t10.org; 'Elliott, Robert (Server Storage)' Subject: RE: BCT and PHY CAP exchange -- Possible interoperability issue I had a lot of problems understanding what the issue is. I don't see interoperability problems, I see bad a implementation that won't work reliably. The transmitters of both ends are transmitting the same thing. There is not difference. The good receiver and the bad receiver see the same stream. One works reliably, the other doesn't. One is a compliant implementation, the other isn't. That being said, I would suggest that the descriptions of the implementations is, in it self, wrong. When the first (START) bit is detected - i.e., ComWake Detect in the diagram below, we are in the middle of the first BCT time. This is not the start of the first BCT. If one were to load the BCT timer with a start value that is appropriate (maybe the middle of the full BCT count) when the first ComWake Detect occurs, then all of the remaining ComWake Detects will be positioned roughly in the middle of each remaining BCT. Drift in clock frequencies will not move it outside of the window. Now in each of the BCT's you will see zero or one ComWake Detects. Never two. I don't see what needs to be changed. Except some bad implementations. Regards, Steve Finch STMicroelectronics ________________________________ From: owner-t10@t10.org [mailto:owner-t10@t10.org] On Behalf Of Shah, Amit M Sent: Tuesday, September 11, 2007 11:26 PM To: t10@t10.org; Elliott, Robert (Server Storage) Cc: Shah, Amit M Subject: BCT and PHY CAP exchange -- Possible interoperability issue Hello, The SAS standard does not talk about BCT timer and PHY CAP exchange high level mplementation at all. It mentions "Each phy capabilities bit is one bit cell time (BCT) and contains either COMWAKE (indicating one) or D.C. idle (indicating zero)" Right now it is open to designer's interpretation of the standard and I am afraid that different vendors will have different implementation of BCT / PHYCAP which will cause interoperability issues in future. Recently I came across of one interpretation of BCT and PHY Capability which I thought was not correct. This implementation is accurate w.r.t. SAS standard (as SAS standard does not mention any rules or SM to govern BCT / PHY CAP exchange) but I believe it will not work in real silicon. Here are the details of what I observed... BCT is 2200 OOBI. Transmitter of ComWake (CW) will have to transmit CW for 2200 OOBI which is equivalent to BCT. Receiver of this CW can detect the negation time a little sooner. TX Negation time = 186.6nsec. RX Negation time > 175 nsec. So basically receiver can completely detect a CW (with negation time) about 11nsec earlier. Now lets look at this diagram.... You can see when TX is transmitting CW. Also after some latency, RX receives the CW. RX can detect the complete CW sequence along with negation time 11 nsec earlier (as mentioned above). So that is the reason there is some IDLE shown between two CW sequence as seen by receiver. During an OOB sequence, there are only two distinct events that are reliable for a receiver. Once is a "ComWake Detected" and other is "ComWake Completed". Logically, RX can only start its BCT timer on ComWake Detect (I don't know why SAS Standard does not mention this critical detail) or a ComWake Completed as starting BCT timer on any random burst sequence could be fatal for PHY CAP detection sequence. What I think is a good implementation.... 1. Receiver starts the BCT on the "first" CW Detect. 2. When BCT is still active and if it recognized a CW Completed, it marks it as a start bit. 3. Once the BCT timer expires, it will start the next BCT timer automatically (this time it will not start it based on CW Detect). 4. Now during the new BCT time, if the receiver detects a CW Completed, it should capture it as a "1", else a "0". 5. go back to step "#3". Advantage: The decoding of the PHY CAP bit during BCT time will happen about 186nsec before BCT expires. So this approach is less susceptible to design implementation / clock crossing issues. Better for Interoperability. BAD implementation # 1 (This is the wrong implementation that I came across).... 1. Receiver starts the BCT on the "first" CW Detect. 2. The First CW Detect is also marked internally as START BIT. 3. When BCT is still active and if it recognized another CW Deteted, it marks it as a "1", else it will mark it as a "0" (as part of PHY CAP) 4. Once the BCT timer expires, it will start the next BCT timer automatically (this time it will not start it based on CW Detect). 5. go back to step #3. BAD implementation # 2.... 1. Receiver starts the BCT on the "first" CW Completed. 2. The First CW Completed is also marked internally as START BIT. 3. When BCT is still active and if it recognized another CW Completed, it marks it as a "1", else it will mark it as a "0" (as part of PHY CAP) 4. Once the BCT timer expires, it will start the next BCT timer automatically (this time it will not start it based on CW Completed). 5. go back to step #3. The last two implementation are marked as "BAD Implementation" as the internal decoding of a "1" or a "0" (after Start bit detection) is based on a CW event which will happen only during the time when BCT is about to expire. Theoretically, BCT is equal to a CW. So one clock before BCT expires, a new CW event should be detected to decode the PHY CAP bit as a "1". In real world, if BCT expires and one clock later the CW event occurs, that event will be missed and a wrong PHY CAP will be captured. Due to clock crossing of signals or due to SERDES implementation, if a CW detect is detected (bad implementation #1) one clock later or if CW completed (implementation #2) is detected 1 clock later, the whole PHY CAP exchange will be bad. So these bad implementations as stretching the detection of a "1" or a "0" during a BCT to a 1 clock period and I am worried about that. By not defining BCT and PHYCAP at a higher level, the standard is relying on the vendors to implement this functionality as they feel is right. But I am worried that wrong implementation of BCT / PHY CAP will lead to a lot of interoperability issues in future. It will be good if the SAS standard will throw some light on this issue.... Thanks, Amit Shah