Subject: RE: SPC-4: Self Describing Command Timeouts (05-284r2) Date: Wed, 9 Aug 2006 12:25:13 -0400 From: "Knight, Frederick" <Frederick.Knight@netapp.com> To: "Kevin D Butt" <kdbutt@us.ibm.com> Cc: <t10@t10.org> X-Message-Number: 7146 Formatted message: HTML-formatted message In my former life as a host driver writer, this is exactly the kind of feature we wanted. We couldn't just send commands and wait forever. We had to invent maximum times we were willing to wait, and pray that they were right (or through testing, determine the worst case value - which was then pretty bad for "well behaved" devices). So, in that regard, I think this is something the host people will welcome. However, I agree that queues are an issue. With intelligent HBAs, the host puts the request (command) in a queue, and the HBA sends it whenever IT decides to. That queueing is not something you can deal with in the device; it must be delt with by the hosts outside of this proposal. The queues within the device however, are important, I think. The host want's to know how long to wait. If it takes longer than that, then the host will typically do recovery, which may be an abort, and a retry (for disks), or maybe even failing the operation outright, or tape repositioning, data validation, and retry. So, if device queues are important, does the task attribute become important? If the requests is HEAD OF QUEUE, does it have a shorter timeout than a request that is SIMPLE QUEUE? If it's SIMPLE QUEUE, what impact does the current queue depth have on the timeout value? Since this command is likely to be issued once by the host, the value should really account for the worst case queue depth. Should the host be forced to ask multiple times for different task attributes? I don't think so. Should we add more timeouts to the command timeout descriptor? From the host side, I'd be interested in knowing the maximum (when the queue is full). >From the device side, I'd like to tell the host about timing events such as ALUA state transitions (what my maximum time will be in the TPGS "transitioning" state). But does that get too long a list of timeouts? Which ones will the host really use? +---------------------------------------------------------------------+ | timeouts length | (0-1) +---------------------------------------------------------------------+ | reserved | (2) +---------------------------------------------------------------------+ | restricted | (3) +---------------------------------------------------------------------+ | minimum command timeout | (4-7) +---------------------------------------------------------------------+ | error recovery timeout | (8-11) +---------------------------------------------------------------------+ | timeout when queue is full | (12-15) +---------------------------------------------------------------------+ | maximum time in TPGS transitioning state | (16-19) +---------------------------------------------------------------------+ Tapes could use the queue full timeout location to specify the worst case when they have to flush their buffers. I hope we'll be getting some comments from some host side folks, and that they will really use this. I think it's a really good idea! Fred Knight _____ From: Kevin D Butt [mailto:kdbutt@us.ibm.com] Sent: Sunday, August 06, 2006 2:24 PM To: Pat LaVarre Cc: t10@t10.org Subject: RE: SPC-4: Self Describing Command Timeouts (05-284r2) Pat, Thanks for the response. It sounds like, for your interests, the main delays are at the host side and not in the device queue(s). In SCSI terms, the host queue is still considered part of the application client (as I understand it) and therefore not something that I can address. So I agree that from the host perspective the time from Command Out to Status In is what it has to work with. On the other hand, target devices only have the time from receipt of the command to the time the status is sent. The difference in these times is whatever bus delays there are. However, since the Command Timeout values are in units of seconds, I believe that the bus/fabric delay time is negligible. With this in mind, I am trying to concentrate my efforts on providing a Command Timeout value from receipt of the command to the sending of the status. My proposal currently does not take into consideration any time that a command might sit in the target device queue prior to entering the enabled task state. My proposal covers the issues from when the command enters the enabled task state to when it enters the task ended state. The sense that I had when the previous version was discussed in CAP is that I will have a difficult time getting this passed without somehow addressing the time spent in the queue waiting to enter the enabled task state (i.e. when the command is in the dormant task state). The only ways I have thought of that might work are to use a Task Management function like query task and attach a timout to the return status (if that is even possible in the SCSI architecture). However, this does not meet the goal of the proposal. The goal of the proposal is to have a method that allows an application to call an API from the device driver and provide a timeout value for the completion of that command. This requires an a priori knowledge of how long that command will take. Thanks, Kevin D. Butt SCSI & Fibre Channel Architect, Tape Firmware MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744 Tel: 520-799-2869 / 520-799-5280 Fax: 520-799-2723 (T/L:321) Email address: kdbutt@us.ibm.com http://www-03.ibm.com/servers/storage/ Pat LaVarre <p.lavarre@ieee.org> 08/06/2006 07:25 AM To Kevin D Butt/Tucson/IBM@IBMUS cc Subject RE: SPC-4: Self Describing Command Timeouts (05-284r2) Kevin, Clear explanation of how Reservations help Tape devices, thank you. > The delay injected by ... the command(s) in the queue > prior to this one) do not often have an effect on the host doing data I/O. Yes. For Disk and all the more for Dvd/cd devices, in my low-end commodity peripheral world, the cache & queues are mostly in the host, not in the device. The write cache in the host can be huge, even as large as the device, and not aggressively flushed. If an early write request stumbles across a difficult to write area, then that time delays all the remaining requests. The only measurabe time that reliably fits within limits is the time from Command Out to Status In measured at the bus, not as measured at a level above the queue. -----Original Message----- From: owner-t10@t10.org on behalf of Kevin D Butt Sent: Sat 8/5/2006 10:05 PM To: t10@t10.org Subject: SPC-4: Self Describing Command Timeouts (05-284r2) A new version of my "self-describing" command time-outs has been posted. This is a major revision from the one posted last November. I have a few issues to solve that I would appreciate help with. The main one being how to sufficiently address or skirt the delay injected by the time in the queue. My thoughts and experience are in the tape realm, and I don't have a good feel for disk or enclosure or MMC. In the tape realm, reservations are often used to ensure that only one host is doing data I/O (or time intensive activities) at a time. While multiple host may be talking to the drive, most are just polling to see if it is there or if it is available (i.e. doesn't have an active reservation). In this scenario, the command time-outs as I have described them will solve a high percentage of the issues related to unknown command time-outs. The delay injected by the queue (or the command(s) in the queue prior to this one) do not often have an effect on the host doing data I/O. Anyway, I need to understand better the issues seen by the other device types. I would also appreciate any suggestions. 2006/08/05 22:47:18 Your request to upload a file or files to the T10 site has been accepted. Your PDF file will be posted at: http://www.t10.org/ftp/t10/document.05/05-284r2.pdf Normally, the posting/archiving process takes about 30 minutes. Kevin D. Butt SCSI & Fibre Channel Architect, Tape Firmware MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744 Tel: 520-799-2869 / 520-799-5280 Fax: 520-799-2723 (T/L:321) Email address: kdbutt@us.ibm.com http://www-03.ibm.com/servers/storage/