current list of RAID ASC/ASCQ codes

Doug Hagerman "serve::hagerman" at starch.enet.dec.com
Wed Apr 27 11:36:40 PDT 1994







                                                        X3T10/94-024 Rev 3


        To:    John Lohmeyer
               Chairman, ANSI X3T10 (SCSI)

        From:  Doug Hagerman
               Digital Equipment Corporation
               SHR 3-2/W3
               334 South Street
               Shrewsbury, MA  01545

        Phone: 508-841-2145
        FAX:   508-841-6100
        Mail:  hagerman at starch.enet.dec.com

        Date:  27 April 1994

        Subject: Error Handling for SCSI Controllers



        This paper is a proposal for some additional error codes to
        handle situations encountered in storage subsystems,
        particularly RAID subsystems.



        1.   April RAB Host Interface Meeting

        At the April 1994 meeting we discussed whether the ASC/ASCQ
        list should be a "short" list of generic error codes or a
        "long" list of detailed codes.  I listed some of the pros
        and cons of the two possibilities using following slide.

        "Why do we need ASC/ASCQs anyway?

        1. The reason for ASC/ASCQs is so that an initiator can
        automatically take an action dependent on the error
        condition. Thus the only reason for different codes is if
        there is a different action needed by the initiator. This
        test should be applied to every proposed ASC/ASCQ, and can
        be expected to result in a small number of codes (as George
        proposes).


        2. There needs to be a standard way to get detailed
        information about an error. This is needed for field service
        and also to allow third party configuration and user
        interface software. This could be done by:

        a. A long list of ASC/ASCQ codes (as I originally proposed),
        or










                                   - 2 -


        b. Some kind of standard mechanism to return a pointer into
        a detailed list of errors and text. The list and text would
        be vendor-specific but the access method would be standard.

        A completely vendor-specific approach to this would make
        third party user interfaces much more difficult."

        After some discussion, the group decided to support the idea
        of the shorter generic list, based primarily on the argument
        that constructing a detailed list that covers all possible
        implementations would be difficult and would need frequent
        updates as new implementations come along.

        A mechanism is retained for specifying vendor unique codes,
        so if one needs more detailed event reporting this is still
        possible.


        2.   Current Draft List of Codes

        The long list was trimmed by combining and rearranging some
        codes as shown below. This list includes the updates that
        were agreed upon at the meeting. The total number of new
        codes is now 21.




        Combine:

        xxh  xxh            A  REQUEST SENSE COMMAND TO DRIVE FAILED
        xxh  xxh            A  PREMATURE COMPLETION OF A DRIVE COMMAND
        xxh  xxh            A  COMMAND FAILED--SCSI ID VERIFICATION FAILED
        xxh  xxh            A  TEST UNIT READY OR READ CAPACITY
                                COMMAND FAILED
        xxh  xxh            A  DRIVE FAILED BECAUSE IT FAILED A FORMAT
                                UNIT COMMAND
        xxh  xxh            A  DRIVE FAILED BECAUSE IT FAILED A TEST UNIT
                                READY COMMAND OR READ CAPACITY COMMAND
                                OR DURING FORMAT OR RECONSTRUCTION
                                OPERATION

        Into:

        xxh  00h            A  Command to Logical Unit Failed


        Combine:

        xxh  xxh            A  DRIVE FAILED BECAUSE OF A FAILED WRITE
                                OPERATION (REPLACE DRIVE NOW)
        xxh  xxh            A  DRIVE FAILED BECAUSE AUTOMATIC REALLOCATION
                                FAILED
        xxh  xxh            A  DRIVE FAILED BECAUSE RECONSTRUCTION FAILED









                                   - 3 -


                                ON DRIVE BEING RECONSTRUCTED
        xxh  xxh            A  DRIVE FAILED BECAUSE RECONSTRUCTION FAILED
                                BECAUSE OF READ ERROR ON SOURCE DRIVE
        xxh  xxh            A  DRIVE FAILED DUE TO HARDWARE COMPONENT
                                DIAGNOSTICS FAILURE
        xxh  xxh            A  DRIVE FAILED BECAUSE OF A DEFERRED ERROR
                                REPORTED BY DRIVE
        xxh  xxh            A  WATCHDOG TIMER TIMEOUT
        xxh  xxh            A  DISCONNECT TIMEOUT
        xxh  xxh            A  CHIP COMMAND TIMEOUT
        xxh  xxh            A  BYTE TRANSFER TIMEOUT
        xxh  xxh            A  EXCESSIVE MEDIA ERROR RATE
        xxh  xxh            A  EXCESSIVE SEEK ERROR RATE
        xxh  xxh            A  EXCESSIVE GROWN DEFECTS
        xxh  xxh            A  NO RESPONSE FROM ONE OR MORE DRIVES
        xxh  xxh            A  NON-FAILED DRIVE WAS UNAVAILABLE
                                FOR OPERATIONS

        Into:

        5Dh  01h            A  Logical Unit Failure
        5Dh  02h            A  Timeout on Logical Unit


        Combine:

        xxh  xxh            A  LUN ALREADY EXISTS; CANNOT DO "ADD LUN"
                                FUNCTION
        xxh  xxh            A  LUN DOES NOT EXIST; CANNOT DO "REPLACE
                                LUN" FUNCTION
        xxh  xxh            A  DRIVE ALREADY EXISTS; CANNOT DO "ADD
                                DRIVE" FUNCTION
        xxh  xxh            A  DRIVE DOES NOT EXIST; CANNOT DO REQUESTED
                                FUNCTION FOR IT
        xxh  xxh            A  DRIVE CAN'T BE DELETED; IT'S PART OF A LUN
        xxh  xxh            A  DISK DEFINED MULTIPLE TIMES FOR LUN
        xxh  xxh            A  TOO MANY DISKS DEFINED
        xxh  xxh            A  NO SPACE AVAILABLE FOR LUN
        xxh  xxh            A  ERROR IN PROCESSING A SUBSYSTEM MODE PAGE
        xxh  xxh            A  DRIVE INQUIRY DATA MISMATCH BETWEEN DRIVES
                                IN THE LUN
        xxh  xxh            A  DRIVE CAPACITY MISMATCH BETWEEN DRIVES
                                IN THE LUN
        xxh  xxh            A  DRIVE BLOCK SIZE MISMATCH BETWEEN DRIVES
                                IN THE LUN
        xxh  xxh            A  ROM CODE INDICATES NO DRIVE IS PRESENT
                                ALTHOUGH INFORMATION STORED ON DISKS
                                INDICATES DRIVE SHOULD BE PRESENT
        xxh  xxh            A  MODE PARAMETERS FOR DRIVES IN LUN
                                DON'T MATCH
        xxh  xxh            A  WRONG DRIVE WAS REPLACED
        xxh  xxh            A  DRIVE NOT RETURNING REQUIRED MODE SENSE PAGE

        Into:









                                   - 4 -


        xxh  00h            A  Configuration Failure
        xxh  01h            A  Configuration of Incapable Logical
                                Units Failed
        xxh  02h            A  Add Logical Unit Failed
        xxh  03h            A  Modification of Logical Unit Failed
        xxh  04h            A  Exchange of Logical Unit Failed
        xxh  05h            A  Remove of Logical Unit Failed
        xxh  06h            A  Attachment of Logical Unit Failed
        xxh  07h            A  Creation of Logical Unit Failed


        Combine:

        xxh  xxh            A  NO DISKS DEFINED FOR LUN

        Into:

        xxh  00h            A  Logical Unit not configured


        Combine:

        xxh  xxh            A  PARITY/DATA MISMATCH
        xxh  xxh            A  COMPONENT FAILURE AFFECTING MULTIPLE
                                CHANNELS

        Into:

        xxh  00h            A  Data Loss on Logical Unit
        xxh  01h            A  Multiple Logical Unit Failures
        xxh  02h            A  PARITY/DATA MISMATCH


        Combine:

        xxh  xxh            A  OPERATION NOT ALLOWED DURING RECONSTRUCTION
        xxh  xxh            A  REBUILD IN PROGRESS
        xxh  xxh            A  RECALCULATION IN PROGRESS

        Into:

        04h  05h            A  Logical Unit Not Ready, REBUILD IN
                                PROGRESS
        04h  06h            A  Logical Unit Not Ready, RECALCULATION
                                IN PROGRESS


        These codes are duplicates of existing ASC/ASCQs, and
        are therefore redundant:

        08h  01h      xxh  xxh            A  COMMAND TIMEOUT
        08h  00h      xxh  xxh            A  BUS ERRORS
        Queue Full    xxh  xxh            A  NO COMMAND CONTROL
                                                STRUCTURES AVAILABLE









                                   - 5 -


        08h  00h      xxh  xxh            A  UNEXPECTED BUS PHASE
        43h  00h      xxh  xxh            A  MESSAGE REJECT RECEIVED
                                                ON A VALID MESSAGE
        43h  00h      xxh  xxh            A  SYNCHRONOUS NEGOTIATION
                                                ERROR
        11h  00h      xxh  xxh            A  DATA RETURNED FROM DRIVE
                                                IS INVALID
        5Dh  00h      xxh  xxh            A  MAXIMUM NUMBER OF ERRORS
                                                FOR THIS i/o EXCEEDED
        0Ch  00h or   xxh  xxh            A  UNRECOVERED READ/WRITE
        11h  00h                             ERROR
        05h  00h      xxh  xxh            A  NO RESPONSE FROM ONE OR
                                                MORE DRIVES
        40h  NN       xxh  xxh            A  NV MEMORY AND DRIVE
                                                METADATA INDICATE
                                                CONFLICTING DRIVE
                                                CONFIGURATIONS
        1Bh  00h      xxh  xxh            A  SYNCHRONOUS TRANSFER
                                                VALUE DIFFERENCES
                                                BETWEEN DRIVES
        Illegal Req.  xxh  xxh            A  INVALID ACTION TO TAKE
        24h or 26h    xxh  xxh            A  INVALID BIT SPECIFIED
        24h or 26h    xxh  xxh            A  TEXT STRING OVERFLOW
        04h  03h      xxh  xxh            A  INTERVENTION REQUIRED
        Res. Conf.    xxh  xxh            A  RESERVATION CONFLICT


        Add these new codes to SCSI-3:

        xxh  xxh            A  INFORMATIONAL, REFER TO LOG
        xxh  xxh            A  REDUNDANCY LEVEL GOT BETTER
        xxh  xxh            A  REDUNDANCY LEVEL GOT WORSE
        xxh  xxh            A  STATE CHANGE HAS OCCURRED































More information about the T10 mailing list