Comments on SPC-5 Extended Copy Enhancements for Tape (14-268r1)

Kevin D Butt kdbutt at us.ibm.com
Tue Jan 13 15:26:27 PST 2015


Formatted message: <a href="http://www.t10.org/cgi-bin/ac.pl?t=r&f=r1501130_f.htm">HTML-formatted message</a>

Dennis,
My engineers have the following comments related to the  SPC-5 Extended 
Copy Enhancements for Tape (14-268r1) proposal.
Seems like this really contains two mostly distinct functions/proposals -- 
adding descriptor 19h to extended copy is fairly distinct and much more 
straightforward than the mirror copy which is fundamentally different from 
a model perspective and is much more complex and subtle.
question: Should the mirror command be mutually exclusive to the normal 
extended copy? or should handling any interaction (e.g., extended copy 
issued while mirroring) be left to the implementer?
issue: The mirror copy may need to add more specific information, 
including more detail on which commands/modes are affected.  Clearly the 
intent is for medium writes and positional equivalents (including 
reposition stubs for read [not compares, etc] and table 8 indicates this, 
but is missing some basic medium commands (RREV (recommendation at least 
this needs to be added), RBD).	Additionally it does not address the 
nuances of fixed block commands (block sizes [initial setup or especially 
changes]), some setup commands (set capacity, format (and the mode page 
partitioning setup)), and other configuration / modal related aspects like 
encryption, LBP, compression, etc.  Perhaps [most/some of] these are 
outside of scope (e.g., pre-command setup), but at least LBP is clearly 
relevant as it affects the sent/written data (and this has the [mandatory] 
mode page policy per I_T problem so like any per I_T setting cannot be 
setup prior to the command [the CM I_T must perform the setup] (there may 
be other things like this depending on the device policy usages).  Also 
since the mirror command is effectively an ongoing [always immed=1] 
command that keeps processing other commands, changes to such modes could 
occur during the command lifecycle itself (and not get propagated to the 
tertiary devices).  It seems like this model could be problematic in 
various ways, but such is the nature of many semi-transparent modes.  I 
understand where they are coming from with this, but there are a lot of 
things that could be done [by the AC] that might interact poorly and no 
clear/clean way to limit them.	This is especially true as the source/copy 
manager drive continues to appear to the AC as a normal tape drive and is 
forwarding SOME of the [medium] commands it gets (so when mode changes are 
done there is no clear way to propagate them (or to even know which should 
and which should not -- some could affect data integrity (e.g., LBP) and 
not cause other errors that terminate the process). I do not think it is 
reasonable to add mode select to forwarded commands (this will cause other 
problems, as it assumes same device type, media types, density, etc) and 
may undo intentional differences.  forwarding mode select would also not 
work in cases where multiple initiators are used for the primary commands 
and mode changes are implicit (no mode select at the time of effective 
change) due to I_T scope settings.  recommendation: I think the 
users/implementers of the mirror operation needs to be aware/careful in 
just what they do while the mirror is active.  I would think that at a 
minimum some text in the model highlighting some of these potential issues 
[to guide both copy manager device and application implementers] would be 
wise.  Not sure how this kind of thing (warning) is typically addressed in 
the standards.	Perhaps this is kind of like the current extended copy 
pre-setup notes (SPC4r36t 5.16.4.1 Prior to sending a third party copy 
command), but should include a general note related to the need for 
caution and [partial] set of notes related to both pre-mirror command 
sending setup as well as identifying some partial commands/functions 
ramifications to be aware of while mirroring is active to illustrate or 
highlight some of the typical issues that might be or particular concern. 
I would recommend at least LBP (as this is a undetected DI case unless the 
mirror manager implictly manages/alters it as needed) and fixed block 
commands.  Alternately a much more comprehensive approach which actively 
limits/rejects certain changes to modes and other potentially dangerous 
operations could be done, though that seems a bit impractical (as it ties 
heavily into SSC) -- while a given implementation could do this, I do not 
think it should be standardized (though the notes mentioned before might 
include something that indicates that a device server may limit certain 
functions (preclude their use, if mirroring is active).
The error surfacing of the mirror seems to be a little bit under specified 
(see the RCDE=1 bullet below).	Most of it is probably ok, but I have 
included my understanding to frame this issue [in case it is relevant to 
any discussion].  In absence of any other specification, I expect that 
much like the base extended copy segment process commands to CSCDs, any 
[forwarded] command with non-good status (e.g, CC, after considering 
CTEOM) terminates the [mirror] segment and stops/fails the mirror 
operation.  warning: As currently specified, this would mean that a read 
into a FM (or EOD) would stop the mirror (assuming that the volumes are 
indeed in sync and the subordinate devices return the CC for the [source] 
read which is mapped to space +1 blocks on the subordinates and would get 
CC when encountering the FM).  Some other possibly common command 
sequences and conditions (a [source] space blocks command encountering FM) 
would also terminate the mirroring.
The following is my interpretation of RCDE=0:  The mirror operation stops 
(generating LID4) status on the first failed command (command with 
non-good status [after considering CTEOM]), but does not report it 
anywhere [at or near that time].  The main commands continue to be 
processed as expected, but the mirroring to all subordinate CSCDs are no 
longer forwarded [the operation is no longer active].  It is the AC 
responsibility to poll as desired for LID status at whatever interval(s) 
they wish to verify that the mirror was/is successful/active.  Seems like 
most ACs would usually want to use RCDE=1 to avoid far latent detection of 
a mirror failure...
issue: It seems like RCDE=1 needs a bit more clarification.  I assume this 
is a DEFERRED error reported the the next eligible command (e.g, not INQ, 
SNS, RLUNS, etc)? or is it a deferred [or non-deferred] error on the next 
command that would have been forwarded? The latter seems best.
Since this is a copy like third party I_T (primary device is I), there may 
be some UAs that will be seen on the first forwarded command(s).  This is 
not mentioned for normal extended copy manager issued commands, so 
probably does not need to be mentioned here.
issue: Once it is determined that a command needs to be forwarded, I 
assume it is forwarded to each subordinate device [even if one or more 
fails].  It is up to the copy manager (device implementation) whether they 
are issued concurrently [best performance] or in some other sequence -- 
but the expectation is that any given command will be attempted on each 
CSCD (and if one or more fail the entire mirroring operation terminates 
per above).
issue: I did not see any indication on what should be done if the primary 
forwardable command fails on the lead device.  Does the mirror get 
terminated?  Or is the command simply not forwarded (and the mirroring 
remain active)?  Does the behavior vary by command (e.g., block transfer 
commands work differently than positioning commands), in the attempt to 
maintain position (e.g., if a space blocks on the primary hits a FM, 
should it be done on the subordinates?).
issue: related to the above, are non-failure CCs where [at least some part 
of] the command was performed (e.g., ILI on read, non-deferred SK 1, tape 
alert device exception sense) still forwarded or somehow handled or do 
they terminate the mirror? or is the forwarding related to the internal 
execution of the command?  What about DCCs (notably recovered errors) -- 
they fail extended copy (which is why setup recommends disabling PER)?
table 117, 1Ah and 6.Y.3 have the LOAD/UNLOAD command as a terminating 
condition -- is this really the desired behavior (it includes the medium 
ready case where the load is a locate to BOP 0), or should it be when the 
medium is unloaded?  issue / recommendation: At a minimum any volume 
unload condition [manual (button), command, etc] should terminate the 
mirror operation (this could either be the [only] specified condition, or 
should be added). 
issue: Descriptor sense usage:
5.16.4.10: The "shall" here is very strong and is not clear just what it 
means (e.g., who/how is descriptor mode forced/or enforced? device? or is 
this a directive to the AC?).  Is this a check, or setup [implicit 
commands by the copy manager [the rest of the extended copy does not 
typically do much of this]]?  If the copy manager needs to do this 
[configuration], how does it ensure cleanup [if needed]?  Since the 
descriptor mode uses forwarded sense descriptors (which can forward either 
fixed or descriptor sense), it seems like only the copy manager I_T really 
needs to be in descriptor mode [and that is only needed for the more 
complex case where errors occur on more than one CSCD].  This is a much 
simpler thing to check/enforce (rather than requiring [internally 
generated commands to  check or setup] that mode on all involved CSCDs) -- 
e.g., the I_T that is processing the extended copy mirror command needs to 
be set to descriptor mode [the command is failed if it	is not].  Since 
the AC is required to be able to parse this format, it should be the one 
required to do the setup before the mirror [or copy] command.  In the 
mirror case there is a hole here of course in that the format could be 
changed when the mirroring is active. 
A similar comment to above applies to the similar descriptor sense 
requirement for descriptor 19h (again it seems like it is only really 
needed for the copy manager I_T, not all CSCD I_Ts).
It might also be possible to soften this to a AC recommendation which 
allows better determination of the error location, which is not strictly 
needed (depending on what kind of recovery action if any is attempted in 
mirror failure cases).	There are other ways to do this recovery.  The 
advantage of allowing fixed sense format is broader compatibility with 
existing software stacks, especially   with respect to the mirror, as this 
makes what could otherwise be a mostly transparent function not work in a 
number of environments.  My recommendation is that these clauses 
indicating descriptor sense be relaxed to something indicating that if the 
application clients wants this location information, they should  [rather 
than shall] configure the copy manager I_T for descriptor sense, but the 
operations (both mirror and 19h) should work with fixed sense and 
additional CSCDs, the error locate will just not be as precisely reported.
issue: Table 146: missing segment 1Ah (should have a row indicating 
descriptor 1Ah is reserved for use with the mirror copy command).
issue: 6.Y.3: limiting additional CSCD to a maximum 3 seems arbitrary (for 
a total of 4 subordinate CSCDs).  It does tie to the enumerated failure 
list [sense data source for descriptor sense format which is a fairly 
limited 4 bit field], so if the entire sense data source field were used 
this allows 14.  Is there a better way to do this so the CSCD count could 
be extended to a larger count if desired?  <admittedly there may be 
practical implementation/performance reasons to keep it fairly small 
anyway, but I always try to think a little bit about arbitrary limits and 
whether they are appropriate>.
recommendation In any case, since a given implementation may have a 
smaller limit (0 [not all that interesting for descriptor 19h], 1, 2 or 
3), it seems the maximum number of additional CSCDs supported should be 
reflected in a descriptor(s) (new one(s) or use some of the reserved part 
of 9101h) in inquiry page 8Fh to reflect capabilities.	This is consistent 
with most of the other aspects of extended copy.  This could be a single 
value that applies to both usages (descriptor 19h (copy segment) and 1Ah 
(mirror command)), or indicated separately for each of these usages.
Kevin D. Butt
SCSI Architect, Tape Firmware, T10 Standards
Data Protection & Retention
MS 6TYA, 9000 S. Rita Rd., Tucson, AZ 85744
Tel: 520-799-5280
Fax: 520-799-2723 (T/L:321)
Email address: kdbutt at us.ibm.com



More information about the T10 mailing list