Re: New Corrigendum to The Unicode Standard

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Wed Aug 22 2007 - 00:48:03 CDT

  • Next message: Philippe Verdy: "RE: New Corrigendum to The Unicode Standard"

    On 8/21/2007 8:26 PM, Philippe Verdy wrote:
    > I was not making a formal proposal, just proposing something to help
    > referring to the version with corrigendums applied. OK the letter "d" is
    > used for drafts, but drafts don't need to be referred to for long term,
    > that's not the case for compliant implementations.
    >
    You either propose or you don't propose. This isn't a formal reply by
    the way, just my personal frustration with the way you cloud every issue
    with fanciful 'suggestions' that are just too often based on an
    incomplete understanding of how the standard actually works.
    > Choose letter c (like corrigendum) if you prefer or an extra dot.
    > So the current version would be 5.0.0c6 or 5.0.0.6.
    >
    > De Asmus Freytag
    >
    >> The Unicode Standard (and website) make very clear that a corrigendum
    >> does not actually modify a version. It also doesn't supercede a version.
    >> What it does, is to allow implementers to claim conformance to a version
    >> with the corrigendum applied.
    >>
    >
    > It does when it modified character properties, because you can't comply to
    > both the base version and the version with the corrigendum,
    Corrigenda, by their very nature, never allow an implementation to
    comply with both the base version of the standard and the same version
    with a corrigendum. Corrigenda are issued only where defects have been
    found with *normative* information, and where it can be expected that
    implementers would want to patch an implementation to correct just those
    defects and not wait for the next version of the standard.

    Granted, as you point out below, if you don't support a feature subject
    to a corrigendum, then that corrigendum has no material effect on your
    implementation, but that's an obvious corollary.
    > unless:
    >
    > (1) you accept to NOT use the characters whose properties have
    > changed (and in that case this hasa global effet on ALL past versions ; or
    >
    > (2) you don't use Unicode 5 and keep with Unicode 4 only (something
    > that is certainly not desirable for the long term)
    >
    > As the compliance level will be needed here not only for renderers (the way
    > they handled the BiDi algorithm and reorder and mirror characters), but also
    > the applications that may generate text for the intended rendering, you xan
    > say what you want, but this change means that Unicode 5.0 without the
    > corrigendum is not compliant with the previous versions for these characters
    > whose properties were changed in an incompatible way before being corrected.
    >
    > The Bidi algorithm is normative, it is a Standard Annex, and its intro
    > explicitly says:
    > "A Unicode Standard Annex (UAX) forms an integral part of the
    > Unicode Standard(...)"
    >
    Skipping to here as the premise for the argument above misses the point.
    > This means that the base version is deprecated, even if it was published, and that implementing Unicode without the corrigendums when you know they exist should not be recommended.
    The word 'deprecation' has a formal meaning that does not apply here.
    You are correct that one might not want to recommend the support of the
    5.0.0 bidi algorithm over the one with corrigendum 6 applied, but that's
    not a formal statement of deprecation.
    > But anyway, you'll still need to implement
    > it according to some version for which there's still no corrigendum; if you
    > comply with it, your implementation may become incompatible with a future
    > corrigendum. That's why I suggest being able to refer to Unicode versions
    > with or without corrigendums explicitly.
    >
    It's already the case that you can refer "refer to Unicode versions with
    or without corrigendums explicitly." References to versions with applied
    corrigenda can be made as shown at Unicode References
    <http://www.unicode.org/versions/#References>. (the last sentence is a
    verbatim quote from http://www.unicode.org/versions/corrigenda.html).

    > I mean here: Unicode 5.0.0c0 for the version without the corrigendums and
    > Unicode 5.0.0.c6 for the current version with 6 corrigendums applied.
    >
    There's no need to introduce numbering scheme here. And it's misleading
    in that multiple corrigenda can be applied to some versions of the
    standard, and there's no requirement that they be applied in sequence.

    The following speculations are built on some fundamental
    misunderstandings of how the corrigenda mechanism works. As I find the
    propose 'solutions', whether the use of numberings or graphs, entirely
    redundant and superfluous, I'm not going to take the trouble to improve
    them by pointing out specific issues with them. I merely want to warn
    all readers that these do not correctly represent the official
    information found on corrigenda
    http://www.unicode.org/versions/corrigenda.html.

    Skipping.....
    > The versions in the 5.0 family have of course a very large common ground,
    > but complying to all of the family members at the same time means dropping
    > the support for the characters whose properties have changed (as if they
    > were undefined characters in all version of Unicode before the corrigendum).
    > If you want to comply to the whole set of Unicode versions, in a upward
    > compatible way, the common subset of supported characters will be reduced
    > even more.
    >
    > Note that under this rule, this means that the Unicode 5.0 change created
    > the reduced subset, it's not the corrigendum itself that is doing that
    > because it attempts to restore the compatibility with past versions; however
    > corrigendums are not upward compatible with uncorrected versions, instead
    > what they do is do move effectively a past version (or version with prior
    > corrigendums) into a separate branch, out of the versions trunk.
    >
    > It's a matter of logic; compliance level should not be made fuzzy.
    >
    > So I mean this compliance graph tree (partial) which is distinct from the
    > historic tree, because the branches out of the trunk are in a reversed
    > order, where each branch from the trunk makes an incompatible change, and
    > elements at the lowest position in a branch are the most compatible with the
    > trunk (the vertical link between them is in fast describing two separate
    > branches because they are also containing mutually incompatible
    > differences):
    >
    > (Latest)
    > ||
    > || 5.0.0c0 (the Unicode 5.0 book, uncorrected)
    > || |
    > || 5.0.0c1
    > || |
    > || 5.0.0c2
    > || |
    > || 5.0.0c3
    > || |
    > || 5.0.0c4
    > || |
    > || 5.0.0c5
    > || |
    > |+-----+
    > ||
    > 5.0.0c6 (Unicode 5.0, with all corrigenda)
    > ||
    > 4.1.0c0 (Unicode 4.1, has no corrigendum)
    > ||
    > || 4.0.1c0 (Unicode 4.0.1 uncorrected)
    > || |
    > || 4.0.1c1
    > || |
    > || 4.0.1c2
    > || |
    > || 4.0.1c3
    > || |
    > || 4.0.1c4
    > || |
    > |+-----+
    > ||
    > 4.0.1c5 (Unicode 4.0.1, with all corrigenda)
    > ||
    > 4.0.0c0 (The Unicode 4.0 book, has no corrigendum)
    > ||
    > || 3.2.0c0 (Unicode 3.2, uncorrected)
    > || |
    > || 3.2.0c1
    > || |
    > || 3.2.0c2
    > || |
    > || 3.2.0c3
    > || |
    > |+-----+
    > ||
    > 3.2.0c4 (Unicode 3.2, with all corrigenda)
    > ||
    > || 3.1.1c0 (Unicode 3.1.1, uncorrected)
    > || |
    > || 3.1.1c1
    > || |
    > || 3.1.1c2
    > || |
    > |+-----+
    > 3.1.1c3 (Unicode 3.1.1, with all corrigenda)
    > || ||
    > 3.1.0c0 (Unicode 3.1.0, has no corrigendum)
    > ||
    > 3.0.1c2 (Unicode 3.0.1, with all corrigenda)
    > ||
    > || 3.0.1c0 (Unicode 3.0.1, uncorrected)
    > || |
    > || 3.0.1c1
    > || |
    > |+-----+
    > ||
    > 3.0.0c0 (The Unicode 3.0 book, has no corrigendum)
    > ||
    > 2.1.9c0 (Unicode 2.1.9, has no corrigendum)
    > ||
    > (...) (intermediate versions have no corrigendum)
    > ||
    > 2.0.0c0 (The Unicode 2.0 book, has no corrigendum)
    > ||
    > 1.1.5c0
    > ||
    > 1.1.0c0
    > ||
    > 1.0.1c0
    > ||
    > || 1.0.0c0 (The Unicode 1.0 book, has no corrigendum)
    > || |
    > |+-----+
    > ||
    > (Root) (no version assigned, common part between 1.0.1c0 and 1.0.0)
    >
    > I've not verified completely this tree, there may exist some other
    > incompatibilities between two successive versions in the trunk, in which
    > case there would be other branches.
    >
    .. to here.

    The tree above is entirely fanciful and gives materially misleading
    representation of the issue. Please consult the official information at
    http://www.unicode.org/versions/corrigenda.html instead. It's clear and
    concise and, being official, it's what you should rely on.

    A./
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Wed Aug 22 2007 - 00:49:37 CDT