Re: Case mappings

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun Mar 13 2011 - 16:06:33 CST

  • Next message: Asmus Freytag: "Re: Case mappings"

    2011/3/13 Doug Ewell <doug@ewellic.org>:
    > Philippe Verdy wrote:
    >
    >>> Modifying all existing electronic text to include such an invisible
    >>> control character,
    >>
    >> Why all texts ? This was not in the proposal.
    >
    >>> and requiring all users and processes to enter it reliably,
    >>
    >> Why all users ? Here again not in the proposal. In fact all characters
    >> are encoded for an undefined number of users, possibly small, but not for
    >> all users. The existence of the character would be there for those users for
    >> whom the difference does matter.
    >
    > If users or processes who want to take advantage of this special character
    > cannot depend on it being there in all texts, it may as well not be there at
    > all, as they will have to fall back on the same heuristics that they are
    > trying to avoid.
    >
    > In any case, I'd best get out of the business of telling users like QSJN UKR
    > that such-and-so character would be a bad idea or that Unicode will not
    > encode it, even if that is what I personally believe.

    This is ia chicken-and-egg problem. If you follow this path of
    reasoning, let's just stop discussing any further progress or
    additions in Unicode. Without any doubt, we would still be using ASCII
    for almost everything in Latin, and all texts would have remained
    ambiguous.
    There's a wellknown problem, but no volonty to propose a solution for
    it. Telling people to not use any case mapping in their encoded texts
    is just a way to tell them: don't use a standard Unicode algorithm,
    i.e. the same as breaking the standard itself by making it unusable
    for practical problems.
    I don't follow you there. A new character offers a clean long term
    solution, even if there will be a long time during which texts encoded
    without it will still be present (but they can be corrected at any
    time for all occurences where the absence of the explicit combining
    char would cause problems.) Even if Unicode is there now and widely
    deployed, all past texts using ASCII only have not disappeared, and in
    the same way, we still see texts using a single dash for unrelated
    things, the same ASCII double quote encoded for distinct quotes.
    I'm not advocating the addition of new letters, when just a couple of
    combining characters to mark explicitly the expected semantic of case,
    can solve all this for all pluricameral scripts in all their cased
    letters.



    This archive was generated by hypermail 2.1.5 : Sun Mar 13 2011 - 16:09:01 CST