Re: Exemplar Characters

From: Mark Davis (mark.davis@icu-project.org)
Date: Tue Nov 15 2005 - 17:34:08 CST

  • Next message: Kenneth Whistler: "Re: Apostrophes (was Re: Exemplar Characters)"

    I have always had misgivings as well. Luckily, we avoided other
    proposals for disunifying identical symbols (such as the "abbreviation"
    period vs the "sentence termination" period vs the "decimal point"
    period), but I think we would have been better off not distinguishing
    02BC and 2019 either. As it is, software really should really treat them
    as essentially equivalent, since you have little idea which would be
    chosen by any given user.

    (Purely the fact that there can be a phonetic difference between
    sequences with and without the character is not a conclusive evidence
    for which one should be chosen. Look at "I'LL" vs "ILL", yet an apostrophe.)

    Mark

    Asmus Freytag wrote:

    > On 11/15/2005 11:26 AM, Marc Bruguières wrote:
    >
    >>
    >> Michael Everson:
    >>
    >>
    >>> At 09:42 -0500 2005-11-15, Chris Harvey wrote:
    >>>
    >>>
    >>>
    >>>> Would this mean that the choice between U+2019 and U+02BC is
    >>>> decided by the phonetic realisation of the apostrophe?
    >>>>
    >>>
    >>> Maybe. There isn't a rule, any more than there is a rule about the
    >>> phonetic value of <c> in any particular language.
    >>>
    >>> Polynesian languages should all use the modifier letters, for
    >>> consistency. It's a glottal there.
    >>>
    >>
    >>
    >> Did they before Unicode? Do they do now? If their usage differs,
    >> isn't this causing a bit of confusion? (I doubt U+02BC is very much
    >> used as it is not in standard fonts like Times Roman on XP SP2 and
    >> U+2019 is readily available...)
    >>
    >> Does word highlighting work less well in Breton than in Polynesian
    >> languages because Breton, let's say, would use U+2019 and the other
    >> U+02BC? Don't think so. At least it is not the case in Word 2003 on
    >> XP, in fact U+2019 for Breton works better inside of words than
    >> U+02BC which breaks them, incidentally this is strange for a modifier
    >> I would say.
    >> Isn't this a case of overunification?
    >
    > I think you meant 'over-dis-unification' here.
    >
    >> Looks the same to users, seems it should behave the same way (in fact
    >> whether an apostrophe breaks a word or not is language dependent[1]).
    >> Why two characters? For extra confusion and spoofing fun?
    >>
    >>
    > I've always been troubled by 02BC / 2019 myself. Since it's not
    > possible to distinguish the two uses of 2019, it seems to make little
    > sense to pull out some partial use and assign it to 02BC. And the fact
    > that there's no visual distinction is really troubling.
    >
    > Your comments imply that implementers have voted with their software
    > and decided to unify rather than to support 02BC. That this is not
    > limited to MS is shown by the fact that Google does not support this
    > character as a modifier letter. c <02BC> t, brings up c't (the German
    > Computer magazine), but so does "c,t".
    >
    > A./
    >
    >> --
    >> Marc
    >>
    >> [1] That is even too simple : "� pied d'oeuvre" makes 4 words (�,
    >> pied, de, oeuvre, four dictinct entries in a dictionary) but
    >> "aujourd'hui" or "chef-d'oeuvre" are singles words (one entry in a
    >> dictionary). Strictly speaking, a dictionary is necessary although
    >> obviously cheaper software may approximate algorithmically.
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >>
    >
    >
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Tue Nov 15 2005 - 17:36:08 CST