Unicode abuse (was: Re: But E0000 Custom Language Tags Are Actually *Required* For Use By Unicode)

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Mar 05 2005 - 16:29:38 CST

  • Next message: UList@dfa-mail.com: "Re: CGJ for Two Greek Ligatures?"

    Asmus Freytag <asmusf at ix dot netcom dot com wrote:

    >> Among other drawbacks, they only encode the basic Latin and Greek
    >> alphabets, not digits, punctuation, or accented letters
    > They are also intended to convey a specific set of distinctions
    > that are applied to the use of these forms in Mathematics.
    > ...
    > However, these uses are different from the way italics or bold
    > are used in text, which usually is to lend emphasis, or to
    > differentiate parts of the text from each other, or in a merely
    > stylistic way, as in formatting section and chapter headers.

    That's one of the "other drawbacks" I meant. They're not "for" that.

    My MathText tool at http://users.adelphia.net/~dewell/mathtext.html,
    which blatantly abuses Unicode by converting plain text to math symbols
    and back, was intended as an April Fool's joke, although it is fully
    functional and has reportedly been used for font testing.

    This brings up the topic of "Unicode abuse" in general. Conformance to
    the Unicode Standard (see DUTR #33, of which Asmus is a co-author)
    generally refers to support for and adherence to the "letter of the
    law," things like implementing normalization and casing correctly. It's
    not quite so easy to quantify adherence to the "spirit of the law," in
    terms of things like abusing math characters and compatibility
    characters, or using directional overrides where they don't harm
    anything and aren't invalid, but also aren't necessary or appropriate.

    This almost falls into the same category as spoofing, which is being
    addressed in a different UTR, but seems different somehow.

    Clearly it's a bad thing to use characters for inappropriate purposes,
    but there is a long history of people doing this with ASCII, and I've
    even seen math symbols used 𝑓𝑜𝑟 𝑒𝑚𝑝ℎ𝑎𝑠𝑖𝑠 on this list.

    How does one go about measuring the type of "conformance" that relates
    to using Unicode for the "right" purposes versus the "wrong" purposes?

    -Doug Ewell
     Fullerton, California

    This archive was generated by hypermail 2.1.5 : Sat Mar 05 2005 - 16:30:23 CST