From: Ernest Cline (
Date: Wed Mar 17 2004 - 14:30:05 EST

  • Next message: Kenneth Whistler: "Re: Investigating: LATIN CAPITAL LETTER J WITH DOT ABOVE"

    > [Original Message]
    > From: Peter Kirk <>
    > On 17/03/2004 07:12, Ernest Cline wrote:
    > >Well, in the event that Unicode ever does add DOTTED J to go with
    > >DOTLESS J, I sincerely hope that it does not follow the example of
    > >DOTTED I and DOTLESS I. It would have been better in my opinion
    > >to have encoded upper and lower case forms of both characters
    > >separate from the ordinary I. That would have placed language
    > >specific burdens not on the casing algorithm of Unicode but on the
    > >transfer of data from legacy character sets. It's probably too late
    > >to change this for the I, but hopefully this can be avoided for J if
    > >a distinct dotted J character is needed.
    > It was too late to change this one even before Unicode was dreamed up,
    > in fact as soon as anyone started using legacy character sets to write
    > Turkish and used the ordinary ASCII i and I for Turkish dotted i and
    > dotless I respectively. Any documents in mixed Turkish and European
    > languages, without explicit language markup, would be hopelessly messed
    > up, and the burden which you wanted to put "on the transfer of data from
    > legacy character sets" would have implied the need to rewrite all such
    > documents.

    Mixed Turkish and other European language documents that are without
    language markup have the same problem, no matter where the burden
    is placed. Some I's will receive inappropriate glyphs when a casing rule
    is applied. The problem is just as pronounced with either method, and
    the need to rewrite such documents to ensure proper casing is the same.

    I will admit that my preferred solution has higher initial costs, but lower
    long term costs that cause me to favor it. In any case, changing to my
    preferred solution now would not be worth the confusion that would be
    caused. If there ever is a successor to Unicode, then it would be worth
    examining this idea, but such an event is at least twenty years away.

    This archive was generated by hypermail 2.1.5 : Wed Mar 17 2004 - 15:08:36 EST