Re: markup on combining characters

From: Dean Snyder (
Date: Thu Sep 09 2004 - 16:27:28 CDT

  • Next message: Michael Everson: "[hebrew] Re: [BULK] - Re: markup on combining characters"

    Mike Ayers wrote at 11:06 AM on Thursday, September 9, 2004:

    >>Does the issue of markup colors vs. font colors even fall on Unicode ground?
    > No, it shouldn't. Unicode deals with characters, not parts of
    >characters, despite making use of character composition to form some of
    >those characters. As such, getting involved in sub-character issues, such
    >as how to color parts of characters, is out of scope. That's how I see it.
    >But of course, my vision doesn't mean much. What does the UTC see? Is this
    >still then an undecided issue?

    But this goes back to a serious mistake made long ago by Unicode when
    encoding, e.g., Hebrew.

    The vowel points in Hebrew are CHARACTERS, they are NOT the logical
    equivalent to Latin accents, for example, even though they may seem to
    superficially resemble them to non-Hebraists. They are not dependent, or
    "sub-", characters - they are full fledged characters that just happen to
    be written above, in, and under the consonants. They are, for example,
    actually read sequentially and discreetly and at times independently. But
    Unicode strapped these Hebrew vowel points with the combining mark
    property, thus making them dependent on base characters; and here we are,
    stuck with a wrong-headed "legacy" with all its concomitant problems.

    My thinking is that Unicode created and perpetuates this problem, so
    Unicode must come up with a solution for it. Therefore it DOES "fall on
    Unicode ground."

    I'm way too busy right now to devote any time to this, but I would
    suggest that Unicode put ALL Hebrew proposals that intersect this issue
    on hiatus until the Hebrew and Unicode experts on the Hebrew email list
    (and elsewhere) come up with long term strategies for dealing with this
    central issue. We don't need a continuing stream of ad hoc bandages and
    bailing wire to mask a fundamental design flaw.

    I have a feeling that some time in the future the Unicode encoding of
    Hebrew vowels will have to be relegated to legacy status and be
    completely re-done, abandoning the combining mark fiasco.


    Dean A. Snyder

    Assistant Research Scholar
    Manager, Digital Hammurabi Project
    Computer Science Department
    Whiting School of Engineering
    218C New Engineering Building
    3400 North Charles Street
    Johns Hopkins University
    Baltimore, Maryland, USA 21218

    office: 410 516-6850
    cell: 717 817-4897

    This archive was generated by hypermail 2.1.5 : Thu Sep 09 2004 - 16:26:10 CDT