Re: Character disunification

From: Mark E. Shoulson (
Date: Tue Dec 21 2004 - 10:48:31 CST

  • Next message: Philippe Verdy: "Re: Is it roundtripping or transfer-encoding"

    Jony Rosenne wrote:

    >>You did suggest something like this during one of the various Hebrew
    >>character debates. But it doesn't hold up well in general. By that
    >>logic, we also now need to encode LATIN LETTER U OR V, LATIN
    >>J (both in CAPITAL and SMALL versions), plus LATIN SMALL
    >>SHORT S (though we could probably manage to use just U+0073
    >>for that and
    >>encode SHORT S separately). But I don't think anyone would
    >>want such a
    >>confusing state of affairs. Spelling things right is hard
    >>enough when
    >>there's only *one* choice for each letter!
    >The example isn't relevant. These disunifications are very old - you could
    >have added C/G - and the I and U are commonly used for the ambiguous
    And Dean's example of Cuneiform characters weren't also "very old"?
    Besides, we still have documents from that "very old" time (only a few
    centuries ago, hardly old at all. Why, many of them are printed!),
    documents which *do* use the glyphs ambiguously. If it's sensible to
    encode ambiguous characters, then these are probably the most sensible
    cases for it. But it isn't.

    "The I and U are commonly used for the ambiguous characters"? Great.
    The stemmed version of the cantillation is also commonly used for the
    ambiguous character. Nothing you've said here distinguishes the case of
    U/V from the one under discussion.

    (Actually, the unification of yerah-ben-yomo and atnah hafukh may
    actually be older than U/V and I/J. Books were printed with U/V and I/J
    not distinguished for quite a long time, certainly into the 17th century
    (viz. Shakespeare's First Folio, for a famous example), but the
    cantillations were conflated in quite early printings of the
    Bible--though not necessarily the earliest.)


    This archive was generated by hypermail 2.1.5 : Tue Dec 21 2004 - 10:54:10 CST