Re: Hebrew script in IDN (was Exemplar Characters)

From: Mark Davis (mark.davis@icu-project.org)
Date: Thu Nov 17 2005 - 17:51:04 CST

  • Next message: Mark E. Shoulson: "Re: Hebrew script in IDN (was Exemplar Characters)"

    It is not that clear-cut. Identifiers by their nature cannot include all
    words and phrases valid in all languages. For IDN, for example, one
    can't express the perfectly reasonable English word "can't", or a word
    like "I.B.M.".

    I did introduce a proposal in March for considering the status of some
    word characters, which turned into a discussion into the UTC of whether
    to add certain items to the identifier definition.

    http://www.unicode.org/L2/L2005/05083-wordprops.txt

    (I'll copy that section here for those without access:

    0027 ; # Po APOSTROPHE
    002D ; # Pd HYPHEN-MINUS
    002E ; # Po FULL STOP
    003A ; # Po COLON
    00B7 ; # Po MIDDLE DOT
    058A ; # Pd ARMENIAN HYPHEN
    05F3 ; # Po HEBREW PUNCTUATION GERESH
    05F4 ; # Po HEBREW PUNCTUATION GERSHAYIM
    200C ; # Cf ZERO WIDTH NON-JOINER // for Indic?
    200D ; # Cf ZERO WIDTH JOINER // for Indic?
    2010 ; # HYPHEN
    2019 ; # Pf RIGHT SINGLE QUOTATION MARK
    2027 ; # Po HYPHENATION POINT
    30A0 ; # Pd KATAKANA-HIRAGANA DOUBLE HYPHEN

    The UTC decided that against adding them to the identifier definition.
    If we were to change that for the Hebrew punctuation, we would have to
    see a documented case for it.

    Mark

    Michael Everson wrote:

    > At 17:42 +0100 2005-11-17, Cary Karp wrote:
    >
    >>> "These punctuation marks may not be available in all fonts (and legacy
    >>> encodings), so an implementation should be prepared to degrade
    >>> gracefully.
    >>> U0027 APOSTROPHE for GERESH and U0022 QUOTATION MARK for GERSHAYIM are
    >>> acceptable fallbacks."
    >>
    >>
    >> The problem is that these fallbacks are not available in IDN under
    >> any circumstances.
    >
    >
    > If that is the case then surely the real characters must be allowed.



    This archive was generated by hypermail 2.1.5 : Thu Nov 17 2005 - 17:52:25 CST