Re: Hebrew script in IDN (was Exemplar Characters)

From: Neil Harris (neil@tonal.clara.co.uk)
Date: Thu Nov 17 2005 - 14:05:53 CST

  • Next message: Cary Karp: "Re: Hebrew script in IDN"

    JR wrote:
    >
    >> -----Original Message-----
    >> From: unicode-bounce@unicode.org
    >> [mailto:unicode-bounce@unicode.org] On Behalf Of Michael Everson
    >> Sent: Thursday, November 17, 2005 7:12 PM
    >> To: Unicode Discussion
    >> Subject: RE: Hebrew script in IDN (was Exemplar Characters)
    >>
    >>
    >> At 17:42 +0100 2005-11-17, Cary Karp wrote:
    >>
    >>>> "These punctuation marks may not be available in all fonts
    >>>>
    >> (and legacy
    >>
    >>>> encodings), so an implementation should be prepared to
    >>>>
    >> degrade gracefully.
    >>
    >>>> U0027 APOSTROPHE for GERESH and U0022 QUOTATION MARK for
    >>>>
    >> GERSHAYIM are
    >>
    >>>> acceptable fallbacks."
    >>>>
    >>> The problem is that these fallbacks are not available in IDN under
    >>> any circumstances.
    >>>
    >> If that is the case then surely the real characters must be allowed.
    >>
    >
    > They are not included in the Hebrew keyboard.
    >
    > Jony

    Then Hebrew-language users are going to have great difficulty in typing
    URLs containing IDNs that contain GERESH or GERSHAYIM, unless either
    their input method can translate the characters automatically, or their
    user-agents have special language-context-sensitive URL parsing code
    which allows the apostrophe and double-quote characters in the context
    of Hebrew letters, and then converts them to the correct characters in
    the IDN. The latter is unlikely, as it will break numerous RFCs.

    Are there unambiguous transformation rules that could be used to
    transform input using the substitute characters to output using the
    correct characters, or will there always be an ambiguity with other uses
    of these characters?

    -- Neil



    This archive was generated by hypermail 2.1.5 : Thu Nov 17 2005 - 14:08:32 CST