IDNA2008 Contextual rules clarification

From: Chigurupati, Nagesh (
Date: Fri Oct 29 2010 - 15:42:12 CDT

  • Next message: Kenneth Whistler: "Re: IDNA2008 Contextual rules clarification"


    I have a question regarding some of the contextual rules in RFC5892. For
    example the contextual rule in appendix A.4 Greek Lower Numeral Sign
    (U+0375), states the following:

    If Script(After(cp)) .eq. Greek Then True;

    If the Greek Lower Numeral Sign (U+0375) is the last code point in the
    IDN, should it be allowed? There are statements in the RFC5892 as

    Before(FirstChar) evaluates to Undefined.
    After(LastChar) evaluates to Undefined.

    Can I assume that "Undefined" is not equal to "Greek", and therefore
    input sequences with a trailing Greek Lower Numeral Sign are always
    disallowed by the specification?

    The Hebrew Punctuation Geresh (U+05F3), Hebrew Puncutation Gershayim
    (U+05F4), etc. also pose a similar question. The rule set for these
    contextual rules states the following:

    If Script(Before(cp)) .eq. Hebrew Then True;

    So, if the first code point is U+05F3, then should it be disallowed as
    there is no code point before this one to assert that it belongs to the
    Hebrew script.

    Nagesh Chigurupati

    This archive was generated by hypermail 2.1.5 : Fri Oct 29 2010 - 15:47:34 CDT