Re: Specification for XID_Start and XID_Continue

From: Mike (mike-list@pobox.com)
Date: Tue Aug 14 2007 - 13:15:54 CDT

  • Next message: Asmus Freytag: "Re: Specification for XID_Start and XID_Continue"

    I ran into the same problem, and didn't really find an
    answer. In my code, I ended up with this:

    inline bool Char::IsXidStart () const
    {
         switch (u_Char_) // u_Char_ holds the code point of the Char
         {
             case 0x037A: case 0x0E33: case 0x0EB3: case 0x309B: case 0x309C:
             case 0xFC5E: case 0xFC5F: case 0xFC60: case 0xFC61: case 0xFC62:
             case 0xFC63: case 0xFDFA: case 0xFDFB: case 0xFE70: case 0xFE72:
             case 0xFE74: case 0xFE76: case 0xFE78: case 0xFE7A: case 0xFE7C:
             case 0xFE7E: case 0xFF9E: case 0xFF9F:
             {
                 return false;
             }
         }

         return IsIdStart();
    }

    inline bool Char::IsXidContinue () const
    {
         switch (u_Char_)
         {
             case 0xB7:
             {
                 return true;
             }

             case 0x037A: case 0x309B: case 0x309C: case 0xFC5E: case 0xFC5F:
             case 0xFC60: case 0xFC61: case 0xFC62: case 0xFC63: case 0xFDFA:
             case 0xFDFB: case 0xFE70: case 0xFE72: case 0xFE74: case 0xFE76:
             case 0xFE78: case 0xFE7A: case 0xFE7C: case 0xFE7E:
             {
                 return false;
             }
         }

         return IsIdContinue();
    }

    Mike

    Martin v. Löwis wrote:
    > I'm trying to locate the precise specification for the
    > XID_Start and XID_Continue properties. According to
    >
    > http://unicode.org/Public/UNIDATA/UCD.html
    >
    > they are derived properties, so there should be an
    > algorithm somewhere describing how the are computed
    > (given other properties). The UCD says that the
    > specification is in UAX#31, which says I should
    > read
    >
    > http://unicode.org/reports/tr31/#NFKC_Modifications
    >
    > However, looking at 5.1, I cannot find a precise
    > specification of these properties. For example,
    > 5.1.2 says "Certain characters...", but does not
    > seem to provide a complete list of such characters.
    > It ends with "In particular, the following four
    > characters...". Again, that reads like an example -
    > is it meant as a complete specification?
    >
    > Likewise, 5.1.3 talks about "certain Arabic presentation
    > forms", without giving a complete list which precisely
    > are excluded from XID_Start and XID_Continue.
    >
    > Any insights appreciated,
    >
    > Martin



    This archive was generated by hypermail 2.1.5 : Tue Aug 14 2007 - 13:22:40 CDT