Re: Specification for XID_Start and XID_Continue

From: Mark Davis (
Date: Wed Aug 15 2007 - 11:02:18 CDT

  • Next message: Martin v. Löwis: "Re: Specification for XID_Start and XID_Continue"

    The reason middle dot wasn't mentioned was that the UTC has decided to add
    it to ID in U5.1 -- see the proposed update at (Middle dot was handled
    specially - instead of removing the character in step #1, the character
    causing a problem in its decomposition was added.)

    The differences can be seen by looking at[:id_continue:]&b=[:xid_continue:]

    I think it would be useful to add a more detailed description of the
    derivation; I'll propose that to the editorial committee.


    On 8/15/07, "Martin v. Löwis" <> wrote:
    > > I glean this as the algorithm:
    > >
    > > Add middle dot to ID_CONTINUE
    > >
    > > If an ID_START or ID_CONTINUE character has a decomposition containing a
    > > character other than middle dot that's not in ID_CONTINUE, then remove
    > > that character from ID_START or ID_CONTINUE.
    > >
    > > If an ID_START has a decomposition that begins with a character that's
    > > not an ID_START, remove it from ID_START.
    > Thanks, this is exactly what I was looking for - at least for Unicode
    > 4.1, this algorithm produces an outcome equal to the published tables.
    > Could that be added to UAX#31?
    > Regards,
    > Martin


    This archive was generated by hypermail 2.1.5 : Wed Aug 15 2007 - 11:05:01 CDT