From: Kenneth Whistler (
Date: Mon Feb 23 2009 - 13:41:49 CST

  • Next message: Mark Davis: "Re: NFC FAQ"

    > theoretically your
    > implementation wouldn't be conformant to UCA for the

    recte: Unicode Normalization Forms

    I spend so much time thinking about the UCA that my fingers
    seem disconnected from my brain sometimes when typing
    TLA's. ;-)
    > million combining character sequence, but realistically,
    > who would care?

    Also, one should note that the Stream-Safe Text Format
    was added to UAX #15 precisely because of worries about
    unbounded sequences of non-starters and their impact
    on the ability to normalize correctly in protocols that
    may use streaming text.

    An implementation concerned about warding off worst-case
    normalization behavior for potentially malicious sequences
    of non-starters in data could simply declare that it
    is using the Stream-Safe Text Process (see D8 in Section 21
    of UAX #15, and Conformance clause UAX15-C4). That automatically
    sets what I was calling the governor count to 30. Any
    non-starter sequence longer than 30 characters results in
    insertion of a CGJ, thereby automatically bounding the searchback
    for canonical reordering.


    This archive was generated by hypermail 2.1.5 : Mon Feb 23 2009 - 13:43:27 CST