Re: Medievalist ligature character in the PUA

From: Doug Ewell (doug@ewellic.org)
Date: Fri Dec 25 2009 - 13:29:24 CST

  • Next message: John W Kennedy: "Re: Medievalist ligature character in the PUA"

    On Mon Dec 14 2009 14:26:16 CST, Julian Bradfield <jcb plus unicode at
    inf dot ed dot ac dot uk> wrote:

    > I'm sure someone can come up with an example of two utf-8 canonically
    > equivalent strings that both make (different) sense in some other
    > encoding.

    For perhaps the wrong reason, this reminded me of:

    NESTLÉ®

    my canonical example of a plausible Latin-1 string that could be
    interpreted (wrongly, of course) as UTF-8. The last two characters are
    U+00C9 U+00AE, and the corresponding Latin-1 byte values 0xC9 0xAE are
    UTF-8 for ɮ U+026E LATIN SMALL LETTER LEZH.

    I probably need a new canonical example, because this one isn't wholly
    realistic; Nestlé doesn't appear to be a registered trademark (the legal
    name appears to be Nestlé S.A.) and the name is not generally spelled
    with all-caps.

    --
    Doug Ewell  |  Thornton, Colorado, USA  |  http://www.ewellic.org
    RFC 5645, 4645, UTN #14  |  ietf-languages @ http://is.gd/2kf0s ­ 
    


    This archive was generated by hypermail 2.1.5 : Fri Dec 25 2009 - 13:35:54 CST