Re: unicode Digest V10 #106

From: Andrew Lipscomb (ewwa@chattanooga.net)
Date: Mon Jun 01 2009 - 09:01:59 CDT

  • Next message: Simon Montagu: "Re: Old Italic in RTL ??"

    > This quote say that it depends on how you read the standard which
    > code
    > points are invalid; perhaps someone here can clarify :-):
    > http://en.wikipedia.org/wiki/UTF-8#Invalid_code_points
    >
    > In particular, it would be great to know if the range U+0080, ?, U
    > +009F is invalid.
    >
    > Hans Aberg

    Those code points (encoded properly) are valid. However, their
    appearance may indicate that an error occurred in processing, as
    the C1 controls would be rare in real Unicode text (and, with the
    exception of U+0085, are discouraged in XML). They most often
    arise by treating Windows-1252 as if it were ISO-Latin-1.

    In other words, not invalid, but suspicious.

    ------------------------------------------------------via webmail----
    Andrew Lipscomb
    ewwa@chattanooga.net



    This archive was generated by hypermail 2.1.5 : Mon Jun 01 2009 - 09:04:14 CDT