RE: unicode on Linux

From: Francois Yergeau (FYergeau@alis.com)
Date: Wed Oct 29 2003 - 09:59:46 CST

Next message: Doug Ewell: "Re: osmanya script"
Previous message: Jim Allan: "RE: Merging combining classes, was: New contribution N2676"
Maybe in reply to: Shao, Yiying: "unicode on Linux"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Philippe Verdy wrote:
> The idea that "if a text (without BOM) looks like valid
> UTF-8, then it is
> UTF-8; else it uses another legacy encoding" does not work in
> practice and also leads to too many false positives.

Can you point to actual data/cases? I don't mean theoretical, I can make up
my own.

> Some problems do
> exist however, with the relaxed rules for UTF-8 as it was
> defined in the IESG RFC.

Errr, relaxed? Care to elaborate? Are you referring to RFC 2279?

> These old texts (that are valid for this old
> version of the UTF-8 encoding) still exist now

What's particular about these old texts?

-- 
François

Next message: Doug Ewell: "Re: osmanya script"
Previous message: Jim Allan: "RE: Merging combining classes, was: New contribution N2676"
Maybe in reply to: Shao, Yiying: "unicode on Linux"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:25 CST