Re: How-To handle i18n when you don't know charset?

From: John O'Conner (john.oconner@eng.sun.com)
Date: Thu Jul 06 2000 - 14:35:45 EDT


> Should I require more information from the UI or is there a way to scan
> the data that I am currently being passed and determine the appropriate
> charset?

If the UI insists on passing non-Unicode text, you must insist on more
information...a tag perhaps that identifies the charset encoding.

> Since I'm in a Java environment, isn't there be a way to go to UTF-8 and
> from UTF-8 determine the corresponding ISO (and other) charset?

One can do some statistical analysis on the code values and perhaps
determine the language and then the charset encoding. However, this seems
unreasonably difficult. If you are in a Java environment, why not pass UTF-8
or UTF-16 values around everywhere?

-- John O'Conner



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT