Re: unicode entities, "beginner" questions...

From: suzume@mx82.tiki.ne.jp
Date: Sun Mar 13 2005 - 18:37:24 CST

Next message: Marion Gunn: "Re: Encoded rendering instructions"

Previous message: Philippe VERDY: "Re: unicode entities, "beginner" questions..."
In reply to: Philippe VERDY: "Re: unicode entities, "beginner" questions..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 2005/03/14, at 0:57, Philippe VERDY wrote:

> I understnd your frustration, but most of these problems come from the
> need, in application programming interface, to keep the compatibility
> with legacy interfaces.
> For example I manage a set of translations for a Java app, within sets
> of .properties files. Unfortunately, the Java API for handling
> resource bundles still does not know (even in Java 1.5) how to
> recognize UTF-8 encoded files (even if we include a leading BOM), so
> the Java resource bundle loader will only process files using the
> legacy ISO-8859-1 or US-ASCII character set. Any other Unicode
> character must be encoded with so-called "Unicode escapes" (with form
> "\uXXXX" where XXXX is a hex-encoded UTF-16 encoding unit).

You may be interested to know that OmegaT handles .properties files. So
that translators have access to the text without you having to work
with it.

Take a look at: www.omegat.org if you are interested.

JC Helary

> This is frustrating, and when managing translations, it is nearly
> impossible to find translators that would have the required technical
> knowledge to work with this format. For this reason, I give to
> translators templates written with UTF-8 encoded files (with a leading
> BOM), that are much more user-friendly. I let them work on this
> version, and use the UTF-8 file as the reference file for all
> translations.
> The actual .properties files are generated automatically with a
> home-made validation tool that check the overall format, check the
> presence of duplicate resource keys, reorder keys, make them properly
> delimited with no extra spaces, check punctuation, and the presence of
> variable place-holders, and also creates the actual .properties file
> in a way similar to the Java JDK tool "native2ascii -encoding UTF-8"
> (except that my tool converts from UTF-8 to ISO-8859-1, letting all
> ISO-8859-1 characters unescaped, including all those that are not
> US-ASCII); also this tool works with an internal CVS-based history
> tool, and can generate comments for helping translators, which are
> preserved in the UTF-8 reference source file, but filtered out in the
> generated final .properties files where all comments and blank lines
> are removed.

Thank you Jukka and Philippe for your answers. I think I got the
answers I was looking for.

Sincerely,

Jean-Christophe

Next message: Marion Gunn: "Re: Encoded rendering instructions"
Previous message: Philippe VERDY: "Re: unicode entities, "beginner" questions..."
In reply to: Philippe VERDY: "Re: unicode entities, "beginner" questions..."
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Sun Mar 13 2005 - 18:36:22 CST