VS: How to remove accents while conforming to language standards?

From: Erkki I Kolehmainen <eik_at_iki.fi>
Date: Mon, 4 Nov 2013 22:38:29 +0200

Mind you, well over ten years ago there was an EU funded project in CEN/TC304, European Localization Requirements, to specify such fallbacks for use in Europe. In spite of the fact that there was conceivably more need to accommodate antiquated systems, the project had to be terminated because no reasonable consensus could be reached.

Sincerely, Erkki I. Kolehmainen

-----Alkuperäinen viesti-----
Lähettäjä: unicode-bounce_at_unicode.org [mailto:unicode-bounce_at_unicode.org] Puolesta Jukka K. Korpela
Lähetetty: 4. marraskuuta 2013 21:53
Vastaanottaja: unicode_at_unicode.org
Aihe: Re: How to remove accents while conforming to language standards?

2013-11-04 21:00, Jennifer Wong wrote:

> The use case is that customers want to integrate data from our
> enterprise solution to their ASCII-based downstream systems.

This is very different from the question about removing accents while
conforming to language standards. The very goal makes it impossible to
conform to language standards. The next question should be what the data
will be used for, and how.

> Thus all accents need to be removed.

I would not jump into that conclusion. Just because some system is
ASCII-based does not mean that you cannot in any way handle non-ASCII
data. You can encode non-ASCII characters in ASCII in many ways. To take
a trivial example, you could convert È to E` and later possibly convert
it back, though in such approaches you need to be careful to make the
conversion reversible (if it needs to be). In some cases, out-of-band
information could be included, e.g. entering a name in a simplified form
in ASCII but accompanied with a note (in ASCII) describing accents that
have been omitted.

Even if it is acceptable to do lossy mappings (like just dropping all
accents, or mapping, say, Ä to AE without worrying about possible AE in
original data), the crucial question is how the data will be used, now
and in the future.

Received on Mon Nov 04 2013 - 14:40:09 CST

This archive was generated by hypermail 2.2.0 : Mon Nov 04 2013 - 14:40:09 CST