Re: Removing accents and diacritics from a word from Asmus Freytag \(c\) via Unicode on 2019-07-17 (Unicode Mail List Archive)

From: Asmus Freytag \(c\) via Unicode <unicode_at_unicode.org>
Date: Wed, 17 Jul 2019 16:55:03 -0700

On 7/17/2019 11:37 AM, Tex wrote:
>
> Asmus, are you including the case where an accented character maps to
> two unaccented characters?
>
> e.g. Å to AA or Ä to AE
>
If that's covered by the same term; but it's not simple
"typewriter/telegraph" fallback.

>
> *From:*Unicode [mailto:unicode-bounces_at_unicode.org] *On Behalf Of
> *Asmus Freytag (c) via Unicode
> *Sent:* Wednesday, July 17, 2019 11:07 AM
> *To:* Norbert Lindenberg
> *Cc:* Unicode Mailing List
> *Subject:* Re: Removing accents and diacritics from a word
>
> On 7/17/2019 11:02 AM, Norbert Lindenberg wrote:
>
> “Misspelling”?
>
> Not helpful. Anybody have a serious suggestion?
>
> A./
>
> On Jul 17, 2019, at 10:37, Asmus Freytag via Unicode<unicode_at_unicode.org> <mailto:unicode_at_unicode.org> wrote:
>
> A question has come up in another context:
>
> Is there any linguistic term for describing the process of removing accents and diacritics from a word to create its “base form”, e.g. São Tomé to Sao Tome?
>
> The linguistic term "string normalization" appears not that preferable in a computing context.
>
> Any ideas?
>
> A./
>
Received on Wed Jul 17 2019 - 18:55:15 CDT

This archive was generated by hypermail 2.2.0 : Wed Jul 17 2019 - 18:55:15 CDT