Re: Tailoring of normalization

From: Rick McGowan (
Date: Tue Feb 04 2003 - 11:58:54 EST

  • Next message: "Re: Indic Devanagari Query"

    (Number one: Please don't CC me on this discusion. I'm on the list and I
    don't need 2 copies of every mail.)

    Paul Hoffman wrote:

    > If I read this issue correctly, this might have a *huge* effect on
    > the IETF protocols that do normalization. Before I alert the usual
    > suspects, could someone describe what are the expected bounds of this
    > will be?

    I can't speak to the bounds, but I can speak to the fact that this should
    not have an effect on IETF protocols. The existing normalization forms will
    not be affected, and protocols that specify a particular normalization
    should be fine.

    The fact is, when it comes to normalization, one size doesn't fit all for
    all purposes. Even for such things as canonical decomp (NFD), not everyone
    wants to totally decompose everything for every purpose. For instance, the
    Japanese hiragana "ga", "ba", "pa" series. Formally, they decompose into
    combining sequences in NFD, but for some purposes, people may not want them
    to be decomposed. If they could specify use of NFD plus some tailoring, it
    would be considered advantageous.

    I'm just explaining, not defending anything in particular.


    This archive was generated by hypermail 2.1.5 : Tue Feb 04 2003 - 12:39:07 EST