Re: Normalization in panlingual application

From: John D. Burger (john@mitre.org)
Date: Thu Sep 20 2007 - 12:55:31 CDT

  • Next message: Rick McGowan: "Re: New Public Review Issue: Proposed Update UTS #18"

    Asmus Freytag wrote:

    > IDN still operates on a restricted domain of characters, many
    > characters that are part of ordinary text are disallowed from the
    > get-go (I haven't checked where that subset is at recently, but
    > that's the general idea). At the minimum, the transformations that
    > are designed into IDN would need to be modified or extended to
    > handle such characters. Because of that alone, the normalization
    > and folding aspect of IDN is unlikely to be suitable for general
    > text. There are likely additional issues.
    >
    > If you suggest that any scheme in which you can't represent the
    > word "can't" is suitable for the class of applications that the
    > original poster represents, then I fail to follow you.

    But that's due to IDN's restricted domain, yes? I guess my thought
    was that, if the transformations from IDN can be applied to a larger
    domain of characters, then IDN might provide a fifth normalization
    form appropriate for a broad class of applications.

    But my hope for a cookie-cutter solution appears to be forlorn. :)

    - John D. Burger
       MITRE



    This archive was generated by hypermail 2.1.5 : Thu Sep 20 2007 - 12:51:07 CDT