Re: supplementing and/or overriding decomposition property

From: Janusz S. Bień (
Date: Fri Apr 08 2011 - 23:51:40 CDT

  • Next message: Doug Ewell: "Re: On the possibility of encoding some localizable sentences in plane 7"

    On Wed, 06 Apr 2011 (Janusz S. Bień) wrote:

    > I need to provide the decomposition mappings for some PUA
    > characters. I would like to use the same format to overrride standard
    > compatibility decomposition (at the moment I would like just to block
    > the conversion of long s to the standard one).
    > Do you have any suggestion for a format to store and maintain such
    > data?

    For archive:

    Jakub Wilk suggested NormalizationCorrections.txt:

    # Interpretation of the fields:
    # Field 0: Unicode code point
    # Field 1: Original (erroneous) decomposition
    # Field 2: Corrected decomposition
    # Field 3: Version of Unicode for which the correction was
    # entered into UnicodeData.txt, in n.n.n format.
    # Comment: Indicates the Unicode Corrigendum which documents
    # the correction

    I intend to leave Field 1 empty and to use Field 3 for the character
    origin (e.g. MUFI) and Field 4 for the character name and possibly
    other comments.



    Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki Formalnej)
    Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics),,

    This archive was generated by hypermail 2.1.5 : Fri Apr 08 2011 - 23:54:04 CDT