Re: ASCII and Unicode lifespan

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu May 19 2005 - 07:00:04 CDT

  • Next message: Hans Aberg: "Re: ASCII and Unicode lifespan"

    From: "Hans Aberg" <haberg@math.su.se>
    > It seems me obvious that such developments will happen, regardless what
    > one does at Unicode. The best Unicode can do, in my opinion, is helping
    > such developments. Then such developments could be done in new standards
    > within the scope of the Unicode consortium.

    As long as ISO/IEC 10646-1 will remain a standard, this won't happen in
    Unicode.
    And even if there's a successor, I think it will probably correct a "better"
    representation by providing equivalences with the past ISO/IEC 10646-1 code
    points, so even Unicode will adapt to the change.

    I don't like the idea of "patchwork". If you think this because scripts are
    encoded separately when some of them could have been unified, or because
    some scripts were unified when they should not have been, you forget that
    Unicode and ISO/IEC 10646 are also replying to these arguments by making the
    necessary desunification (for example Coptic/Greek recently, however
    characters were not really splitted).

    If you want another encoding model, I can give a few ideas:
    - reencoding Korean jamos as simple jamos
    - adding a model for interlinear annotation that effectively works for
    Chinese and Hebrew
    - adding a model for vertical and boustrophedon presentations and their
    effect on mirrored characters and text layout
    - making a better layered model for canonical/compatibility equivalences
    - making a better model for clusters/subclusters (forget the "double"
    diacritics), including at the syllabic level.
    - adding a layered model for text in general, that structures it into
    processable units like paragraphs, sentences and words (also needed to
    paliate the difficulties caused by East-Asian scripts).
    ...
    Unfortunately, we still have to live with all these limitations. However all
    this goes too far away from the objectives of ISO/IEC 10646 which is to
    reconciliate the many incompatible charsets that have been developed
    everywhere. This objective is still the most wanted one today, as
    conversions of charsets is needed in so many places, and ISO/IEC 10646 plays
    the role of a "kernel" intermediate representation.

    For the future, it seems that the definition of "plain-text" is likely to be
    extended, to cover things that are considered part of "rich text formating"
    today. The "limitations" today of Unicode and ISO/IEC 10646 are however
    easily solved by adding those upper layers of processing on top of it, and
    for this reason, it is very unlikely that there will be a big revolution in
    ISO/IEC 10646-2 (and the corresponding Unicode successor)...



    This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 07:01:17 CDT