From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu May 19 2005 - 07:00:04 CDT
From: "Hans Aberg" <haberg@math.su.se>
> It seems me obvious that such developments will happen, regardless what
> one does at Unicode. The best Unicode can do, in my opinion, is helping
> such developments. Then such developments could be done in new standards
> within the scope of the Unicode consortium.
As long as ISO/IEC 10646-1 will remain a standard, this won't happen in
Unicode.
And even if there's a successor, I think it will probably correct a "better"
representation by providing equivalences with the past ISO/IEC 10646-1 code
points, so even Unicode will adapt to the change.
I don't like the idea of "patchwork". If you think this because scripts are
encoded separately when some of them could have been unified, or because
some scripts were unified when they should not have been, you forget that
Unicode and ISO/IEC 10646 are also replying to these arguments by making the
necessary desunification (for example Coptic/Greek recently, however
characters were not really splitted).
If you want another encoding model, I can give a few ideas:
- reencoding Korean jamos as simple jamos
- adding a model for interlinear annotation that effectively works for
Chinese and Hebrew
- adding a model for vertical and boustrophedon presentations and their
effect on mirrored characters and text layout
- making a better layered model for canonical/compatibility equivalences
- making a better model for clusters/subclusters (forget the "double"
diacritics), including at the syllabic level.
- adding a layered model for text in general, that structures it into
processable units like paragraphs, sentences and words (also needed to
paliate the difficulties caused by East-Asian scripts).
...
Unfortunately, we still have to live with all these limitations. However all
this goes too far away from the objectives of ISO/IEC 10646 which is to
reconciliate the many incompatible charsets that have been developed
everywhere. This objective is still the most wanted one today, as
conversions of charsets is needed in so many places, and ISO/IEC 10646 plays
the role of a "kernel" intermediate representation.
For the future, it seems that the definition of "plain-text" is likely to be
extended, to cover things that are considered part of "rich text formating"
today. The "limitations" today of Unicode and ISO/IEC 10646 are however
easily solved by adding those upper layers of processing on top of it, and
for this reason, it is very unlikely that there will be a big revolution in
ISO/IEC 10646-2 (and the corresponding Unicode successor)...
This archive was generated by hypermail 2.1.5 : Thu May 19 2005 - 07:01:17 CDT