Latin 0 Implementation considerations

From: Alain LaBont/e'/ SCT (alb@riq.qc.ca)
Date: Tue Jul 08 1997 - 23:33:57 EDT


Here is a personal contribution I produced to ISO/IEC JTC1/SC2 as
additional information on the official proposal for Latin 0. That should
help, I wish. This was asked by at least one manufacturer so that no
information be hidden, in particular when it is useful, harmless
information, that was instrumental in making a case for the proposal even
if it was only discussed and not written before (in fact I feared that it
would hurt sensitivities of ISO/IEC experts who want to stick with the
ideal standards world without making reference to the real world [I've been
through this in other projects, and I was slightly blamed to make
proprietary references, but I guess that with the new paradigm mandated by
JTC1 reengineering, in which new projects have to be based on market
issues, that habits have changed], but the proposal was based on real world
problems).

So I hope that Microsoft, IBM and IBM-compatible mainframe manufacturers,
email software makers, browser makers, and so on, will be interested and if
it can be useful, my day will have been enhanced (;

If it hurts, please ignore it, I don't want to raise a controversy, on the
contrary, it is a good-will contribution for information only.

Alain LaBont
Iraklion

_______________________________________________________
Implementing Latin 0 as a smooth replacement of Latin 1

This project has for primary purpose to be the standard missing link that
will help interchange character data (via legitimated conversion tables)
between private character sets in the 8-bit world and ISO/IEC 10646 for the
EURO SIGN, and for integral French and Finnish mainly, in addition to
allow implementing this table under strictly conforming ISO/IEC 8-bit
coding environments, naturally, for example to be able to write ISO
standards using ISO standard character technology!

In private environments for which the identified issues exist (implementing
the EURO SIGN is one issue, correcting Latin-1 to fully cover its scope for
French and Finnish is a long-standing issue), a good implementation
strategy would be, when new characters are already supported in a private
code that is not conformant to ISO/IEC 8-bit character set structure, to
obsolete the replaced characters of Latin-1 in 8-bit-to-8-bit character
conversion tables and rather, for compatibility, divert, back and forth,
the replaced characters which already existed in the private code as if
they were using the replaced positions, without changing applications
between the private environment and the outside world.

As an example, in the MS-Windows 1252 code page, the French , and the
Finnish , are already coded in what is known as the C1 space in ISO/IEC
8-bit character set structure. For interchange, without changing MS-Windows
legacy data, conversion tables could be changed to the outside 8-bit world
using the locations allocated in Latin 0 for these characters. Of course
when characters do not exist, like the EURO SIGN and capital and lower case
Z CARON, they will have to be added and this could be accomplished using
either private code positions or standard replaced positions, a choice that
has to be made by the proprietary table designers.

Another example: IBM-defined EBCDIC code pages generally stick to standard
ISO/IEC SC2 character set repertoires because they can not technically
implement more than 191 characters in tables in the 8-bit world (the rest
of the 8-bit tables being totally used for control characters), as
Microsoft did, pragmatically, because this constraint did not exist for
this company. In this case IBM and IBM-compatible mainframe manufacturers
would probably choose to create a new code page which sticks on the new
ISO/IEC Latin 0 repertoire, using exactly the same code conversion as
before for Latin 1 to the outside, ISO/IEC standard world.

This pragmatic information was considered by the editor before the Latin 0
proposal was made.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT