Re: MES as an ISO standard?

From: Markus G. Kuhn (kuhn@cs.purdue.edu)
Date: Tue Jul 01 1997 - 00:37:23 EDT


Jonathan Rosenne wrote on 1997-07-01 03:59 UTC:
> At 23:23 30/06/97 -0400, Markus Kuhn wrote:
> >I think, it would be an excellent idea to draft a new standard
> >
> > ISO 15646:1998 -- Multi-byte coded character set for
> > European languages
> >
> >that specifies MES or a very similar Unicode subset with around
> >1000 characters.
>
> 10646 already has the concept of subsets, all you want is an editorial to
> Annex A to add another one.

Definitely not. You missunderstood me. I am well aware of annex A and
I do not like annex A.

I definitely would like to see this as a new ISO standard with a new ISO
standard number and a new title. Make is clearly recognizable a different
character set for the naive moderately computer literate person out
there who wonders what characters her email software supports.

I want to have a short identifier like "ISO 15646" for this subset
that I can use in all those systems that use ISO standard numbers
in order to identify character sets. This is for example the
X Windows system, MIME, and HTTP.

Just as ASCII is not only a subset in some appendix A of ISO 8859-1,
I want to have a separate name that makes clear that I am NOT talking
about ISO 10646 when I talk about this new simpler 16-bit character set
for European languages.

Yes, this is just sort of a marketing issue and not strictly a
technical problem. But the problem we are trying to solve here
*IS* a marketing problem. Full ISO 10464 will not replace ASCII in the
next 30 years in those 90% of applications that are not special
i18n word processors. Period. I have doubts, whether "ISO 10646
Level 0 Subset 63 UCS-2 as defined in ISO/IEC 10646-1 Am. 15 and
revised by ISO/IEC 10646-1 Am. 19" will replace it. But "ISO 15646"
might have a chance to make 16-bit characters attractive to those
>>90% developers without special i18n and global linguistic
training who are today scared to death by bidi algorithms,
combining characters, and representation forms. UTF-8 is already
enough for them to worry about.

Make this a separate document that is as simple to read as ISO 8859-1.
Make it a 16-bit character set, drop all the exotic 31-bit architecture and
terminology, and make it just a code table that people can implement and
use as easily as they have implemented ISO 8859-1. Those who are
interested will appreciate that everything is compatible to ISO 10646,
those who are not do not have to worry about all the gory details of
the all-in-one-kitchen-sink standard ISO 10646/Unicode.

If I have learned one important thing about standardization, then it is
that the idea of all-in-one standard documents plus additional profiles
is a nice idea for the working groups, but in the real world, profiles
only cause confusion and people just ignore them. Most people will
not read the standard. They just expect that if product A and B
implement this ISO standard, then both will work together.

Markus

-- 
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn@cs.purdue.edu



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT