Re: Unicode Standard & ISO-10646 Standard

From: Kenneth Whistler (kenw@sybase.com)
Date: Tue May 11 1999 - 21:44:04 EDT


Dear Mr. Ricardo Bermell-Benet,

Ed Hart answered a number of your inquiries. I will attempt to
answer the rest without duplicating much of Ed's excellent answers.

> I don't fully understand the relation between ISO-10646 and Unicode.
>
> The most I know is that ISO-10646 is an international standard,
> proposed by public national organizations of many countries, and
> Unicode is a "standard based in" or ¿implementation of? ISO-10646,
> proposed by major American (United States) private organizations.
>
> Not in Unicode site nor in other sites related to these standards
> have i found a satisfactory answer.
>
> Following http://www.unicode.org/ one finds:
> << The Unicode Standard is a subset of and code for code identical
> to the International Standard ISO/IEC 10646-1:1993 >>

This unfortunate wording on the home page of the Unicode website
could be improved. For a more accurate statement, you can follow
the links down to the Technical Introduction on the website:

http://www.unicode.org/unicode/standard/principles.html

which states:

"[The Unicode Standard] is fully compatible with the International
Standard ISO/IEC 10646-1:1993, and contains all of the same
characters and encoding points as ISO/IEC 10646."

>
> ¿Is it a subset? If so, why don't use ISO-10646 as a better (wider) standard.

No, it isn't, except in the trivial sense that a set is also
a subset of itself. At any particular point of synchronization
of Amendments (Unicode 2.0 = IS 10646-1 + Amendments 1-7; Unicode 3.0
= IS 10646-1 + Amendments 1-31), the two standards contain exactly the
same repertoire of encoded characters.

> ¿Isn't ISO-10646 usable directly for the same purposes?

Of course.
 
> ¿"Code for code"? ¿What does it mean precisely "to code" here?

Meaning U+0041 LATIN CAPITAL LETTER A in the Unicode Standard is
U+0041 LATIN CAPITAL LETTER A in IS 10646-1, and U+3042 HIRAGANA
LETTER A in the Unicode Standard is U+3042 HIRAGANA LETTER A in
IS 10646-1, and so on for all 38,800+ characters in both standards.

> ¿Is Unicode "code" identical to ISO "code"?

Yes, for the UTF-16 and UTF-8 encoding forms. The Unicode Standard
does not currently allow the UCS-4 encoding form of IS 10646-1.

> ¿Is the diference that Unicode supplies algorithms?

And lots of other things as well.

>
> It seems that (full) documentation for Unicode won't be online and free
> (gratis) for ever and ever, ¿why?

Because it is expensive to publish large, complex books, and even
non-profit, volunteer organizations have to cover the costs of their
operations.

As for ISO, the Unicode Consortium is investigating what it will take
to put a meaningful portion of the standard online. The code charts
for the current version of the standard are already available online.
See:

http://charts.unicode.org/charts.html

What is unlikely to happen is for an entire image of the entire book
to appear online for people to download and print their own copies.

By the way, I doubt that *anything* is going to be "online and free
for ever and ever." Somebody foots the bill for maintaining the
online infrastructure and authoring costs of *any* free site.

>
> Is Unicode Consortium an organization "for profit", an altruist one,
> or what kind of mixture? (Please, note i'm not criticising, simply
> i don't know the answer)
>
> ISO is a public organization, so one can expect (hopefully) its
> documentation will some day be online and free. And more, one may
> expect ISO standards won't ever have restrictions of royalties,
> not restrictions of use (nor of documentation) for arbitrary
> users or organizations. ¿What warranties of that kind supplies
> Unicode Consortium?

The copyright for the publication itself, The Unicode Standard,
Version 2.0 (or 3.0, or whatever), is retained by the Unicode
Consortium. That is just regular practice for publication of
books. The Consortium and the publisher express no warranties
for the standard -- that is also regular practice, since there
could be (and were) errors in the publication, and implementors
are on notice that they need to take reasonable care in implementation.

Other than that, it is clear that the Unicode Standard is an open
*standard*. It is intended for general, open, use by anyone
who wishes to make use of the standard. There are absolutely
no restrictions on use of the standard; there are not and cannot
be royalties applied to its use.

By the way, please note that ISO retains copyright on ISO
standards. This is merely to protect them from copyright pirates
with a copying machine setting up a "Standards-'R-Us" post
office box and undercutting ISO's cost of operations and
production of the standards. So the restrictions of use
on the *documentation* of the standard are similar in either
case.

Regards,

--Ken Whistler, Technical Director, Unicode, Inc.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:46 EDT