Re: Charsets + encoding + codesets

From: Keld J|rn Simonsen (keld@dkuug.dk)
Date: Thu Oct 09 1997 - 11:16:48 EDT


=?iso-8859-1?Q?Martin_J=2E_D=FCrst?= writes:

> On Tue, 7 Oct 1997, Keld J|rn Simonsen wrote:
>
> > Kenneth Whistler writes:
>
> > I would rather say that the character set of 10646 is the repertoire
> > of 10646 which is the characters in the codepoints of 10646. This
> > is a finite repertoire, although it may differ which each
> > amendment. But you can always count the characters in there.
> >
> > For Unicode it is a different story. Unicode can represent an
> > undefined number of "abstract characters" which is the Unicode
> > equivalent term to the ISO term "character". (I even use that
> > term to clarify the difference to a "coded character").
> > Unicode's repertoire is thus infinite.
>
> I agree with others:
> Unicode has exactly the same codepoints/characters, and therefore
> the same repertoire, as ISO 10646. It can represent exactly
> the same (unlimited) "abstract characters"/diacritic
> combinations, as well as the same words, sentences,...
> as ISO 10646. It may use different terminology in some
> cases, but that doesn't make it different. It includes
> some more detailled specifications that ISO 10646 includes
> more implicitly. But even ISO 10646 speaks about how to
> combine base characters and combining marks to represent
> diacritic combinations,...

Well, different terminology may make the specificatons different,
but with respect to the repertoire, I do agree from the recent
discussion that the repertoires of 10646 and Unicode is the same
(with minor differences due to amendments etc).

keld



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT