Code pages and Unicode (wasn't really: RE: Endangered Alphabets) from Doug Ewell on 2011-08-19 (Unicode Mail List Archive)

From: Doug Ewell <doug_at_ewellic.org>
Date: Fri, 19 Aug 2011 07:27:27 -0700

srivas sinnathurai <sisrivas at blueyonder dot co dot uk> wrote:

> PUA is not structured

It's not supposed to be. It's a private-use area. You use it the way
you see fit.

> and not officially programmable to accommodate
> numerous code pages.

None of Unicode is designed around code-page switching. It's a flat
code space. This is true even for ISO 10646, which nominally divides
the space into groups and planes and rows.

As a programmer, I don't understand what "not officially programmable"
means here. I've written lots of programs that use and understand the
PUA.

> Take the ISO 8859-1, 2, 3, and so on .....
> These are now allocating the same code points to many languages and
> for other purposes.

Character encodings don't allocate code points to languages. They
allocate code points to characters, which are used to write text in
languages. This is not a trivial distinction; it is crucial to
understanding how character encodings work.

> Similarly, a structured and official allocations to any many
> requirements can be done using the same codes, say 16,000 of them.

If you want to use ISO 2022, just use ISO 2022.

I guess what I'm missing is why the code-page switching model is
considered superior, in any way, to the flat code space of
Unicode/10646.

--
Doug Ewell | Thornton, Colorado, USA | RFC 5645, 4645, UTN #14
www.ewellic.org | www.facebook.com/doug.ewell | @DougEwell

Received on Fri Aug 19 2011 - 09:28:46 CDT

This archive was generated by hypermail 2.2.0 : Fri Aug 19 2011 - 09:28:47 CDT