RE: Surrogate points

From: Lars Kristan (lars.kristan@hermes.si)
Date: Tue Feb 01 2005 - 04:02:11 CST

Next message: D. Starner: "RE: Surrogate points"

Previous message: Donald Z. Osborn: "Re: The Yoruba under-diacritic"
Next in thread: D. Starner: "RE: Surrogate points"
Maybe reply: D. Starner: "RE: Surrogate points"
Maybe reply: Peter Constable: "RE: Surrogate points"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Hans Aberg wrote:
> A relatively minor change to the UTF-16 would make that
> condition to go
> away. The current UTF-16 implementations would merely need to
> be aware of
> that these character numbers may be used, and become altered
> appropriately.
> This is not an urgent change as these character numbers will
> not be filled
> very soon.

Extending UTF-16 would not be easy. An incompatible implementation would not
succeed, since in that case you can simply use UTF-8, which we already
concluded can be extended should the need arise.

But all this is rather irrelevant. There are enough codepoints. Yes, such
statements have been made before, and were proven wrong, but this fact alone
does not mean the same thing will happen again. You are falsely applying the
Moore's law here. Assigning codepoints doesn't have much to do with
processing power or storage capacity growth. It has to do with people. And
not the number of people, but the number of scripts and characters in them.
Those don't grow much, in fact they are probably declining.

A new glyph will probably pop-up now and then, the demand for it is most
likely to come from mathematics or physics. I would not be concerned that we
will run out of codepoints on that account. Things that would fill up the
codepoints would be:
* Artificial scripts, like Klingon
* Formatting (escape) codes
* An alien race

Unicode doesn't encode the first two, an the third one is unlikely. And if
it happens, I bet we would adopt their encoding (along with the rest of the
technology) rather than vice versa.

A decision was once made that 21 bits will suffice. An so far it seems it
will indeed. The industry will not be willing to make an investment into
something that we may never need. And as I already said, if we ever run out,
UTF-16 will probably be long gone by then.

Lars

P.S.: Speaking of introducing new glyphs, I wonder how is one supposed to
introduce a new one. One needs to prove a glyph is in use, but then again
nowadays a glyph cannot be used (efficiently) until it is encoded. Catch 22?

Next message: D. Starner: "RE: Surrogate points"
Previous message: Donald Z. Osborn: "Re: The Yoruba under-diacritic"
Next in thread: D. Starner: "RE: Surrogate points"
Maybe reply: D. Starner: "RE: Surrogate points"
Maybe reply: Peter Constable: "RE: Surrogate points"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Feb 01 2005 - 04:13:36 CST