RE: is there any way to change already defined character codes?

From: 11digitboy@bolt.com
Date: Tue Aug 08 2000 - 20:04:02 EDT


--
Robert Lozyniak
Accusplit pedometer manufactures can go suck eggs
My page: http://walk.to/11
11digitboy@bolt.com - email
(917) 421-3909 x1133 - voicemail/fax

---- Marco.Cimarosti@icl.com wrote: > Sandro Karumidze wrote: > > The issue is that in Unicode there is a sequence > of Georgian > > caracters different > > from what this people think should be. > > [...] In beginning of this century 5 characters > were dropped > > [...] > > In Unicode this 5 characters follow 33. There > is a different > > point of view that those 5 should be included > among the > > ohters. > > (You definitely need an official reply, but let's > go on with some more > informal chatting.) > > I foresee that this would not be considered a good > reason to change > anything. > > The order of characters in Unicode (or in any other > character encoding) is > not important. The scope of a character set is > to assign a unique number to > each character, not to define an "alphabetical > order". > Yeah. Just look at the kanji digits!

> If you notice, the situation that you describe > is true for *all* the > alphabets in Unicode. > > E.g., if you look at the Latin part, you see that > the 26 letters used in > modern English are all contiguously ordered in > two areas: U0041 to U005A > (uppercase) and U0061 to U007A (lowercase).

Yeah, but so what? All you gotta do is turn the 6th bit off and there you go! > > But that's the end of the story! All the other > 100's Latin letters are > scattered all over, using no consistent order. > Too bad unicode values can't be fractions!!

> The same is true for Cyrillic, Greek, Hebrew, Arabic, > and so on. Have a look > at those blocks: the basic letters for post-czar > Russian, modern Greek, > Israeli Hebrew, modern Arabic etc. are consistently > ordered, but the letters > for other languages that use the same alphabets > (or ancient letters for the > same languages) are scattered all over with no > specific order. > > The reason why no one cares about the order of > characters is that it is > *impossible* to determine a "correct" order. > > In alphabet used by more than one language (e.g. > Latin, Cyrillic, Arabic, > Devanagari, etc.), the alphabetic order is normally > different for each > language. > > Moreover, many languages have more than one alphabetic > order, all equally > valid and in current usage. > > For this reason the problem of "alphabetic order" > has been pulled apart from > character sets, and addressed separately. > > In Unicode, the issue of "collation" is handled > by ad-hoc optional > algorithm, that is part of the standard but is > separated from the encoding > issue itself. > > The algorithm is titled "Unicode Technical Report > #10: Unicode Collation > Algorithm", and you can find it here: > http://www.unicode.org/unicode/reports/tr10/ . > > *That* is the place to check whether Georgian Letters > are in the correct > order or not. And if they are not, you have two > options: > > 1) Ask Unicode to change it: here you *do* have > some chances to be listened, > if you have valid arguments. > > 2) Change it yourself: unlike the character values, > the collation algorithm > is designed to be flexible and customizable. > > Regards, > _ Marco >

___________________________________________________________________ Get your own FREE Bolt Onebox - FREE voicemail, email, and fax, all in one place - sign up at http://www.bolt.com



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:06 EDT