Re: Sample of symbols useful in Classics

From: Edward Cherlin (cherlin@snowcrest.net)
Date: Thu Aug 15 1996 - 12:58:11 EDT

Next message: Rick McGowan: "future allocations"
Previous message: dat@pobox.com: "Re: Unicode & Han"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Rick McGowan wrote:
>A while ago today Ed Cherlin said...
>
>> "In the past, Unicode has been somewhat cramped, and the tendency has been
>> to unify characters where possible without infringing on other standards.
>> Now with UTF-16 we can afford to separate character sets for clearly
>> different uses."
>
>No, no, no, no, no, no. Please don't even start thinking this way. Don't
>encourage other people to think this way.
>
>Unification principles are primary. They are independent of how much
>space we
>have at our disposal. Just because we have space doe not mean we should
>rush
>to fill it with junk as fast as we can. The principles are still the
>principles.
>
>You should read the allocation paper that Becker and I did a long time ago,
>which has been adopted, at least in principle, by both UTC and WG2. It is a
>rational plan with some good rules of thumb about character encoding.

Be glad to. Where is it?

>Encoding a new character is the LAST RESORT.
>
> Rick

I have had a lot of experience on this point, and I respectfully disagree.
While I would not want to see a wild proliferation of characters (a code
page for every language, for example, would be utterly disastrous), there
are some character sets which should be more clearly separated. For
example, APL should not have been unified even as much as it was with math
and ASCII. The reason is that the unified characters were not identified as
such, leaving APL implementors at odds about which star or tilde character
code point to use for APL EXPONENTIAL FUNCTION and APL LOGICAL NOT
FUNCTION, so that APL in Unicode is still in a horrible state of confusion.
It is certainly nothing like as bad as it used to be, but it is much worse
than necessary.

For example, I can produce an attempt at Unicode APL in which ASCII
asterisk and APL EXPONENTIAL FUNCTIONS are clearly distinct characters
(asterisk has six points and is used for various punctuation functions,
while the APL character has five points, represents the Exponential
function, and used to be regarded as an element of several overstrike
characters. At the same time, ASCII tilde was unified with the APL symbol.
Other implementors made other choices. A significant number of other
characters have the same problem. I wrote a detailed article on the problem
for APL News (Springer-Verlag) which I can make available.

In the case of the Bible marks we were discussing, it is clear that they
are mostly quite different from the various glyphs that look similar in
various fonts. Perhaps a few can be unified, and perhaps the alpha in
question is one of them, but as I understand the matter now, I would
probably be against it. It cannot possibly be either the math or APL alpha,
however.

Edward Cherlin Helping Newbies to become "knowbies" Point Top 5%
Vice President http://www.newbie.net/Mentors/Cherlin of Web sites
NewbieNet, Inc. Everything should be made as simple as possible,
cherlin@newbie.net __but no simpler__. Albert Einstein

Next message: Rick McGowan: "future allocations"
Previous message: dat@pobox.com: "Re: Unicode & Han"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT