Re: Sample of symbols useful in Classics (was: Apple's Unicode)

From: Edward Cherlin (cherlin@snowcrest.net)
Date: Wed Aug 14 1996 - 15:29:52 EDT


Ronald Wood wrote:
>On Wed, 14 Aug 1996, Otto Stolz wrote:
>> On Mon Aug 12, 17:07, Ronald S. Wood <wood@cs.dal.ca> has asked for the
>> ISO 10646 / Unicode codings for some symbols used in classics.
[snip]
>> I think that the symbols used in Classics have not been considered when
>> Unicode has been defined, yet Unicode /ISO 10646 comprises characters
>> suitable for some of them.
[snip]

My $.02 worth: These are a reasonably well-defined set of characters
essential to a fairly large set of users. Let some of them come up with a
proposal, argue the merits of all the niggling details, and give them a
page.

>> Nevertheless, some of the symbols used by the Deutsche Bibelgesellschaft =
>> can
>> be coded in Unicode / ISO 10646, viz.:
>> alfa 03B1 GREEK SMALL LETTER ALPHA (sub BASIC GREEK)
>> or 237A APL FUNCTIONAL SYMBOL ALPHA (sub MISCELLANEOUS TECHNI=
>> CAL)

No, no, please, no. Please nobody use APL characters for something other
than APL. They have weird spacing (monospaced) and semantics (written
left-to-right, parsed right-to-left, with very specific meanings). We can
argue whether this particular DB alpha is a Greek alpha, or yet another
special alpha (like the math and APL versions) but it is NOT APL.

>I only included lower case alpha to indicate that the following
>characters were superscript. 0x03B1 is the only appropriate encoding for the
>Greek language.
[various cases and suggestions snipped]

In the past, Unicode has been somewhat cramped, and the tendency has been
to unify characters where possible without infringing on other standards.
Now with UTF-16 we can afford to separate character sets for clearly
different uses.

>I am also concerned that the need for idiosyncratic fonts be reduced (if
>not eliminated). For ancient Greek, I know of maybe 5-6 fonts with very
>different sets of characters. Linguist Software's SymbolGreek uses spacing
>and non-spacing characters to leave room for the Nestle-Aland editorial
>symbols (LS distributes Bible e-texts). GreekKeys fonts take the Unicode
>approach by encoding all characters/diacritical combinations, leaving no
>space for even the most useful editorial symbols. And other fonts may
>include editorial symbols, but put them in a different order, etc...

Actually, Unicode is going to increase the number of idiosyncratic fonts,
but to a considerable extent will allow them to work together properly. We
must remember that a font is not an input method, and that a keyboard
layout or set of symbol menus can include a variety of characters from
different fonts (as in math input editors). Thus you could set up your
system to use GreekKeys letters and SymbolGreek editorial marks, because
they would be identified by Unicode code points, not font positions. You
could type letters and diacritical marks as separate elements or as
precomposed combinations, no matter which approach your font takes. The Mac
keyboard and the Windows International keyboard both allow typing of most
diacritics separately, for display and printing using precomposed forms.

>I would like to see a well-defined set of characters partly because I am
>dealing with the problem of converting the texts of the Thesaurus Linguae
>Graecae (a comprehensive CD-ROM database of Ancient Greek texts) into
>displayable encodings. I am using Unicode as the intermediate
>representation. Many of the symbols are obscure, but some have general
>use. For the time being, I will use the private use area, but I would
>hope that I could, at some time, exchange a Unicode file with a scholar
>and know that she could read it.

Does this mean that there will be a CD-ROM of TLG in Unicode? And is there
any chance of the price coming down to the textbook level, say $40-50,
instead of the $300 they have been charging, so that all beginning Classics
students and amateurs could automatically buy one? I've had my eye on it
ever since it was announced, but nobody I know around here has a copy at
that price.

>I suspect that 128 codepoints wuld suffice for the most common symbols, but,
>as I mentioned, I have not done a comprehensive survey.
>
>Sorry for the length!

Don't apologize. The correct length for a technical discussion is long
enough to state all of the problems, proposed solutions' and pros and cons
clearly (see my sig).

>-Ronald S. Wood
> Halifax, NS, Canada

Edward Cherlin Helping Newbies to become "knowbies" Point Top 5%
Vice President http://www.newbie.net/Mentors/Cherlin of Web sites
NewbieNet, Inc. Everything should be made as simple as possible,
cherlin@newbie.net __but no simpler__. Albert Einstein



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT