Miscellaneous comments/questions.

From: Alex Bochannek (alex@p9.com)
Date: Wed Jul 12 2000 - 18:06:59 EDT


I just returned from a lengthy trip through parts of Europe and
thought I mention some observations.

In Greece, I noticed that almost all signs used monotonic Greek. I saw
some older road signs and a couple of store signs that used polytonic
Greek, but according to a Greek acquaintance, everybody is very happy
to not have to deal with it anymore. When did the switch actually
happen? He claimed it was only about a decade ago?

What was interesting to see was how the printing of the tonos
varied. For the most part it did look like a steeper acute as
described in Chapter 7.2 of Unicode 3. A number of times, I did see a
variation though which looked more like, e.g., U+03B1 U+0307, but I
suspect that to be just a font style. I also noticed that frequently,
certain characters are written in variants which at first were
completely indecipherable to me. I especially recall the beta
(U+03D0), theta (U+03D1), and maybe pi (U+03D6) as well as the
upper-case upsilon (U+03D2.) As someone who learned classical Greek in
school, it added to the problems I already had with the modern
pronunciation of a lot of the letters ;-)

One thing I found very confusing was the mixing of Latin and Greek
script which is very common on billboards. A couple of times I found
myself unable to tell whether a word was spelled in Latin or Greek
since it only used glyphs which both scripts share and hence I could
not derive the proper pronunciation at first. It was interesting to
see some brand name products and proper names transcribed while
sometimes Latin script is used in mid-sentence for foreign words.

A similar issue was very interesting to observe in France and
Germany. The use of the English language in advertisement seems to run
rampant in Germany while almost all ads that include English in France
(mostly tag lines) are followed by an asterisk and the literal French
translation somewhere near the edge of the sign. At first I thought it
was somewhat silly but when I saw how the German language currently is
absorbing English words like a sponge, the footnotes seemed to make

While in Germany, I bought a children's book that was first published
in 1921 and used a simplified Fraktur. As a native German, I had no
problems reading it, but for my wife who doesn't have German as her
native language, the long-s did throw her off at first. After I
explained the logic behind it, it was a lot easier, but she did make a
good point as to why it isn't used in the "sp" digraph. Maybe Otto can
shed some light on this?

In looking at older Fraktur text, it was very interesting to see how
foreign words are set in an Antiqua font similar to how in English
text foreign words are often in italics (and similar to the use of
Latin script in Greek above.) This brings up a font question I have
been wondering about for a long time: How similar are typesetting
features of fonts across different scripts? It seems that most
European scripts have print and cursive versions (I saw some beautiful
cursive signs in Greece), serifs and mono-spaced fonts, and boldness
and slant seems to be common as well. But what about other scripts? It
seems that all(?) scripts currently represented in Unicode have at
least some typographical tradition albeit only scholarly in some
cases. How much of the features are overlapping, i.e., how much sense
does it make to define a serif font for CJK scripts? What about
italics in Arabic? Can there be a font family which covers all the
scripts in Unicode and which complies with the local typographic
esthetics? I apologize for the glyph-centric nature of the question

Two other topics of discussion that came up in recent weeks were very
interesting to me: Time zones and location names. The latter was
something I have been curious about myself for a while. It is true
that in Germany for example, rarely the state (Bundesland) is
indicated when referring to a location. When ambiguity arises,
regional names or other landmarks are used to distinguish, sometimes
to the point of becoming part of the name. Examples: Hamm (Westfalen)
and Frankfurt am Main versus Frankfurt an der Oder. Even more
interesting to me though would be the local name of places and I would
love to find a World Atlas who first indicates every location's name
in the local language and script, then the accepted Latin
transliteration, and finally the name in English (or, say, German, if
published in Germany.) Are the large publishing houses equipped to
produce something like this? Or more importantly, would they use
Unicode for it? What about smaller printers (like for business cards?)

The other issue that was brought up about time zones is fascinating. A
while ago, when I was looking into locale issues, it occurred to me
that there really needs to be a comprehensive database of "cultural
defaults." For extensive localization, you need to know more than just
date format, language, and script (OK, I am oversimplifying the extent
of the locale information.) What I would like to see is a database
that allows to enter a location in, e.g., coordinates and a date and
that then gives me extensive information about the most commonly used
language, script, currency, measurement units, local date/time,
holidays, and maybe even typesetting rules (which quotation marks to
use?) or travel information like electrical standard at the location
at that date. I know that would be a tremendous mount of work, but I
would think that it should be possible to find interested volunteers
who can fill in this data over time. Just having this data available
as it pertains to the current situation in the World would be
extremely helpful for localization work (wasn't there some discussion
on localized pictograms earlier?) Does this information already exist
somewhere? I am certain that parts of it do.

Finally, after going through several hundred emails from the list that
had accumulated while I was out, I have to agree with both Jens
Siebert's sentiment to make the list more transparent, but
unfortunately also with Frank da Cruz's assessment that attempts like
this are often doomed. I personally would not like to see the list be
split into multiple subtopic lists since I find the range of
discussions quite enlightening and there is the risk that we will lose
some cross-pollination if experts only subscribe to their respective
lists. I am in favor of subject line tagging and keyword headers and
think that, if adhered to, a combination of a very small number of
agreed-upon subject tags and a more free-form keyword header could be
extremely useful for sorting and filtering (if you choose to.)


