Re: Romanized Cyrillic bibliographic data--viable fonts?

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Tue Aug 27 2002 - 07:20:25 EDT


James Kass wrote as follows.

>Unless a font is fixed width, Latin combiners can't currently
>consistently combine well without "smart font technology"
>support enabled on the system. So, don't blame the Arial
>Unicode MS font if these glyphs don't always merge well.
>
>While awaiting Latin OpenType support, it might be a good
>idea to take a look at a well populated fixed width pan-Unicode
>font like Everson Mono.

James had also previously written.

>Best regards,

>James Kass,
>who is now adding U+FE20 .. U+FE23 to the font here.

I have had a look at the problem and decided, as the saying might go, that
even the best cook cannot make herb and carrot sauce starting with parsnips
and some huge quantity of thyme!

The characters need to cover ligatures for both TS and also iu so they need
to be high and arranged so that the result looks reasonable.

So, I began to think that the best display option would probably, in the
long term, be for an advanced format font to carry all of the necessary
glyphs and to produce a glyph in response to an appropriate four character
sequence.

However, the problem remains for people with other than the very latest
equipment, so I have decided to add some of these ligatures into the golden
ligatures collection.

This is quite an interesting task, as, starting from the reference to the
pdf file I worked back to the directory and there found various other pdf
files, some for other languages which use Cyrillic characters.

http://lcweb.loc.gov/catdir/cpso/romanization/russian.pdf

http://lcweb.loc.gov/catdir/cpso/romanization

Thus far, I have made the following allocations for the golden ligatures
collection. Please know that my approach is mathematical rather than
linguistic.

The first four pairs are for romanizing Russian names and unknown terms.

U+E7A0 for T U+FE20 S U+FE21
U+E7A1 for t U+FE20 s U+FE21

U+E7A2 for I U+FE20 E U+FE21
U+E7A3 for i U+FE20 e U+FE21

U+E7A4 for I U+FE20 U U+FE21
U+E7A5 for i U+FE20 u U+FE21

U+E7A6 for I U+FE20 A U+FE21
U+E7A7 for i U+FE20 a U+FE21

The next two pairs are additions for Belorussian and Ukrainian.

U+E7A8 for I U+FE20 O U+FE21
U+E7A9 for i U+FE20 o U+FE21

U+E7AA for Z U+FE20 H U+FE21
U+E7AB for z U+FE20 h U+FE21

I have started writing it all up for our web site, where it will hopefully
be posted, making clear the use of these code point allocations for
producing displays, not for storing text in databases that need to be
searched and sorted.

However, I would like to make the list of encodings more comprehensive and
would welcome feedback on which ligatures to include. The files
churchsl.pdf and nonslav.pdf from the above named directory are the source
material that I have found so far which has not yet been covered in the
above encodings. Suggestions of other source material are welcome.

I have looked through them and found some very interesting characters, such
as what looks like an o macron and t ligature and also a t s ligature with a
dot above the whole ligature as well as other ligatures along similar lines.
I would appreciate any information about expressing those in Unicode which
anyone can provide please, either to the mailing list or, if a writer
prefers, privately by email.

The current documents about other ligatures already in the golden ligatures
collection can be found from the following introduction and index page.

http://www.users.globalnet.co.uk/~ngo/golden.htm

William Overington

27 August 2002



This archive was generated by hypermail 2.1.2 : Tue Aug 27 2002 - 05:49:31 EDT