Re: Combining diacriticals and Cyrillic

From: Andrew Cunningham (
Date: Thu Jul 10 2003 - 20:40:43 EDT

  • Next message: James H. Cloos Jr.: "Re: Combining diacriticals and Cyrillic"

    Hi Vladimir

    yes in theory your answer is Unicode, i.e. cyrillic plus combining

    Although the actual application of the theory will differ from operating
    system to operating system.

    I did a quick test on windows in both word processors and web browsers.
    Everything displayed correctly (given certain combinations of fonts and

    There are two elements that need to be addressed:

    1) appropraite fonts. I only know of two that are suitable: Code2000 (v.
    1.13) has the appropriate opentype tables (I believe it uses the
    OpenType MarkToBase feature - others on the list will correct me if my
    memory is faulty). The second font is Doulos SIL (v 0.6 - Beta). This
    font has both OpenType tables and Graphite tables. Graphite is a
    rendering system developed by SIL International.

    2) You need a rendering system that supports the features. On Windows,
    this means that you will need a version of Uniscribe that supports the
    use of combining diacritics with cyrillic characters. Currently none are
    available, except for the version in the MS Office 2003 Beta. I did a
    quick test using the two fonts above, and the characters displayed
    correctly. So from the point of view of word processing, there is a
    solution coming. This approach will also work with other applications
    that support uniscribe. Although you might ahve to wait until Microsoft
    release a service pack that contains the uniscribe update.

    I assume that Microsoft will update one or more fonts with the necessary
    features when they release Office 2003.

    I also tested the software in some graphite enabled software (WorldPad
    and a graphite enabled version of Mozilla). It seemed to work fine as well. wrote:

    > Dear Ladys and Gentlemen,
    > Currently there is an ongoing effort in Bulgaria trying to resolve an issuie concerning the way we write in Bulgarian.
    > Our problem is:
    > Usually a bulgarian regular user does not need to write accented characters. There is one middle-sized exclusion of this, but generally we do fine without accented characters. The problem is that in some special cases or more serious lingustic work, one definetely needs to be able to write accented characters (accented vowels).
    > One of the ideas is to invent a new ASCII-based encodings, containing the accented characters we need. This would introduce an additional disorder in the current mess of cyrillic encodings, and would introduce problems with automated spellcheck.
    > Generally I beleive it would be best to invent a Unicode based solution.
    > Such a solution is for example, combining diacritical signs with the cyrillic symbols.
    > I composed a demo page:
    > and then made 10-20 shots of the results on Opera and IE on Linux, Windows 98 and Windows XP:
    > You can see that this approach yields _quite_ incosistent and useless results, depending on the font, application and operating system being used.
    > Finally, I wonder if you could give us some advice:
    > 1.
    > Is it possible somehow to improve this approach? I imagine eg., if the font can provide prepared combined symbols whenever the application asks for a combined cyrillic+diacritical, instead of leaving the application to do the combination.
    > 2.
    > Do you see other unicode based approach to the Bulgarian problem?
    > 3.
    > Do you beleive the approach should be looked for outside Unicode?
    > Please excuse me for wasting your time,
    > Vladimir,
    > Bulgaria
    > .

    Andrew Cunningham
    Multilingual Technical Officer
    Online Projects Team, Vicnet
    State Library of Victoria
    328 Swanston Street
    Melbourne  VIC  3000
    Ph. +61-3-8664-7430
    Fax: +61-3-9639-2175

    This archive was generated by hypermail 2.1.5 : Thu Jul 10 2003 - 21:49:50 EDT