Re: Cyrillic - accented/acuted vowels

From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Fri May 06 2005 - 09:09:33 CDT

  • Next message: Frank Yung-Fong Tang: "Re: CSets 1.8 released"

    On Sun, 1 May 2005, David Ulbrich wrote:

    > Has anyone come accross the problem with accented/acuted vowels and
    > iota-vowels in Russian/Ukrainian/Belarusian...? Though only used in
    > textbooks and dictionaries as standard, the absence of these characters
    > brings about really difficult problems in printing, often solved in a
    > quite hardly acceptable way. Combining these characters with diacritic
    > combination signs really does not give good results, and I do believe
    > this would deserve separate signs.

    As others have noted, good-quality implementations are possible even
    though the characters are not encoded as separate Unicode characters
    but only representable in Unicode by writing a base character followed by
    a combining diacritic mark. But commonly used software, even if it is
    capable of somehow displaying the combined character, is indeed rather
    poor in rendering them.

    I don't think it would be useful to add such characters into Unicode,
    or even realistic - the general idea seems to be that new precomposed
    characters will not be added. This saves work and coding space, and it
    helps to avoid long discussions. After all, commonly used characters with
    diacritic marks have already been incorporated into Unicode as precomposed
    characters, so the rest are rather specialized. Well, Cyrillic letters
    with diacritics aren't _that_ rare - as you mention, they appear in
    textbooks and dictionaries (and grammars), and occasionally even in normal
    text (e.g., I've seen an accent on Cyrillic o in the word that is
    transliterated as "bolshaya", since in this word, the stress is
    distinctive between the meanings 'big' and 'bigger').

    If we think that the characters deserve "characterhood" in Unicode, the
    natural step would be to define names for them, as defined in UAX #34,
    "Unicode Named Character Sequences",
    ( http://www.unicode.org/reports/tr34/ )
    I was actually somewhat surprised at seeing that the list of currently
    defined named character sequences does not contain any Cyrillic letters
    with diacritic mark. Maybe the idea has not become popular. After all,
    defining such a sequence does not guarantee anything, and has no immediate
    effect - but it might be a hint to implementors that the characters need
    special attention. After all, the list _could_ be used so that separate
    glyphs are designed for those characters, instead of relying on the
    general algorithms that handle the rendering of diacritic marks.

    -- 
    Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
    


    This archive was generated by hypermail 2.1.5 : Fri May 06 2005 - 09:12:07 CDT