Re: Combining diacriticals and Cyrillic

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Mon Jul 14 2003 - 04:14:32 EDT

  • Next message: Tex Texin: "Re: Combining diacriticals and Cyrillic"

    A possibly useful thing to do would be to make a list of those characters
    which you which to produce which are not already encoded as precomposed
    characters in Unicode, sort them into alphabetical order and publish a list
    of them with code point assignments in the Private Use Area starting at
    U+EF00.

    This would mean that fonts could be produced with each of those precomposed
    glyphs accessible from a Private Use Area code point.

    Please know that you can use any code points in the Private Use Area which
    you choose, yet I am suggesting U+EF00 upwards so that the code points would
    be consistent with my suggested use of the Private Use Area for interactive
    television broadcasts.

    For producing graphics files for the web or for local hardcopy printing it
    would be possible to use those glyphs directly from the Private Use Area,
    thereby producing an elegant graphic. As Unicode code point information is
    not placed in a graphic when lettering is added to a graphic, the result
    would not show that the Private Use Area had been used.

    I have devised a method called a eutocode typography file for use with
    languages of the Indian subcontinent. It would seem potentially useful for
    your application as well.

    http://www.users.globalnet.co.uk/~ngo/ast03300.htm

    As far as I know the eutocode typography file has not yet been implemented
    in any software applications, it is primarily a suggestion for the future in
    relation to interactive television yet may be useful elsewhere.

    http://www.users.globalnet.co.uk/~ngo/ast00000.htm

    Software would need to be developed (by you or by other interested people),
    yet essentially what is needed is software to take an input document and
    process it according to information in a eutocode typography file. In this
    way the Private Use Area codes would not be used for interchanging
    information, yet would be used locally so as to produce an elegant display.

    The best long term solution, in my opinion, would be to send in a proposal
    to the Unicode Consortium to add the precomposed glyphs into regular
    Unicode. However this takes time and may not be successful and a Private
    Use Area solution does permit progress to be made now.

    Please know that my suggestion of publishing a list of Private Use Area code
    points may be regarded as controversial by some readers of this list and it
    is possible that you may be advised not to do it by some other readers.

    However, in my opinion, publication of code points for some uses of the
    Private Use Area does have some benefits for some applications. In this
    case it would at least achieve some consistency amongst those font makers
    who might like to add the precomposed characters into existing fonts. In
    relation to advanced format fonts the use of the Private Use Area code point
    in addition to the encoded access method does have the benefit of allowing
    access to the glyphs to people who are using a PC which does not have
    facilities for using the encoded access method of the advanced format font.

    William Overington

    14 July 2003

    -----Original Message-----
    From: vladimirg@need.bg <vladimirg@need.bg>
    To: unicode@unicode.org <unicode@unicode.org>
    Date: Thursday, July 10, 2003 10:23 AM
    Subject: Combining diacriticals and Cyrillic

    >Dear Ladys and Gentlemen,
    >
    >Currently there is an ongoing effort in Bulgaria trying to resolve an
    issuie concerning the way we write in Bulgarian.
    >
    >Our problem is:
    >
    >Usually a bulgarian regular user does not need to write accented
    characters. There is one middle-sized exclusion of this, but generally we do
    fine without accented characters. The problem is that in some special cases
    or more serious lingustic work, one definetely needs to be able to write
    accented characters (accented vowels).
    >
    >One of the ideas is to invent a new ASCII-based encodings, containing the
    accented characters we need. This would introduce an additional disorder in
    the current mess of cyrillic encodings, and would introduce problems with
    automated spellcheck.
    >
    >Generally I beleive it would be best to invent a Unicode based solution.
    >
    >Such a solution is for example, combining diacritical signs with the
    cyrillic symbols.
    >
    >I composed a demo page:
    >http://v.bulport.com/bugs/opera/426/balhaah_lonex_org/
    >
    >and then made 10-20 shots of the results on Opera and IE on Linux, Windows
    98 and Windows XP:
    >http://v.bulport.com/bugs/opera/426/balhaah_lonex_org/shots.html
    >
    >You can see that this approach yields _quite_ incosistent and useless
    results, depending on the font, application and operating system being used.
    >
    >Finally, I wonder if you could give us some advice:
    >
    >1.
    >Is it possible somehow to improve this approach? I imagine eg., if the font
    can provide prepared combined symbols whenever the application asks for a
    combined cyrillic+diacritical, instead of leaving the application to do the
    combination.
    >
    >2.
    >Do you see other unicode based approach to the Bulgarian problem?
    >
    >3.
    >Do you beleive the approach should be looked for outside Unicode?
    >
    >Please excuse me for wasting your time,
    >Vladimir,
    >Bulgaria
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Mon Jul 14 2003 - 05:07:06 EDT