Re: minimizing size (was Re: allocation of Georgian letters)

From: John H. Jenkins (
Date: Fri Feb 08 2008 - 12:06:07 CST

  • Next message: Ed Trager: "Re: minimizing size (was Re: allocation of Georgian letters)"

    On Feb 8, 2008, at 3:52 AM, Sinnathurai Srivas wrote:

    > 1/
    > My question was what is the criteria used to class a language as
    > That requires complex rendering
    > That requires no complex rendering.
    > For example Tamil could easily be implemented without the need for
    > any complex rendering
    > However, Tamil is currently implemented using complex rendering.
    > This was one of the main discussions and I have not seen a viable
    > answer that catergorically states for such and such TECHNICAL
    > reasons Tamil was made one that requires Complex rendering.

    There are a number of things people do with text which they want
    computers to help them do. The display of text (on screens or on
    paper) is one. This display comes in both low-end (typically
    newspaper quality) and high-end (books). In addition, people want to
    use computers for processes which depend on the *content* of the
    text. This would include searching, sorting, text-to-speech, and so on.

    Typically, a text encoding designed to optimize one process creates
    problems for the others. This is particularly true for attempts to
    support high-end typography. For years, for example, font vendors
    would provide "expert" versions of their fonts which contained the
    various swashes, ligatures, and other glyphs needed to typeset a
    book. These expert fonts made for beautiful display, but processes
    depending on the *content* of the text become hopelessly complex.

    Unicode is designed to provide support for all textual processes and
    all the world's languages. This is the (technical) reason for
    adopting the character-glyph model, which in turn implies complex
    rendering as a requirement for even minimal rendering of some scripts,
    such as Arabic and the various South Asian scripts.

    I'll grant I'm not an expert on the South Asian scripts, but that's
    the basic technical reason. In order to provide support for *all*
    computerized processes involving text, Unicode has adopted an approach
    which requires complex rendering. No script absolutely *requires*
    this if rendering is the only process you're interested in. But
    simplifying rendering to such an extent inevitably makes other things
    more difficult.

    > 2/
    > My question was, mostly all proper publishing softwares do not yet
    > support complex rendering. How many years since Unicode come into
    > being?
    > When is this going to be resolved, or do we plan on choosing an
    > alternative encoding as Unicode is not working.

    Well, what applications are you thinking of and on what platforms? As
    I say, Word on Windows is fine for almost everything in Unicode, and
    Pages on Mac OS X is fine for all of it. It is resolved now in that

    If your question is how long will it be before *all* publishing
    software supports all of Unicode, that is likely to be a long time. A
    lot of people still try to cut down on their development time by
    taking shortcuts which mean they don't support all of Unicode. So
    long as the European West and East Asia dominate the computer market,
    this is going to continue to be the case.

    But for most people running current software on current hardware, the
    problem is solved.

    > 3/
    > As for bitmap, I meant the "Rigidly-fixed-width-character"
    > requirements.
    > At present, the complex rendering (which is not working yet in these
    > systems) will produce extremly large width glyphs which will not be
    > accomodated by "rigidly-fixedwidth- requirements. What is the plan
    > to resolve this?

    This is not a Unicode problem. By this I don't mean that Unicode
    doesn't care whether the problem is solved or not, it means that the
    problem exists whether or not you use Unicode. Certainly rigidly-
    fixed-width-character layout is unnatural for all the world's scripts
    except in East Asia. It is certainly possible to design a Unicode
    display system to support this if you really want to, but I imagine
    that any system limited to this kind of display is likely to have
    other limitations (e.g., limited number of glyphs available) which
    would make rendering of some of the more glyph-heavy scripts difficult.

    > 4/
    > Storage size was one issue, I also do not think, given the available
    > technologies to deal with large size, this can not be the only
    > reason why an encoding should change, but can be changed if there
    > are other compelling reasons for change.

    Unicode has proven to be an elegant solution to a wide variety of
    problems. It is not an ideal solution to all problems. So far, the
    advantages of Unicode have outweighed any disadvantages -- certainly
    this has been true for the large companies such as Microsoft which
    dominate the personal computer market. Any solution which aims to
    displace Unicode should do at least as well in solving the problems
    that Unicode solves as well as being clearly superior in solving other

    John H. Jenkins

    This archive was generated by hypermail 2.1.5 : Fri Feb 08 2008 - 12:08:59 CST