Re: Why people still want to encode precomposed letters

From: David Starner (
Date: Sun Nov 16 2008 - 19:12:19 CST

  • Next message: John Hudson: "Re: Why people still want to encode precomposed letters"

    On Sun, Nov 16, 2008 at 1:12 PM, Jukka K. Korpela <> wrote:
    > David Starner wrote:
    >> On Sat, Nov 15, 2008 at 5:55 PM, Jukka K. Korpela
    >> <> wrote:
    >>> Too bad if you really need those characters. But encoding new
    >>> letters with diacritics as code points wouldn't help. Even if it
    >>> were possible to add them into Unicode, it would take many many
    >>> years before they have been added there and implemented widely in
    >>> fonts that are available on people's computers.
    >> I don't believe that.
    > Consider, for example, the case of DIAMETER SIGN U+2300, a fairly common
    > technical symbol.

    That strikes me as a different case; a lot of fonts that support a
    wide selection of letters don't support many symbols, even one's that
    might be expected.

    > If that's how far we've got in 15 years with a common, language-independent
    > symbol, how fast do you expect to make progress with language-specific
    > precomposed characters, typically for languages spoken by relative small
    > amount of people?

    I think that's a non-sequitur.

    >> I'm seeing Latin Extended C on Alanwood's page just fine,
    > I am not. I can see about half of the characters – and my computer has a
    > fairly good repertoire of extensive fonts, including Code2000 and Doulos
    > SIL. The problem is that currently I am using, as most of the world always
    > uses, Internet Explorer as web browser.

    The fact that one tool doesn't work well is not a good reason to
    dismiss all of them.

    > One of several fonts, you mean, I suppose. Well, it might be trivial to ask
    > people to do such things, but to begin with, a huge number of people can't
    > install any fonts on their computer at work or on a computer in some public
    > premises that they are using.

    I'm not sure that saying we should give and go home is the right
    solution. If the goal is to support as many people using their
    languages as possible, then I think we should worry about those we can
    help, not just fret over those who can't change their systems.

    >> How many characters
    >> that would have been encoded in Unicode 5.0 if it hadn't been for that
    >> rule have several fonts that well-support them?
    > I cannot quite interpret (or actually even parse) the question. It seems to
    > postulate a rule that a character cannot be added to Unicode unless there
    > are several fonts that "well-support" it. Is there really such a rule?

    After I wrote this message, I glanced at it and realized I had gotten
    rather elliptical at points. Sorry. Try "How many (composed)
    characters that would have been encoded in Unicode 5.0 if it hadn't
    been for that rule (that new precomposed characters can't be encoded)
    have several fonts that well-support them?"

    > What I wrote was simply an argument saying that even if Unicode principles
    > were changed to allow new characters with diacritic marks added as
    > precomposed – which is an unrealistic assumption, made only for the sake of
    > argument – this would not magically make such character useable in practice.

    I think it would make those characters significantly more available
    than the current situation. Since it is, in fact, an unrealistic
    assumption, I think Unicode should encourage those font makers making
    relatively complete fonts to include such characters. For one wild
    idea, they could include in the Unicode code charts a Latin
    Supplemental page including characters with diacritics that are in use
    in orthographies just like a normal code chart, except that the code
    points are replaced with base+combining code points.

    Andrew Cunningham writes:

    > To add to JUkka's comment, if UTC were to accept precomposed characters,
    > which languages would benefit? the languages with sufficient number of technical
    > users or agencies that would be in a position to submit a proposal.

    Unicode has done a pretty good job of encoding characters just used by
    Lakota and other languages before; I don't see why it would be
    different if they were still encoding precomposed characters.

    This archive was generated by hypermail 2.1.5 : Sun Nov 16 2008 - 19:16:19 CST