Re: Looking for code ranges on specific languages.

From: Mark Davis (
Date: Thu Jul 17 2008 - 18:01:12 CDT

  • Next message: Mark Davis: "FYI, another official Google blog mentioning Unicode"


    On Thu, Jul 17, 2008 at 3:19 PM, David Starner <> wrote:

    > On Thu, Jul 17, 2008 at 5:05 PM, Jonathan Woodburn <>
    > wrote:
    > > Admittedly, Chinese is a huge character set, however, the font is still
    > > aimed at a low memory footprint.
    > What's a low memory footprint for you? You can fit about any
    > Latin-Greek-Cyrillic font on the market in a fraction of the space of
    > a small Chinese font. The question is why it's worth stressing about a
    > minimal fragment of the Latin characters, especially as even if you're
    > just doing English, someone is going to mention Dvořák or the like at
    > some point.
    > > However, I'm getting the impression that
    > > perhaps my understanding of Unicode is misinformed (or simply
    > uninformed).
    > > Is every character not found in a common table for every language (i.e.
    > > Latin characters + foreign language accents + Cyrillic + chinese,
    > etc...)?
    > That's one way to describe it.
    > > If this is
    > > an exhaustive list, it will be a little tedious to read the HTML Source,
    > but
    > > will certainly work. :)
    > It's not an exhaustive list; note that it doesn't include ô, ö, or é
    > in the English column. Even if you dismiss rôle and coöperate as
    > archaic, café is still fairly common.

    CLDR distinguishes two sets of characters for each language, a main set and
    an auxiliary set. é is in the auxiliary set for English. Much more detail on
    this in UTS#35.

    > > 1. Are all characters for every language found in a single Unicode
    > > definition so that U+XXXX can express any character?
    > Yes and no. You need to support combining characters for some
    > languages, though none of the languages you're looking at.
    > > 2. Would it be necessary to create individual fonts for particular
    > > (non-coexisting) languages?
    > Again, yes and no. Your languages are fine, but fine typography will
    > set the accent in Polish and French differently, and Russian and
    > Serbian italics use a different form for one of the letters, etc.

    This archive was generated by hypermail 2.1.5 : Thu Jul 17 2008 - 18:03:14 CDT