Re: Looking for code ranges on specific languages.

From: Mark Davis (mark.davis@icu-project.org)
Date: Thu Jul 17 2008 - 18:01:12 CDT

  • Next message: Mark Davis: "FYI, another official Google blog mentioning Unicode"

    Mark

    On Thu, Jul 17, 2008 at 3:19 PM, David Starner <prosfilaes@gmail.com> wrote:

    > On Thu, Jul 17, 2008 at 5:05 PM, Jonathan Woodburn <jonathan@woodburn.cc>
    > wrote:
    > > Admittedly, Chinese is a huge character set, however, the font is still
    > > aimed at a low memory footprint.
    >
    > What's a low memory footprint for you? You can fit about any
    > Latin-Greek-Cyrillic font on the market in a fraction of the space of
    > a small Chinese font. The question is why it's worth stressing about a
    > minimal fragment of the Latin characters, especially as even if you're
    > just doing English, someone is going to mention Dvořák or the like at
    > some point.
    >
    > > However, I'm getting the impression that
    > > perhaps my understanding of Unicode is misinformed (or simply
    > uninformed).
    > > Is every character not found in a common table for every language (i.e.
    > > Latin characters + foreign language accents + Cyrillic + chinese,
    > etc...)?
    >
    > That's one way to describe it.
    >
    > > If this is
    > > an exhaustive list, it will be a little tedious to read the HTML Source,
    > but
    > > will certainly work. :)
    >
    > It's not an exhaustive list; note that it doesn't include ô, ö, or é
    > in the English column. Even if you dismiss rôle and coöperate as
    > archaic, café is still fairly common.

    CLDR distinguishes two sets of characters for each language, a main set and
    an auxiliary set. é is in the auxiliary set for English. Much more detail on
    this in UTS#35.

    >
    > > 1. Are all characters for every language found in a single Unicode
    > > definition so that U+XXXX can express any character?
    >
    > Yes and no. You need to support combining characters for some
    > languages, though none of the languages you're looking at.
    >
    > > 2. Would it be necessary to create individual fonts for particular
    > > (non-coexisting) languages?
    >
    > Again, yes and no. Your languages are fine, but fine typography will
    > set the accent in Polish and French differently, and Russian and
    > Serbian italics use a different form for one of the letters, etc.
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Thu Jul 17 2008 - 18:03:14 CDT