Re: Why people still want to encode precomposed letters

From: philip chastney (philip_chastney@yahoo.com)
Date: Tue Nov 25 2008 - 04:18:39 CST

  • Next message: John Hudson: "Re: Why people still want to encode precomposed letters"

    --- On Mon, 24/11/08, John Hudson <john@tiro.ca> wrote:

    From: John Hudson <john@tiro.ca>
    Subject: Re: Why people still want to encode precomposed letters
    To:
    Cc: unicode@unicode.org
    Date: Monday, 24 November, 2008, 5:45 PM

    philip chastney wrote:

    > using OpenType tables brings lots of benefits to the font, but drastically
    restricts the software it can be used with

    Can you expand on this? From my perspective, OT Layout support looks pretty
    good in most places, the most notable weaknesses being in Apple's support,
    but they are working on improving that. What do you consider 'drastic
    restrictions'?
    I meant that there are relatively few applications which make use of OpenType tables
    > I should have said "the average designer of large fonts",
    because I get the impression that the "average" font designer still
    finds the full extent of Latin-1 a bit exotic

    Those would be below-average font developers.
    yes, and they produce a lot of fonts: scrapbooking fonts, grunge fonts, lots of commercial fonts
    [...] As of about five years ago, most
    of the type designers I know have gravitated to a pan-European Latin set, with a
    few extending to Cyrillic and Greek.
    I have no doubt that is true, but while the designers you know may all be the best in their field, it doesn't make them the majority  --  that would be like taking observations of Olympic athletes and applying them to weekend joggers
     
    but none of the foregoing is relevant to the main issue here
    Support for combining marks remains minimal, though, in large part because of
    the heritage of 8-bit sets with all precomposed diacritics. If we hadn't
    been able to get away without supporting combining marks for so long, our tools
    and workflows would be much more advanced than they are.

    > but with that change, I still think that the average designer of large
    fonts will want their font to be useful in as many contexts as possible, and
    will therefore generate as many pre-formed composites as possible

    But those pre-formed composites need to be accessed by some mechanism other
    than straight glyph-to-character mapping, since many will not have precomposed
    encodings in Unicode. So you still need OpenType Layout GSUB support, even if
    you don't think you can rely on GPOS support. I wonder what are the current
    holes in OTL support, such that the GSUB <ccmp> feature is supported but
    the GPOS <mark> and <mkmk> are not? Are these holes big enough --
    and unlikely to be soon filled in -- to encourage font developers to add
    thousands of precomposed glyphs to their fonts? I'm not keen on the idea,
    because I'd really like to reduce glyph set sizes, not increase them.
    please note, I am not advocating the use of preformed composites  --  the world would be a slightly better place if the idea of base+mark(s) could have been sold to interested parties before Unicode 1.0 was ever published, but that's not the way it happened
     
    and I'm sorry if my imprecise expression has diverted us from the main issue
     
    but back to the here and now  --  to recap: the only way to support the open-ended requirements of the philologists and the mathematicians is via OpenType tables
     
    these same OpenType tables can be used to support known composites in natural languages, and would make for a neater, smaller, font, with better coverage of accented letters than the use of preformed composites, and would (could, should) eliminate any visual differences between, say, U+00e9 and U+00e5, U+0301
     
    but we lack an exhaustive list of known composites
     
    such a list would be a list of characters -- it's not a language issue -- which is one reason Unicode.org would be a suitable repository
     
    we know such a list cannot be called "namedsequences.txt", but it isn't really important what it's called, so long as it exists
     
    I may be wrong here, but a feeling seems to have grown that Unicode cannot/will not host such a list, where the precise truth is that Unicode cannot/will not host such a list if it's named "namedsequences.txt"
     
    how do we test the truth of that statement? would Unicode.org host a file called "knowncomposites.txt" (or "knowncomposites.xls")? who decides this sort of thing?
     
    once such a file exists, it could be used to test existing fonts for completeness, and it would provide a specification standard for new fonts
     
    also, it would provide a suitable filing place for newly discovered composites in minority languages, and the world would be a slightly better place, as a result
     
    /phil
     

     



    This archive was generated by hypermail 2.1.5 : Tue Nov 25 2008 - 04:21:42 CST