Re: Why people still want to encode precomposed letters

From: John Hudson (john@tiro.ca)
Date: Tue Nov 25 2008 - 12:52:00 CST

Next message: verdy_p: "Re: Why people still want to encode precomposed letters"

Previous message: John Hudson: "Re: Why people still want to encode precomposed letters"
In reply to: philip chastney: "Re: Why people still want to encode precomposed letters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

philip chastney wrote:

> I meant that there are relatively few applications which make use of
> OpenType tables

Examples? In my experience, there are relatively few applications today
that do not make use of OTL tables, either directly or via system or
third-party layout engines. The level of support for particular layout
features may vary, but these generally involve UI limitations and
features that are intended to be on by default are most likely to be
supported. GPOS mark-to-base and mark-to-mark positioning is supported
for European in any app using recent versions of the Uniscribe engine or
ICU, include web browsers. In fact, app support for this feature is more
advanced and widespread than font support is.

> but back to the here and now -- to recap: the only way to support the
> open-ended requirements of the philologists and the mathematicians is
> via OpenType tables

[Aside: mathematics requires things that OTL alone cannot handle, hence
Microsoft's 'math' font table and dedicated math layout engine.]

> these same OpenType tables can be used to support known composites in
> natural languages, and would make for a neater, smaller, font, with
> better coverage of accented letters than the use of preformed
> composites, and would (could, should) eliminate any visual differences
> between, say, U+00e9 and U+00e5, U+0301

Let's clarify a point of terminology: in font production, a 'composite'
is a glyph that uses one or more other glyphs as components, e.g. a
precomposed glyph for displaying a diacritic base+mark(s) combination. I
believe what you are talking about -- in terms of making neater, smaller
fonts -- is dynamic mark positioning, not composites, which is also what
I am advocating. But in my experience there is neither the need nor a
significant benefit (contrary to the cost) in trying to define a target
set of combinations for dynamic mark positioning, because the same
mechanism can be designed to address arbitrary combinations.

This discussion began with the observation that certain fonts appear to
support generic mark positioning for arbitrary combinations of Latin
letters plus marks, but not for Cyrillic letters. I happen to know some
of these fonts inside-out, and know that, in fact, they do support
*some* Cyrillic plus mark combinations: they support a targeted subset
that someone, somewhere identified as 'known combinations'. So the very
impetus of this discussion is an example of the downside of relying on a
subset of known combinations rather than generic positioning for
arbitrary combinations: you end up not supporting unknown combinations
which may, in fact, occur. The sad irony of such fonts is that exactly
the same anchor position that is used to position e.g. an acute accent
above a Cyrillic letter could be used to position any other above-centre
mark, and generic positioning costs nothing more in terms of font
development work than anchors for a subset of marks.

> once such a file exists, it could be used to test existing fonts for
> completeness, and it would provide a specification standard for new fonts

> also, it would provide a suitable filing place for newly discovered
> composites in minority languages, and the world would be a slightly
> better place, as a result

Such a list would be useful, as John Jenkins noted, in providing
real-world test cases. But I am not convinced that it is worth the
effort. In 1998, I compiled lists of base+mark combinations occurring in
the orthographies of about 200 African languages. It not only took a
long time, but the resulting data was unreliable because of instability
in the orthographies and their regional implementations. Precise,
targeted subsets of known combinations are difficult to compile and of
limited use. Generic positioning of marks categorised by shared anchors
is much easier to achieve and provides flexible results.

John Hudson

-- 
Tiro Typeworks        www.tiro.com
Gulf Islands, BC      tiro@tiro.com
You can't build a healthy democracy with people
who believe in little green men from Venus.
                    -- Arthur C. Clark

Next message: verdy_p: "Re: Why people still want to encode precomposed letters"
Previous message: John Hudson: "Re: Why people still want to encode precomposed letters"
In reply to: philip chastney: "Re: Why people still want to encode precomposed letters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Nov 25 2008 - 12:54:47 CST