Re: Language Tagging And Unicode

From: Christopher John Fynn (
Date: Wed Jan 19 2000 - 08:33:04 EST

David C. Brown (Windows dev) wrote:

> There is currently no support for language based script shaping
> in Uniscribe, but we did make provision in the design of the API:

> The Uniscribe ScriptItemize API takes a SCRIPT_CONTROL parameter
> various flags used alongside the Unicode codepoints. One of the fields in
> SCRIPT_CONTROL is uDefaultLanguage - the idea being for Uniscribe to pass
> this through to OpenType so that fonts which include alternate language
> based shaping (or alternate glyph lookup) can be controlled by the client.

Does this mean that Windows' OpenType layout services does support
language system (even if MS are not currently making use of this in
Office etc. applications)?

The OpenType spec says e.g. for GSUB (glyph substitution) lookup:

1 Locate the current script in the GSUB ScriptList table.
2 If the language system is known, search the script for the correct
   LangSys table; otherwise, use the script's default language system
   (DefaultLangSys table).
3 The LangSys table provides index numbers into the GSUB FeatureList
   table to access a required feature and a number of additional features.

Do MS applications that use OT currently bypass (2) and just go straight to
the default language system for any given script? (Text in word Word at
least has a language attribute so presumably that application "knows"
the language system.)

I've been trying to build OT fonts based on the assumption that applications
will use the LangSys table. Am I wasting my time?

BTW Does anyone know if Adobe's InDesign makes use of language system?

> Note that by 'complex scripts' Uniscribe implies some form of script
> specific processing as part of the rendering process. For example
> Uniscirbe analyses Arabic codepoints to idetify initial, medial, final
> and alone categories before using OpenType to do shaping.

> So it is not that Cyrillic is simple, just that we reserve the specific
> 'complex script' for scripts best handled by more than OpenType alone.

Presumably when it comes time to implement different forms of
Arabic script (Arabic, Persian, Urdu, etc.) or whenever it is felt
necessary to distinguish between Russian Cyrillic and Serbia
Cyrillic you will make use of LangSys - or are you going to
try and determine the language by analysing the text and then
switch to another font if that is available?

This currently looks like bit of a "chicken & egg" situation to me. If
developers don't make use of "language system" then font developers won't
bother to put language system specific glyph variants and their lookups
in OT fonts; and if font developers don't include language system specific
features in their fonts application developers won't bother to look for

If the situation is that there are currently no applications that will make
use of
language specific glyph variants in fonts, it seems to me that we should be
encouraging application developers to add this feature - not that we need to
add variant characters to the Unicode standard in order to work round this.

- Chris

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:58 EDT