Re: Ways to show Unicode contents on Windows?

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Fri, 19 Jul 2013 21:20:39 +0100

Peter Constable <petercon_at_microsoft.com> wrote:
> Behalf Of Ilya Zakharevich wrote:

> > Why would one NEED to upgrade the OS to use Old Italic?

> You can't expect an OS like Windows XP to support Old Italic
> characters that weren't even defined in Unicode at the time it
> shipped.

That actually came as a great surprise to me. I once naively thought
that all that had to be done was to update the version of the Unicode
Character Database (UCD) that the system was using, and then only new
*properties* should be causing major trouble. Now scripts needing
reordering have their own problems, but that sort of problem is what
SIL developed Graphite for. (I fear the case for Microsoft Office to
support Graphite is steadily reducing.)

The problem with changes to the UCD arises partly because enough
developers prefer speed and compactness to flexibility.

> That said, it turns out that a given version of Windows does support
> later-encoded characters such as Old Italic that have no special
> requirements fairly well -- provided you have a font and format your
> content with that font.

Are you sure this tolerance isn't by design?

> It is the case of simple rendering. Given a font, and a keyboard
> layout (both doable in user-land), it should “just work”. Or I am
> missing something?

The biggest thing you're missing is too much cleverness, and the second
is centralisation.

Word switches keyboard at the very least as you step through text,
which in simple cases is quite helpful. Also, Office has (at least)
three current fonts - one for simple scripts, one for complex scripts,
and one for CJK scripts. This in itself can cause problems with new
scripts - I have a fair bit of Tai Tham text in Open Document format
that has the wrong size because LibreOffice hesitantly changed the
script's classification from simple to complex.

The centralisation issue is that Indic rearrangement and selection of
Arabic and Syria contextual forms seemed obvious things to abstract
away from fonts and handle centrally. Consequently, text is split by
script and each script run handled separately.

Combining the two, we can certainly have Word XP asking whether a
font supports a script, and refusing to use it for the script if it
doesn't declare it does. I had to fiddle the OS/2 table of a Tai Tham
hack font (Lannaworld) to be able to use it. The font maps Latin and
Thai characters to Tai Tham glyphs, but when I downloaded the font it
didn't declare support for the 'Basic Latin' character range or the
'Latin-1' encoding. To get the font to work, I not only had to dodge
the constraints on Thai character sequences, I also had to change the
OS/2 table to declare that the font supported the Latin range and
encoding.
 
I still don't think we've got to the bottom of Doug's PUA problem. For
all I know, he may have been violating the agreement he made with
Microsoft for the use of the PUA. I'm not aware of Microsoft
publishing a consolidated statement of this agreement, but I've a
feeling some characters are reserved for symbol fonts and yet others are
reserved for Thai glyphs. Its also conceivable that he trespassed on
the PUA assignments decreed by China for Tibetan.

Richard.
Received on Fri Jul 19 2013 - 15:24:39 CDT

This archive was generated by hypermail 2.2.0 : Fri Jul 19 2013 - 15:24:40 CDT