Re: Chess symbols, ZWJ, Opentype and holly type ornaments.

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Fri Jun 21 2002 - 06:04:15 EDT


>>Suppose that one wishes to produce a chess diagram in a Unicode compliant
>>manner in a document produced using Word 97 running on either a Windows 95
>>platform or a Windows 98 platform, with a view also to save the document
as
>>plain text. One way to do that would be to use a chess fount which is
>>mapped to my collection of code points for a chess fount. I would be
>>interested to know of how, if at all, that could be done in a document
>>produced using Word 97 running on either a Windows 95 platform or a
Windows
>>98 platform using regular Unicode or XML.
>
>For someone unfamiliar with Windows and Word 97, are these proposed
>operations (inputting PUA character sequences, saving them as UTF-8 plain
>text, opening and correctly reading the document produced) technically
>possible on the systems mentioned, assuming you have an appropriate font?
>

Well, what I say can be done can indeed be done, yet not quite in the manner
suggested in your question.

The Private Use Area characters are input by using Insert from the menu bar,
then choosing Symbol from the drop down menu. A dialogue box is produced.
The symbol can be selected from a display of small boxed representations of
all of the characters available in the current fount. There is a cursor
which shows a magnified view of the character currently being considered for
insertion. It is possible to set the system up for characters to be custom
linked to the keyboard. For example, I once had Word 97 on a Windows 95
system set up so that I could enter the twelve accented characters used for
Esperanto by using such combinations as Alt+c for c circumflex and
Alt+Shift+c for C circumflex. I found that I could key in Esperanto about
as fast as I can key English (30 words per minute, which is not fast as
keying goes).

Saving is to a file type known as "Unicode Text". Getting the file back may
display wrongly. One needs to specify the fount to read it properly. For
example, Word 97 on a Windows 95 system may be set up to have Times New
Roman as the default fount. Suppose that that fount on that machine does
not have the fi ligature in the Private Use Area. Suppose that one creates
a file using the Tahoma typeface using the fi ligature from the Private Use
Area. One then saves the fount as "Unicode Text". Closing down Word 97
then starting it up again and reading in the "Unicode Text" file brings back
the information, yet does not display it properly as the fount being used is
the default fount with which Word 97 on that PC starts up, namely Times New
Roman and that fount on that machine does not have the fi ligature in the
Private Use Area. However, displaying using the Tahoma fount displays the
fi ligature correctly.

As a passing note, please note that Microsoft Corporation very generously
provides more modern versions of various founts for free download from its
website. The Times New Roman fount on the PC in the above example has just
not been replaced with an updated version.

The format "Unicode Text" is not defined within the Word 97 help as far as I
can find, yet I have tried an experiment of producing a file as a Word 97
Unicode Text format file, then changing its file name extension to .raw and
reading it into Paint Shop Pro as a grey scale picture then zooming in on
the picture and deducing the bit patterns from the colours.

I entered the word actor using an experimental fount of my own with a o r
from the ordinary places and a ct ligature at U+E707 as in my golden
ligatures collection of code points for ligatures.

http://www.users.globalnet.co.uk/~ngo/golden.htm

The colours came out as follows.

255
254
97
0
7
231
111
0
114
0
13
0
10
0

After that, all of the grey scale picture which I produced is colour 0, yet
I am unsure whether there were none or more 0, 0 pairs after the 10 0 in the
file.

I do not know the format of that Unicode Text file, but I think that it is
not UTF-8.

However, one can save in that format and read back in from that format.

Also, I have found that one can read in such a file into Microsoft WordPad
under Windows 98.

One can even, in WordPad under Windows 98, get the ct ligature character
into the program using Alt 59143.

An interesting document on that technique is as follows.

http://www.users.globalnet.co.uk/~ngo/pai04200.htm

That page also contains a link to the Microsoft free founts download page.

William Overington

21 June 2002



This archive was generated by hypermail 2.1.2 : Fri Jun 21 2002 - 04:24:19 EDT