Re: The golden ligatures collection ct ligature code in use.

From: William Overington (WOverington@ngo.globalnet.co.uk)
Date: Mon Jun 03 2002 - 07:35:06 EDT


Thank you for your posting. I feel that I have learned a lot from your
explanation.

>It was simple. I started SC UniPad <http://www.unipad.org> with a new
>document, opened Character Map, entered E707 in the edit control and
>pressed Enter, then copied and pasted the character into Outlook
>Express. I could have entered the character into UniPad in other ways,
>too, like typing "\ue707", then highlighting the sequence and converting
>it from ASCII-UCN to Unicode.

Thank you for the link. I have downloaded a copy of SC UniPad and tried it
out. I am very impressed. I managed to produce the U+E707 character in a
document by both methods, displaying it in the two possible ways as well.
Very impressive.

As I am interested in Esperanto, I tried entering some Esperanto text, which
worked well.

I then decided to copy and paste this into a PowerPoint 97 presentation
running on a PC running under Windows 98 and found that the computer would
not pass the accented character properly ( a c^).

I then tried pasting it into Word 97, which accepted the c^ character.

So I then highlighted the Esperanto text in Word and did a copy and paste
from Word to PowerPoint and the accented character transferred correctly. I
then formatted the text in PowerPoint to 200 points, italic and green.

So, it appears that SC UniPad used in conjunction with Word and PowerPoint
can be used to prepare elegant presentations in the languages of the world.
Wow!

If anyone would like to try SC UniPad out and would like to use Esperanto as
an example to try to repeat my experiment with perhaps different versions of
PowerPoint and Word and perhaps extending the experiment to other packages
yet does not know any Esperanto words, please know that there are a few,
including some with accents, in some of the illustrations of the following
page.

http://www.users.globalnet.co.uk/~ngo/euto0008.htm

>> Here it came out as a black rectangle in Outlook Express.
>
>That's right, it did. And it did for almost everyone else (except my
>neighbor, James Kass). That was my whole point of using it; unless you
>were using the One and Only Font to read my message, you would see a
>black rectangle, which is WORSE -- not better -- than if I had just used
>"c" and "t".
>

I feel that what happened is very interesting and should go down in Unicode
history as "The Respectfully Experiment".

William published a list of code points for ligatures. Quite independently
of each other, James used the list to add a code point for a ct ligature
into his fount, Doug used the list to include a code point for a ct ligature
in his posting. Doug posted to the Unicode list, not directly to James,
without any direct prior arrangement with James over the use of these
Private Use Area codes. The message was received by James' computer and
displayed correctly. So, the meaning of the sender was communicated to the
recipient using a code from the Private Use Area with a meaning obtained
from a published list.

This is, I feel, an experiment which should be properly documented in the
Unicode archives.

----

>> So, Doug has proved the benefit of my list existing and you have >> proved the benefit of, in the future, using my suggested >> classification codes. > >Nobody did any such thing. We proved that the use of this proposed >ligature character obscures the intended text unless a custom-built font >is used, and guarantees that a search of the Unicode Mailing List >archive for the word "Respectfully" will never return a hit for that >message.

I feel that an interesting aspect of all of this is that if one looks very carefully at what James did there may be benefit for the future, for James did something which I did not imagine happening. The thing is, James used my list within an OpenType fount, as a method of providing a designation for a glyph normally used within the fount, thereby making it accessible directly from outside the fount as well.

I have started to gather some documents about OpenType and am hoping to learn more about OpenType by studying them. I have noticed that on a PC OpenType really needs Windows 2000, yet that there is a way to use the founts on a PC with older operating systems, though not in such an effective way.

So, I wonder if perhaps the route to go on this is to say that, yes, the use of special code points for ligature characters is very useful at the present time for people with older operating systems, but that as time goes on the trend should be for transferring to more modern solutions using OpenType and the ZWJ and ZWNJ code point system together with using special code point characters for ligatures within an OpenType fount as a way of gaining direct access to glyphs in an OpenType fount if needed for a special purpose. Initially, if I understand it all correctly, that approach would also make the ligatures and other special characters in an OpenType fount available directly to someone using, say, PowerPoint 97 on a Windows 98 platform.

On the matter of searching the archives for the word "Respectfully" and not getting a hit returned for that message, I wonder if that situation need persist for ever.

Suppose that the software which copies documents into the archive automatically converted any use of a golden ligatures code point into the constituent letters and ZWJ characters and that when an end user requests a search, ZWJ characters are ignored by the search engine.

I realize that that last paragraph misses out the issue of the non-uniqueness of a Private Use Area code point designation, yet if the ligatures were promoted to regular Unicode, that problem would not exist.

Now, in order to assist in implementing that process, there could also be a new character defined which could be used if so desired, and which would have zero width and be ignored in searches. This could be WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE WAS ORIGINALLY USED FOR THE FOLLOWING LIGATURE and could be inserted to show that a whole ligature was used in the input.

In case there happen to be readers interested in having a go at writing some experimental software which would process texts containing ligatures in this manner and then recovering copies in a choice of either ZWJ format used for ligatures or whole codes used for ligatures, so that users of older equipment could select the choice to suit their equipment, and be able to interrogate an archive database so as to find out what percentage of ligature usage was originally using the ZWJ method and what percentage of ligature usage was originally using the whole code point method, I am pleased to suggest the code point U+E7C1 WATERMARK-LIKE MEMORY THAT A WHOLE LIGATURE WAS ORIGINALLY USED FOR THE FOLLOWING LIGATURE in the hope that experimental databases produced by various experimenters might all be compatible one with another.

Naturally, it would be desirable, if that feature is thought worth using, to promote the meaning of U+E7C1 as defined above to regular Unicode at the same time as promoting some or all of the golden ligatures collection and also implementing any other desired ligatures, such as those needed for calligraphy, in regular Unicode.

Naturally, it might be better all round if these code point and ligature pairings were promoted to plane 0 of regular Unicode, with whatever notes accompanying them so as not to affect the primary use of the ZWJ and ZWNJ codes for ligation control as people think it best to include, so that people can rely upon unique regular Unicode code points without having the option of either using the Private Use Area or having no facility at all. The golden ligatures collection is intended as a work of art as itself, and one of the important aspects of art is its influence upon the world at large, in influencing the way that people think of things, so, as a work of art, it is already successful.

I wonder if people might like to discuss whether such promotion, accompanied by a table of how to break down each ligature into its constituents, and accompanied by whatever notes might be necessary about avoiding direct use in the long term as far as possible, would be feasible please.

I feel that one factor to be considered, though not the only one, is that older computing equipment, pre-Windows 2000, is likely to be in use for many years yet, though to a lessening extent as time goes by certainly.

I would be interested in reading a debate on this matter as perhaps people with deep knowledge of the workings of OpenType could perhaps think this idea through and maybe also arrive at other possibilities which could be used. Also, any possible problems in doing this could be raised.

If there is room in plane 0 to do so, maybe a comprehensive set of code points could be allocated.

Readers interested in the possibilities might like to have a look at a newly written document on some Private Use Area code points for numbers, which is now available from within the following index page.

http://www.users.globalnet.co.uk/~ngo/golden.htm

In relation to this, readers might like to know that there is a sample font collection for various Fraktur faces available for free download on the www.waldenfont.com site, providing only some of the characters for each typeface, yet the Gutenberg face has all ten digits included in the sample and many of the samples have all of the digit characters included as well. There is certainly enough information there to get a good understanding of the encoding aspects.

William Overington

3 June 2002



This archive was generated by hypermail 2.1.2 : Mon Jun 03 2002 - 06:04:15 EDT