Re: The golden ligatures collection ct ligature code in use.

From: Doug Ewell (dewell@adelphia.net)
Date: Sat Jun 01 2002 - 21:04:25 EDT


Sorry if this message got sent already. (Outlook Express promised me it
was just saving a draft so I could edit it later.)

William Overington <WOverington at ngo dot globalnet dot co dot uk>
wrote:

> Yet ConScript has now withdrawn that allocation and now uses that
> code point for Ewellic.
>
> http://www.evertype.com/standards/csur/ewellic.html
>
> What is interesting is as to how Doug produced that effect. How was
> it done please?

I don't think the "effect" William was talking about was getting my
proposal for Ewellic listed in ConScript, but rather entering the
character U+E707 into an e-mail.

It was simple. I started SC UniPad <http://www.unipad.org> with a new
document, opened Character Map, entered E707 in the edit control and
pressed Enter, then copied and pasted the character into Outlook
Express. I could have entered the character into UniPad in other ways,
too, like typing "\ue707", then highlighting the sequence and converting
it from ASCII-UCN to Unicode.

As for the problem of sending a PUA character in an e-mail, I simply
made sure the encoding of my message was set to UTF-8. (More about
UTF-8 below.)

> Here it came out as a black rectangle in Outlook Express.

That's right, it did. And it did for almost everyone else (except my
neighbor, James Kass). That was my whole point of using it; unless you
were using the One and Only Font to read my message, you would see a
black rectangle, which is WORSE -- not better -- than if I had just used
"c" and "t".

> So I did
> two things. Firstly I looked in the message source and found the
> string =EE=9C=87 in the line of text. Secondly I did a copy and
> paste of the text from Outlook Express to Word 97 and then did a
> Save as HTML and then I looked at the source code of the HTML file
> which was produced. This produced the number 59143 in the sequence
> &#59143; so I then looked in the list at the following web page.
>
> http://www.Joern.De/tipsn128.htm#Ligaturen
>
> There, to my delight, was the number 59143 alongside my choice of
> U+E707 for the ct ligature.

Well, you probably could have guessed on your own what belongs between
"Respe" and "fully". :-)

> This is interesting, as the fact that your system was set up for
> ConScript and Doug wrote using a character from what is now called
> the golden ligatures collection provides a good practical example of
> the need for the use of the classification codes which I suggested
> some time ago.

No, it doesn't. It illustrates the supreme lack of interoperability
which would result from the quasi-standard use of these ligature code
points. You saw a black box and so did most other users. And even you,
the inventor of these things, had to jump through hoops to discover what
the text meant.

> If the Conscript registry is defined to be in one type tray and the
> golden ligatures collection is defined to be in another type tray,
> then, in future software, the two different meanings associated with
> the code point U+E707 could be clearly signalled, indeed the two
> meanings could both be signalled in the same document!
>
> I am wondering what is the coding that Doug used, namely =EE=9C=87
> in the line of text.

It's called quoted-printable UTF-8. U+E707 is expressed in UTF-8 with
the bytes 0xEE 0x9C 0x87. These three non-ASCII bytes are then
converted to the nine-character ASCII string "=EE=9C=87" so they will
pass through e-mail channels. This is basic character encoding stuff,
something you really should have a handle on if you are going to propose
grand new uses for Unicode.

> I have also analysed the other black rectangle which appears in your
> posting by the same process. It comes out as decimal 9785 which
> converts to hexadecimal 2639 which, upon looking in the code charts,
> gives a variation on a smiley, namely a frowning face.

Now that glyph *is* in quite a few existing fonts. If you are using
Windows 2000 or XP and a fairly common Microsoft-provided font, you
should have seen it.

> So, Doug has proved the benefit of my list existing and you have
> proved the benefit of, in the future, using my suggested
> classification codes.

Nobody did any such thing. We proved that the use of this proposed
ligature character obscures the intended text unless a custom-built font
is used, and guarantees that a search of the Unicode Mailing List
archive for the word "Respectfully" will never return a hit for that
message.

-Doug Ewell
 Fullerton, California



This archive was generated by hypermail 2.1.2 : Sat Jun 01 2002 - 19:26:02 EDT