From: Doug Ewell (firstname.lastname@example.org)
Date: Sun Mar 16 2003 - 19:05:48 EST
Chris Jacobs <c dot t dot m dot jacobs at hccnet dot nl> wrote:
> A codepoint in itself does not specify a character.
> Font + codepoint does specify a character.
> Charset + codepoint also can specify a character.
All true for non-Unicode fonts. But then one is left to wonder why we
are discussing this on the Unicode list.
> Say font A has on E000 an apple symbol, while font B has there a
> Say for this reason I gave font B an offset of 0100
> Then on my system U+E000 in plaintext should indeed display an apple
> symbol and U+E100 a banana symbol.
> But if there are more fonts with an apple symbol U+E000 does not
> specify the font to use.
This isn't conformant and won't work.
Say I have font A with its apple at E000 and font C with an orange at
E000, and for this reason I gave font C an offset of 0100. Then I add
font B, with its banana at E000, and I have to give it an offset of 0200
because font C was installed first.
As a result, the same code point (U+E100), with the same font (B), will
display as an orange on my system and a banana on Chris's.
This destroys interoperability even more than the use of custom "hacked"
ASCII fonts, because there is no way to control the madness if users are
allowed to select their own offsets. Fifty different users might all
have the same offset, or might have 50 different offsets. The author
has no way of knowing who can read his text and who cannot.
> It is the only consistent way to let plaintext utilities work properly
> in the PUA
It is anything but consistent.
> Of course such plaintext cannot as such be interchanged with other
> systems, but if needed it could be converted to a format which can be
> interchanged, the info if a certain codepoint represents an apple or a
> banana would be still there.
If apples and bananas are really needed as characters in a font -- which
is debatable -- then Chris might consider creating a Fruit And Vegetable
Ornaments Registry ("FAVOR") to assign APPLE to U+E000 and BANANA to
U+E001 once and for all.
This registry would cohabitate in the PUA with the ConScript Unicode
Registry and the Golden Ligatures Collection and who knows what else,
and would be a de-facto standard for the six authors worldwide who feel
compelled to represent apples and bananas in plain text.
> Which Tengwar font?
> He could specify the font as Tengwar Sindarin and use whatever
> codepoint Dan Smith gave it.
> Or he could specify Code2001 and use E000
> If he used a Dan Smith's Tengwar font the charset should not be
> specified as unicode.
Unicode and the PUA were invented to put an end to this kind of
confusion and uncertainty. If the subject is going to be non-Unicode
fonts, I suggest moving this discussion to a more appropriate mailing
This archive was generated by hypermail 2.1.5 : Sun Mar 16 2003 - 19:52:00 EST