Michael Everson
Thu Jul 04 2002 - 17:19:40 EDT

At 19:19 +0000 2002-07-04, Timothy Partridge wrote:
>Michael Everson said in reply to me
>> >- No scripts with a limited body of text in existance. (No need to exchange
>> >or analyse on computer.) E.g. Phaistos disk script
>> If the Phaistos disk were bilingual and deciphered, it could be added
>> even if there were only one document. Why not?
>I have the impression that there is a philosophy that the standard is for
>storing text on a computer for the purposes of processing and interchange
>and should avoid being cluttered with things that aren't needed. This seems
>to have precedence over covering all the important scripts.

Oh? What "important" scripts aren't on the Roadmap? The "important"
(national) scripts have all been encoded; Tibetan, Sinhala, Ethiopic,
Thaana, Myanmar, and Khmer were among the last "national" scripts
which were encoded. Of course we add Arabic characters for Urdu and
Georgian and so on as they appear.

>In the case of the Phaistos disk the length of text is so short that there
>seems little benefit in storing the plain characters even if it had text in
>another language too. Surely an image of the original or an accurate drawing
>is better at communicating the content and gives more of a feel for the disk
>than just the characters?

I suppose. But if Phaistos were deciphered I would be pleased to encode it.

>Personally I would be happy to see this particular script in the standard.
>Indeed scripts that appear on the cover of the Unicode book or CD have a
>rather poor record of appearing inside.

Oh no! We're found out!

>(The Greek from the Rosetta stone is there and I presume the KangXi
>dictionary page too as of 3.1, although I don't read CJK.)

I imagine the KangXi characters have been encoded by now. Egyptian is
on the list but is a big job. You may know I've done a fair bit
towards encoding Egyptian.

> > >- No symbols that are just a picture of something with no other meaning
> > >e.g. a dog. (These tend not to have a fixed conventional form.)
>> For instance, Blissymbols has a dog symbol in it. Granted,
>> Blissymbols is a separate script so maybe that isn't so convincing.
>> But what if a series of hotel symbols were added, with things like NO
>> SMOKING, NO DOGS, GUIDE DOGS appeared? Those do have some sort of
>> real semantic even though the glyphs may vary.
>Agreed although as you say there is a semantic involved. Interestingly in
>the NO... cases the semantic is provided by the circle with the diagonal
>line U+20E0. So there might be a desire to put a cigarette symbol into the
>standard (which is just a picture) so it could be granted a semantic by
>being combined with U+20E0. (In practise I doubt these symbols are used in
>running text so there is little call for them. Come to think of it, the
>colours involved are often standardised so a graphic would be better,
>discounting another thread.)

They appear in a great many travel guides as meaningful characters.
Research on things like this continues.

> > >- No archaic styles of existing characters. E.g. dotless j.
>> There are some archaic characters already encoded, and N'Ko is going
>> to have two of them. Probably.
>I was thinking of same character different style of writing it. In other
>words easily done with a font change. (The mathematical styles are a special
>case because each style is effectively a different set of characters with
>there own meaning.) A character that used to be used but is no longer is
>perfectly legitimate.

Like LONG S? :-)

> > >- No control codes for fancy text. E.g. begin bold
>> We have BEGIN SLUR in Western Music already. Might have use for BEGIN
>> and END CARTOUCHE in Egyptian -- or might not. Research continues.
>I gave the example of bold because it doesn't change the essential meaning
>of the text in a significant manner. In your examples there is meaning
>involved; the cartouche is a determiner for a royal name for instance.

Ah. I didn't track your use of the word "fancy". Still it's
interesting to note that Budge printed red hieroglyphic text with a
bold black line above the entire text which was red in the original.
Possibly a cause to encode "BEGIN RUBRIC". (It's never neat and tidy,
is it?)

> > >- No characters that can be obtained by using a different font with
> > > existing characters and have no semantic difference from the existing
> > > characters.
>> Such as?
>Inclined Cyrillic Be. I have that in a font on my computer but I don't
>expect Unicode to have a separate character for it.

What is "inclined Cyrllic Be"?

>Or (to my mind unfortunately) Greek letters in a Coptic style.

Copticists agree that this unification was false, and the paper N2444
Coptic supplementation in the BMP at has won the day and
it is agreed that the two scripts be disunified and the Coptic
letters answering to the Greek be added to the standard in future.

