Re: Synthetic scripts (was: Re: Private Use Agreements and Unappr oved Characters)

From: Thomas Chan (tc31@cornell.edu)
Date: Fri Mar 15 2002 - 21:20:37 EST


On Fri, 15 Mar 2002, Kenneth Whistler wrote:

> Ben Monroe wrote:
> > As it is a personal spelling, I never expected
> > Unicode to map a code point to this character to me.
>
> For those not following the Japanese in the UTF-8, Ben's name
> is Monryuu Ben in kanji. This is a sound-based name coinage
> for an English name. Mon 'gate' ryuu ~ ryoo 'dragon'. (Sorry,
> but I can't tell just from the kanji just exactly what
> pronunciation you would use.)
> And sticking the dragon inside the gate, which is then
> structured like a radical, is creating a phonological rebus
> that departs from the ordinary way that radical + phonological
> component characters are constructed. No wonder your teacher
> marvelled at how to pronounce it -- Han characters aren't
> constructed by putting two syllables together in one character
> to create a disyllabic pronunciation.

I thought it was a case of an obscure personal name character, until I saw
the connection to "Monroe".

There are a small number of these with polysyllabic Chinese or
Sino-Xenic readings where the reading is a concatenation of the readings
of its components, such as U+55E7 jia1lun2 'gallon' (U+52A0 U+4F96),
which are really ligatures. Some like U+337B heisei (Japanese era name)
(U+5E73 U+6210) are easy to recognize, but it would be easy for characters
like the "Monroe" one to slip through in the absence of information about
it.

This is different from another small set where a polysyllabic
Chinese/Sino-Xenic reading is not a concatenation of the readings of its
components, such as U+544E ying1chi3 'English foot' (U+5C3A chi3 'foot'
plus a 'mouth' radical--indicating a semantic connection/modifiation?).

However, not everything that looks like a ligature really is, such as
U+6B6A wai1 'crooked' appears to spell out the phrase bu4 zheng4 'not
straight' (U+4E0D U+6B63); U+81AD Cantonese chun 'animal egg' appears to
spell out the phrase mei sing yuk 'not yet become flesh' (U+672A U+6210
U+8089); or U+7526 su1 'to revive' appears to spell out a synonymous word
geng4sheng1 'to revive' (U+66F4 U+751F).

 
> > Should I really have any reason to expect Unicode to deal with this?
>
> Nope. Any more than it should deal with the fanciful but
> ubiquitous good luck coinages like the shuang1xi3 'double happiness'
> "character".

U+56CD and U+21155--the latter seems to be more common on printed matter.
Almost no Chinese dictionaries include them, but Korean ones seem to.
However, I think this case may have made the jump from ligature to
independent character, as it has acquired a monosyllabic xi3 reading, and
appears in the title of at least two movies (rather than appearing
independently as decoration). But more examples of this class can be
found at the "shinji" page[1] of "Kanji no shashin jiten"[2]
(content is in Japanese). Apparently Mojikyou has been okay with encoding
some of them.

[1] http://homepage2.nifty.com/Gat_Tin/kanji/sinji.htm
[2] http://homepage2.nifty.com/Gat_Tin/kanji/kaindex.htm

 
Thomas Chan
tc31@cornell.edu



This archive was generated by hypermail 2.1.2 : Fri Mar 15 2002 - 21:04:04 EST