From: Tex Texin (tex@i18nguy.com)
Date: Sun Jul 06 2003 - 21:51:12 EDT
Thanks very much Ben.
The radical I see used on several web pages corresponds to either U+4E88 or
U+5B50 (child), they are very similar.
If you look at the Unicode charts, the character 79ED has a radical on the
right which doesn't look like the child character at all (to me).
http://www.unicode.org/charts/PDF/U4E00.pdf
For ext. B, the character James suggested U+25771 also has the child radical.
However, the other two U+25797, U+25791 do not.
http://www.unicode.org/charts/PDF/U20000.pdf
If you look at the charts you can see what I am referring to.
The character is often represented by two characters as I did on my page or as
a glyph image. Very often it is not displayed at all, and is skipped in lists
of these characters used for numbers. That's why I am kind of stuck not
wanting to recommend a single JIS-based character, if they have been rejected
by many users, and also not wanting to lead in a new direction, unless there
is a preponderance of agreement it is the right thing to do. It might be best
to continue with the two-character or glyph approach.
I agree that this number is not going to be used a lot and therefore it may
not bear a great investment in sweating over which character(s) to use, but on
the other hand I like to make my pages accurate with reasonable
recommendations.
I was hoping Unicode 4.0 would have a clear solution to the problem, and if
the character U+25771 were in the BMP, and if font vendors told me they were
going to support it reasonably soon, then it seems to me to be the right thing
to recommend (with a caution perhaps) going forward. Given it is part of Ext.
B, support seems far away at best and therefore not a good recommendation.
At this point, I probably should footnote the character and provide the
suggestions you have documented (which I appreciate!). Before I do, let me
know what you think of the glyphs in the Unicode charts, to make sure that the
rightside radicals there, are something you would agree are reasonable
alternatives to child. They look nothing like U+4E88 or U+5B50.
I am hoping this won't take 1 jo/shi of emails to straighten out!
BTW, I should mention my knowledge around Kanji is next to nil and my sources
were mostly other web pages I searched out, so this purely a layman's effort
on my part and given the accuracy of the web, I will change positions easily.
tex
Ben Monroe wrote:
>
> [UTF-8]
>
> Tex Texin wrote:
>
> > On shi/jo the glyph I see in Windows charmap doesn't look
> > right. Perhaps it is my particular set of fonts. I expect to
> > see a radical on the right that looks like the character for
> > child, and charmap shows something else. I'll wait to see if
> > someone else chimes in pro or con.
>
> The right side of the character probably has U+4E88 予 instead of U+5B50 子 (child). These two characters are different. As I mentioned before, there are several different glyphs used to write shi/jo.
> Several of the forms are U+79ED 秭, U+25797 𥞗, U+25791 𥞑, and U+25771 𥝱.
> These all express the value of 10^24 and are read as shi or jo, depending on your source.
>
> > Also, I wonder what the correct thing to recommend would be?
> > Assuming surrogate support was consistently available, and
> > fonts were available containing this character (are there any
> > today?), since the character was not generally being written
> > as a single character until now (and I am still not sure if the
> > pair U+79BE U+4E88 is the correct alternative), would it be
> > right to recommend this for people to use in number writing
> > going forward? I tend to think of Ext. B as there for historic
> > and special characters, not those that might be used every day.
>
> If you are worried about surrogate support and font availability, then U+79ED may be the best, which is attested and documented, and listed in modern dictionaries. Both Koujien and Daijirin (available online at http://dictionary.goo.ne.jp/index.html?kind=jn&mode=0se this glyph for it's entry of "shi". Otherwise, go for U+25771, which seems to be attested the most in documents. Daijirin uses this glyph for it's entry of "jo", but Koujien does not list it.
>
> However, these are not really "every day" characters, at least in my experience. Most people will know "chou", some will know "kei", fewer will know "gai", and even fewer will know "shi/jo". I would be a little surprised if many people could list the rest off the top of their head without prior special study or other references.
>
> Ben Monroe
> [For those looking for my original e-mail message that Tex responded to, I accidentally sent it under a new address forgetting to update my subscription information after my e-mail address changed. (Old one is still being forwarded to this one.)]
-- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
This archive was generated by hypermail 2.1.5 : Sun Jul 06 2003 - 22:32:21 EDT