Re: The character for 10**24 in Japanese numbers (jo)

From: Allen Haaheim (haaheima@interchange.ubc.ca)
Date: Tue Jul 08 2003 - 01:17:46 EDT

  • Next message: Tex Texin: "Re: The character for 10**24 in Japanese numbers (jo)"

    Hi,
    Any similarity between U+5b50 子 at 3 strokes, and U+4e88 予 at 4, is
    superficial. For example, U+2007 迺 is a common variant for 4e43 乃 ("is,"
    "then") but can't be confused 逎 900e ("alcoholic beverage"). (Context
    usually makes that clear pretty quickly!) Since 79ed and 25771 seem to be
    both interchangeable for your purpose, as well as both still in fairly
    common use, I agree a note would be a good idea.

    (Someone correct me please, if the following method should not be used for
    Japanese--I'm sure it is fine with Chinese.) If you can't get 25771 to work,
    for the note you might consider using a certain "Ideographic Description
    Character," namely U+2FF0 ⿰, which indicates that the two characters to
    follow
    are meant to represent the two left/right components of the originally
    desired character.
    Otherwise you will simply have two incorrect graphs (79be 禾 and 4e88 予),
    even if what you really
    mean is obvious enough. E.g., something like "Also written as⿰禾予."

    There is an explanation in English here (bottom of page):
    http://deall.ohio-state.edu/grads/chan.200/cjkv/elevator.html

    Allen

    When you raise chickens, you let them eat what they want,
    As soon as they fatten, you boil them in a pot.
    This plan is the best for the man who owns them,
    But for heaven's sake, don't let the chickens know!

    Yüan Mei (1776), trans. J.D. Schmidt

    ----- Original Message -----
    From: "Tex Texin" <tex@i18nguy.com>
    To: "Ben Monroe" <bendono@comcast.net>
    Cc: <unicode@unicode.org>
    Sent: Sunday, July 06, 2003 6:51 PM
    Subject: Re: The character for 10**24 in Japanese numbers (jo)

    Thanks very much Ben.

    The radical I see used on several web pages corresponds to either U+4E88 or
    U+5B50 (child), they are very similar.
    If you look at the Unicode charts, the character 79ED has a radical on the
    right which doesn't look like the child character at all (to me).
    http://www.unicode.org/charts/PDF/U4E00.pdf

    For ext. B, the character James suggested U+25771 also has the child
    radical.
    However, the other two U+25797, U+25791 do not.
    http://www.unicode.org/charts/PDF/U20000.pdf

    If you look at the charts you can see what I am referring to.

    The character is often represented by two characters as I did on my page or
    as
    a glyph image. Very often it is not displayed at all, and is skipped in
    lists
    of these characters used for numbers. That's why I am kind of stuck not
    wanting to recommend a single JIS-based character, if they have been
    rejected
    by many users, and also not wanting to lead in a new direction, unless there
    is a preponderance of agreement it is the right thing to do. It might be
    best
    to continue with the two-character or glyph approach.

    I agree that this number is not going to be used a lot and therefore it may
    not bear a great investment in sweating over which character(s) to use, but
    on
    the other hand I like to make my pages accurate with reasonable
    recommendations.

    I was hoping Unicode 4.0 would have a clear solution to the problem, and if
    the character U+25771 were in the BMP, and if font vendors told me they were
    going to support it reasonably soon, then it seems to me to be the right
    thing
    to recommend (with a caution perhaps) going forward. Given it is part of
    Ext.
    B, support seems far away at best and therefore not a good recommendation.

    At this point, I probably should footnote the character and provide the
    suggestions you have documented (which I appreciate!). Before I do, let me
    know what you think of the glyphs in the Unicode charts, to make sure that
    the
    rightside radicals there, are something you would agree are reasonable
    alternatives to child. They look nothing like U+4E88 or U+5B50.

    I am hoping this won't take 1 jo/shi of emails to straighten out!

    BTW, I should mention my knowledge around Kanji is next to nil and my
    sources
    were mostly other web pages I searched out, so this purely a layman's effort
    on my part and given the accuracy of the web, I will change positions
    easily.

    tex

    Ben Monroe wrote:
    >
    > [UTF-8]
    >
    > Tex Texin wrote:
    >
    > > On shi/jo the glyph I see in Windows charmap doesn't look
    > > right. Perhaps it is my particular set of fonts. I expect to
    > > see a radical on the right that looks like the character for
    > > child, and charmap shows something else. I'll wait to see if
    > > someone else chimes in pro or con.
    >
    > The right side of the character probably has U+4E88 äº^ instead of U+5B50
    子 (child). These two characters are different. As I mentioned before,
    there are several different glyphs used to write shi/jo.
    > Several of the forms are U+79ED 秭, U+25797 ð¥z-, U+25791 ð¥z', and
    U+25771 𥝱.
    > These all express the value of 10^24 and are read as shi or jo, depending
    on your source.
    >
    > > Also, I wonder what the correct thing to recommend would be?
    > > Assuming surrogate support was consistently available, and
    > > fonts were available containing this character (are there any
    > > today?), since the character was not generally being written
    > > as a single character until now (and I am still not sure if the
    > > pair U+79BE U+4E88 is the correct alternative), would it be
    > > right to recommend this for people to use in number writing
    > > going forward? I tend to think of Ext. B as there for historic
    > > and special characters, not those that might be used every day.
    >
    > If you are worried about surrogate support and font availability, then
    U+79ED may be the best, which is attested and documented, and listed in
    modern dictionaries. Both Koujien and Daijirin (available online at
    http://dictionary.goo.ne.jp/index.html?kind=jn&mode=0se this glyph for
    it's entry of "shi". Otherwise, go for U+25771, which seems to be attested
    the most in documents. Daijirin uses this glyph for it's entry of "jo", but
    Koujien does not list it.
    >
    > However, these are not really "every day" characters, at least in my
    experience. Most people will know "chou", some will know "kei", fewer will
    know "gai", and even fewer will know "shi/jo". I would be a little surprised
    if many people could list the rest off the top of their head without prior
    special study or other references.
    >
    > Ben Monroe
    > [For those looking for my original e-mail message that Tex responded to, I
    accidentally sent it under a new address forgetting to update my
    subscription information after my e-mail address changed. (Old one is still
    being forwarded to this one.)]

    -- 
    -------------------------------------------------------------
    Tex Texin   cell: +1 781 789 1898   mailto:Tex@XenCraft.com
    Xen Master                          http://www.i18nGuy.com
    XenCraft             http://www.XenCraft.com
    Making e-Business Work Around the World
    -------------------------------------------------------------
    


    This archive was generated by hypermail 2.1.5 : Tue Jul 08 2003 - 02:15:07 EDT