RE: Composition of not included Chinese characters

From: Philippe Verdy (
Date: Mon Sep 24 2007 - 22:08:20 CDT

  • Next message: John H. Jenkins: "Re: Composition of not included Chinese characters"

    Personnally I read it as


    instead of this more complex composition (if it make sense?):


    i.e. 16 characters in the IDS instead of 18.


    The difference being in the central rectangle whose area is divided into 6 parts here on yellow background, if you can see this message in HTML):















    * I see a stack two rows, each one subdivided into 3 horizontal parts (in each row, the left and the right part are identical: 幺言幺 and 長馬長). You seem to read this area as three columns (the first column being equal to the last one), each column being a stack of two rows.

    * Also you seem to decompose the top-right corner (surrounded by 8FB6 辶) into four areas, the top two being used by 穴 and only the lower right being subdivided. I think that the horizontal space would be more equally subdivided to make it more readable, by splitting the lower part so that 月 will more or less use the same width as 刂, and the central area being a bit wider to make its components more legible. Even with a scanned copy of this character, I think you will see various way to manage the tiny space left to make the 6 central items legible.


    One related question: do IDS have to be composed according to semantic (i.e. grouping according to the relations between components), or according to the ideograph glyph layout?


    > -----Message d'origine-----

    > De : [] De la

    > part de Marnen Laibow-Koser

    > Envoyé : mardi 25 septembre 2007 00:05

    > À : John H. Jenkins; Unicode Mailing List

    > Objet : Re: Composition of not included Chinese characters


    > On Sep 24, 2007, at 4:27 PM, John H. Jenkins wrote:


    > >

    > > On Sep 24, 2007, at 12:55 PM, Gerrit Sangel wrote:

    > >

    > >> I want to write Biáng biáng (

    > >> in an

    > >> ordinary text. As far as I know, the character is not included in

    > >> Unicode, so it seems that there is no possibility to just write it.

    > >>

    > >

    > > It is not currently in Unicode, but the UTC has submitted to the IRG

    > > for inclusion in a future version. FWIW, it's IDS is ⿺辶⿱

    > > 穴⿰月⿰⿲⿱幺長⿱言馬⿱幺長刂.


    > Unfortunately, this is not a legitimate IDS. On p. 427 ( http://

    > ), the Standard states that

    > no legitimate IDS can contain more than 16 Unicode code points, but yours

    > contains 18. A silly restriction, to be sure, but that's the way the

    > Standard is currently written. So I guess at the moment there's no way to

    > represent this character in legitimate Unicode?



    > Best,

    > --

    > Marnen Laibow-Koser





    This archive was generated by hypermail 2.1.5 : Mon Sep 24 2007 - 22:10:26 CDT