Re: EA width, Latin punctuation and fonts

From: peter_constable@sil.org
Date: Thu Dec 09 1999 - 12:46:16 EST


       If you look on page 6-94 of Unicode 2.0, it says, "U+3000 ...
       is provided for compatibility." It is also mentioned on page
       6-130 in the description for the Halfwidth and Fullwidth Forms
       block:

       "Unifications. The fullwidth form of U+0020 SPACE is unified
       with U+3000 IDEOGRAPHIC SPACE."

       It's not at all clear to me what that is supposed to mean. I
       thought characters that were unified in Unicode ended up with a
       single encoding. It did suggest to me, however, a relationship
       between U+3000 and U+0020 comparable to that of U+FF01 and
       U+0021 - though perhaps I was reading too much into it. In
       terms of pure, plain-text semantics, it seems that U+3000
       relates to U+0020 in precisely the same way that U+FF01 relates
       to U+0021 .

       Between those two references, I concluded that U+3000 should be
       treated the same way as characters in the range U+FF00-FF5E.

       The following also from page 6-130 is also relevant:

       <quote>
       The characters in this block consist of fullwidth forms of the
       ASCII block (except SPACE)... As with other compatibility
       characters, the preferred Unicode encoding is to use the
       nominal counterparts of these characters and use rich text font
       or style bindings to select the appropriate glyph size and
       width.
       </quote>

       This seems to require option 2 (and to denegrate option 1), but
       surely option 4 would also be considered acceptable - i.e. I
       think the main point is to avoid encoding text using
       compatibility characters if possible. Whether or not U+3000 is
       to be treated like characters in the compatibility area is
       still not clear to me.

       Peter

       From: <Marco.Cimarosti@icl.com> AT Internet on 12/09/99 12:47
             PM

       Received on: 12/09/99

       To: Peter Constable/IntlAdmin/WCT, <unicode@unicode.org> AT
             Internet@Ccmail
       cc:
       Subject: Re: EA width, Latin punctuation and fonts

       Peter Constable says:
>1. Include both wide and narrow glyphs in a single font and
>encode text using U+3000 etc. (i.e. encode using compatibility
>characters).

       Why do you say that U+3000 is a compatibility character? My
       understanding of a "compatibility character" in Unicode is a
       character that either:

       1) has the word "COMPATIBILITY" in its name (from
       ftp.unicode.org/Public/UNIDATA/UnicodeData-Latest.txt)
       2) is in a block that has the word "Compatibility" in its name
       (from
       ftp.unicode.org/Public/UNIDATA/Blocks.txt)

       U+3000 seems to fall in neither of these cases: it just has a
       compatibility mapping ("<wide> 0020"), as many other characters
       do.

       Marco



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:56 EDT