From: Asmus Freytag (firstname.lastname@example.org)
Date: Mon Mar 13 2006 - 22:32:35 CST
On 3/13/2006 3:02 PM, Kenneth Whistler wrote:
>> U+FE45 SESAME DOT
>> U+FE46 WHITE SESAME DOT
> they were encoded for
> compatibility with JIS X 0213.
this is a great example where harping on the putative compatibility
character status is really confusing and not helpful. Yes, X0213 had
them before we did, but *compatibility* characters they are only if we
*would not* have added them as characters for reasons of our own, or if
they violate the character glyph model in some other way.
In my estimate, we might have and they do not, at least not to the
degree that makes them special in any way. (I'll deal with their
similarity to the punctuation characters below).
> And they were encoded in the CJK
> Compatibility Forms block because much of that block consists of
> forms used in vertical CJK text, as are the sesame marks.
My recollection is, we picked up two empty slots that were handy, and
the BMP was getting full, and there were no better locations in existing
(non-compatibility) blocks. The 'related to vertical text' was a nice
bonus, but - in fact- distracting, because the other characters violate
Unicode's writing direction model, whereas these don't.
(The other ones are among the "blackest" strain of black-sheep
compatibility characters there are ;-).
> But note that they have no compatibility decomposition mapping, and there
> is no indication whatsoever that their use is discouraged.
Therefore, it makes no sense to emphasize them as "compatibility
characters" which are implicitly second class citizens. Let's reserve
that label for the truly unwanted.
> If you have need of referring to a sesame dot in CJK text, by
> all means, *do* use U+FE45 SESAME DOT. That is what it is encoded
>> In the case of the sesame at least, the shape in printed materials closely
>> parallels U+3001 IDEOGRAPHIC COMMA, which is provided by the font.
> I would *not* suggest using that.
The committee consensus was to discourage precisely that *hack-o-rama*
by providing dedicated codes.
(The location of the comma and period in the character box is
potentially different for each font, but for use as an emphasis mark,
you need the 'ink' at a known location, usually centered, otherwise they
won't look right).
Note, that we might want to note the fact that - by convention -
software scales the glyphs for these characters down (just as if they
had been regular characters).
PS: Form the last parenthetical remark, it should be clear that for
other symbols, for which existing fonts have glyphs that are always
centered, would not require specific codes for emphasis marks.
This archive was generated by hypermail 2.1.5 : Mon Mar 13 2006 - 22:34:43 CST