Unicode CJK Language Myth

From: Hart, Edwin F. (HartEF1@bisdpo1.bisdnet.jhuapl.edu)
Date: Wed May 22 1996 - 12:47:30 EDT


I am sending this again after changing to the right word, "ideogramic"
instead of "ideogramic". I am sorry for any confusion this may have cause
you. (I hit the wrong button on the spelling checker and it changed every
occurrence.)

Ed Hart
__________________________

From the discussions, users of ideogramic characters have stated a strong
requirement to see the unified ideogramic characters in 10646/Unicode
displayed or printed using the shapes they prefer. If this is what
customers are demanding, developers need to listen to the users. I would
compare a speaker of English reading text in a fancy script font or Gothic
font to someone from Japan reading simplified Chinese ideographs: Although
the text can be read, the person reads much more slowly because the
characters are harder to discern (read) in these fonts. This makes the
person uncomfortable. This is why you see these fonts (script and Gothic)
used for printing diplomas, awards, greeting cards, etc. rather than
business writing.

The above requirement is in addition to a second requirement to be able to
selectively specify other ideogramic shapes along with the expected
ideogramic shapes. Dictionaries are examples of this. Another example
would be a Japanese document that analyzes a poem written in Chinese.
 However, this second requirement requires formatting information that is a
higher-level protocol imposed on top of the 10646/Unicode encoding of the
characters.

Moving back to the first requirement, the question is how to specify the
expected shapes for printing or displaying the unified ideogramic characters
of 10646/Unicode. For the "correct" or "expected" display and printing of
unified ideogramic characters in 10646/Unicode, I think that the "locale"
needs an ideogramic shapes parameter to set the default shapes, whether
simplified or classical Chinese, Japanese, Korean, or with the addition of
Vietnam to the WG 2/IRG, Vietnamese. The ideogramic shape parameter would
apply to "plain text" (coded text without formatting information). The
ideogramic shape parameter would be in addition to parameters for the
default language, country, etc. The language, country and ideograph shape
parameters need to be checked for consistency in a locale. For example, if
the locale specified the language as Chinese, the default ideogramic shape
should be either simplified or classical Chinese. Clearly if the computer
does not have access to a CJKV font or does not have an ideogramic font that
corresponds to the ideogramic shape parameter, the user is not going to see
the expected shapes for CJK characters.

Ed Hart



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:31 EDT