Re: combining/fullwidth support for xterm

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Tue Aug 17 1999 - 07:12:18 EDT


PILCH Hartmut wrote on 1999-08-17 09:53 UTC:
> I wrote
>
> > This is not merely a question of backwards compatiblity. In many cases,
> > wide alphabetic characters are desirable in a CJK context. There is a
> > common Japanese habit of writing abbreviations like NATO and short even
> > words like FAX untranslated in wide capital letters in the text. This
> > habit exists is prior to the design of the EUC codings.
>
> One can of course cope with this by relegating it to the level of
> typesetting, e.g. by defining a pair of shift characters that will widen
> anything that is between them. Does Unicode contain such a definition?

No, it doesn't. And it shouldn't. The Japanese habit of abusing the
character encoding to determine presentation style is nothing we should
take over in Unicode. There is nothing inherently different between wide
ASCII text and bold/italic ASCII text, and Unicode has deliberately
decided that font style markup is outside its scope and is left to
higher-layer protocols such as ISO 6429 or HTML or any of many others.
Keeping the scope of Unicode limited this way makes it much more
versatile and useful.

ISO 6429 = ECMA 48 define a range of private ESC control sequences, and
we could very well define such private ESC sequences for switching
between the following modes

 - always use the narrow font
 - always use the wide font
 - use wide font for EastAsian W/F characters and narrow for all others
 - [ perhaps also: use the narrow/wide font in a way compatible with
   some national standard (e.g., EUC).]

just like we switch already today between bold and inverse modes using
ISO 6429 ESC sequences. May be such codes should also be added in the
next ISO 6429 revision.

Note that Unicode has purely for backwards compatibility fullwidth
versions of all ASCII characters in the range U+FF01 - U+FF5E, so if you
really insist, you could use these to write "NATO" and "FAX" in wide
characters. However note, that these characters are only in Unicode to
guarantee round-trip compatibility, and using them in new applications
without a legacy compatibility requirement is really not recommended.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT