Kenneth Whistler wrote on 1999-08-16 22:51 UTC:
> > 2) ls must know that combining characters do not occupy their own
> > character cell
> Well, more correctly, that *non-spacing* characters do not. Those
> are a subset of all combining characters in Unicode--many of which
> are actually spacing characters.
> > 3) ls must know that characters with the East Asian Wide of FullWidth
> > property (see TR #7) occupy two character cells.
> That's TR #11, not TR #7.
Thanks for the corrections.
By the way, below follow two C functions, that test whether a Unicode
character is in one of these two classes (non-spacing or EastAsian Wide/
FullWidth). With these functions, host applications should again be able
to predict nicely how many cells a character consumes on a Unicode
enhanced VT100 terminal such as some future xterm/kermit/Linux_console
It would be nice to have something like these in glibc and similar
libraries. They could also be the basis for implementing the column
width functionality mentioned in section H.14 of ISO C (1990)
Amendment 1 (1995), that is the "%#N" formatting code in printf
that causes "%n" to report character-cells counts and not character
P.S.: The attached code is in the public domain. Share, use, and enjoy.
-- Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT