RE: Non-ascii string processing?

From: Marco Cimarosti (marco.cimarosti@essetre.it)
Date: Mon Oct 06 2003 - 13:36:13 CST


Edward H. Trager wrote:
> > But I still don't see any use in knowing how many
> characters are in an UTF-8
> > string, apart the use that I already mentioned: allocating
> a buffer for a
> > UTF-8 to UTF-32 conversion.
>
> Well, I know a good use for it: a console or terminal-based
> application which displays information using fixed-width
> fonts in a tabular form, such as a subset of records from
> a database table. To calculate how wide to display each
> column, knowing the maximum number of characters in the
> strings for each column is a useful starting place.

Well, I am just about to start a time consuming task: fixing an application
which was based on the assumption the number of characters in a string was
good "starting place" to format tabular text in a fixed width font...

You have already explained why this can't work when CJK or other scripts pop
in.

What you really need for such a thing is a function which computes the
"width" of a string in terms of display units, rather than its length in
term of characters.

_ Marco



This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST