Re: Non-ascii string processing?

From: 'Stephane Bortzmeyer' (bortzmeyer@nic.fr)
Date: Mon Oct 06 2003 - 06:37:44 CST

Next message: John Delacour: "RE: Non-ascii string processing?"
Previous message: jon@spin.ie: "Re: Non-ascii string processing?"
In reply to: Marco Cimarosti: "RE: Non-ascii string processing?"
Next in thread: jon@spin.ie: "Re: Non-ascii string processing?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Mon, Oct 06, 2003 at 01:52:26PM +0200,
Marco Cimarosti <marco.cimarosti@essetre.it> wrote
a message of 51 lines which said:

> a word like "élite" is always counted as five characters, regardless
> that it might be encoded as six Unicode "characters".

I assume that everybody on this list knows that you count characters
only after a proper normalization... (like many operations on Unicode
texts).

> 3) That is a very silly count anyway. If you want to have an idea of the
> "size" of a document, lines or words are much more useful units.

Tell that to the editor (editors of paper publications still talk with
this unit "3 000 characters, no more, for tommorrow morning").

> OK. But the length in "characters" of a string is not "character semantics":
> it's plain nonsense, IMHO.

I disagree.

Next message: John Delacour: "RE: Non-ascii string processing?"
Previous message: jon@spin.ie: "Re: Non-ascii string processing?"
In reply to: Marco Cimarosti: "RE: Non-ascii string processing?"
Next in thread: jon@spin.ie: "Re: Non-ascii string processing?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 18 2007 - 15:54:24 CST