Re: Handling UTF-8

From: Gaute B Strokkenes (
Date: Thu Mar 01 2001 - 12:51:40 EST

On Thu, 1 Mar 2001, wrote:

> Apropos UTF-8:
> While waiting for software (Mac or Unix) that makes me able to
> handle UTF-8 (input, sort, wc, such things), I try to put up UTF-8
> web pages myself. I look at the algorithm of p. 47 in the Unicode
> Book, and convert any UxHHHH sequence to UTF-8 by breaking the hexes
> down to binaries.
> Now, I have a distinct feeling that there is a mathematical formula
> for doing this, e.g. on my hex calculator. To my irritation I cannot
> figure it out. Instead of trying to reinvent the wheel I go to the
> list. Can anyone help me: how do I calculate a UTF-8 value from a
> UCS value (except by the bit-counting paper and pencil way)?
> The fall-back would be a table for UCS>UTF-8, starting on Ux0080.

You're putting yourself through a lot of unnecessary pain. I'd
suggest that you look into the posix program iconv (if your Unix
supports it) , otherwise you might want to have a look at GNU recode.

Big Gaute                     
MY income is ALL disposable!

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:19 EDT