RE: Handling UTF-8

From: Trond Trosterud
Date: Thu Mar 01 2001

Apropos UTF-8:

While waiting for software (Mac or Unix) that makes me able to handle UTF-8
(input, sort, wc, such things), I try to put up UTF-8 web pages myself. I
look at the algorithm of p. 47 in the Unicode Book, and convert any UxHHHH
sequence to UTF-8 by breaking the hexes down to binaries.

Now, I have a distinct feeling that there is a mathematical formula for
doing this, e.g. on my hex calculator. To my irritation I cannot figure it
out. Instead of trying to reinvent the wheel I go to the list. Can anyone
help me: how do I calculate a UTF-8 value from a UCS value (except by the
bit-counting paper and pencil way)?

The fall-back would be a table for UCS>UTF-8, starting on Ux0080.


Trond Trosterud
Det humanistiske fakultet h +47 7767 3639
N-9037 Universitetet i Tromsų, Noreg f +47 7764 4239

