Re: The future of UTF-8

From: Markus Kuhn (Markus.Kuhn@cl.cam.ac.uk)
Date: Thu Jul 22 1999 - 06:34:20 EDT


"Addison Phillips" wrote on 1999-07-21 22:03 UTC:
> UTF-8 is a kludge.

Why is UTF-16 better? It just moves the kludge threshold up by 0xff80
but otherwise makes nothing fundamentally simpler.

> UTF-8 is merely a detour (albeit a very useful one).

I am not sure. What you refer to as a "Unicode plain text file" is in
essence UTF-16. UTF-16 also does not have fixed-width characters. Even
UCS-4 doesn't, considering that there are things such as combining
characters.

Unicode is inherently a variable length encoding of characters, and
assuming that UTF-8 is the only variable-length aspect of it might be a
bit naive.

I see UTF-8 not only as a temporary encoding for ASCII legacy systems,
but something that is here to stay for a very long time.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:48 EDT