RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)

From: Kenneth Whistler (
Date: Tue May 29 2001 - 18:47:08 EDT


> Ken,
> UTF-8s is essentially a way to ignore surrogate processing. It allows a
> company to encode UTF-16 with UCS-2 logic.
> The problem is that by not implementing surrogate support you can introduce
> subtle errors. For example it is common to break buffers apart into
> segments. These segments may be reconcatinated but they may be processed
> individually.

You are preaching to the choir here. I didn't state that *I* was in
favor of UTF-8S -- only that we have to be careful not to assume that
UTC will obviously not support it. The proponents of UTF-8S are
vigorously and actively campaigning for their proposal. In
standardization committees, proposals that have committed, active
proponents who can aim for the long haul, often have a way of getting
adopted in one form or another, unless there are equally committed
and active opponents of the proposal. It is just the nature of
consensus politicking in these committees, whether corporate based
or national body based.

Also, I consider the stated position of "near-universal agreement
among the database vendors" to be largely a rhetorical device by
the proponents. Oracle is clearly pushing the proposal. NCR has
stated it is not in favor of the proposal. The other big enterprise
database vendors are hedging their positions somewhat -- in
particular, the standards people in those companies may not be
entirely in agreement with some of their database engine developers, for
example. And the small database vendors are either not playing
in this space or are part of desktop systems that will just follow
the behavior of the platforms.


This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:18:17 EDT