I have heard a rumour (i.e. my source is not involved in the reported
activity) that:

SAP, PeopleSoft, Siebel, Oracle and others are actually
in the process of proposing a new format of UTF that will cause a UTF-16
surrogate pair to become two 3-byte UTF-8 codepoints so that UTF-8 will
have the same behaviour as UTF-16, that is, a surrogate will be two UTF-8
code points.

Can anyone corroborate this, and, if it's true, offer an opinion on it?

I may add that, as some of you already know, a small group in the UK (which
includes me) is working on a proposal intended to improve the SQL standard
specification with regard to the treatment of Unicode data by an

The competent bodies are ISO/IEC SC 32/WG 3, ANSI NCITS H2, BSI IST/40 and
other national bodies.

We expect that most of the parties most interested, principally SQL
implementors, are already represented either directly or indirectly on one
or more competent bodies. But if anyone else is interested, please feel free
to download the current, incomplete, provisional draft of the proposal from:

where the files containing two different versions are jms01v6 and jms01v7
each of which is in both w97.doc and .pdf format.

All comments will be seriously considered.

