Re: Support for non-BMP characters

From: Ken Whistler <kenw_at_sybase.com>
Date: Wed, 25 Apr 2012 12:09:06 -0700

On 4/25/2012 6:55 AM, Juanma Barranquero wrote:
> Ada 2012 is adding (quoting from the ARM):
>
> A.4.11 String Encoding

> [...]
>
> {AI05-0137-2} {AI05-0262-1} The type Encoding_Scheme defines encoding
> schemes. UTF_8 corresponds to the UTF-8 encoding scheme defined by
> Annex D of ISO/IEC 10646. UTF_16BE corresponds to the UTF-16 encoding
> scheme defined by Annex C of ISO/IEC 10646 in 8 bit, big-endian order;
> and UTF_16LE corresponds to the UTF-16 encoding scheme in 8 bit,
> little-endian order.

I would suggest that the folks working on Ada 2012 (presumably a new
edition of ISO/IEC 8652:1995) get themselves an
updated copy of 10646, and specifically ISO/IEC 10646:2011:

http://www.iso.org/iso/iso_catalogue/catalogue_tc/catalogue_detail.htm?csnumber=51273

The second edition, which supersedes 10646:2003, and which incorporates
7 published
amendments (and Amd 8, which was not separately published), no longer
defines UTF-8 in Annex D or UTF-16 in Annex C. The UTF-8 encoding form is
now defined in Clause 9.1. The UTF-16 encoding form is defined in Clause
9.2.
And the encoding schemes are defined in Clause 10.

I know it takes awhile for implementations to catch up with standards,
but it
would be a shame if a 2012 revision of the Ada *standard* ends up referring
to a 9-year old and outdated version of 10646 to get its definitions of
UTF-8 and UTF-16.

--Ken
Received on Wed Apr 25 2012 - 14:12:18 CDT

This archive was generated by hypermail 2.2.0 : Wed Apr 25 2012 - 14:12:19 CDT