From: Peter Kirk (firstname.lastname@example.org)
Date: Mon Apr 04 2005 - 11:04:20 CST
On 04/04/2005 16:33, Marcin 'Qrczak' Kowalczyk wrote:
>Peter Kirk <email@example.com> writes:
>>There is a serious danger of breaking existing implementations
>>(especially those which only fully support the BMP) by introducing a
>>BMP character which normalises to outside the BMP. For the BMP is now
>>no longer a closed subset of Unicode, under operations like
>>normalisation which existing implementations expected to find closed.
>I had to change my implementation of normalization because of this
>(my static tables of canonical decompositions use 16-bit entries for
>BMP blocks), but it was not a big deal.
Thank you for the confirmation that this has required a change to
existing code. It wasn't a big deal for you because you knew and
understood what was happening - and presumably because you already had
some support for non-BMP characters. It may be a big deal for others who
are less well informed, or for code which is in the field and not being
maintained properly, as well as for the many existing implementations
which do not support anything outside the BMP.
>I'm more concerned with killing the myth that Unicode is a 16-bit
>encoding than with that minor inconvenience.
I agree that this myth needs killing, but is it really worth the risk of
killing with it the millions of computers which have been programmed
according to this myth?
-- Peter Kirk firstname.lastname@example.org (personal) email@example.com (work) http://www.qaya.org/ -- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.9.1 - Release Date: 01/04/2005
This archive was generated by hypermail 2.1.5 : Mon Apr 04 2005 - 11:04:56 CST