UnicodeData-2.1.8 bug report

From: Kevin Bracey (kbracey@e-14.com)
Date: Wed Mar 17 1999 - 09:25:12 EST

Next message: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Previous message: John O'Conner: "NO-BREAK SPACE vs SPACE"
Next in thread: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Maybe reply: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

The ReadMe file for version 2.1.8 boldly states:

  Note that as of the 2.1.8 update of the Unicode Character Database,
  the decompositions in the UnicodeData.txt file can be used to recursively
  derive the full decomposition in canonical order, without the need
  to separately apply canonical reordering.

I've just found a bunch of Vietnamese characters for which this doesn't
seem to be the case, eg:

      1EAC LATIN CAPITAL LETTER A WITH CIRCUMFLEX AND DOT BELOW

   == 00C2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX
      0323 COMBINING DOT BELOW

   == 0041 LATIN CAPITAL LETTER A
      0302 COMBINING CIRCUMFLEX ACCENT
      0323 COMBINING DOT BELOW

But the canonical order is, of course:

      0041 LATIN CAPITAL LETTER A
      0323 COMBINING DOT BELOW
      0302 COMBINING CIRCUMFLEX ACCENT

This affects characters 1EAC,1EAD,1EB6,1EB7,1EC6,1EC7,1ED8,1ED9.

Would it be worthwhile me knocking up an algorithmic check that this
assertion doesn't fail elsewhere, or is someone else already looking at it?

-- 
Kevin Bracey, Senior Software Engineer
Acorn Computers Ltd                           Tel: +44 (0) 1223 725228
Acorn House, 645 Newmarket Road               Fax: +44 (0) 1223 725328
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.acorn.co.uk/

Next message: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Previous message: John O'Conner: "NO-BREAK SPACE vs SPACE"
Next in thread: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Maybe reply: Mark Davis: "Re: UnicodeData-2.1.8 bug report"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT