Problem with ConvertUTF.c?

From: Theodore H. Smith (delete@softhome.net)
Date: Tue Jul 16 2002 - 19:57:03 EDT


The file ConvertUTF.c contains this array:

static const char trailingBytesForUTF8[256] = {
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
        2,2,2,2,2,2,2,2,2,2,2,2,2,2,2,2, 3,3,3,3,3,3,3,3,4,4,4,4,5,5,5,5
};

Doesn't UTF8 only have 4 bytes maximum? So then the entries
above 3 should not be there.

There could be similar mistakes with 6 byte UTF8 codes. I think
this file may have been written before UTF8 was tightened up.
Perhaps this code should be tightened up along with the standard
now?

--
     Theodore H. Smith - Macintosh Consultant / Contractor.
     My website: <www.elfdata.com/>



This archive was generated by hypermail 2.1.2 : Tue Jul 16 2002 - 19:21:28 EDT