L2/06-225 From: Peter Constable Date: 2006-06-17 Subject: comments on Indic block descriptions Rick: Mike had given me a copy of the draft for chapter 9, and I was just reviewing details regarding representing independent vowels atomically rather than as V + matra combinations. (I’m reviewing which combinations Uniscribe should block to avoid spoofing.) I don’t know if you or someone else worked on this; you can forward as needed. In looking at the cases mentioned, there are some additional combinations I should be listed, but there are some combinations that aren’t obvious to me. In the latter cases, maybe there’s a historical basis for relating the combination to the atomic character – I’m not an expert on histories of these scripts. I’ll just raise the questions, though, for you or whoever to review and consider. For Devanagari, table 9-1, it wasn’t clear to me why 090A vs. < 0909, 0941 > was mentioned. If there’s a historical basis for that analysis, perhaps that makes sense, though it wasn’t at all obvious to me that one might analyze 090A that way, and I wouldn’t expect any rendering implementations that happened to allow < 0909, 0941 > to present that combination like 090A. On the other hand, it surprised me that for 0911..0914 it wasn’t mentioned that sequences using 0906 and 0945..0948 shouldn’t be used. For Bengali, it seems to me that 09E0 vs. < 098B, 09C3 > and 09E1 vs. < 098C, 09E2 > might also have been mentioned. For Gurmukhi, it’s not clear to me why 0A13 vs. < 0A73, 0A4B > would be mentioned. Again, maybe there’s a historical basis though it’s certainly not obvious to me; I certainly wouldn’t expect the combination to render like the atomic character. For Gujarati: similar issue to corresponding Devanagari cases – I would have thought sequences using 0A86 and 0AC5..0AC8 would be mentioned in relation to 0A91, 0A93, 0A94. For Oriya, I can understand why 0B10 and 0B14 might be mentioned, but there are variations in how those appear, and it not obvious that the combinations listed are related; it seems possible to me that 0B10 might be historically related to ee + ya-phalaa + ai length mark (consider how cognates of this diphthong are written e.g. in Thai – e + i + y), and similarly for 0B14. Because of the variation in appearance for those two independent vowels, I didn’t feel I should block any sequences in Uniscribe to prevent spoofing of these two characters. (I concur wrt 0B06, though.) For Telugu, I might have included 0C60 vs. < 0C0B, 0C3E > and 0C61 vs. < 0C0C, 0C3E >. For Kannada and Malayalam, I concur with the combinations mentioned. Peter