L2/01-108 From: Peter_Constable@sil.org Sent: Monday, February 26, 2001 5:34 PM Subject: Re: comment on DTR27, article 3 Before the last meeting, I mentioned something in relation to revisions to the note following D32 (quoted below for convenience). The text in question is what's left of the note after D32: "To make implementations simpler and faster, some transformation formats may allow irregular code value sequences..." In particular, it is the statement that "some transformation formats" - parts of the standard - permit this. My suggestion was not adopted by the editorial committee in the latest revisions. I'm not sure why. I still think a change would be beneficial. Let me explain: From the revised wording for D36, it's clear what an irregular UTF-8 sequence is (a pair of 3-byte sequences that correspond to a UTF-16 surrogate pair). D36(c) clearly states that "As a consequence of C12, these irregular UTF-8 sequences shall not be generated by a conformant process". But the wording "some transformation formats may allow irregular code value sequences" seems to contradict that: it suggests that a conformant process can generate such sequences. If the note to D32 is intended to imply that it's OK for a conformant process to *interpret* an irregular sequence, then it should make that clear. If it is intended to imply that it's OK for a conformant process to generate and interpret irregular sequences in *internal* processing, then it should make that clear. I don't think it should be left at the current wording, however, which appears to contradict D36(c) and which could mislead someone who reads D32 but misses or forgets D36. --------- - Article 3, UTF-8 Corrigendum: Back on 11/21 of last year when a draft of this was being discussed, I made a suggestion, part of which Mark Davis supported, but was not incorporated. Here's the relevant exchange with reference only to parts Mark agreed with me on): PC> - Re deleted sentence from note under D32: I think the *first* sentence in > that note is also problematic. It is not the transformation formats > themselves that permit irregular code value sequences, but specific > implementations. I might suggest the wording of this note could be changed > as follows: > > > To allow for simpler and faster implementations, some processes that > implement a transformation format may allow irregular code value sequences > without requiring error handling. [snip] > [MD] I agree that your wording is better for the first sentence. The rest goes far beyond what the committee decided. I still think this would be improved wording. I presume this is something that can simply be handled by the editorial committee. ---------- - Peter --------------------------------------------------------------------------- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: ------------- 2