L2/01-108
From: Peter_Constable@sil.org
Sent: Monday, February 26, 2001 5:34 PM
Subject: Re: comment on DTR27, article 3
Before the last meeting, I mentioned something in relation to revisions to
the note following D32 (quoted below for convenience). The text in question
is what's left of the note after D32: "To make implementations simpler and
faster, some transformation formats may allow irregular code value
sequences..." In particular, it is the statement that "some transformation
formats" - parts of the standard - permit this. My suggestion was not
adopted by the editorial committee in the latest revisions. I'm not sure
why. I still think a change would be beneficial. Let me explain:
From the revised wording for D36, it's clear what an irregular UTF-8
sequence is (a pair of 3-byte sequences that correspond to a UTF-16
surrogate pair). D36(c) clearly states that "As a consequence of C12, these
irregular UTF-8 sequences shall not be generated by a conformant process".
But the wording "some transformation formats may allow irregular code value
sequences" seems to contradict that: it suggests that a conformant process
can generate such sequences. If the note to D32 is intended to imply that
it's OK for a conformant process to *interpret* an irregular sequence, then
it should make that clear. If it is intended to imply that it's OK for a
conformant process to generate and interpret irregular sequences in
*internal* processing, then it should make that clear. I don't think it
should be left at the current wording, however, which appears to contradict
D36(c) and which could mislead someone who reads D32 but misses or forgets
D36.
---------
- Article 3, UTF-8 Corrigendum:
Back on 11/21 of last year when a draft of this was being discussed, I made
a suggestion, part of which Mark Davis supported, but was not incorporated.
Here's the relevant exchange with reference only to parts Mark agreed with
me on):
PC> - Re deleted sentence from note under D32: I think the *first* sentence
in
> that note is also problematic. It is not the transformation formats
> themselves that permit irregular code value sequences, but specific
> implementations. I might suggest the wording of this note could be
changed
> as follows:
>
>
> To allow for simpler and faster implementations, some processes that
> implement a transformation format may allow irregular code value
sequences
> without requiring error handling. [snip]
>
[MD] I agree that your wording is better for the first sentence. The rest
goes
far beyond what the committee decided.
I still think this would be improved wording. I presume this is something
that can simply be handled by the editorial committee.
----------
- Peter
---------------------------------------------------------------------------
Peter Constable
Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail:
-------------
2