John Cowan recalculated:
> Misha Wolf wrote:
> > S2. In both these places, amend the DESCSET accordingly: In the "SGML
> > Declaration for XML", change the "65376" to "2147483488". In the
> > box labelled "Scope Document", amend the "65536" to "2147483648".
> > The SGML Declaration for HTML 4.0 has "2147483486" rather than
> > "2147483488" (see the least significant digit), but I'll seek to
> > have that changed.
> By my reading of clause 7.1 of ISO/IEC 10646:1993, the correct
> number of codepoints is 2147483646 (from 0000 0000 to 7FFF FFFD inclusive).
> Deducting 160 for the C0, ASCII, and C1 ranges leaves 2147483486,
> in conformity with HTML 4.0.
> John Cowan http://www.ccil.org/~cowan firstname.lastname@example.org
The math I do varies slightly yet again. Based on the latest draft
of 10646, with corrigenda, Clause 7a specifies:
The values of P-, and R-, and C-octets used for representing graphic
characters shall be in the range 00 to FF. The values of G-octets
used for rpresentation of graphic characters shall be in the range
of 00 to 7F. On any plane, code positions FFFE and FFFF shall not
That gives the basic size of the coding space as 2 gig, but minus
two characters per plane.
Clause 7b specifies:
Code positions to which a character is not allocated, except for
the positions reserved for private use character or for
transformation formats, are reserved for future standardisation
and shall not be used for any other purpose.
The important fact here is that U+D800..U+DFFF on the BMP do not
represent characters per se, but are reserved for the UTF-16
So I get:
7FFF planes x FFFD chars/plane = 7FFD8003
1 plane BMP x F7FD chars/plane = F7FD
7FFE7800 => 2,147,383,296
If you subtract off 160 for C0,ASCII,C1, you get 2,147,383,136
On the other hand, the range within which valid values are
expressible, is 00000000 .. 7FFFFFFD ( 0 .. 2,147,483,645 ).
It is not clear to me exactly which of these numbers is needed
for the DESCSET in XML.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT