Re: In UTF-16 no codepoints are assigned to D800 - DFFF ... is that range also reserved in UTF-8 and UTF-32?

From: Stephan Stiller <stephan.stiller_at_gmail.com>
Date: Sun, 27 Jan 2013 01:37:09 -0800

>> Thus, there are no codepoints assigned to the range D800 to DFFF in UTF-16.
>>
>> Does that mean there are no codepoints assigned to the range D800 to DFFF in UTF-8 and UTF-32? I assume that's the case, but just want to check to be sure.
> Code points are assigned in the Unicode code point space, not in the encodings.
What Martinho writes is correct. To add some complementary information:

It depends on what you (Roger) originally meant by "assigned". According
to Table 2-3 (p. 23) of the Standard, they are "assigned" in official
Unicode lingo. They are not "assigned to abstract character"; in the
last column the table clarifies that "assigned" (in the most general
sense) is to be understood as "designated" (with the opposite
"undesignated" meaning {"being up for acquiring a semantics in a future
version" or equivalently "not assigned but assignable to an abstract
character in the future"}). The more important thing is that they're
exactly the code points ("code point" can be understood as "element of
the set of contiguous integers ('codespace') containing the domain of
the UTF functions, which map a code point (sequence) to [a] code unit
(sequence)") which are /unmappable to a UTF/. (Martinho correctly points
out that some UTFs can in principle map them, though they don't and
shouldn't.)

Stephan
Received on Sun Jan 27 2013 - 03:46:43 CST

This archive was generated by hypermail 2.2.0 : Sun Jan 27 2013 - 03:46:56 CST