Re: Why Work at Encoding Level?

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Mon, 19 Oct 2015 21:34:01 +0100

On Mon, 19 Oct 2015 21:35:16 +0200
Philippe Verdy <verdy_p_at_wanadoo.fr> wrote:

> 2015-10-19 20:53 GMT+02:00 Richard Wordingham <
> richard.wordingham_at_ntlworld.com>:

> > The word
> > 'codepoint' is even worse, as a supplementary plane codepoint is
> > represented by two BMP codepoints.

> No ! The "supplementary code points" (or "supplementary characters"
> when they are assigned to characters) are represented in UTF-16 as
> two **code units**, NOT as two "code points" (even if their binary
> value are related).

A code point is 'any value in the Unicode codespace' (TUS Section 3.4
D10). The 'Unicode codespace' is a range of integers from 0 to
0x10FFFF (TUS Section 3.4 D9).

This works fine so long as one thinks of a 'code point' as just a
number. The problem is that people rarely use the term 'scalar values'.

Richard.
Received on Mon Oct 19 2015 - 15:35:12 CDT

This archive was generated by hypermail 2.2.0 : Mon Oct 19 2015 - 15:35:12 CDT