Re: "A Programmer's Introduction to Unicode"

From: Richard Wordingham <richard.wordingham_at_ntlworld.com>
Date: Tue, 14 Mar 2017 02:03:56 +0000

On Mon, 13 Mar 2017 19:18:00 +0000
Alastair Houghton <alastair_at_alastairs-place.net> wrote:

> IMO, returning code points by index is a mistake. It over-emphasises
> the importance of the code point, which helps to continue the notion
> in some developers’ minds that code points are somehow “characters”.
> It also leads to people unnecessarily using UCS-4 as an internal
> representation, which seems to have very few advantages in practice
> over UTF-16.

The problem is that UTF-16 based code can very easily overlook the
handling of surrogate pairs, and one very easily get confused over what
string lengths mean.

Richard.
Received on Mon Mar 13 2017 - 21:04:18 CDT

This archive was generated by hypermail 2.2.0 : Mon Mar 13 2017 - 21:04:18 CDT