Re: How does Python Unicode treat surrogates?

From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Mon Jun 25 2001 - 14:22:56 EDT


Mon, 25 Jun 2001 07:24:28 -0700, Mark Davis <mark@macchiato.com> pisze:

> In most people's experience, it is best to leave the low level interfaces
> with indices in terms of code units, then supply some utility routines that
> tell you information about code points.

It's yet better to work on characters instead of code units internally,
i.e. use UTF-whatever only for interaction with external world.

Unfortunately some languages did a mistake of using only 16 bits per
character and it's not easy in them.

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT