From: David Starner (starner@okstate.edu)
Date: Fri Nov 15 2002 - 13:29:11 EST
On Fri, Nov 15, 2002 at 01:11:39PM -0500, John Cowan wrote:
> David Starner scripsit:
>
> > Have you looked at the way Emacs 21 handles this? It's got something
> > similar going on.
>
> I confess I remain in blissful ignorance of Emacs and all its works. Do
> you have a pointer to this particular part of it?
It's not as extensive as I remembered, and this is pretty old:
Emacs-Unicode-990824
----------------------------------------------------------------------
Internal Character code:
00 0000 xxxxxxxx xxxxxxxx Unicode U+0000 - U+FFFF
00 xxxx xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair)
01 0000 xxxxxxxx xxxxxxxx Unicode 20bit (via surrogate pair)
01 0ppp xxxxxxxx xxxxxxxx 7 64kByte planes reserved for Emacs
01 1ppp xxxxxxxx xxxxxxxx 8 64kByte planes for private use
1x xxxx xxxxxxxx xxxxxxxx for private use, CNS 3-16, and CCCII
Private area is 180000h - 3087FFh
----------------------------------------------------------------------
Multibyte sequence in buffer/string:
1 byte: xxxxxxxx
0xxxxxxx
ASCII
1xxxxxxx
not used
2 bytes: 110xxxxx 10xxxxxx where x... are:
00000 000000 - 00001 111111 (0h - 7Fh)
7 bits not used
(or we may be able to use this area for holding 8-bit raw data
in multibyte buffer/string)
00010 000000 - 11111 111111 (80h - 7FFh)
Unicode U+0080 - U+07FF
3 bytes: 1110xxxx 10xxxxxx 10xxxxxx where x... are:
0000 000000 000000 - 0000 011111 111111 (0h - 7FFh)
11 bits not used
0000 100000 000000 - 1111 111111 111111 (800h - FFFFh)
Unicode U+0800 - U+FFFF
4 bytes: 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx where x... are:
000 000000 000000 000000 - 000 001111 111111 111111 (0h - FFFFh)
16 bits not used
000 010000 000000 000000 - 100 001111 111111 111111 (10000h - 10FFFFh)
20 bits Unicode via surrogate pare
100 010000 000000 000000 - 101 111111 111111 111111 (110000h - 17FFFFh)
7 64kByte planes reserved for Emacs
We may map Japanese Han characters here.
110 000000 000000 000000 - 111 111111 111111 111111 (180000h - 1FFFFFh)
8 64kByte planes reserved for private use
5 bytes: 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx where x... are:
00 000000 000000 000000 000000 - 00 000111 111111 111111 111111
0h - 1FFFFFh
21 bits not used
00 001000 000000 000000 000000 - 00 001100 001000 011111 111111
200000h - 3087FFh
1083391 (almost 1M) character code points for private use
00 001100 001000 100000 000000 - 00 001100 100111 111111 111111
308800h - 327FFFh
CNS Plain 3 to 16 (96*96*14)
00 001100 101000 000000 000000 - 00 001111 111111 111111 111111
328000h - 3FFFFFFh
CCCII (96*96*96)
-- David Starner - starner@okstate.edu Great is the battle-god, great, and his kingdom-- A field where a thousand corpses lie. -- Stephen Crane, "War is Kind"
This archive was generated by hypermail 2.1.5 : Fri Nov 15 2002 - 15:16:59 EST