L2/00-391 Clarification of terms relating to surrogates and supplementary characters From: Mark Davis/Cupertino/IBM [mark.davis@us.ibm.com] Sent: Friday, November 03, 2000 1:51 PM The editorial committee recommends the clarification of terms relating to surrogates and supplementary characters. The following is a proposal for appropriate changes to the glossary. (The committee discussed this via email. This document reflects the consensus, but not all members were able to participate.) A. The current glossary contains inconsistent and incorrect usage of "code value" and "scalar value". This should be corrected to use the more consistent terminology of UTR #17: Character Encoding Model: "code point" and "code unit". B. Add the following terms to help facilitate common terminology (especially as people are moving quickly now to support surrogates). supplementary code point. A Unicode code point between U+10000 and U+10FFFF. supplementary character. A Unicode encoded character having a supplementary code point. basic code point. A Unicode code point between U+0000 and U+FFFF. Also known as a BMP code point. basic character. A Unicode encoded character having a basic code point. surrogate code point. A Unicode code point in the range U+D800 through U+DC00. Reserved for use by UTF-16, where a pair of surrogate code units "stand in" for a supplementary code point. (See high surrogate and low surrogate.) Note: there is no such thing as a "surrogate" character. That would be an encoded character having a surrogate code point, which is impossible. plane. A range of 65,536 (hex 10000) contiguous code points, where the first code point is an integer multiple of 65,636 (hex 10000). The Plane number is the first code point divided by 65,536, so Plane 0 is U+0000..U+FFFF, Plane 1 is U+10000..U+1FFFF, and Plane 16 is U+100000..10FFFF. See Basic Multilingual Plane and Supplementary Planes. Basic Multilingual Plane. Plane 0, consisting of the basic code points. Supplementary Planes. Planes 1 through 16, consisting of the supplementary code points.