L2/01-262 From: Markus Scherer [markus.scherer@jtcsv.com] Sent: Wednesday, June 20, 2001 3:07 PM Subject: UTC Agenda Item: Proposal to reserve d7c0..d7ff for internal use Mark Davis wrote: > Markus Scherer noticed that one could apply Formula 1 to certain BMP points, Thanks, Mark. I would like to expand on this and propose to reserve U+d7c0..U+d7ff permanently for internal use, in case such UTF-16 variants may be useful for someone. To simplify your example and achieve both - code point order=code unit order, and - unambiguous encoding of all code points, one could just encode all code points U+d7f5..U+10ffff as described, with U+d7f5 as the first "lead" surrogate. In fact, the lower limit of this could be anywhere from U+0000 to U+d7f5, which would require up to 64 additional "lead surrogates" compared to UTF-16. It is not necessary to encode U+d7f8..U+d7ff with single code units. (Your table had U+e000..U+ffff without the surrogate pair encoding, which yielded unambiguous encoding but not code point order.) Clarification: I am not proposing such an encoding form "bastard" - not even an exact specification or name. I am only proposing to set aside certain 64 code points for internal use. markus