Date: 2000-08-09

Proposal to extend U+ notation

Unicode Technical Committee

Liaison

For consideration by JTC1/SC2/WG2

The Unicode Technical Committee has extended the U+ notation to allow for 5 or 6 hexadecimal digits as well as 4 hexadecimal digits. For example, one could then write U-00012345 as U+12345. This allows for a uniform notation that covers the range of all of the codepoints in Unicode, and will be used in future versions of the standard, in technical reports, and in other Unicode documents.

The consortium urges WG2 to also allow this extended notation for use in reference to codepoints in 10646. This would involve changes to the ISO 10646, Clause 6.5:

- change "four-digit form" and "4-digit form" to "four-to-six-digit form"
- change "It is not defined if the first four digits of the eight-digit form are not all zeros" to "It is not defined if the eight-digit form is greater than U-0010FFFF"
- change "{+}xxxx" in the BNF form to "{+}(xxxx | xxxxx | xxxxxx)"