Re: How will software source code represent 21 bit unicode characters?

From: Marcin 'Qrczak' Kowalczyk (qrczak@knm.org.pl)
Date: Tue Apr 17 2001 - 13:08:47 EDT


Tue, 17 Apr 2001 07:33:16 +0100, William Overington <WOverington@ngo.globalnet.co.uk> pisze:

> In Java source code one may currently represent a 16 bit unicode character
> by using \uhhhh where each h is any hexadecimal character.
>
> How will Java, and maybe other languages, represent 21 bit unicode
> characters?

In Haskell the character U+FFFD can be written thus (inside character
or string literal):
    \65533
    \xFFFD
    \o177775
Such escape sequences can have any number of digits. The sequence \&
expands to the empty string and is used to protect a sequence from
the following text if it begins with a digit.

> May I, with permission, start a discussion by suggesting that \uhhhh \vhhhhh
> and \whhhhhh would be good formats. Programmers could then enter unicode
> characters into software source code using \u and four hexdecimal characters
> or using \v and five hexadecimal characters or using \w and six hexadecimal
> characters, as convenient for any particular character.

This conflicts with the usage of \v as vertical tab.

-- 
 __("<  Marcin Kowalczyk * qrczak@knm.org.pl http://qrczak.ids.net.pl/
 \__/
  ^^                      SYGNATURA ZASTĘPCZA
QRCZAK



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:16 EDT