Bidi/Hebrew Identifiers

Date: Tue Mar 13 2001 - 01:01:14 EST

Following are suggestions for the modification or elaboration of the Unicode
rules regarding identifiers with respect to Hebrew. These suggestions were
discussed at a technical meeting of the SII, and comments are requested.

Reference: TUS 3.0 5.16 page 133

Cantillation marks (0591 to 05AF, 05C4): These characters should not be allowed
in identifiers. They are only used in special circumstances, for Biblical texts.

Bidi formatting codes (202A to 202E, 200E, 200F): These characters should not be
allowed in identifiers (RLM and LRM require further thought). These codes are
problematic, because they can make different strings appear identical.
Identifiers are single words and do not require these codes.

Points (05B0 to 05BD, 05BF to 05BF, 05C1, 05C2): These characters should be
allowed. In those situations where case is ignored for other languages, these
characters should be ignored. Points are optional in Hebrew, and in normal use
their presence or absence should not be meaningful.


