RE: Property-Problems

From: Hart, Edwin F. (
Date: Tue Dec 05 2000 - 20:37:16 EST

Questions 1 & 2

I would venture to say that Unicode specifies *no* properties for the
Private Use Areas. Everyone is free to use the private use areas anyway
they please. However, both the sender and receiver need to agree on how to
use the Private Use Areas. As part of that agreement, they would agree on
what characters to code there and what the character properties are.

Ed Hart

Edwin F. Hart
The Johns Hopkins University Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-443-778-6926 (Baltimore area)
+1-240-228-6926 (Washington, DC area)
+1-443-778-1093 (fax)
+1-240-228-1093 (fax)

-----Original Message-----
From: Tobias Hunger []
Sent: Tuesday, December 05, 2000 19:33
To: Unicode List
Subject: Property-Problems


I am trying to implement a library that is (more or less) compatible to
unicode 3.0.1. Doing so I ran into some problems. Maybe someone on this list

can help me:

1.) What are the EastAsian Width properties of the characters in the new
Private Use areas (Plane 15/16)?

2.) What are the Linebreaking Properties for those characters?

3.) How do you generate the PropList File? Some of the properties are quite
obvious (for example the Bidi-Properties), but others are a mystery to me.
Some examples:

  (upper|lower|title)case Properties:
  I though it had something to do with the General Categories Lu, Ll and Lt,
  but that asumption was obviously wrong.
  For example U+02B6 is obviously a uppercase character (looking at the
  drawing in the book), has the Category Lm and the lowercase-Property.

  Decimal Digit Value-Property:
  I exspected that all characters that have a decimal digit value set in the
  Character Database had this property set. But that is not allways the
  The same goes for the Numeric and Digit Property.

4.) Which characters are those in the Virama, Joining Character Classes
mentioned in Table 5-3? It would be great if there was a Virama and a
Property in the Property List.
Looking for hints I found several VIRAMA-characters in the datafile. Do I
need to use those with VIRAMA in the 'character name'-field and/or in the
'unicode 1.0 name'-field of UnicodeData*.txt?

Many thanks in advance!


------------------------------------------------------------------- Tobias Hunger The box said: 'Windows 95 or better' So I installed Linux. -------------------------------------------------------------------

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT