Property-Problems

From: Tobias Hunger (tobias@berlin-consortium.org)
Date: Tue Dec 05 2000 - 19:49:19 EST


Hello!

I am trying to implement a library that is (more or less) compatible to
unicode 3.0.1. Doing so I ran into some problems. Maybe someone on this list
can help me:

1.) What are the EastAsian Width properties of the characters in the new
Private Use areas (Plane 15/16)?

2.) What are the Linebreaking Properties for those characters?

3.) How do you generate the PropList File? Some of the properties are quite
obvious (for example the Bidi-Properties), but others are a mystery to me.
Some examples:

  (upper|lower|title)case Properties:
  I though it had something to do with the General Categories Lu, Ll and Lt,
  but that asumption was obviously wrong.
  For example U+02B6 is obviously a uppercase character (looking at the
  drawing in the book), has the Category Lm and the lowercase-Property.

  Decimal Digit Value-Property:
  I exspected that all characters that have a decimal digit value set in the
  Character Database had this property set. But that is not allways the case.
  The same goes for the Numeric and Digit Property.

4.) Which characters are those in the Virama, Joining Character Classes
mentioned in Table 5-3? It would be great if there was a Virama and a Joining
Property in the Property List.
Looking for hints I found several VIRAMA-characters in the datafile. Do I
need to use those with VIRAMA in the 'character name'-field and/or in the
'unicode 1.0 name'-field of UnicodeData*.txt?

Many thanks in advance!

-- 
Gruss,
Tobias

------------------------------------------------------------------- Tobias Hunger The box said: 'Windows 95 or better' tobias@berlin-consortium.org So I installed Linux. -------------------------------------------------------------------



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT