Re: Some question about DOM(Core) Level 1 Darft 11-September-1997

From: John Cowan (cowan@drv.cbc.com)
Date: Wed Sep 17 1997 - 10:25:13 EDT


Yung-Fong Tang wrote:

> I just do a quick scan of the document
> http://www.w3.org/MarkUp/DOM/Group/drafts/level-one-core.html and have
> some question:

Unfortunately, this document is not accessible to non-W3C members,
and individuals cannot become members.

> boolean isBase()

This means that the character does *not* have general category Mn
or Mc or Me.

> boolean isCombining()

This means that the character has general category Mn, Mc, or Me.

> boolean isComposite()

This means that the character has a decomposition.

> boolean isCompatibility()

I'm not sure how to interpret this.

> boolean isNonSpacing()

Category Mn.

> boolean isSmall()

Category Ll.

> boolean isNormal()

I don't know.

> boolean isCapital()

Category Lu.

> boolean isFullwidth()
> boolean isHalfwidth()

These are defined as follows on p. 6-130:

isFullWidth is true of U+3000-9FFF, U+F900-FAFF, U+FE30-FE6F,
U+FF01-FF5E, U+FFE0-FFE6.

isHalfWidth is true of U+0020-007E, U+0A00-1FFF, U+FB00-FE2F,
U+FE70-FEFE, U+FF61-FFDC, U+FFE8-FFEE, U+FFF0-FFFD.

The status of U+1100-11FF and U+AC00-D7A3 is doubtful. Officially,
the first block (Hangul Jamo) is halfwidth and the second block
(Hangul Syllables) is neither, but they both look fullwidth to me.

> boolean isAlphabetic()
> Returns true if the character is an alphabetic character within some character set; false otherwise.

> We should include specification or reference about the above is
> function. Does ANYONE understand what the above function mean ?

Sure. Page 4-14 says that alphabetic characters are letter
characters which are not ideographic. General category Ll, Lu, Lm,
Lo, Lt *except* in the ranges U+4E00-9FFF and U+F900-FAFF.
 
> isFullwisth, isHalfwidth and isAlphabetic seems carried from the old C
> interface which is not quite fit into the Unicode centric world (and
> Propotional font world ...) But what the rest of the function mean ?

There is nothing inconsistent about fixed-font Unicode: see
the Everson Mono fonts.

Feel free to forward this to other mailing lists or individuals.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
			e'osai ko sarji la lojban



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT