Re: "uctype.h": a Unicode-based character classification API

From: Tom Garland SMI European Software Centre (Tom.Garland@Ireland.Sun.COM)
Date: Wed Feb 11 1998 - 04:14:00 EST


John,

> As I am dissatisfied with the constraints that the POSIX "ctype.h"
> classification API puts on characters, and its over-specificity
> compared with the Unicode model,

I'd be interested in seeing a critique of ctype.h if you have one or can point me to one.

thanks

tom

--------------------------------------------------------------------------
Tom Garland Direct: +353 1 8199110
Technical Lead Internal ext: 19110
European Localization Centre External: +353 1 8199100
Sun Microsystems Ireland Ltd. Fax: +353 1 8199261
Hamilton House Email: tom.garland@ireland.sun.com
East Point Business Park Interoffice: EDUB03
Dublin 3
Ireland
---------------------------------------------------------------------------

> Mime-Version: 1.0
> Content-Transfer-Encoding: 7bit
> X-Uml-Sequence: 4921 (1998-02-10 23:00:46 GMT)
> To: Multiple Recipients of <unicode@unicode.org>
> From: John Cowan <cowan@locke.ccil.org>
> Date: Tue, 10 Feb 1998 15:00:45 -0800 (PST)
> Subject: "uctype.h": a Unicode-based character classification API
>
> As I am dissatisfied with the constraints that the POSIX "ctype.h"
> classification API puts on characters, and its over-specificity
> compared with the Unicode model, I have built and tested an
> alternative API known as "uctype.h", and I am now releasing the
> source code for Version 2.0. (Version 1.0 was differently
> conceived and never made it out the door.)
>
> The API still maintains the flavor of "ctype.h", but allows access
> to every property in the Unicode character database
> (UnicodeData-Latest.txt) except the name and decomposition properties.
> There are 39 uct_is* selectors, plus uct_getbidi, uct_getclass,
> uct_getnumber, uct_getdigit, uct_toupper, uct_tolower, and uct_totitle
> functions. Nonetheless, only about 6-7 kilobytes of data space
> are required.
>
> It may be reckoned an advantage or a disadvantage that this package
> is stand-alone, and not part of a more complex application framework.
>
> The code is written in ISO/IEC C and is highly portable, having
> essentially no dependencies on environment. A suite of programs
> in C and Perl are provided for those who wish to add their own
> characters or change existing properties; the suite tests that all the
> properties are consistently provided.
>
> The code is released under an MIT/X-Consortium license: it is free
> for any use whatever, proprietary or not, and may be freely modified
> by anyone provided the copyright notice is preserved.
>
> O Sarasvati: I would love to see this code in a
> /Public/SOFTWARE/CONTRIB subdirectory. Is this possible?
> If so, tell me where and how to upload it. If not, I will post an
> URL in due course.
>
> --
> John Cowan http://www.ccil.org/~cowan cowan@ccil.org
> You tollerday donsk? N. You tolkatiff scowegian? Nn.
> You spigotty anglease? Nnn. You phonio saxo? Nnnn.
> Clear all so! 'Tis a Jute.... (FW 16.5)



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:39 EDT