Re: Library of Congress diacriticized character data

From: Mark Davis (
Date: Mon Jul 28 1997 - 13:25:20 EDT

I would be quite surprised that only 312 have Unicode equivalents. Which
combining marks are missing that do not allow the missing characters to
be composed?

(I was unable to access; Communicator said
that lacked a DNS entry.)


John Cowan wrote:

> I have posted the L of C data on diacriticized characters to
> my Web page at .
> This is the result of the printouts that James Agenbroad sent me
> some months ago. I typed them in, massaged them to get USMARC
> and Unicode equivalents, et voila.
> The data illustrate the diverse base+combining characters that
> are needed for bibliographic purposes. There are 1152 diacriticized
> characters in the file, of which only 312 have Unicode equivalents.
> (I may have missed some, and some sequences are probably bogus
> encodings, like A WITH ACUTE WITH ACUTE, which is probably an error
> Enjoy!
> --
> John Cowan
> e'osai ko sarji la lojban.

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:36 EDT