[unicode] Re: Malay (Latin) characters in Unicode?

From: Rick McGowan (rick@unicode.org)
Date: Fri Mar 23 2001 - 03:13:33 EST


David Starner wrote:

> I have a copy of Shellbear's Practical Malay Grammar that I'm preparing
> to transcribe for Project Gutenberg. Unfortunately, he represents the
> Malaysian alphabet in a Latin transliteration that includes ng as a
> single ligatured form, and I don't know how to transcribe in Unicode.

Could you perhaps post or point to a picture of what it looks like? I
suppose it's an "N" with a loopy tail of some type.

The character you are looking for is probably U+014B in lowercase or
U+014A in uppercase. I would be rather surprised if that's not what you're
looking for.

Another way to approach this would be to put a Perl script in the
Gutenberg Edition header info so that users who wanted to do so could
extract the script, run it, and transliterate the file into UTF-8. Then
put your edition out in pure ASCII with /ng/ for the ligatured form and
note that it's equivalent to U+014B.

BTW, a bit off topic here but: I think it's high time that Project
Gutenberg adopted some very clear character encoding guidelines now that
they're expanding so widely. Or have they already adopted them and I've
just missed the policy statement...? They're in for a real mess if they
don't specify character encodings in a very controlled way.

        Rick



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:15 EDT