[unicode] Malay (Latin) characters in Unicode?

From: dvdeug@hushmail.com
Date: Fri Mar 23 2001 - 02:46:40 EST


[Feed another to the shubnet . . .]

I have a copy of Shellbear's Practical Malay Grammar that I'm preparing
to transcribe for Project Gutenberg. Unfortunately, he represents the
Malaysian alphabet in a Latin transliteration that includes ng as a
single ligatured form, and I don't know how to transcribe in Unicode.
Some ideas:

(1) Use a private use character. Not feasible, because it needs to readable
by the average person, not just someone who has patience to set up their
computer for this one file.

(2) Use a ZWJ between n and g. If I'm not mistaken, most current systems
will show the ZWJ as a little black box, and there's going to be very
few systems any time soon that would actually display the ng ligature.
Still, a good Unicode system will elide the ZWJ displaying the acceptable
ng with the real information still in the file.

(3) Petition Unicode for a new character. Right. I'm going to argue
for a character used in two books (that I know of) that bears
annoying similarity to the ng (non-ligatured) flame wars, that
in the best of cases I wait a couple years for it to be accepted.

(4) Resort to ASCII trickery to distinguish between ng (ligatured) and
ng (non-ligatured). Marking the ng (ligatured) would be ugly; marking
the unligatured would be also ugly, although a lot rarer - I don't know
if Malay (in this transliteration) uses ng (non-ligatured).

(5) Just use ng. A simple, just ASCII solution. I don't know if it's
information preserving though.

Any suggestions?

-- 
David Starner - dstarner98@aasaa.ofe.org
Gutenberg stuff - http://dvdeug.dhis.org/guten/ (down for the week)

Free, encrypted, secure Web-based email at www.hushmail.com



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:15 EDT