From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Oct 06 2004 - 06:32:23 CST
From: "Chris Harvey" <chris@languagegeek.com>
> The users seem determined to put the entire alphabet into the PUA, thus
> making a single character for <ng>, <kw>, <ii> etc. I would like to be
> able to present them with something that works and avoid this kind of
> catastrophe.
A better alternative to PUAs, which would require specific fonts and no
interopable solution would be to use controls that make explicit grapheme
clusters: ZWJ notably, and make sure that the editor handles it effectively
as a single cluster, including for backspace.
Or, may be using existing combining modifier letters, even if they look like
superscript in existing fonts (if you are ready to go to PUAs, you would
need to develop a font for them), but as we don't know the whole extents of
the "alphabet", it's hard to determine which solution is best.
I am assuming (I'm possibly wrong) that you'll need it to support some
African languages, and if so, there are existing proposals to increase their
support in Unicode with pending new Latin letters. Using PUAs could be an
interim solution, before new characters are introduced, notably if you need
combining modifier letters to act with the base letter as a single cluster.
If you need that to support the Latin transliteration of Native North
American languages that you support on your web site, as a convenient tool
allowing a reverse transliteration to the native script (which has
constraints on its syllabic structure), and a convenient way to fix the
Latin orthography in order to create richer contents transliterated
appropriately and automatically into the native script, may be you need
really a specific editor that can check and enforce the Latin orthography.
For example you cite the case of Pacific coast schwas, raised consonants and
ejectives (like ə kw q̉), or Hawayian long vowels (with macrons, rarely
supported in fonts) which are difficult to enter with existing keyboards and
fonts. Using a more basic ASCII-based orthography seems like an input method
for such languages, and an intermediate before the production of actual
existing Unicode characters using the proper combining or modifier letters
(in that case, Unicode itself is not the issue, and you may wonder how to
create an input method editor which can show a "simplified" ASCII-only
transliteration which can reliably be converted to the more exact
orthography.
This archive was generated by hypermail 2.1.5 : Wed Oct 06 2004 - 06:47:40 CST