Re: Text in composed normalized form is king, right? Does anyone generate text in decomposed normalized form?

From: Julian Bradfield <jcb+unicode_at_inf.ed.ac.uk>
Date: Sat, 2 Feb 2013 17:31:30 +0000 (GMT)

On 2013-02-02, Richard Wordingham <richard.wordingham_at_ntlworld.com> wrote:
> On Fri, 1 Feb 2013 23:51:34 +0000 (GMT)
> Julian Bradfield <jcb+unicode_at_inf.ed.ac.uk> wrote:
>> ...
> But if you use a member of the Keyman family of inputs methods (I've
> been using Keyman for Linux (KMFL), you can set up a keyboard so you
> just enter that using XSAMPA keystrokes, e.g.

I never got round to learning SAMPA; I either use a standard input
method for the relevant language, or I use my own mnemonic system.
I only do non-trivial typing in Emacs, so I don't worry about X input
methods.

> you have to remember to type a_L_k to get the NFC form à̰ rather than
> a_k_L, which delivers the NFD form à̰, but do you not have to remember
> the order of diacritics anyway? Simple codepoint-sequence based
> searching only works if diacritics are in the correct order.

Well, as it happens, the diacritic-heavy orthography has been
displaced by one that only uses tone diacritics (and those are used
only in the dictionary). So ǂhèẽ and ǃn̥à̰ĩ would be written ǂhhèen and
nhǃàqin. I was working with older data. (And in truth, most of the
time I used a phonological markup (e.g. \DelAsp{ǂ}\Pln{è}\Nas{e} and
\VclNas{ǃ}\Phg{à}\Nas{i}) as I was working with four different
transcriptions with non-bijective mappings! But that's another matter.)

> Having set up an NFC-deliverinɡ XSAMPA-based keyboard so that it had
> rules O => ɔ, O\ => ʘ, O\\ => O, I’ve found it would be a lot more
> useful if I’d been a lot less puristic and set it up so that I had O =>
> O, O\ => ɔ, O\\ => ʘ. I use multiple backslashes to get some additional
> characters and recover ASCII, an idea I ɡot from Martin Hosken’s IPA
> keyboard. I’m currently pondering how to maintain puristic and
> ‘practical’ versions from the same source files. Ideally I’d also merge
> in the related Emacs keyboard definition.

Yes, I don't like switching, so I use compose sequences for phonetics.
E.g. multi-key $ c (turned c) for ɔ , multi-key p a (phonetic a) for
ɑ, and so on, with various mnemonic prefixes, shape-based rather than
function-based.
If I were doing lots of it, I'd probably use function keys as dead
keys to replace the multi-key prefix.

-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Received on Sat Feb 02 2013 - 11:37:40 CST

This archive was generated by hypermail 2.2.0 : Sat Feb 02 2013 - 11:37:47 CST