From: Tom Emerson (tree@basistech.com)
Date: Tue Apr 20 2004 - 22:20:48 EDT
Unihan is designed, first and foremost, to be a _data_ file for
consumption by software. It doesn't matter at all how many spaces are
used for the tabs. The use of tabs make it trivial to write scfipts to
parse the file with grep, awk, Perl, Python.
With regards to the Pinyin orthography: tone numbers make it easier to
process the readings into initial, final and tone. Replacing the
numbers with diacritics makes it more difficult to do this.
-- Tom Emerson Basis Technology Corp. Software Architect http://www.basistech.com "Beware the lollipop of mediocrity: lick it once and you suck forever"
This archive was generated by hypermail 2.1.5 : Tue Apr 20 2004 - 23:02:03 EDT