From: Asmus Freytag (firstname.lastname@example.org)
Date: Mon Jan 25 2010 - 17:00:23 CST
On 1/25/2010 2:10 PM, Christoph Päper wrote:
> John H. Jenkins:
>> The readings fields are intended to be primarily "what you would see
>> if you looked this up in a dictionary," and secondarily "what you
>> would type to input this character by itself."
>> The fields for Mandarin, Cantonese, Korean, and Vietnamese, however,
>> all use the transcription system preferred by native speakers.
> Taking the close relationship of Unicode and ISO 10646 into account,
> one would expect, naively perhaps, those transcription systems to be
> selected from available ISO romanization standards.
That's the kind or reasoning that gives ISO standards a bad name. If an
ISO standard embodies a superior solution, it might make sense to use
it, but that would have to be the case. Solely going by "brand
preference" is probably not a good selection criterion.
I am sure there are many other ISO standards for information
presentation and data formats that the Unihan database could have
followed, but didn't. In some sense, that's not ideal, because the
format is rather ad-hoc. On the other hand, it freed the original
authors/editors to focus on the contents and data collection.
> In the case of Japanese this would mean 3602 (loose or strict
> variant). This standard and similar ones, on the other hand, do not
> employ 10646 either, but provide graphic character representations
> only. (Admittedly, most of the ones I read have not been updated since
> the turn of the century.)
What one does expect, is that the Unihan data do not see a wholesale
replacement. Such as replacing ASCII data with kana data. Even a
wholesale correction, replacing the romanization scheme from one version
of the database to the next could present problems for anyone who has
written scripts or programs to utilize the data.
Adding a new kana-based set of readings, on the other hand, would not
cause compatibility problems. That should be the route to persue.
This archive was generated by hypermail 2.1.5 : Mon Jan 25 2010 - 17:02:16 CST