Re: Status of Unihan Mandarin readings?

From: John H. Jenkins (jenkins@apple.com)
Date: Thu Dec 19 2002 - 11:06:13 EST

Next message: David J. Perry: "RE: h in Greek epigraphy"

Previous message: Michael Everson: "Re: h in Greek epigraphy"
In reply to: Andrew C. West: "Re: Status of Unihan Mandarin readings?"
Next in thread: Raymond Mercier: "Re: Status of Unihan Mandarin readings?"
Reply: Raymond Mercier: "Re: Status of Unihan Mandarin readings?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On Thursday, December 19, 2002, at 06:05 AM, Andrew C. West wrote:

>
>> - Any estimates for when it will be possible publish a fixed version?
>
> I'll let Mr. Jenkins answer that one.
>

Unicode 4.0 timeframe. We'll also try to get the preferred Mandarin
(and possibly Cantonese) readings marked by then.

>> - Any suggestion for interim work-arounds (e.g., an older version of
>> the
>> file, an alternative source)?
>
> Use the Unihan database for Unicode 3.0 at
> http://www.unicode.org/Public/3.0-Update/Unihan-3.txt
>
> This is the latest uncorrupted version.
>

This is what's weird about this whole thing. I can't figure out how
the corruption took place between Unicode 3.0 and 3.1. At least it'll
make it easier to fix.

Meanwhile, one caveat regarding the pronunciations supplied in the
Unihan database. While we do try to be accurate and careful and while
we do try to use reliable sources, we are not lexicographers ourselves,
and there's not much we can do when our sources don't agree. For
Mandarin this is a fairly minor problem, but it's a bit more extensive
for Cantonese. One cause of this is that languages are moving targets,
and the pronunciations themselves can change over time. Another is
that sometimes people extrapolate the pronunciation for one dialect
from the pronunciation from another, or from the pronunciation given in
a classical dictionary such as the KangXi. And, for Cantonese in
particular, sometimes characters are new enough that we can't go to
dictionaries but have to rely on the "man in the street" for the
pronunciation (we had a case like this come up in the last IRG). And
sometimes our fingers just trip over each other while we type.

While I think the readings we provide are useful and an important
adjunct to the Unihan database, I'm not sure I'd want to use these
readings if I were developing a commercial-grade product or writing a
scholarly treatise.

==========
John H. Jenkins
jenkins@apple.com
jhjenkins@mac.com
http://www.tejat.net/

Next message: David J. Perry: "RE: h in Greek epigraphy"
Previous message: Michael Everson: "Re: h in Greek epigraphy"
In reply to: Andrew C. West: "Re: Status of Unihan Mandarin readings?"
Next in thread: Raymond Mercier: "Re: Status of Unihan Mandarin readings?"
Reply: Raymond Mercier: "Re: Status of Unihan Mandarin readings?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Dec 19 2002 - 12:09:16 EST