From: Uriah Eisenstein (firstname.lastname@example.org)
Date: Tue Aug 31 2010 - 10:32:33 CDT
Thanks for the answers (and sorry for the somewhat late reply),
My interest in this question is purely technical - as I've mentioned
elsewhere, I'm trying to load Unihan data into an SQL database*, so
occasionally I need more details about the contents of fields without
actually using them. In this case I guess I'll ignore non-UTCnnnnn values
since they are to be changed anyway in the next version of Unicode.
Regarding your question, mpsuzuki, I assume the data in Unihan should
represent the source of the ideograph as precisely as possible, which may be
considered "historical background info". But the mapping of ideographs to
themselves is unclear; ultimately, I guess sources may had better be
associated with specific glyph variants (expressed as IVS), which I
understand is still a bit far... Anyway, since I'm not directly using the
data, I can't say for sure.
* I'm aware of the existence of libUnihan, but I couldn't find it's latest
versions which are supposed to support Windows, and anyway I'm doing
something somewhat different.
On Mon, Aug 30, 2010 at 9:02 PM, John H. Jenkins <email@example.com> wrote:
> On Aug 29, 2010, at 6:07 AM, Uriah Eisenstein wrote:
> UAX #38 (Unihan) defines the kIRG_USource field as a reference into the
> U-source ideograph database described in UTR #45, having the form "UTC
> nnnnn". However, several CJK Compatibility Ideographs are mapped to their
> own code point values, e.g. "U+FA0C kIRG_USource U+FA0C". The formal
> syntax of kIRG_USource allows this, but I've found no explanation as to the
> meaning of such a mapping; there is also no such mapping from a code point
> to another code point.
> This is being changed with the 6.0.0 release. The U-source for all such
> ideographs has been turned into a UTR #45 index, e.g., the U-source for
> U+FA0C is now UTC00915.
> What it means is that the character is a unifiable variant derived from one
> of the industrial (and not national) sources used by Unicode during the
> development of the original URO.
> John H. Jenkins
This archive was generated by hypermail 2.1.5 : Tue Aug 31 2010 - 10:35:57 CDT