Re: Hanzi trad-simp folding and z-variants

From: john knightley <john.knightley_at_gmail.com>
Date: Sat, 8 Jun 2013 19:09:20 +0800

On Sat, Jun 8, 2013 at 4:02 PM, Stephan Stiller
<stephan.stiller_at_gmail.com>wrote:

>
> http://www.unicode.org/**reports/tr38/<http://www.unicode.org/reports/tr38/>does a good summary of the possibilities.
>>
> Which and where?
>
>
>
Section 3.7.1 Simplified and Traditional Chinese Variants talks about
converting between Simplified and Traditional Chinese.

> Trying to "fold" from one locale to another, which is what folding from
>> traditional to simplified would be is not a good idea, best practice is not
>> bear in mind the locale being used, and do information retrieval on a
>> locale by locale basis.
>>
> What do you mean?
>
> Put simply: Either you don't let someone search a TW database with
> simplified characters or you convert either the search terms or the
> searched documents internally for the duration of your search – or some
> combination of these options. It is not at all obvious to me what the
> fastest way in a big data context is. There's gotta be research about this.
>
> Stephan
>
>
Received on Sat Jun 08 2013 - 06:14:18 CDT

This archive was generated by hypermail 2.2.0 : Sat Jun 08 2013 - 06:14:24 CDT