CJK Parsing Techniques

From: David C. Brauer (dbrauer@worldpoint.com)
Date: Tue Jul 01 1997 - 16:17:47 EDT

Next message: Rick McGowan: "Re: CJK Parsing Techniques"
Previous message: Markus Kuhn: "Re: MES instead of ISO 8859-nn"
Next in thread: Rick McGowan: "Re: CJK Parsing Techniques"
Maybe reply: Rick McGowan: "Re: CJK Parsing Techniques"
Maybe reply: Roland Wang: "Re: CJK Parsing Techniques"
Maybe reply: Maobing Jin: "Re: CJK Parsing Techniques"
Maybe reply: Mark Leisher: "Re: CJK Parsing Techniques"
Maybe reply: Misha Wolf: "Re: CJK Parsing Techniques"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I know this question is a bit off the topic of Unicode, but this group
seems very aware of the latest in text processing.

The parsing of CJK text to find meaning tokens (word equivalents) seems
to be a daunting problem due to lack of word boundaries. Are there any
techniques, tools or algorithms (free or licensable) that do a good job
of parsing "words" out of a CJK string.

Thanks,

Dave

text/x-vcard attachment: Card for David C. Brauer

Next message: Rick McGowan: "Re: CJK Parsing Techniques"
Previous message: Markus Kuhn: "Re: MES instead of ISO 8859-nn"
Next in thread: Rick McGowan: "Re: CJK Parsing Techniques"
Maybe reply: Rick McGowan: "Re: CJK Parsing Techniques"
Maybe reply: Roland Wang: "Re: CJK Parsing Techniques"
Maybe reply: Maobing Jin: "Re: CJK Parsing Techniques"
Maybe reply: Mark Leisher: "Re: CJK Parsing Techniques"
Maybe reply: Misha Wolf: "Re: CJK Parsing Techniques"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT