RE: Detecting encoding in Plain text

From: Chris Pratley (chrispr@Exchange.Microsoft.com)
Date: Thu Jan 08 2004 - 16:45:47 EST

Next message: Hausmann, Michael: "unsubscribe - mhausmann@bridgew.edu"

Previous message: Tex Texin: "Re: Detecting encoding in Plain text"
Maybe in reply to: Brijesh Sharma: "Detecting encoding in Plain text"
Next in thread: Doug Ewell: "Re: Detecting encoding in Plain text"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

If you are on the Windows platform, look at mlang.dll, and at the
IMultiLanguage2 and IMultiLanguage3 APIs, which provide this service. As
others have noted you will get false detections with too little or
ambiguous data, but you may be quite surprised at just how accurate this
detection is (sometimes just one character outside of the "ASCII"
repertoire), since there is language frequency data used as well as
merely encoding rules.

Chris

-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
Behalf Of Brijesh Sharma
Sent: January 8, 2004 3:08 AM
To: Unicode Mailing List
Subject: Detecting encoding in Plain text

Hi All,
I am new to Unicode.
I writing a small tool to get text from a txt file into a edit box.
Now this txt file could be in any encoding for eg(UTF-8,UTF-16,Mac
Roman,Windows ANSI,Western (ISO-8859-1),JIS,Shift-JIS etc)
My problem is that I can distinguish between UTF-8 or UTF-16 using the
BOM.
But how do I auto detect the others.
Any kind of help will be appreciated.

Regards
Brijesh Sharma

"You're not obligated to win. You're obligated to keep trying to do the
best
you can every day."

Next message: Hausmann, Michael: "unsubscribe - mhausmann@bridgew.edu"
Previous message: Tex Texin: "Re: Detecting encoding in Plain text"
Maybe in reply to: Brijesh Sharma: "Detecting encoding in Plain text"
Next in thread: Doug Ewell: "Re: Detecting encoding in Plain text"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Thu Jan 08 2004 - 17:27:52 EST