Re: Cp1256 (Windows Arabic) Characters not supported by UTF8

From: eflarup@yahoo.com
Date: Wed Aug 10 2005 - 11:56:17 CDT

  • Next message: Ritesh: "Re: Cp1256 (Windows Arabic) Characters not supported by UTF8"

    Maybe the new CharsetDetector in ICU 3.4 would be
    useful for this situation:

    http://icu.sourceforge.net/apiref/icu4j/com/ibm/icu/text/CharsetDetector.html

    --- Ritesh <ritesh.h.patel@gmail.com> wrote:

     
    > Now we have few user who upload a file which can
    > contain English and
    > other language characters(Here it is Arabic).
    >
    > This files can have different combinations as below,
    > 1. File is a UTF-8 and have English and Arabic
    > Characters.
    > 2. File is a UTF-16 (LE) and have English and Arabic
    > Characters.
    > 3. File is UTF-8 and Have only Arabic Characters
    > 4. File is UTF-8 and Have only English Characters
    > 5. File is UTF-16 and Have only Arabic Characters
    > 6. File is UTF-16 and Have only English Characters
    > 7. File can be in ASCII format.
    >



    This archive was generated by hypermail 2.1.5 : Wed Aug 10 2005 - 11:57:44 CDT