Re: Utility to report and repair broken surrogate pairs in UTF-16 text

From: Martin J. Dürst (duerst@it.aoyama.ac.jp)
Date: Thu Nov 04 2010 - 05:16:06 CST

  • Next message: Bjoern Hoehrmann: "Re: Utility to report and repair broken surrogate pairs in UTF-16 text"

    There is charlint (http://www.w3.org/International/charlint/), which is
    based on UTF-8. It may be possible to adapt it to UTF-16/32.

    Regards, Martin.

    On 2010/11/04 4:37, Jim Monty wrote:
    > Is there a utility, preferably open source and written in C, that inspects
    > UTF-16/UTF-16BE/UTF-16LE text and identifies broken surrogate pairs and illegal
    > characters? Ideally, the utility can both report illegal code units and "repair"
    > them by replacing them with U+FFFD.
    >
    > Jim Monty
    >
    >
    >
    >

    -- 
    #-# Martin J. Dürst, Professor, Aoyama Gakuin University
    #-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
    


    This archive was generated by hypermail 2.1.5 : Thu Nov 04 2010 - 05:21:32 CST