Re: 32'nd bit & UTF-8

From: Martin Duerst (
Date: Mon Jan 24 2005 - 03:28:48 CST

  • Next message: Martin Duerst: "RE: Subject: Re: 32'nd bit & UTF-8"

    At 21:51 05/01/20, Hans Aberg wrote:
    >On 2005/01/20 09:40, Arcane Jill at wrote:

    >>> The problem is that UNIX software looks at the first bytes to determine if
    >>> it is a shell script.
    >> As noted above, so long as such software does not claim to be Unicode
    >> Conformant, who cares? Ah - but wait. What if there are users out there
    >> demanding Unicode Conformant software? Hmmm...
    >This si also another problem. For example, US Federal Agencies may be
    >required to only use software that is conformant to certain standards. Say
    >at the same time that it is almost impossible to adapt UNIX to be strictly
    >UTF-8 conformant. Then one cannot formally use UNIX anymore in US federal
    >government computers...

    The perl script for getting rid of a BOM at the start of an UTF-8
    file is about 20 chars long. You can just run that on files that
    come from somewhere else. Would be totally in accordance with
    the Unicode standard: You know it's a BOM, and you know you can
    remove it.

    Regards, Martin.

    This archive was generated by hypermail 2.1.5 : Mon Jan 24 2005 - 19:27:27 CST