Re: Subject: Re: 32'nd bit & UTF-8

From: Michael \(michka\) Kaplan (michka@trigeminal.com)
Date: Mon Jan 24 2005 - 11:48:36 CST

  • Next message: Lokesh Joshi: "Re: Need help for Arabic text processing"

    The filesystem only supports simple casing that does not change the size of
    the buffer (it also is a much older version of the casing table, last
    updated in the NT4 Beta 1 timeframe, though I am working on that....).

    This is not broken, and it is not wrong. And it is obvious that fileystems
    which have to span versions are cautious in how often they make changes.

    MichKa [MS]
    NLS Collation/Locale/Keyboard Technical Lead
    Globalization Infrastructure, Fonts, and Tools
    Microsoft

    ----- Original Message -----
    From: "Arcane Jill" <arcanejill@ramonsky.com>
    To: "Unicode" <unicode@unicode.org>
    Sent: Monday, January 24, 2005 12:18 AM
    Subject: RE: Subject: Re: 32'nd bit & UTF-8

    > I, too, have also managed to create Windows filenames containing U+FFFF,
    > Windows filenames containing unpaired surrogates, etc.
    >
    > I have also managed to store two files in the the same (Windows)
    directory, one
    > called "ss" and the other called "ß" (U+00DF). This violates the principle
    that
    > "ss" and "ß" are supposed to be equivalent in a case-insensitive system.
    >
    > Windows was quite happy to handle those files after the event, including
    > copying them from place to place. Even the directory containing "ss" and
    "ß"
    > could be dragged from a FAT16 filesystem to an NTFS filesystem without
    > complaint
    >
    > However - I *do not* believe that this behavior on the part of Windows is
    > correct. It is broken, and should be fixed. Filenames /should/ be made of
    > characters. Not opaque octets. Not opaque 16-bit words. This behaviour is
    > broken. Period.
    >
    > Jill
    >
    >
    >
    > -----Original Message-----
    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    Behalf
    > Of Lars Kristan
    > Sent: 22 January 2005 10:54
    > To: unicode@unicode.org
    > Subject: RE: Subject: Re: 32'nd bit & UTF-8
    >
    >
    > Then Windows should NOT allow creation of such filenames. But, hell, it
    surely
    > allows unpaired surrogates (Windows is still pretty much UCS-2). And it
    also
    > allows U+FFFF. Well, it looks like filenames on Windows are not really
    text,
    > they are binary data. Not that I believe that, but I've been told to
    process
    > UNIX filenames as binary data. Guess the same is then true for Windows
    > filenames. Nice.
    >
    > Lars
    >
    >
    >
    >



    This archive was generated by hypermail 2.1.5 : Mon Jan 24 2005 - 11:48:05 CST