Re: Subject: Re: 32'nd bit & UTF-8

From: Michael \(michka\) Kaplan (
Date: Mon Jan 24 2005 - 11:48:36 CST

  • Next message: Lokesh Joshi: "Re: Need help for Arabic text processing"

    The filesystem only supports simple casing that does not change the size of
    the buffer (it also is a much older version of the casing table, last
    updated in the NT4 Beta 1 timeframe, though I am working on that....).

    This is not broken, and it is not wrong. And it is obvious that fileystems
    which have to span versions are cautious in how often they make changes.

    MichKa [MS]
    NLS Collation/Locale/Keyboard Technical Lead
    Globalization Infrastructure, Fonts, and Tools

    ----- Original Message -----
    From: "Arcane Jill" <>
    To: "Unicode" <>
    Sent: Monday, January 24, 2005 12:18 AM
    Subject: RE: Subject: Re: 32'nd bit & UTF-8

    > I, too, have also managed to create Windows filenames containing U+FFFF,
    > Windows filenames containing unpaired surrogates, etc.
    > I have also managed to store two files in the the same (Windows)
    directory, one
    > called "ss" and the other called "ß" (U+00DF). This violates the principle
    > "ss" and "ß" are supposed to be equivalent in a case-insensitive system.
    > Windows was quite happy to handle those files after the event, including
    > copying them from place to place. Even the directory containing "ss" and
    > could be dragged from a FAT16 filesystem to an NTFS filesystem without
    > complaint
    > However - I *do not* believe that this behavior on the part of Windows is
    > correct. It is broken, and should be fixed. Filenames /should/ be made of
    > characters. Not opaque octets. Not opaque 16-bit words. This behaviour is
    > broken. Period.
    > Jill
    > -----Original Message-----
    > From: []On
    > Of Lars Kristan
    > Sent: 22 January 2005 10:54
    > To:
    > Subject: RE: Subject: Re: 32'nd bit & UTF-8
    > Then Windows should NOT allow creation of such filenames. But, hell, it
    > allows unpaired surrogates (Windows is still pretty much UCS-2). And it
    > allows U+FFFF. Well, it looks like filenames on Windows are not really
    > they are binary data. Not that I believe that, but I've been told to
    > UNIX filenames as binary data. Guess the same is then true for Windows
    > filenames. Nice.
    > Lars

    This archive was generated by hypermail 2.1.5 : Mon Jan 24 2005 - 11:48:05 CST