Re: Medievalist ligature character in the PUA

From: verdy_p (
Date: Mon Dec 14 2009 - 19:47:32 CST

  • Next message: Asmus Freytag: "Re: Medievalist ligature character in the PUA"

    "John (Eljay) Love-Jensen" wrote:
    > Windows requires that filenames be normalized as NFC. This can cause all
    > sorts of havoc.
    > This Windows requirement is not enforced by the OS.

    Actually, it is enforced in some places : where long filenames (encoded with UTF-16) are associated with short
    filenames (using an 8-bit code page) the mapping will be verified by CHKDSK, which will detect the cases where UTF-
    16 strings do not match their mapping to the 8-bit codepage, the short filenames will be corrected. However it will
    not enforce the normalization forms.

    normalization to NFC is assumed only because the reverse mapping (from the OEM codepage to UTF-16) is implicit
    within the Windows console and MSDOS emulation layer, which only maps the OEM codepage used in DOS file accesses to
    NFC form. But the mpaaing from UTF-16 to the OEM codepage will of course fail in many cases, and DOS application
    running in a console will not be able to access to all filenames stored in a NTFS or FAT32 filesystem reliably, if
    these filenames on a FAT32 or NTFS filesystem don't have unique short filenames already added to them and stored on
    the filesystem itself.

    It's the OS that "translates" (NOT just "transcodes") long filenames to short filenames (and reverses it for legacy
    console apps). But as long as the OS is used with applications that don't need short filenames, it will not do
    anything to enforce the normalization or help the conversion to short filenames.

    And the automatic generation of short filenames from long filenames specified by Win32 apps can be disabled in the
    filesystem settings, for performance reasons (in which case the legacy apps will still have the possibility to view
    files in those filesystems, but they will be affected by degraded performance because this mapping will not be pre-
    stored and will have to be generated and verified by each of their accesses to the filesystem, and without any
    warranty of consistency as the short filenames to the same files, with the same same background LFN, may vary in
    some circumstances).

    This archive was generated by hypermail 2.1.5 : Mon Dec 14 2009 - 19:48:59 CST