Re: problem - non-ASCII characters on Windows command line

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jan 29 2004 - 17:40:38 EST

  • Next message: Rick McGowan: "New Public Review Issue"

    RE: problem - non-ASCII characters on Windows command lineFrom: Mike Ayers
    To: Deepak Chand Rathore ; unicode
    Sent: Thursday, January 29, 2004 7:34 PM
    Subject: RE: problem - non-ASCII characters on Windows command line

    > From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
    > Behalf Of Markus Scherer
    > Sent: Thursday, January 29, 2004 8:51 AM
    > As I said in my earlier email, I would try the Windows
    > command line window (DOS prompt window) and
    > set it to Unicode mode via "chcp 10000".
    >
    > I just tried this on Windows 2000, and pasting Unicode
    > characters (that are not in the OEM codepage)
    > from the character map does not work. It appears to perform a
    > conversion from Unicode to the OEM
    > codepage (and then back out).
            I see a similar thing on Win2K server.
    > My other machine has Windows XP. There, the same experiment
    > works - I can paste non-Latin-1 accented
    > Latin characters, Greek, the Euro symbol, etc.

    It does not work in XP either: my default codepage set in my French keyboard
    driver is CP-850 for the console. If I paste a "é" after I have changed to
    "CHCP 10000", what I see is a "Ä", i.e. the result of the displayed
    interpretation of the pasted code point U+00E9 (Latin small letter e with
    accute), as the CP-850 code 0x8E (U+00C4: Latin capital letter a with
    diaeresis).

    Note that even trying to display the current codepage, uses the wrong
    characters:

    C:\>MODE CON /STATUS

    âtat du périphérique CON:
    -------------------------
        Lignes?: 300
        Colonnes?: 80
        Vitesse clavier?: 31
        DÄlai clavier?: 1
        Page de codes?: 10000

    where "?" is the box-drawing character coded 0xCA in codepage 850 (i.e.
    U+2569, box-drawing double line to West North and East) which appears
    instead of the expected non-breaking space U+00A0 (if someone understands
    why this box-drawing character appears, please explain, I can't find the
    rationale). Note also the wrong characters: for "É" incorrectly displayed
    "â", and "é" incorrectly displayed "Ä".

    Even more strange, I can select and copy what is displayed on screen, and
    paste it in a Windows GUI app, such as this email program I'm using to
    compose the message, and I get the correct characters:

    État du périphérique CON:
    -------------------------
        Lignes : 300
        Colonnes : 80
        Vitesse clavier : 31
        Délai clavier : 1
        Page de codes : 10000

    So it seems that despite the characters are not correctly displayed, they
    are correctly stored in the Console display buffer.

    This seems to be an effect of the currently selected font in the Console
    display: if this font is the default legacy raster font built for Console
    apps (built for CP-850 on my system), it will always incorrectly display
    Unicode characters stored in the display buffer.

    So I suppose that the console stores correctly the Unicode characters, but
    fails to convert them into font indices when the font is a legacy raster
    font for console apps (and I don't understand how it can produce such bogous
    display, given than the raster font really contains the correct characters,
    even if it requires a conversion from Unicode to its default OEM codepage
    for which the font was designed.)

    The bug then remains with the display of the Windows console with legacy
    raster fonts.

    A solution is to select a monospaced TrueType font (such as "Lucida
    Console", clean to read if selected in Bold style, at 12 point size) in the
    Console properties menu. Does Microsoft knows this bug in the rendering with
    his own legacy raster fonts selected by default for his own Windows console
    ?



    This archive was generated by hypermail 2.1.5 : Thu Jan 29 2004 - 19:13:44 EST