RE: Pupil's question about Burmese

From: Shawn Steele (Shawn.Steele@microsoft.com)
Date: Wed Nov 10 2010 - 01:47:44 CST

  • Next message: Keith Stribley: "Re: Pupil's question about Burmese"

    FWIW: The OS really likes Unicode, so lots of the text input, etc, are really Unicode. ANSI apps (including non-Unicode web pages), get the data back from those controls in ANSI, so you can lose data that it looked like you entered.

    As mentioned, the "solution" is to fix the app to use Unicode. Especially for a language like this. In these cases, machines will be fairly inconsistent even if they did support some code page, but Unicode works most everywhere.

    Usually it's not difficult for a web page to switch to UTF-8. If it's a form, it's even possible that overriding it on your end might get the data posted back in UTF-8 and succeed (if you're really lucky), but the real fix is to have the web server serve Unicode.

    -Shawn

     
    http://blogs.msdn.com/shawnste

    ________________________________________
    From: unicode-bounce@unicode.org [unicode-bounce@unicode.org] on behalf of Peter Constable [petercon@microsoft.com]
    Sent: Tuesday, November 09, 2010 10:42 PM
    To: James Lin; Ed
    Cc: Unicode Mailing List
    Subject: RE: Pupil's question about Burmese

    A non-Unicode web page is like a non-Unicode app. Web pages, and apps, should use Unicode.'

    Peter

    -----Original Message-----
    From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On Behalf Of James Lin
    Sent: Tuesday, November 09, 2010 11:24 AM
    To: Ed
    Cc: Unicode Mailing List
    Subject: RE: Pupil's question about Burmese

    Oh, don't get me wrong. By having Unicode is like wearing a crown and be a king. It's best thing out there.

    What I am referring is, if a web page is not Unicode supported, or any applications that do not support Unicode, even if running a windows 7 with English locale(even though natively, it supports UTF-16), it is not possible to directly copy/paste without having the correct supported locale, if not, you may damaging the bytes of the characters which show corruptions.

    Even though most modern API is and hopefully written in Unicode calls, not all (legacy) applications are written in Unicode, so conversion is still necessary to even handling the non-ASCII data.

    Let me know if I am still missing something here.

    -----Original Message-----
    From: Ed [mailto:ed.trager@gmail.com]
    Sent: Tuesday, November 09, 2010 11:02 AM
    To: James Lin
    Cc: Unicode Mailing List
    Subject: Re: Pupil's question about Burmese

    >
    > Yes, displaying is fine, but the original question is copying and
    > pasting; without the correct locale settings, you can’t copy/paste
    > without corrupting the byte sizes. Copy/paste is generally handle by
    > OS itself, not application. Even if you have unicode support
    > application, you can display, but you can’t handle none-ASCII characters.

    Why not? Modern Win32 OSes use UTF-16. Presumably most modern applications are written using calls to the modern API which should seamlessly support copy-and-paste of Unicode text, regardless of script or language -- so long as the script or language is supported at the level of displaying the text correctly and you have a font that works for that script. Actually, even if the text display is imperfectly (i.e., one sees square boxes when lacking a proper font, or even if OpenType GPOSs and GSUBs are not correct for a Complex Text Layout script like Burmese), copy-and-paste of the raw Unicode text should still work correctly.

    Is this not the case?



    This archive was generated by hypermail 2.1.5 : Wed Nov 10 2010 - 01:50:50 CST