Re: suggestions for strategy on dealing with plain text in potentially any (unspecified) encoding?

From: Allen Haaheim (haaheima@interchange.ubc.ca)
Date: Sat May 10 2003 - 18:30:10 EDT

  • Next message: Jungshik Shin: "RE: suggestions for strategy on dealing with plain text in potentially any (unspecified) encoding?"

    > does a chinese user for example often get confronted with initially
    > completely garbled looking text until they tell the app to use chinese
    > encoding to display the text, then it becomes readable?

    I can't really answer your questions, but I offer you my opinion as an
    amateur. (If you have questions about Chinese languages, on the other
    hand, email me off-list.) I often run into garbled Chinese, and I think I
    will still
    be even after I scrounge up the cash to upgrade, because some people are
    still processing their Chinese using old layovers that don't seem to be
    convertible to a common standard. E.g., most recently I managed to convert
    most of an initially completely unreadable text with SC Unipad. Here's an
    example of a line with a character that didn't convert:

    宮 體 詩 � 究

    The penultimate character should be yanjiu de yan, or look like / be
    equivalent to 研 U+7814. In unipad it goes to U+FFFD (where it is located in
    the vendor's font, I assume). Before my brilliant conversion (by pasting as
    Big5) the above line looked like this:

    ®c Åé ¸Ö ã ¨s

    So while I'm pleased with this 80% success rate, the prospect of
    re-inputting the rest doesn't leave me overjoyed (it's not going to happen,
    of course). The original text was probably processed on an old version of
    Chinese Star, maybe 2.97 (for Win
    3.x), because I loaded CStar 2001 in my EWin ME, and it didn't help.
    Any ideas that haven't already been posted?

    I think whereever you have fonts created by a vendor that you either don't
    have, or won't work on your machine for whatever reason, then you've got
    this problem. And there are quite a few of them, seeing as linguistic
    diversity combined with political rivalries resulted in lots of mutually
    incompatible Chinese. Of course things have been improving lately, though.

    Thanks to all those who posted answers on the list, they are useful for me
    too.

    Regards,

    Allen Haaheim



    This archive was generated by hypermail 2.1.5 : Sat May 10 2003 - 19:07:42 EDT