Re: charset parameter in Google Groups

From: Andreas Prilop (prilop4321@trashmail.net)
Date: Thu Jul 01 2010 - 12:00:37 CDT

  • Next message: John Burger: "Re: charset parameter in Google Groups"

    On Mon, 28 Jun 2010, Mark Davis wrote:

    > I'll overlook the lack of civility, since I can understand
    > that kind of frustration when something doesn't work.

    Well, I am aware of this problem/bug for many years now:
     http://groups.google.co.uk/group/sci.lang/msg/eb55255e1925350f

    Over the years I tried again and again and again to write
    to Google, for example with such forms as
    http://www.google.com/support/contact/bin/request.py?page=&contact_type=suggestion_t&master=suggestion_t&Action.Search=Continue
    But nothing happened.

    How to file a bug report with Google?

    > This is the first I've heard of this as a problem with Google Groups.
    > I filed a bug against Groups for this issue; I'll see what they find out.

    Thank you!

    > BTW, does the same thing happen if you send your email in UTF-8?

    No. On the contrary, groups.google tries UTF-8 always:
     http://groups.google.co.uk/group/pl.test/msg/359af83289a00e8e
    This messages contains Latin-2 characters with charset=ISO-8859-2.

    > The problem with slavishly following the charset parameter is
    > that it is often incorrect.

    I wonder how you could draw such a conclusion. In order to make
    such a statement, there must be some other (god-given?) parameter,
    which is the "real charset".

    Each and every program (webbrowser, newsreader, e-mailer ...)
    reads the charset parameter and displays the document
    (webpage, e-mail message, news message) accordingly.
    Do you really think that the author of a webpage or message
    will not notice?
    Do you really think that all these programs (including
    the author's tools) render the document incorrectly
    and only Google knows better?

    How do you come to this conclusion?

    I admit that the situation is different with the LANG
    attribute in HTML. That's because writing, e.g.
      <HTML lang=fi>
    has almost no practical implications and the text
    might well be French.

    But when we have charset=ISO-8859-7 , all of the author's
    programs (I believe) will display the document in Greek.

    How can you think that only Google knows better
    to "correct" this into charset=ISO-8859-1 ?

    You can still make your guesses when a charset parameter
    is completely missing.



    This archive was generated by hypermail 2.1.5 : Thu Jul 01 2010 - 12:06:53 CDT