a little more help understanding diacritical encoding

From: Steve Pruitt (SPruitt@exstream.com)
Date: Thu Sep 25 2003 - 12:02:47 EDT

  • Next message: Eric Muller: "Re: About that alphabetician..."

    Thanks for the excellent responses. I now understand how C3 and 89 are derived. I tried getting everything set the way I intrepreted what the list responses said to do. The scenario is:
    I have a page with some diacritical characters displayed and a input text box and a submit button. I copy and past one of the displayed characters into the input box and then submit. What is submitted gets echoed back. The pages use style sheets so I cut and pasted the relevant tags, etc.

    I thought I found the problem. My response had a character encoding of null. I read null defaults to 8859-1 which seemed consistent with my echoed page. So, I explicitly set the response character encoding to UTF-8 via the setContentType method.

    I used a TCP tunneler to see what my request and responses look like. My browser is set to utf-8 also.

    From the tunneler my request had the following posted data: v904=%C3%89 this is correct according to how the utf encoding algo was explained.

    The http response had the following:

    Content-Type: text/html; charset=UTF-8 this is correct.

    <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> is a child in the <head> tag

    <span class="text29">&#201; &#234; &#235; &#237; &#238; &#239; &#240; &#241; &#243; &#244; &#245; &#246;</span> these are the listed characters on the previous page I cut and past from they are listed on this page just for reference - (#201 = C9) is .

    <span class="text17">Accented Characters from&nbsp;&nbsp;previous form:&nbsp;&nbsp;&#195;&#137; </span>
    this is echoed back. #195 = C3 and #137 = 89. These, of course, are displayed as ?.

    I checked the browser to be sure and its encoding is still set to utf-8 and it is. This is everything I know to check. What am I missing?

    This archive was generated by hypermail 2.1.5 : Thu Sep 25 2003 - 13:05:44 EDT