Re: Unicode or specific language charset

From: Addison Phillips (
Date: Mon Dec 18 2006 - 16:04:10 CST

  • Next message: Robert Kidd: "Re: Unicode or specific language charset"

    Use UTF-8.

    Reports of problems with Unicode in browsers are mostly stale. Virtually
    all browsers actually in use (unless you insist on supporting Netscape 4
    or IE 4) will display UTF-8 just fine.

    You may want to provide for separate style sheets per language, since
    you will probably want the control how the browser selects fonts and
    otherwise display things for (say) Japanese or Chinese (although you
    should always allow the browser to search for a font if the ones you've
    named aren't present).

    But the trouble of transcoding between multiple encodings, plus the
    horror of HTML-entitization of characters from outside the current page
    encoding, plus the nastiness of detecting what encoding was used or
    purported to be used later: you will do well to avoid it all by using UTF-8.

    Some good material on this topic can be found on the W3C site:

    See especially:

    Hope that helps.


    Addison Phillips
    Globalization Architect -- Yahoo! Inc.
    Internationalization is an architecture.
    It is not a feature.
    Robert Kidd wrote:
    > I am the owner of and I am debating 
    > whether to use Unicode on the pages or not. There are over 500 
    > translators in the database (but not yet fully registered as the site is 
    > still in development). Each of these translators needs to create at 
    > least two "profile" web pages (one in their source language and one in 
    > their target language). This means I need to have the ability to allow 
    > the translators to enter their profile information in two separate forms 
    > when they register in their respective languages so that the form 
    > contents once submitted will display properly in the user's browser 
    > ("user" here means potential clients looking for a translator).
    > I have heard of some problems with Unicode and browser/computer 
    > configuration so I am not sure if that is the best solution. It also 
    > seems more complex and therefore more expensive and perhaps more prone 
    > to bugs. I am also not sure how to implement Unicode so that the 
    > translators can read/type their info into the form in their languages 
    > and the pages resulting from the form submission are displayed in Unicode.
    > The other solution is to set the form pages to be displayed 
    > automatically using the specific language charset for that page and to 
    > display the web pages using the charset of that page's language.
    > The problem is that I have not had the actual site pages translated yet 
    > (that is coming soon) and the common elements (menus etc.) are all in 
    > English. This will mean a translator's profile page with a language 
    > charset of Japanese (for example) will also have menu items in English. 
    > The combination of the two languages is troublesome and the only 
    > solution I can come up with is to use Unicode OR display these pages 
    > without menus using a target "_blank" to open a new browser window and a 
    > couple of images (in English) to close the window or whatever.
    > I have been thinking and wondering about this for some time and I could 
    > use any help or opinions to get me over the hump of indecision I am 
    > stuck at.

    This archive was generated by hypermail 2.1.5 : Mon Dec 18 2006 - 16:06:54 CST