RE: [A12n-policy] VOA- utf-8, lang="en" (Re: languages ...)

From: Don Osborn (
Date: Thu Apr 16 2009 - 05:21:02 CDT

  • Next message: William_J_G Overington: "Re: Localizable Sentences Experiment (derives from Re: [Emoji] Symbols for Two-letter Codes)"

    Hi Dwayne, Thanks for this feedback. Responses in text...

    > -----Original Message-----
    > From: Dwayne Bailey []
    > Sent: Thursday, April 16, 2009 3:02 AM
    > On Tue, 2009-04-14 at 22:23 +0800, Donald Z. Osborn wrote:
    > > Thanks to all for the feedback on this topic. It sounds like the
    > > choice of utf-8 or not is mainly one of policy (or lack of same) and
    > > not technical restraints?
    > I would argue that the choice of UTF-8 has more to do with the tools
    > used to push the content rather then anything specific about policy.

    In this instance I should have been more clear that I was referring to
    policies within organizations like BBC and VOA. It looks as if someone at
    some level in VOA decided that the site would be utf-8 across the board,
    while at BBC either a decision to let old encodings lie was made, or
    encoding was just left up to the webmasters (a non-policy). Similarly, it
    looks like someone at BBC (whether by rule or individual initiative) decided
    to pay attention to language tagging, whereas at VOA this was ignored
    (perhaps anticipating Google's treatment of this element).

    That said, I'd agree that the technology and the tools are also key. It
    seems that there is an interplay between the two however: someone has to
    decide to use the tools and sometimes that decision comes in the form of an
    organizationwide rule or policy.

    > There is generally large confusion about encodings and language on the
    > web. My guess would be that that is caused mostly by content developers
    > and sysadmins who don't know the subject. When a tool just works in
    > UTF-8 the problem disappears without them needing to think about it.

    I think you're right. It relates to one of the digital divides not often
    spoken of: people who manage computer systems and content often have little
    interest or appreciation of issues related to language, and linguists and
    language experts don't get into the technical side of computing or web
    > I gave input to the South African governments MIOS (Minimal
    > Interoperability Standard). I'd consider that a policy document. I
    > haven't got it in front of me but I'm pretty sure we agreed that all
    > content must be in Unicode with UTF-8 encoding preferred. Its
    > important
    > for South Africa as Venda works only in Unicode. It also eliminates a
    > huge amount of problems when people need to translate content.

    Thanks for this information. South Africa would be one of the very few
    countries in Africa to have any policy regarding Unicode and content.

    > Now how long this takes to filter in is unknown. Suppliers must
    > conform
    > to MIOS. So while they must conform the change will probably have more
    > to do with upgrading of software to versions that uses UTF-8 by default
    > more then it has to do with the policy document itself.

    And maybe also to how organizations decide to conform with MIOS?

    All the best.


    This archive was generated by hypermail 2.1.5 : Thu Apr 16 2009 - 05:23:20 CDT