RE: Globalized lists

From: Addison Phillips (addison.phillips@quest.com)
Date: Tue Dec 13 2005 - 15:03:22 CST

  • Next message: Richard Wordingham: "Re: Globalized lists"

    I was midway in replying to Mark's message when this one arrived...

    I tend to recommend against (and was going to point out that Mark's design neatly avoids) treating a list as:

    TOKEN[0] + SEPARATOR + TOKEN[1] + ... + LASTSEPARATOR + TOKEN[n]

    But rather to treat it as:

    TOKEN[0] + TOKEN[1] + ... + TOKEN[n]

    That is, to use a pattern string to allow the text on BOTH sides of the token to be varied. The localizable variables look more like:

    "nonePattern", "(none)"
    "oneItemPattern", "{0}"
    "twoItemPattern", "{0} and {1}" // only two items
    "startPattern", "{0}" // the other place the list might vary
                                      // is at the start!
    "itemPattern", "{0}, {1}" // used to accrete items to the list
    "lastPattern", "{0}, and {1}" // used at the end of the list only
                                      // note that this is where you punctuate
                                      // the entire list (enclosing in quotes
                                      // for example).

    There is the question of whether you really need the conjunction "and" at all. If you practice text dereferencing, users might be okay (context matters here, of course, as to whether this is acceptable) with seeing a more naked list format:

       Items Selected: dog; cat; fish; reptile; geese
       项目被选择: 大狗 猫 猪 鱼 // babelfish; don't complain :-)

    If you want to avoid a lot of code complexity, the less like natural language the text should look. Also, I purposely used the plural for "Items" above so I could point up the count and word agreement issues that might arise (even though I know you are aware of them).

    Although I tend to think that Mark is pretty safe assuming that two is the upper likely limit for list variations in most languages, it is usually best to code around the special handling of anything using magic numbers in the code (in this case to allow for languages that handle three item lists specially, for example), by resourcing the values used to control resource selection. Of course, a *perfectly* generic solution will be more cumbersome to code, test, and optimize.

    The other problem is avoiding resources that will be difficult to explain to the localizers (who are used to replacing one string with a different string, not to making a different set of strings). For something you might never use, it is probably a lot of work and you can use something like the above without running into many difficulties.

    Addison

    Addison P. Phillips
    Globalization Architect, Quest Software

    Internationalization is not a feature.
    It is an architecture.

    > -----Original Message-----
    > From: Mike Ayers [mailto:mayers@celequest.com]
    > Sent: 2005年12月13日 12:17
    > To: Addison Phillips
    > Cc: Unicode Mailing List
    > Subject: Re: Globalized lists
    >
    >
    > Addison Phillips wrote:
    >
    > > Without knowing more specifics, it is difficult to advise you precisely.
    > > If you are trying to write general purpose code that can serve many
    > > languages, perhaps simultaneously, then avoiding the generation of long
    > > lists where possible might be a good idea. The more one fools around
    > with
    > > count, gender, inter-word dependency and the like, the more likely one
    > is
    > > to get it wrong somehow.
    >
    > The problem is that I am dealing with a situation involving repeated
    > use of the terms "generalized" and "arbitrary". We have no control over
    > the existence or content of the lists, but we must display them as best
    > we can when they arrive. What I'm thinking now:
    >
    > Localizable variables:
    >
    > LIST_CONCAT - String which is inserted between consecutive,
    > non-terminal list items.
    >
    > LIST_TERMCAT - String which is inserted between the second-last and
    > last items.
    >
    > LIST_PAIRCAT - String which is inserted between the items in a two
    > element list.
    >
    > Their descriptions should make the algorithm clear. Example:
    >
    > en_US:
    > LIST_CONCAT - ", "
    > LIST_TERMCAT - ", and "
    > LIST_PAIRCAT - " and "
    >
    > ["dog", "cat", "pig", "fish"] -> "dog, cat, pig, and fish"
    >
    > zh_CN:
    > LIST_CONCAT - ""
    > LIST_TERMCAT - "和"
    > LIST_PAIRCAT - "和"
    >
    > ["大狗", "猫猫", "猪", "鱼"] -> "大狗猫猫猪和鱼"
    >
    > Would this be sufficient?
    >
    >
    > Thanks,
    >
    > /|/|ike



    This archive was generated by hypermail 2.1.5 : Tue Dec 13 2005 - 15:17:35 CST