Re: how to create automated meaning preserving abbreviations?

From: verdy_p (
Date: Fri Jul 24 2009 - 05:41:09 CDT

  • Next message: karl williamson: "Hangul syllable decomposition types"

    > De : "Asmus Freytag"
    > On 7/22/2009 3:48 PM, André Szabolcs Szelp wrote:
    > > Hello,
    > >
    > > While your point might be valid in a number of cases, the very example
    > > you bring seems somewhat strained. Is it indeed the term you want to
    > > use in your UI? It would probably confuse your users more than not.
    > I think the general problem, that of the need of shortening long words
    > or phrases of certain laguaguages because of runtime changes in UI
    > layout is a real one.
    > As early as 1988 I used a piece of software that had a patented
    > algorithm to reduce arbitrary words and phrases to fit into narrow
    > columns, without being tripped up by the usual problem of lots of terms
    > with identical prefixes.
    > Therefore, arguing about the validity of an isolated example isn't going
    > to help.
    > P. Verdy had suggested the use of dual resources, both long and
    > abbreviated. That approach is more flexible in allowing abbreviations
    > that aren't strict substrings of given words. However, the dual
    > resources approach works only for precisely two screen widths.

    Yes but this is exactly that kind of solution which is adopted in CLDR for example ; look at the various ways to
    write date elements like months: there are not only abbreviations, but also alternate representations such as
    numeric ones instead of letters. The exampel is even more pertinent when abbreviating Chinese date elements where
    abbreviated forms replace a part only of the romanized full name with numbers (because the abbreviation would have
    absolutely no meaning when gthe main information is contained within the first single syllable, but the suffix is
    still needed to disambiguate it with unrelated date elements, like year names or day names).

    So the only question to ask is effectively, how many string widths do you need to build a usable GUI? For very
    narrow screens (like mobile phones), the same GUI will most often require a new design, with their own resources
    anyway, given that not all information will fit the screen even when they are abbreviated. Abbreviating texts is
    only a part of the solution, when the more general solution will require a new representation and a new layout

    So not only you'll have to design a GUI, but you will also necessarily have to consider the minimum screen size in
    which the GUI will have to fit. Below some limit, you'll need another layout, and new separate resources which may
    even be represented in separate screens showing distinct part of the same information.

    To your translators, you then need to provide them with a test layout or with commenting informations explaining the
    maximum widths you can fit in your existing GUI, but also you need to hear about your translators when they'sll
    signal you that the GUI cannot be made meaningful and usable without updating the layout design.

    On the opposite, if you are creating an application where the text will be displayed on a full page width in a
    scrollable area, you will generally not have to explain the width/length constraints and you will probably won't
    want or mandate the use of abbreviations or of private notation conventions.

    For me the problem of the screen width is not a problem: I've not said that you needed dual resources just to limit
    you to two distinct layouts for two distinct screen widths. I've just spoken about using as many ressources as
    needed to fit in several possible display layouts. May be for some languages, you will not use the same layouts at
    all (think about RTL and LTR layouts which are generally more complex than just with a basic mirroring of the
    positions and directions of strings and margins). Translating a software will frequently require you more work than
    translating plain text resources: you need something more, which is localisation (and which should also include
    tests for usability).

    > An inline solution could have two modes.
    > a) A marker that identifies a place where some locale specific character
    > can be inserted to mark the end of a prefix-style abbreviation.
    > b) A set of paired markers, identifying a shortened form of the
    > preceding word.

    Yes but why restricting such conventional notation to abbreviations only ? You could as well use the same inline
    markers to delimit multiple forms of the same ressources. But I'mp not sure that inventing such inline notation
    would simplify the problem. It will be often more logical to readapt the layout to the needs for a specific locale
    (more than just the language or its orthography in a given script), and support this addional layout with its own
    separate ressources, making some layouts inaccessible only for some languages that can't fit in them. The perfect
    example context where the multiple-layouts approach for the GUI is the context of applications mobile phones with
    very narrow column widths: you'll also have layouts made specifically for a given locale, not just plain text
    ressources translated for a specific locale and usable within a common GUI layout.

    If you are developing a layout for the web, there are similar contraints due to the fact that HTML is also not very
    flexible (except after payout the price of very complex active scripts, working direcly on the document's DOM and
    using browser's properties). The least usable applications I can see today are those built to use fixed screen
    widths which can't be changed easily without sacrificing the perfect pixel sizes of the unscalable bitmap images
    used to delimit the layout.

    Similar issues are found with Flash and Silverlight web components: you often have to predefine a minimum or fixed
    display size of the component, and you need to either ignore accesses from devices with small displays, or redevelop
    other versions for them (if those devices support these active components). Even in this case, you'll need a new set
    of ressources if you want the support for internationalization (but most web sites ignore the internationalisation
    of their contents, or "solve" the problem of varying display sizes depending on locales, by using icons instead of
    translatable texts that are difficult to make them fit cleanly in the GUI layout).

    Once again, all those problems and solutions are completely independant of the text encoding, and the Unicode
    standard cannot help you, as those solutions will be application-specific and dependant on your specific GUI

    This archive was generated by hypermail 2.1.5 : Fri Jul 24 2009 - 05:44:52 CDT