Re: how to create automated meaning preserving abbreviations?

From: Asmus Freytag (
Date: Wed Jul 22 2009 - 11:51:25 CDT

  • Next message: Kenneth Whistler: "Re: how to create automated meaning preserving abbreviations?"

    On 7/22/2009 6:15 AM, Alexander Kempgen wrote:
    > I was wondering if there are format characters or other ways to mark
    > parts of a word or words, that can automatically be shortened, if the
    > text has to fit in a smaller space when displayed in a software
    > application.
    As far as I know, there is no format character that is designed for that
    purpose, and many would say that is a good thing - because the more
    format characters exist, the more difficult it would be for "simple"
    software to handle such text - having to step past all the irrelevant
    format characters.

    If you are talking about handling strings that are only ever meant to be
    handled by one application, then you could use one of the 66
    noncharacter code points for the purpose. They are FDD0..FDEF,
    FFFE..FFFF, 1FFFE..1FFF etc to 10FFFE..10FFFF.

    These are for _internal use_, which would fit your proposed application

    An alternative to noncharacter code points, is "light-weight" markup.
    This is used in many user interfaces today. Windows uses the & character
    as a command to underline a character in a menu and turn it into a
    keyboard shortcut. You could use "$" and modify your software that it
    treats a single $ as an abbreviation location, but uses two $$ or \$ as
    a real dollar sign. For unabbreviated strings, it would filter single $
    signs, so they are not visible.

    So far, all approaches are valid and useful, if you are trying to solve
    a problem for *your* piece of software, and none of the strings are
    shipped around and have to be interpreted by other software that isn't
    in on the secret.

    If this turns out to become a more common, widespread problem, with the
    need to widely and safely share data that is marked up like that, then
    there are two options:
    a) create or extend a markup language to handle this information (go
    beyond plain text)
    b) convince the UTC that this is of similar importance to, say, the soft
    hyphen, and ask for a standardized code

    Because many user interface systems already use special, ad-hoc
    conventions, and because this has not caused problems, I rather think
    that going with an implementation or platform specific approach should
    be your first try.


    This archive was generated by hypermail 2.1.5 : Wed Jul 22 2009 - 11:54:33 CDT