RE: (no subject)

From: Keutgen, Walter (
Date: Fri Mar 03 2006 - 13:04:06 CST

  • Next message: Mike Ayers: "Re: (no subject)"


    in the character set standards one tends to say that a 'characters set' is just the list of characters without numbering or encoding them. Associating numbers or computer bit patterns to them is making an 'encoded character' set of it. This allows to say that Unicode characters set and GB18030 character sets are the same or that ISO-8859-x is a subset of Unicode. With the 'codes' attached to the 'characters' this would be impossible. An old example of a coded characters set is the 'Morse alphabet' associating a combination of 2 durations to each latin Letter (SOS = ... --- ...).

    'Letters' are only a minority of 'characters'. The world existed long before computers. Before they invaded printing, 'printers' were people who made matrices full of characters and printed books and newspapers and so on by that method. In an argument about Unicode, I read that for fine printing in ENGLISH one needs about 400 characters. Maybe the English term used by printers for characters is 'types' cfr. 'typewriter'. Maybe the 'printers' were 'typesetters'? Were there perhaps 'type sets' in the past?

    In every day IT life one does not make that subtle difference i.e. characters must be encoded and one feels no need to add the adjective 'encoded'. IT folks added also 'control characters' i.e. #7 means 'ring the bell'.

    I fear that the 'Fachwörterliste', a glossary, rather is a thread needed by the specialists themselves to be sure that they understand each other, hence the reason to have such a list per project.

    A French Google search in French language, yielded that there are more then 6 times as many documents with the plural 'jeu de caractères' (including Wikipedia) than with the singular.


    THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.

    -----Original Message-----
    From: [] On Behalf Of E. Keown
    Sent: vendredi, le 3 mars 2006 18:11
    Subject: (no subject)

               March 2006


    Below, my first definition of a term one MUST know to
    understand character set work. Feel very free to
    critique this. This definition is for non-geeks or,
    at best, semi-geeks.

    Character set, a definition :
            A character set is a computerized version
            of any alphabet (or other writing system).
            Each letter, number, symbol, etc. of the
            computerized alphabet is assigned a unique
            number for the computer to use in software.

    There must be 15 core terms needed for a
    mini-dictionary for character set work. But which 15?

    Marc Kuester of DIN told me that German-language
    proposals include what he calls a "Fachwörterliste," a
    list of terminology to harmonize usage in all German
    technical documents. Great idea!

    Translations of the word character set:

    le jeu de caractère (le pluriel prefere?)
    Codifica dei caratteri

    PLEASE send me more translations if you have them!

    As you know, the Hebrew language has been written for
    3,150 years, at least. There are four living languages
    which were written for over 2,900 years:

    Part of what happened with computerizing Hebrew is
    that no academic Semitist knew the phrase 'character
    set' until maybe 1999.

    There are at least a dozen scholarly societies which
    concern themselves with Hebrew. But only 1-2 of these
    societies have any computational work.

    Elaine Keown
    in white bread America

    Do You Yahoo!?
    Tired of spam? Yahoo! Mail has the best spam protection around

    This archive was generated by hypermail 2.1.5 : Fri Mar 03 2006 - 13:08:32 CST