Maths letters and digits (was: Is it true that Unicode is insufficient for Oriental languages?)

From: Philippe Verdy (
Date: Thu May 22 2003 - 18:33:00 EDT

  • Next message: Kenneth Whistler: "Re: CodePage Information"

    From: "Asmus Freytag" <>
    > Styled text uses markup. However, for specialized texts, such as
    > mathematics, where loss of style-markup can completely eradicate the
    > meaning of the text, several symbol sets have been added to Unicode, where
    > the symbols look like styled letters, but function very differently (i.e.
    > as mathematical symbols).

    This creates some conflicting interest: which semantic for characters encoded in fonts: a style semantic if one wants to present Latin or Cyrillic text with a Gothic style?

    Also there are exceptions in the new mathematic block, where some letters were not encoded considering that they are already available in other blocks, either as Letter-like symbols or as plain characters.

    What would happen to a mathematic text rendered with a Gothic style ? One could not make a semantic distinction between plain characters symbols, and Gothic style symbols. The current encoding assumes that mathematic text use only fonts in a basic style using only the "representative glyphs" shown in charts. Depending on the fonts available to render a particilar text style (independantly of its abstract charactersemantic), such distinction will be hard to make.

    This just proves that mathematical symbols use also the plain standard scripts, whose rendered style is then suddenly important. If accuracy in semantics was needed, clearly we would need to define separate mathematic characters for the basic style, but Unicode chose to unify them...

    Conclusion: mathemetical symbols is a separate script, but Unicode unifies this set incoherently as it assumes a default style for all scripts. So can we say that general purpose fonts for extended Latin with Gothic style are Unicode-compliant?
    Also it is not clear how serif and sans-serif variants of mathematical symbols will behave with other non mathematic text, and where we can say that the encoded text is mathematic and where it is not, so where a required style MUST be applied.

    May be this should require defining new "BEGIN MATHS" and "END MATHS" (or "BEGIN TEXT") abstract characters and encode them (as format control characters) for the same semantic reasons Unicode defined and encoded the "Invisible Function Application" or "Invisible Comma" or "Invisible Multiplication Operator" (I'm not sure if they are their exact name, so look in UCD if you need them).

    This archive was generated by hypermail 2.1.5 : Thu May 22 2003 - 19:29:00 EDT