From: Philippe Verdy <verdy_p_at_wanadoo.fr>

Date: Sun, 29 Mar 2015 13:16:33 +0200

Date: Sun, 29 Mar 2015 13:16:33 +0200

How would you note the numeric value property of the mathematical pi

symbol, if you use "0.5", assuming that it should be written as a single

decimal value without using any operator ?

You can't because there's an infinite number of decimals, unless you

explciitly says that the numeric property is limtied to the precision of an

IEEE 64-bit "double" floatting point value (or 80-bit "long double"

supported natively by x86 processors).

So you have to imagine that the numeric value property is effectively a

mathematical expression using some conventional set of mathematical symbols

(in which case the numeric value property of the pi symbol should be the

symbol itself). In that case, writing "6/12" or "1/2" is fully equivalent,

mathematically, as this property is a mathematical expression.

Now that property should have a syntax defined. The problem being that for

complex expressions there are several mathematical notations, the most

common used in plaintext being using TeX (except that it does not just note

the expression itself but its presentation and layout).

Could Unicode define a basic plaintext syntax for a subset of mathematical

expressions that are useful to parse the "numeric value" field ? It would

of course contain the syntax for numbers (using all decimal digits from

various scripts, but ignoring the localized conventions for decimal

separators, reduced to just the ASCII dot, and the grouping separators,

reduced to none), restricting the use of unnecessary whitespaces in that

field, reducing the use of unnecessary leading zeroes, or trailing zeroes

in decimal parts), it would contain the subset of symbolic constants

encoded in Unicode as symbolic constants (such as pi, e, i). It would not

contain any symbolic constant directly expressible with others. It could

potentially contain superscript digits used for exponents.

And of course it would contain the common set of arithmetic operators (+,

the ASCII "MINUS-HYPHEN" or mathematical MINUS, × or the ASCII ASTERISK, /

or ÷, ^ for noting exponentiation, and parentheses), or algebric operators

(such as√). It would not include special operators (such as ±) that can't

be evaluated to a single number in a single dimensional numerical body (so

we limit us to the body of complex numbers ?). Further extensions would

include some common functions such as core trigonometric and hyperbolic

functions (sine, cosine, tangent, cotangent) and their inverse, and

logarithms.

That syntax would not specify if those expressions are effectively

evaluatable such as 0/0 (it's up to implementations to check this according

to their own numeric domain) as the syntax does not specify the numeric

domain (body or ring?) in which it will be evaluated (for example 1/0 is

valid in some rings where all member numbers are invertible, including

zero), and it will not assume that "-1" is necessarily different from "+1"

(they are equivalent in Z/2Z which just contains two members: 0 and 1, and

where "2" or "4" are also equal "0") or the precision of numbers ("1/100"

could be equal to "0" in an integer domain).

This could be the base for defining a basic set of expressions that many

programming languages could support in their syntax, using the precision

they want or can support (even if their native syntax use other similar

notations with simple substitution rules.

For this reason, it seems more natural to avoid reducing fractions in the

numeric property value, and keep them in their natural form : "6/12" NOT

reduced to "1/2", and "0/3" NOT reduced to "0" (because this may

incorrectly assume a subset of a linear numeric body): let the

implementation define itself its numeric domain and these expressions are

evaluatable in that domain: the parser will be the same, only the evaluator

will be different as it completely depends on the numeric domain.

2015-03-29 11:41 GMT+02:00 Andrew West <andrewcwest_at_gmail.com>:

*> On 28 March 2015 at 20:05, Karl Williamson <public_at_khwilliamson.com>
*

*> wrote:
*

*> >
*

*> > Existing software that looks at the numeric values of characters is
*

*> written
*

*> > expecting that rational numbers will have been reduced to their lowest
*

*> form.
*

*>
*

*> That seems to be a rather rash statement. I have software (BabelPad)
*

*> which parses the numeric values of characters for numeric sorting
*

*> purposes, and it parses "6/12" for MEROITIC CURSIVE FRACTION SIX
*

*> TWELFTHS as 0.5. Personally I find it hard to imagine how you could
*

*> write software that accepts "6/12" as input and is unable to come up
*

*> with the answer of a half.
*

*>
*

*> I would say that fractions should not be reduced to their lowest form
*

*> in the Unicode data as some people may need to order fractions by
*

*> numerator or denominator, and reducing to lowest form could break the
*

*> expectations of some software. Having said that, I note that the
*

*> numeric value of one character has been reduced in the Unicode data:
*

*> U+2189 VULGAR FRACTION ZERO THIRDS is given the numeric value of "0"
*

*> rather that "0/3".
*

*>
*

*> Andrew
*

*> _______________________________________________
*

*> Unicode mailing list
*

*> Unicode_at_unicode.org
*

*> http://unicode.org/mailman/listinfo/unicode
*

*>
*

_______________________________________________

Unicode mailing list

Unicode_at_unicode.org

http://unicode.org/mailman/listinfo/unicode

Received on Sun Mar 29 2015 - 06:18:26 CDT

*
This archive was generated by hypermail 2.2.0
: Sun Mar 29 2015 - 06:18:27 CDT
*