Re: (M) Little support for writing numbers?

From: Jeroen Hellingman (etmjehe@genesis.etm.ericsson.se)
Date: Mon Aug 31 1998 - 05:16:49 EDT

Next message: Markus Kuhn: "The Most Common IPA Characters"
Previous message: Marco Mussini: "(M) best way to handle Unicode data internally?"
Maybe in reply to: Marco Mussini: "(M) Little support for writing numbers?"
Next in thread: Kenneth Whistler: "Re: (M) Little support for writing numbers?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> it seems that in Unicode there are satisfactory ways to write dates,
> currencies, and so on, the Japanese way (Imperial age) as well as our
> way, etc.
>
> But I think that there is little support to enable people to write
> numbers their own way (just think to Japanese/Chinese way, or (they told
> me) Indian.

Today, such numbers are no longer used in India or Sri Lanka.
There used to be such numbers in Tamil and Sinhalese. In Tamil, the
numbers can be used either way, that is as a positional system, or
in the Ideographic way, that is, 1980 = '1' * '1000' + '9' * '100' +
'8' * '10', or simply as '1' '9' '8' '0'. Because both uses are
possible, the distinction can be made only by 'smart' convertion
software, that know about both conventions. It will be fairly easy
to distinguish either by the presence of the glyphs for '10', '100',
'1000', etc., and the system can use the second convetion without
resort to yet another digit type. I've already coded such a routine
some time ago.

Jeroen

> It's true, the glyphs for the Japanese kanjis for the digits, as well as
> 100, 1000 and 10000 are there, but the problem of input of numeric data
> in nonwestern contexts has several side problems:
>
> - must (should) distinguish between Japanese digits and western digits,
> and disallow the user to write a number by mixing the two sets. But all
> number-related symbols are market as "is digit", without distincion
> about "is digit, more precisely, a japanese digit"... so a routine that
> processes numeric input needs to check the symbols read to figure out if
> the number is being written in jap or western or whatsoever, and then
> apply the rules (e.g. 1 million is 100 x 10000 in Japanese
> ['hyaku-man']), so to assemble what the user types in and to get the
> internal binary representation of the number there is a lot of decisions
> to be taken and there is little support built into the Unicode set for
> taking these decisions.

all logic needed is a check for a number after the first ideographic (or Tamil
or Sinhala) digit seen, to select the proper convertion algorithm.

> It seems that the routine dealing with number input has to know by
> itself a lot of stuff that could be (at least in part) conveniently
> placed in the information tags of the Unicode format.

> --Marco

Jeroen

Next message: Markus Kuhn: "The Most Common IPA Characters"
Previous message: Marco Mussini: "(M) best way to handle Unicode data internally?"
Maybe in reply to: Marco Mussini: "(M) Little support for writing numbers?"
Next in thread: Kenneth Whistler: "Re: (M) Little support for writing numbers?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT