# Re: Code Point -- What is the integer?

From: Sinnathurai Srivas (sisrivas@blueyonder.co.uk)
Date: Thu Apr 28 2005 - 17:19:39 CST

• Next message: Mark Davis: "Re: Transliterator"

This URL
http://www.unicode.org/charts/PDF/U0B80.pdf
at
http://www.unicode.org/charts/
might help.

Think of Roman way of counting.
Think of Arabic/ Indic way of counting

You can tranlate (convert this to decimal ie, Arabi/Indic as world knows it)

Hope this makes sense.

ex Tamil K (Ka) = Hexadecimal "0b85" = Decimal "2949"
Hex is another language in counting system.

Sinnathurai

----- Original Message -----
From: "Sivakatirswami" <katir@hindu.org>
To: <unicode@unicode.org>
Sent: Thursday, April 28, 2005 5:43 AM
Subject: Code Point -- What is the integer?

> Namaskar and Aloha from the offices of Himalayan Academy Publications in
> Hawaii...
>
> Where we are just slowly learning about Unicode in our publications work..
>
> I'm writing a short article on Unicode in a "public" magazine (Hinduism
> Today) about Mac OSX Tiger ((10.4) support for Tamil Unicode...
>
> I need to get down to a very layman's level and only have a very small
> space allotment.
>
> 4.0 standard book) I *still* have trouble getting my head around the
> difference between
>
> 1. The code points described as a simple series of integers from
>
> 1 to 1,123,000 (or whatever that last integer is that is equivalent to:
> U+10FFFF)
>
> This being the simplest way a layman can visualize it, albeit the latter
> number is big... it still easy to describe and visualize (roughly of
> course) as in:
>
> "Unicode is this just a long series from One to over One Million and
> there is a character in each place and the whole list includes all the
> characters of all the languages known to man, past and present."
>
> Which of course sounds at the very least "cool" for the glib-minded and
> incredibly ground breaking for those who can see it for what it is... (if
> true, which it seems to be...)
>
> 2. but then we move on to: " Unicode characters may be encoded at any
> code point from U+0000 to U+10FFFF" and now we begin to slide into the
> "nerd realm"
>
> I understand "004F" to be the hexadecimal representation for four
> separate, 4-bit sequences.
>
> for purposes of a diagram, I would like to translate any given such code
> point designation like A = U+0041 to its integer position in the series.
> (aside question: what do you call that kind of "label" for the code point:
> "U+****"?)
>
> e.g. expressed verbally, if one were writing an article for "mom and pop"
>
> The capital letter A is number "65" in the series... but computer geeks
> like to express it in hexidecimal form like this, "U+0041" and if you
> really need to describe it to the computer then it is "0000 0000 0100
> 0001"
>
> or in a diagram simply
>
> A --> 65 --> U+0041 --> 0000 0000 0100 0001
>
> And ditto for one Tamil Char and one Chinese character... but my problem
> is ascertaining the second, simple integer, segement...
>
> OK, so my questions are:
>
> 1) is the decimal expression for the capital letter A as 65 exactly
> correspondent to its integer code point position in the total unicode
> series expressed as as a series of integers?
>
> 2) How can one ascertain the integer number for a code point outside-above
> base ANSI?
>
> e.g. in the diagram I want to put an English char, a Tamil chara and a
> Chinese character...
>
> So I we want to be able to say, for the layman:
>
> "The entire Tamil alphabet is contained between characters 2560 and 2843
> in the unicode series" But one need sto
>
> a) be able find where those blocks are (where do you go to find the blocks
> beginning and endings for different languages)
> b) be able to translate "U+0BE6" (which is a position in the Tamil set)
> back to a simple integer in the series. If I just "do the math* using the
> same correlation for the Letter A ["0041" = "65"therefore 0BE6 must equal
> **** ] ... will it be correct?
>
> I'm hoping I can go somewhere to find this info easily from some
> tables....
>
> TIA!
>
> Sannyasin Sivakatirswami
> at Kauai's Hindu Monastery
> katir@hindu.org
>