**From:** Sinnathurai Srivas (*sisrivas@blueyonder.co.uk*)

**Date:** Thu Apr 28 2005 - 17:19:39 CST

**Previous message:**Markus Scherer: "Re: Transliterator"**In reply to:**Sivakatirswami: "Code Point -- What is the integer?"**Next in thread:**Hans Aberg: "Re: Code Point -- What is the integer?"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

This URL

http://www.unicode.org/charts/PDF/U0B80.pdf

at

http://www.unicode.org/charts/

might help.

Think of Roman way of counting.

Think of Arabic/ Indic way of counting

Think of hexadecimal counting.

This page uses hexadecial http://www.unicode.org/charts/PDF/U0B80.pdf

You can tranlate (convert this to decimal ie, Arabi/Indic as world knows it)

Hope this makes sense.

ex Tamil K (Ka) = Hexadecimal "0b85" = Decimal "2949"

Hex is another language in counting system.

Sinnathurai

----- Original Message -----

From: "Sivakatirswami" <katir@hindu.org>

To: <unicode@unicode.org>

Sent: Thursday, April 28, 2005 5:43 AM

Subject: Code Point -- What is the integer?

*> Namaskar and Aloha from the offices of Himalayan Academy Publications in
*

*> Hawaii...
*

*>
*

*> Where we are just slowly learning about Unicode in our publications work..
*

*>
*

*> I'm writing a short article on Unicode in a "public" magazine (Hinduism
*

*> Today) about Mac OSX Tiger ((10.4) support for Tamil Unicode...
*

*>
*

*> I need to get down to a very layman's level and only have a very small
*

*> space allotment.
*

*>
*

*> Despite reading all the documents ( I downloaded *all* the PDF's for the
*

*> 4.0 standard book) I *still* have trouble getting my head around the
*

*> difference between
*

*>
*

*> 1. The code points described as a simple series of integers from
*

*>
*

*> 1 to 1,123,000 (or whatever that last integer is that is equivalent to:
*

*> U+10FFFF)
*

*>
*

*> This being the simplest way a layman can visualize it, albeit the latter
*

*> number is big... it still easy to describe and visualize (roughly of
*

*> course) as in:
*

*>
*

*> "Unicode is this just a long series from One to over One Million and
*

*> there is a character in each place and the whole list includes all the
*

*> characters of all the languages known to man, past and present."
*

*>
*

*> Which of course sounds at the very least "cool" for the glib-minded and
*

*> incredibly ground breaking for those who can see it for what it is... (if
*

*> true, which it seems to be...)
*

*>
*

*> 2. but then we move on to: " Unicode characters may be encoded at any
*

*> code point from U+0000 to U+10FFFF" and now we begin to slide into the
*

*> "nerd realm"
*

*>
*

*> I understand "004F" to be the hexadecimal representation for four
*

*> separate, 4-bit sequences.
*

*>
*

*> for purposes of a diagram, I would like to translate any given such code
*

*> point designation like A = U+0041 to its integer position in the series.
*

*> (aside question: what do you call that kind of "label" for the code point:
*

*> "U+****"?)
*

*>
*

*> e.g. expressed verbally, if one were writing an article for "mom and pop"
*

*>
*

*> The capital letter A is number "65" in the series... but computer geeks
*

*> like to express it in hexidecimal form like this, "U+0041" and if you
*

*> really need to describe it to the computer then it is "0000 0000 0100
*

*> 0001"
*

*>
*

*> or in a diagram simply
*

*>
*

*> A --> 65 --> U+0041 --> 0000 0000 0100 0001
*

*>
*

*> And ditto for one Tamil Char and one Chinese character... but my problem
*

*> is ascertaining the second, simple integer, segement...
*

*>
*

*> OK, so my questions are:
*

*>
*

*> 1) is the decimal expression for the capital letter A as 65 exactly
*

*> correspondent to its integer code point position in the total unicode
*

*> series expressed as as a series of integers?
*

*>
*

*> 2) How can one ascertain the integer number for a code point outside-above
*

*> base ANSI?
*

*>
*

*> e.g. in the diagram I want to put an English char, a Tamil chara and a
*

*> Chinese character...
*

*>
*

*> So I we want to be able to say, for the layman:
*

*>
*

*> "The entire Tamil alphabet is contained between characters 2560 and 2843
*

*> in the unicode series" But one need sto
*

*>
*

*> a) be able find where those blocks are (where do you go to find the blocks
*

*> beginning and endings for different languages)
*

*> b) be able to translate "U+0BE6" (which is a position in the Tamil set)
*

*> back to a simple integer in the series. If I just "do the math* using the
*

*> same correlation for the Letter A ["0041" = "65"therefore 0BE6 must equal
*

*> **** ] ... will it be correct?
*

*>
*

*> I'm hoping I can go somewhere to find this info easily from some
*

*> tables....
*

*>
*

*> TIA!
*

*>
*

*> Sannyasin Sivakatirswami
*

*> Himalayan Academy Publications
*

*> at Kauai's Hindu Monastery
*

*> katir@hindu.org
*

*>
*

*> www.HimalayanAcademy.com,
*

*> www.HinduismToday.com
*

*> www.Gurudeva.org
*

*> www.Hindu.org
*

*>
*

*>
*

*>
*

*>
*

**Next message:**Mark Davis: "Re: Transliterator"**Previous message:**Markus Scherer: "Re: Transliterator"**In reply to:**Sivakatirswami: "Code Point -- What is the integer?"**Next in thread:**Hans Aberg: "Re: Code Point -- What is the integer?"**Messages sorted by:**[ date ] [ thread ] [ subject ] [ author ] [ attachment ]**Mail actions:**[ respond to this message ] [ mail a new topic ]

*
This archive was generated by hypermail 2.1.5
: Thu Apr 28 2005 - 17:22:22 CST
*