Re: Hawaiian language site goes Unicode

From: keola@leoki.uhh.hawaii.edu
Date: Wed Jun 03 1998 - 04:04:29 EDT


Aloha kakou,

I've joined this list now, so I'll answer a few questions and commnents I
received and maybe add a few of my own.

> I'm interested to know what process was used to assign U+02BB MODIFIER
> LETTER TURNED COMMA as the appropriate rendering of Hawaiian
> glottal stop. Although Lucida Sans Unicode has a glyph for this
> character, Bitstream Cyberbit does not, which is unfortunate.
> (Luckily for me, I have both fonts.)

I think that was my error. In my earlier message exchanges with Ken he
suggested using U+2018 (single left quotation mark) instead of U+02BB,
and I got them mixed up. I'll fix that and use U+2018 instead. Mahalo.

> Many of the pages (except for the first) seem to miss a
> "charset" declaration, which would make the bowser use the correct
> character decoding right away.

My blunder, I forgot to add the charset declaration to all of my templates,
so only some had it. Will fix this today.

> After visiting the page, it seems that there are some errors in
> the UTF8 encoding. The second byte in a multibyte sequence are
> sometimes out of range, that is, above <BF>, which could result
> in undefined behaviour. The incorrect sequences are <C4><C1>,
> <C5><CD> and <C5><CC>.

I'm really not sure what this means, sorry, I'm very new to this. I'm
converting numerically, using these codes:

A-macron 196,192
a-macron 196,193
E-macron 196,146
e-macron 196,147
I-macron 196,170
i-macron 196,171
O-macron 197,204
o-macron 197,205
U-macron 197,170
u-macron 197,171
glottal - 202,187 (I know this is wrong, gotta find the right combination)

> Interpreting the phrase "as it existed on June 6, 1998"

Guess I was getting a little ahead of myself :-) I worked on some of the
basic site design problems that would make it difficult to keep up both the
escape-sequence Hawaiian language site and a Unicode one. If I can clear
these up I'll be able to keep both up and updated simultaneously.

Thanks to you all for the feedback, I'll keep plugging away at this.

Is there anyone else on the list dealing with conversion of a moderate sized
site to Unicode? I'm just curious what other tools are out there for doing
this. Frontier itself isn't Unicode friendly, but can be modified to render 8
bit text as Unicode. Also, does anyone have insight as to when Mac Nav or
MSIE might support Unicode?

Mahalo a nui.

Keola

==================================================================
              Ho'ouna 'ia mai loko mai o ka Leoki
             papa lawelawe ho'olaha 'olelo Hawai'i
             -------------------------------------
             Kualono - http://www.olelo.hawaii.edu
               hale_kuamoo@leoki.uhh.hawaii.edu
==================================================================



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:40 EDT