RE: RE: A basic question on encoding Latin characters

From: Marco.Cimarosti@icl.com
Date: Wed Sep 29 1999 - 15:09:00 EDT


I keep not understanding where the problem is with these terminals and
combining characters.

The fact is that the polling (receiving) of BINARY characters (encoded in
Unicode, ASCII, GB, or whatever) and their visualization should be two
completely separate and unrelated things.

The part that receives BINARY characters (from remote or local) should not
even imagine the existence of things like combining characters, or the bidi
algorithm, or canonical or eretical (de)composition.

And the part that handles display (graphics) should not care at all where
the characters came from, or how and when this occurred.

Unluckily, my broken English is just a second language for me so I have
problems expressing myself clearly. Please allow me explain the same thing
in my mother tongue:

---------------------------------------------

#include <stdio.h>
#include <wchar.h>

/* Unicode modem library: handles serial communication, UTF conversions,
etc. */
extern void UnicodeConnection_Open(void);
extern void UnicodeConnection_Close(void);
extern int UnicodeConnection_IsEof(void);
extern int UnicodeConnection_IsCharReady(void);
extern wchar_t UnicodeConnection_ReadChar(void);
extern void UnicodeConnection_WriteChar(wchar_t c);

/* Unicode keyboard library: handles plain keys, dead keys, input methods,
etc. */
extern void UnicodeKeyboard_Open(void);
extern void UnicodeKeyboard_Close(void);
extern int UnicodeKeyboard_IsEof(void);
extern int UnicodeKeyboard_IsCharReady(void);
extern wchar_t UnicodeKeyboard_ReadChar(void);

/* Unicode display library: handles fonts, bidi, normalization, etc. */
extern void UnicodeWindow_Open(void);
extern void UnicodeWindow_Close(void);
extern void UnicodeWindow_DrawLine(wchar_t * text, int length);
extern void UnicodeWindow_FeedLine(void);

void MyEchoChar(wchar_t c)
{
#define MAX_LINE_LEN 256; /* or whatever */

        static wchar_t Line [MAX_LINE_LEN];
        static int Len = 0;
        
        switch (c)
        {
                case L'\n':
                        Len = 0;
                        UnicodeWindow_FeedLine();
                        break;
                case L'\b':
                        if (Len > 0)
                                UnicodeWindow_DrawLine(Line, --Len);
                        break;
                default:
                        if (Len < MAX_LINE_LEN)
                        {
                                Line[n++] = c;
                                UnicodeWindow_DrawLine(Line, Len);
                        }
                        break;
        }
}

int MyTerminal(void)
{
        wchar_t c;

        UnicodeWindow_Open();
        UnicodeKeyboard_Open();
        UnicodeConnection_Open();

        while (!UnicodeKeyboard_IsEof() && !UnicodeConnection_IsEof())
        {
                if (UnicodeConnection_IsCharReady())
                {
                        c = UnicodeConnection_ReadChar();
                        MyEchoChar(c);

                        /* You now have one character from the server! Do
whatever you want with it! */
                }

                if (UnicodeKeyboard_IsCharReady())
                {
                        c = UnicodeKeyboard_ReadChar();
                        UnicodeConnection_WriteChar(c);
                        MyEchoChar(c);

                        /* You now have one character from the human! Do
whatever you want with it! */
                }
        }

        UnicodeConnection_Close();
        UnicodeKeyboard_Close();
        UnicodeWindow_Close();
}

int main (void)
{
        MyTerminal();
}

---------------------------------------------

I removed all fine details. Is it clearer now what I mean?

Why should combining characters be a problem, as they only become such
inside UnicodeWindow_DrawLine()?

Regards.
        Marco

> -----Original Message-----
> From: Christopher John Fynn [SMTP:cfynn@dircon.co.uk]
> Sent: 1999 September 29, Wednesday 19.51
> To: Unicode List
> Subject: Re: RE: A basic question on encoding Latin characters
>
> Robert Brady <robert@ents.susu.soton.ac.uk> wrote:
>
> > Saying "abandon terminals" is not really an adequate solution either,
> and
> > certainly not to UNIX users. Using decomposed characters in the input
> > stream breaks things that were possible before. Sure, our metaphor
> > couldn't cope with distinguishing "ch" from "c", but one thing it could
> do
> > was distinguish "a-with-ring-above" with "a". We can't do that with
> things
> > normalised to postfix combining characters.
>
> Why not send a control character before the pair indicating <the next two
> characters should be combined>? This effectively gives you the same
> features as a prefixed combining character, though you have to send
> one extra character, and it gives you a combined "ch".
>
> If the terminal has some intelligence you can still type the <combining
> ring
> above> before the <a>. User types combining character, combining
> character
> is held. User types base character,< base character> + <stored combining
> character>
> is sent (or <control: the next two characters are paired> + <base
> character> +
> <combining character> is sent.
>
> - Chris
>
>



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT