HTML question

From: Chuck Wrobel (wrobel@geoplex.com)
Date: Wed Jun 11 1997 - 19:36:35 EDT


I've got a question about HTML that I hope someone can help me with.
When html contains native encoding of
text, the <tags> are in ascii, it appears. How does a program
(that wants to translate the native stuff to another encoding) find
the native encoding stuff amidst the tags? I.e. it looked like
the HTML was like:

        <b><anothertag>XXYYZZ

where XX, YY, and ZZ were 2-byte quantities, ie. some wide character
encoding. Now the real question - how does it know where the encoding
starts? E.g. if the HTML contained:

        <b><anothertag> XXYYZZ

would the extra space after the end of the tag mean that the first
"wide" character in that input would be " X" (and the second "XY")?
It's not clear how a conversion program (that needs to skip the tags
(which appear to always be in ascii)) and convert the text) can find
the text.

Chuck Wrobel



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT