Re: A basic question on encoding Latin characters

From: Kevin Bracey (kevin.bracey@pacemicro.com)
Date: Mon Oct 04 1999 - 05:08:54 EDT


In message <37F4E501.5A1E89CC@ispchannel.com>
          "Mark E. Davis" <markdavis@ispchannel.com> wrote:

> It is still not a problem. XML requires every instance of '<' where it
> could be interpreted as the start of a tag to be quoted as "&lt;", so if
> you wanted to use the combining character sequence it would have to be as
> "&lt;&#x0338;". (actually, the second character doesn't need to be quoted
> if the character set can express it).
>

What I meant was that it might be supplied in form C, but the user-agent
might be decomposing everything on input internally, causing a problem.

On your last point; surely you couldn't say &lt;<U+0338>, because that would
be a semicolon with a slash through it in the source, no? It _would_ have
to be &lt;&#x0338;. Or are we again searching only for base characters in
the source, ignoring combining marks?

-- 
Kevin Bracey, Senior Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.acorn.co.uk/



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:53 EDT