RE: Definitions

From: Philippe Verdy ([email protected])
Date: Wed Nov 26 2003 - 05:29:37 EST

Next message: Arcane Jill: "RE: Compression through normalization"

Previous message: Arcane Jill: "RE: numeric properties of Nl characters in the UCD"
In reply to: [email protected]: "RE: Definitions"
Next in thread: Peter Kirk: "Re: Definitions"
Reply: Peter Kirk: "Re: Definitions"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

[email protected] wrote:
> Briefly, it's my opinion that applications which claim to support
> and comply with Unicode should not 'step on' Unicode text. Any
> loopholes in the 'letter of the law' which allow applications to
> mung or reject Unicode text should be plugged.

If this "pluging" request must be done, it should be also the case for HTML
and XML.
For now, combining characters can be encoded directly just after a quote
character (single or double) used to mark the beginning of an attribute
value, or just after a tag-closing ">". HTML and XML parsers will parse
these quotes or superior signs by ignoring the combining sequence, creating
defective sequences, but this is a problem.

My opinion is that HTML and XML parsers should not take the quote and
superior sign isolately without considering the whole combining sequence.
This means that such occurences should be considered as syntax errors. If
one really wants to create a Unicode-compliant XML/HTML document containing
defective sequences, these sequences should be encoded with character
entities...

A XML/HTML code generator that generates a serialized document should then
know the list of combining characters, and encode them with numeric entities
when their use is defective (at the beginning of a CDATA section, or of an
attribute value, or of a text element... This would completely "plug the
hole".

__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com

application/ms-tnef attachment: winmail.dat

Next message: Arcane Jill: "RE: Compression through normalization"
Previous message: Arcane Jill: "RE: numeric properties of Nl characters in the UCD"
In reply to: [email protected]: "RE: Definitions"
Next in thread: Peter Kirk: "Re: Definitions"
Reply: Peter Kirk: "Re: Definitions"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Nov 26 2003 - 06:04:12 EST