At 16:55 04/07/97 -0400, Chris Lilley wrote:
>> >> I expect problems like this to be many orders of magnitude worse
>> >> once Unicode starts to get widely used on the Web. The above
>> >> problem is at least well-defined, the people using the
>> >> 0x80-0x9f characters in HTML are clearly wrong, the HTML specification
>> >> leaves no doubt about this. The problem is just that the authors
>> >> of HTML export filters of one very popular word processor have been
>> >> ignorant about the problem (I won't mention names here).
>> This would be allowed if the HTML charset will be coded correctly as
>> I guess authoring tools will gradually get over producing a
>> misleading 8859-1 specification, which many do now.
>> A note to authoring tools producers: If you do not know for sure that it is
>> 8859-1 don't produce this charset specification. Either get the correct
>> data from the operating system or ask the user,
>or convert to character entities if available, or convert to utf-8
>> and if this is not possible it is better you do nothing!
>Doing nothing is equivalent to labelling as 8859-1, according to HTTP.
>I don't see how the application can not know what character set it
While according to the specs 8859-1 is the default, and so it seems
equivalent to specifying nothing, the way this is implemented by the most
commonly used browsers is that when nothing is specified the user (reader)
can supply a specification but if it is specified the user can not. This is
why I suggest that the 8859-1 specification will be there only if the
authoring code has reason to believe that it is correct and not otherwise.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT