Re: character entities in UTF-8 files

From: Chris Jacobs (chris.jacobs@freeler.nl)
Date: Tue Jul 12 2005 - 14:40:43 CDT

Next message: Peter Constable: "RE: character entities in UTF-8 files"

Previous message: Kenneth Whistler: "Re: character entities in UTF-8 files"
In reply to: Avraham Shapiro: "character entities in UTF-8 files"
Next in thread: Peter Constable: "RE: character entities in UTF-8 files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

----- Original Message -----
From: "Avraham Shapiro" <asha@loc.gov>
To: <unicode@unicode.org>
Sent: Tuesday, July 12, 2005 7:44 PM
Subject: character entities in UTF-8 files

> ** Low Priority **
>
> We have an XML based application that specifies UTF-8 files as input.
> Occasionally users will
> include numeric character entites, for example é for e acute instead
> of the UTF-8
> equivalent of C3 A9. My question is: Is this legal UTF-8?

Perfectly legal.

Only it does not stand for e acute, as far as unicode is involved it just
stands for itself, for é.

Of course you are allowed to have agreements with your users about replacing
é by e acute or by whatever you want to replace it by.
Just like you can agree with them to convert lower case to capitals.

> And are numeric or symbolic character
> entites valid for Ascii-7 characters such as "<"? My guess is the first
> one is not legal,
> and the second one is application defined, i.e. Unicode says nothing about
> it. Am I right?

No. For those the situation is just the same as for the chars above 7F.
Entities in an UTF-8 file stand for the chars they are composed of, not
for the chars they denote.

Next message: Peter Constable: "RE: character entities in UTF-8 files"
Previous message: Kenneth Whistler: "Re: character entities in UTF-8 files"
In reply to: Avraham Shapiro: "character entities in UTF-8 files"
Next in thread: Peter Constable: "RE: character entities in UTF-8 files"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Tue Jul 12 2005 - 14:44:54 CDT