You can find information on how to get the Unicode Standard at
> unicode@Unicode.ORG writes:
> > >According to the HTML I18N spec, all that is needed in this case is to
> > >specify
> > >CHARSET=CP1251, and the text would be correctly converted to the equivalent
> > >Unicodes.
> > The issue is not the coded content of the document, about which you are
> > correct. The issue is numeric character references of the form &nnnn.
> > Some HTML documents today use numeric references in the C1 range,
> > assuming they are the extra characters in cp1251. This is contrary to the
> > i18n spec, which states that all numeric character references refer to
> > Unicode. This means that all references in the C1 range are illegal
> > according to the spec.
> A sublety: the i18n spec refers to UCS, which has a consquence
> when going beyond BMP. There UCS has well defined numbers, while I
> do not know whether Unicode has this.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:33 EDT