From: Deborah W. Anderson (dwanders@pacbell.net)
Date: Sun Mar 14 2004 - 15:25:36 EST
Two questions:
1. Is there a way to determine the prevalence of Unicode in electronic file documents (vs. documents not in Unicode)? At least for the Web, has anyone done a statistical sampling to determine the percentage of Unicode-encoded webpages?
2. A graduate student mentioned that it was her impression that most Cyrillic webpages (at least for Russian--her interest) are still not encoded in Unicode. (She is doing some research on the use of certain words in Russian and wanted to know how best to do the search.)
Again: Has anyone looked into the situation with Cyrillic in terms of the percentage of Web documents in Unicode?
With thanks,
Debbie Anderson
Deborah Anderson
Researcher, Dept. of Linguistics
UC Berkeley
Email: dwanders@socrates.berkeley.edu
or dwanders@pacbell.net
Script Encoding Initiative: www.linguistics.berkeley.edu/~dwanders
This archive was generated by hypermail 2.1.5 : Sun Mar 14 2004 - 16:11:39 EST