Wanted: An Internet Unicode Meter

From: Daniel Yacob (unicode@geez.org)
Date: Wed Jul 26 2006 - 12:01:17 CDT

  • Next message: Magda Danish \(Unicode\): "FW: FW: Other Question, Problem, or Feedback"


    I was asked twice within a week recently how many Amharic documents
    were on the internet and I could only guess at a figure. So it
    dawned on me that it would be a nice service if search engine
    companies could provide some statistics -based on language (if
    identified) and script. Perhaps these stats are available and
    I just wasn't able to find them?

    Going a step further, stats on a per character basis, or even a
    property basis would be useful and not just academically interesting.
    The practical application that comes to mind would be as a survey
    of Unicode usage. Under-utilized blocks, even dead zones, could be
    identified which would indicate where community outreach was needed.

    I think this would be in the Unicode Consortium's best interest to
    be aware of these stats (as well as related stats such as Unicode
    use vs other encoding systems and growth over time) to then know
    where to focus efforts in promoting adoption of the standard.

    So if the Unicode Consortium could work on a character meter with a
    major indexing/searching service, such as Google for example, that
    would be dandy. Do we know anyone at that intersection? ;-)



    This archive was generated by hypermail 2.1.5 : Wed Jul 26 2006 - 12:19:59 CDT