minimizing size (was Re: allocation of Georgian letters)

From: William J Poser (wjposer@ldc.upenn.edu)
Date: Wed Feb 06 2008 - 20:22:37 CST

  • Next message: Michael S. Kaplan: "Re: minimizing size (was Re: allocation of Georgian letters)"

    The mention of the issue of whether Georgian encodes in UTF-8 as two
    bytes or three bytes is yet another instance of something that puzzles me.
    Why do some people on this list seem to care so much about text size?
    We now have such large storage devices, so much memory, and such high
    network bandwidth, that it strikes me as very odd that anyone would care
    very much about modest differences in the size of texts. Primary memory
    is a bit less plentiful and using less can improve performance, so
    the question of what representation to use for processing makes some
    sense, but why people would care about how large a text is in UTF-8,
    which is primarily intended for storage and transfer, mystifies me.

    So, is this an essentially outdated obsession that some people have
    not been able to shake? Are there people here working on applications
    with so much text that modest differences in size are important? Are
    some of you working in very restrictive environments such as embedded
    systems or satellites? For whom does it really make a difference
    how many bytes the UTF-8 encoding of their script requires?

    Bill



    This archive was generated by hypermail 2.1.5 : Wed Feb 06 2008 - 20:26:07 CST