From: Douglas Davidson (firstname.lastname@example.org)
Date: Thu Feb 07 2008 - 11:31:41 CST
On Feb 7, 2008, at 3:22 AM, Michael S. Kaplan wrote:
> Having flown halfway around the world to talk to people who for
> whatever reasons, both valid and invalid (and not really
> distinguishing which is which on their list of concerns), are
> unhappy with a language encoding that in their view doubles or worse
> the amount of bytes used to store their language in Unicode, I can
> tell you that this as very real concern on some people's minds.
> True or false, it is on their minds. They can all add and multiply,
> and it certainly looks like a 2x or 3x situation to them.
> And we get a lot further by acknowledging their concerns and then
> showing them that they have less to be concerned about than they
> think, in the end, then we ever would by telling them there are
> wrong, wrong, wrong.
One mitigating factor is that many document formats have at least an
option to employ some form of compression. For example, both OOXML
and ODF are zip-archived XML, which means that most text will usually
end up being compressed. If one is concerned about sending HTML over
the wire, then one can use HTTP compression. Obviously these are
general-purpose compression algorithms, not text-specific ones, but
they still should be able to help. Actually, in most XML and HTML
documents, a large proportion of the characters are ASCII markup
anyway, so the overall expansion is not going to be 2x or 3x in the
first place. Furthermore, in many cases the size of the text in any
form is less significant than the size of other data such as images.
This archive was generated by hypermail 2.1.5 : Thu Feb 07 2008 - 11:33:50 CST