Re: Effect on file size when using extended fonts in Word 2000

From: Michael \(michka\) Kaplan (
Date: Tue Dec 19 2000 - 09:24:49 EST

Word 2000 uses Unicode, and a somewhat bloated format for RTF as it always
have (extra tags around even the smallest pieces of text). To see more of
it, save your doc to HTML some time and look at the tags.... :-)

I believe the problem you are seeing has to do with limitations to
compression techniques more than anything else (text that is not all in the
same range will not have the kinds of similarities that make compression
effective, perhaps?).

But the actual internal storage is not documented and of course is entirely
subject to change. :-)

No matter what, if space is your main concern, then Word is really not your
ideal tool. I mean, I love Word to death (and am writing another book, 100%
in Word 2000 SA edition!) but not for its ability to be small in memory or
on disk.


a new book on internationalization in VB at

----- Original Message -----
From: "Dembek, Raymond F" <>
To: "Unicode List" <>
Cc: "Winkler, Arnold F" <>
Sent: Tuesday, December 19, 2000 5:48 AM
Subject: Effect on file size when using extended fonts in Word 2000

> Does anyone know the implications on file size when you add characters
> 255 to a Word 2000 document on Windows 95/98/ME. Will this double the
> of the paragraphs that contain these characters?
> I am primarily concerned with adding linedraw characters to paragraphs
> in Courier New.
> We are getting some disproportionate increases in file size.
> For example when one of the characters in each 1000-character paragraph is
> replaced by with a character outside the lower 255, the file size doubles.
> Does Word 2000 store a paragraph as two-byte characters if one of the
> characters in it is a double-byte character?
> When I look at the RTF version of such a file it seems that only the
> characters that need two-bytes get special coding and at least the lower
> are all coded as ASCII.
> Please forgive the imprecise terminology in the above. This is still a
> confusing area for me.
> Regards and thanks,
> Ray Dembek
> Raymond F. Dembek
> Unisys Corp. - Michigan
> Voice: 1-248-661-9302
> "We e-eat, e-sleep and e-drink this e-stuff."

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:17 EDT