Re: What do you think about storing encoded text in db?

From: J.Schneider@epixtech.com
Date: Mon Jul 10 2000 - 16:26:53 EDT


I have spent many years working with SQL Server in a similar situation. We
did not chose to store the record as a BLOB because we needed to be able to
create adhoc SQL queries to search through the text. Our text is stored in
a character encoding called MARC8, which is used by libraries for
bibliographic information. The first 127 bytes are the same as ASCII so a
lot of queries will return 'readable' data.

The problems we have faced are numerous. One in particular that has been
annoying is that SQL Server attempts to translate CHAR data from the
codepage of the server to the codepage of the client. Since our data
really isn't in the codepage of the server, that just corrupts our data.

We have wondered many times if our lives would have been simpler with BLOB
storage and have even considered changing, but we always come back to the
decision to stay where we are at. We've overcome the problems and 3rd
party tools can be used with our databases. If we had chosen BLOB format,
the storage requirements would increase significantly, data transfer may be
affected (unsure which way), and adhoc SQL queries on the data could not be
used.

Good luck,
Wayne

                                                                                                                       
                    Leon Spencer
                    <Leon.Spencer@brigh To: "Unicode List" <unicode@unicode.org>
                    tware.com> cc:
                                               Subject: What do you think about storing encoded text in db?
                    07/10/00 05:56 AM
                                                                                                                       
                                                                                                                       

In an earlier question posted to this mailing,
I asked about the best way to handle i18n in
e-mail software.

I have several clients that contact to a database
to get their messages. I'm considering storing their
message as encoded text (i.e. an encoding support
by MIME spec) along with a column for the charset.

This way when a user reads their messages, the following
will occur:
1. The user's e-mail client can check the charset column
   in the db to see if the charset for that particular
   message is supported.

2. The user's e-mail client will decode (MIME decode) the
   message text before displaying it.

Pro: Since MIME encoded text is always represented as 7/8-bit
     ASCII, I'm always sure to have my text store in the database
     without worrying about charset support.

Con: I think there's a limit to the size of text stored in a column.
     Is it better to stick with a BLOB?

Has anyone tried this?

Thanks.
      Leon



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:05 EDT