From: Timothy Partridge (email@example.com)
Date: Wed Nov 26 2003 - 16:17:14 EST
Peter Kirk wrote:
> As there hasn't been a rush of on-list responses to this one, and partly
> in reply to the one off-list response, let me clarify the issue I am
> have in mind.
> Instance A of a program P, version X, writes a Unicode character string
> S, in a particular normalisation form, to a storage medium Z. Some time
> later (maybe seconds, maybe years) instance B of version Y of that same
> program P reads that string from the same storage medium. For the
> purposes of Unicode conformance, are instances A and B to be considered
> one process or separate processes?
I would say a process is something that carries out some sort of task on
data. Typically data both comes in and goes out. It might be to the outside
world or to a data store.
> Conformance clause C9 states that "no process can assume that another
> process will make a distinction between two different, but
> canonical-equivalent character sequences", which implies that no process
> can assume that another process has correctly normalised any character
> sequence. So, if instances A and B are considered separate processes, B
> is not permitted to assume that the string S has been correctly
> normalised - even if in fact it is known that all strings on medium Z
> have been written by program P and that all versions of program P write
> strings in a particular normalisation form.
I would consider A and B to be different versions of the same process. I
read the word assume to mean make an assumption without definite knowledge.
If process B *knows* something is true it can exploit that knowledge. If on
the other hand it is receiving data from a process outside its control
(owned by a third party perhaps) then it can't guess that the data have any
particular charateristics. It is common for a process to be composed of
sub-processes. If they can't exploit their knowledge of one another then you
have serious problems. To take an extreme case how could you call a
normalisation process if you couldn't rely on it returning normalised data?
> Also, can the storage medium Z be considered a process?
No it is a data store.
> Or can low-level
> transformations of the data, e.g. defragmentation, backup and
> compression, which are invisible to the program P be considered
> processes? If so, these processes are permitted to transform S into a
> canonically equivalent form; and so instance B of program P is not
> permitted to assume that the string it reads from Z is in the same
> normalisation form as the string written by instance A.
At some point your system will make use of a data store. It is entitled to
assume that what it gets out of the store is what was stored into it. The
operating system might make invisible compressions or duplications, but the
system using the data store is oblivious to them. If the operating system
doesn't return what was put in then it doesn't qualify for an *invisible*
change. I would expect the operating system documentation to make very clear
if the storage routines don't return what you gave them in the first place.
-- Tim Partridge. Any opinions expressed are mine only and not those of my employer
This archive was generated by hypermail 2.1.5 : Wed Nov 26 2003 - 18:02:11 EST