Abstract:
This thesis is motivated by the latest work related to differential compression algorithms as it appears in the work of Ajtai, et al. - 2002. In particular, we pay special attention to delta encoding algorithms that achieve got compression in linear time and constant space. This is important because previous work in this area uses either quadratic time and constant space or linear time and linear space, which is unacceptable for large inputs. In delta encoding, the algorithm reads two different copies of the same file as input, termed the reference copy and the version copy. The output of the algorithm is a sequence of Add/copy commands that reconstructs the version copy in the presence of the reference copy. Such algorithms have been recommended to be integrated into the http protocol. The idea is to reduce the data transfer time for text and http objects in order to decrease the latency of loading updated web pages. Also, in a client server system, clients may perform delta encoding to exchange delta encodings with a server instead of exchanging whole files. This reduces the time needed to perform the backup and reduces the storage required at the backup server. In the literature, the evaluation of delta encoding algorithms depends on three metrics: the running time of the algorithm, the space it uses, and the compression results it achieves. In this thesis we build a test-bed for delta encoding algorithms that accommodates most of the previous work described in the literature. Through intensive experimentations we are able to recommend a hybrid algorithm that forms a good compromise among the existing methods.
Description:
M.S. -- Faculty of Natural and Applied Sciences, Notre Dame University, Louaize, 2004; "A thesis submitted in partial fulfillment of the requirements for the degree of Master of science in computer science, Department of Computer Science, Faculty of Natural and Applied Sciences"; Includes bibliographical references (leaves 42-43).