On Recurrence Time
LAURILA, KIRSTI (2005)
LAURILA, KIRSTI
2005
Matematiikka - Mathematics
Informaatiotieteiden tiedekunta - Faculty of Information Sciences
This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.
Hyväksymispäivämäärä
2005-04-19
Julkaisun pysyvä osoite on
https://urn.fi/urn:nbn:fi:uta-1-14593
https://urn.fi/urn:nbn:fi:uta-1-14593
Tiivistelmä
The amount of the data in the world enlarges all the time and therefore efficient methods are needed for data compression. There are many different algorithms to compress the data. One class of compression algorithms are the Lempel-Ziv algorithms that are closely connected to the recurrence time of the sequence. The recurrence time of the sequence is the number of the characters between the start at the sequence and its following occurrence. Recurrence time has many mathematical properties which are examined in the thesis. Especially the Recurrence time theorem is proved. This theorem gives the basis to use recurrence time as an efficient help in the data compression.
When compressing the data different codes are used. This is why the properties of the codes and the using the codes in different cases are also studied. Furthermore, to study these properties, the packings of intervals of integers are important tools. The packing of a interval of integers is a big enoug set of numbers inside the interval..
The special application field of data compression is biological sequences, among other things, DNA sequences. Thus in the thesis recurrence times of DNA-sequences are experimentally studied using the human chromosome 22 as a DNA-sequence. Besides, the recurrence times of DNA-sequences are estimated on the basis of the theorems proved in the thesis. Finally, the experimental recurrence times are compared with the calculated ones and in general, a good agreement is found.
When compressing the data different codes are used. This is why the properties of the codes and the using the codes in different cases are also studied. Furthermore, to study these properties, the packings of intervals of integers are important tools. The packing of a interval of integers is a big enoug set of numbers inside the interval..
The special application field of data compression is biological sequences, among other things, DNA sequences. Thus in the thesis recurrence times of DNA-sequences are experimentally studied using the human chromosome 22 as a DNA-sequence. Besides, the recurrence times of DNA-sequences are estimated on the basis of the theorems proved in the thesis. Finally, the experimental recurrence times are compared with the calculated ones and in general, a good agreement is found.