Charles Explorer logo
🇬🇧

An Impact of Static Dictionary on Text Compression

Publication at Faculty of Mathematics and Physics |
2007

Abstract

When compressing a huge collection of smaller pieces of text (like e-mails, news, newspaper articles, etc.), the standard compression techniques are not as efficient as on larger documents. We tried to find some modifications or settings of existing text-compression algorithms in order to work better in environments managing small text files, e.g. in WWW search engines. When properly initialized, even the algorithms originally intended for large or middle-sized files like word- or syllable-based compression algorithms can be effectively used on collections of small files.

As the dictionary initialization influences the compression ratio for large and very large files only insignificantly, the developersof applications compressing such files set the initial directory to empty.