CompSci 267 Homework set #4
- In LZSS (the version of LZ77 due to Storer and Szymanski),
a short match can be represented by either (F,P,L)
(flag,pointer,length) or by (F,C) (flag,character).
If the window length is W=4096 and the maximum match length is M=256,
what is the shortest match that one would represent as a match rather
than as uncompressed characters?
- Analyze the LZW compression of the string "aaaa...",
for input length 1 million.
- What is the longest string that can be retrieved from the LZW
dictionary during decoding when the input text had length 1 billion?
- Assume a two-symbol alphabet with the symbols {a, b}.
Show the first 15 dictionary entries for the LZW encoding
of the string: ababababababab...
- In BWT, the last column, L, of the sorted matrix contains
concentrations of identical characters, which is why L is easy
to compress. However, the first column, F, of the same matrix
is even easier to compress since it contains runs,
not just concentrations. Why select column L and not column F?
- Using BWT for string S="sssssssssh"
calculate string L and its MTF compression.