CompSci 267 Homework set #3

  1. In lecture we saw that if X has pdf = {.4,.2,.2,.1,.1} then all 3 size-5 compact codes are optimal for X.
     
  2. Binary-coded decimal (BCD) encodes 0 as 0000, 1 as 0001, and so on up to 9, coded as 1001, with other 4-digit binary codes being discarded. Consider a source that emits equally likely digits in the range 0-9. It has entropy lg 10. Consider encoding analagously in blocks, i.e., coding all k-blocks of decimal digits by binary m-blocks. Prove that for suitable k and m you can get arbitrarily near the lower bound lg 10 on codeword-length per decimal digit.
     
  3. Three information sources with alphabet {a,b,c,d,e,f,g,h} are characterized by the probabilities in the following table. For each source, find a binary, ternary, and quaternary (four code symbols) Huffman code, and calculate its average length.
                  a     b      c      d     e      f      g    h
    source 1     1/8   1/8    1/8    1/8   1/8    1/8    1/8  1/8
    source 2     .1     .2     .1    .3    .05    .1     .05   .1
    source 3     .15   .15    .15    .15    .1    .1     .1    .1
    

     
  4. Apply the adaptive Huffman method to the following 11-symbol string "abracadabra". The encoder/decoder know that the universe of allowable source symbols consists of the 26 lowercase alphabetic characters. For each symbol input, show the output, the tree after the symbol has been added to it, and the tree after being rearranged (if necessary).
     
  5. Generate a binary sequence of length L with Prob(0) = 0.8, and use arithmetic coding to encode it. Plot the difference of the rate in bits/symbol and the entropy as a function of L. Comment on the effect of L on the rate.
     
  6. Assume a 2-symbol input alphabet {a, b} with probabilities {.7, .3}, respectively.