Approximation Algorithms for Probabilistic Reasoning: Sampling and Iterative Inference
Bozhena Bidyuk
The complexity of the exact inference increases exponentially with size and complexity of the network. As a result, the exact inference methods become impractical for large networks and we seek to approximate the results. A variety of approximation methods exist. This research focuses on two approximation methods for finding posterior marginals P(xije) in Bayesian networks: iterative belief updating (defined by Pearl [Pearl 1988]) and sampling.
The belief updating is an exact inference method for singly-connected networks. It can be applied to loopy networks to obtain approximate answers. The algorithm is based on message passing: in some order, each node computes and sends messages to its neighbors incorporating the latest messages it recieved. In a singly-connected network, we can order nodes so that it will be sufficient for each node to pass one messages in each direction. In a loopy network, the nodes compute several iterations of messages to achieve convergence (or to demonstrate the lack of convergence). Thus, belief updating in loopy networks is often referred to as Iterative Belief Propagation or IBP. Although IBP generally computes only approximate answers, it is known to perform extremely well in several special classes of networks such as coding networks and noisy-or networks. At the same time, we know that in some instances IBP does not converge or generates approximate answers far from correct. Currently, we do not have any methodology that would allow us in general case to predict the convergence of IBP or provide some practical error bounds on the approximate marginals it computes. In this research work, we examine the influence of the -cutset criteria on the convergence and quality of approximate marginals computed by IBP. We conjecture the -cutset (defined as a cycle-cutset with extreme posterior marginals) has effect similar to an observed cycle-cutset which breaks the loops and leaves the network singly-connected. We prove that the conjecture is true for Bayesian networks without evidence and show that the error in the approximate marginals computed by IBP converges to 0 as  tends to 0. We provide empirical support for instances of Bayesian networks with evidence.
The idea behind the sampling methods for Bayesian networks is to generate a set of samples (where a sample in a vector space X = fX1; :::;XNg is just an assignment of values to the elements of vector X) and then estimate the posterior marginals of interest from samples. In general, the quality of the approximate answers depends primarily on the number of samples generated and the approximate values converge to the exact values as number of samples increases. However, the sampling variance increases with the size of the sampling space. In this research work, we focus on the the variance reduction techniques on the example of the Gibbs sampler for Bayesian networks. It is obvious that we can achieve the reduction in variance by sampling only a subset of variables. However, the implication is that we have to carry out a lot more analytical computations which may render the whole approach impractical. We demonstrate that we can reduce sampling space efficiently if we take into consideration the underlying network structure. The time/space complexity of the exact inference in Bayesian networks is exponential in the induced width of the graph. In our sampling scheme, called w-cutset sampling, we sample a subset of variables (called a cutset) that is carefully chosen to reduce the complexity of the graph bounded by the induced width w. We analyze the problem of finding an optimal w-cutset of a graph (NP-hard in general case) and provide a heuristic algorithm for finding w-cutset in practice. We show empirically that w-cutset sampling typically finds better approximate answers than standard Gibbs sampler for a range of w values although its performance eventually deteriorates as w increases.