indii.org / research
http://www.indii.org/
Lawrence Murray: software, research, photography.en-usSun, 16 Feb 2020 16:40:12 -0800Sun, 16 Feb 2020 16:40:12 -0800Lazy object copy as a platform for population-based probabilistic programming
http://www.indii.org/research/lazy-object-copy/index.html
Sat, 01 Feb 2020 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/lazy-object-copy/index.htmlThis work considers dynamic memory management for population-based probabilistic programs, such as those using particle methods for inference. Such programs exhibit a pattern of allocating, copying, potentially mutating, and deallocating collections of similar objects through successive generations. These objects may assemble data structures such as stacks, queues, lists, ragged arrays, and trees, which may be of random, and possibly unbounded, size. For the simple case of N particles, T generations, D objects, and resampling at each generation, dense representation requires $O(DNT)$ memory, while sparse representation requires only $O(DT+DN \log DN)$ memory, based on existing theoretical results. This work describes an object copy-on-write platform to automate this saving for the programmer. The core idea is formalized using labeled directed multigraphs, where vertices represent objects, edges the pointers between them, and labels the necessary bookkeeping. A specific labeling scheme is proposed for high performance under the motivating pattern. The platform is implemented for the Birch probabilistic programming language, using smart pointers, hash tables, and reference-counting garbage collection. It is tested empirically on a number of realistic probabilistic programs, and shown to significantly reduce memory use and execution time in a manner consistent with theoretical expectations. This enables copy-on-write for the imperative programmer, lazy deep copies for the object-oriented programmer, and in-place write optimizations for the functional programmer.
Parameter elimination in particle Gibbs sampling
http://www.indii.org/research/parameter-elimination-in-particle-gibbs-sampling/index.html
Sat, 09 Nov 2019 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/parameter-elimination-in-particle-gibbs-sampling/index.htmlBayesian inference in state-space models is challenging due to high-dimensional state trajectories. A viable approach is particle Markov chain Monte Carlo, combining MCMC and sequential Monte Carlo to form “exact approximations” to otherwise intractable MCMC methods. The performance of the approximation is limited to that of the exact method. We focus on particle Gibbs and particle Gibbs with ancestor sampling, improving their performance beyond that of the underlying Gibbs sampler (which they approximate) by marginalizing out one or more parameters. This is possible when the parameter prior is conjugate to the complete data likelihood. Marginalization yields a non-Markovian model for inference, but we show that, in contrast to the general case, this method still scales linearly in time. While marginalization can be cumbersome to implement, recent advances in probabilistic programming have enabled its automation. We demonstrate how the marginalized methods are viable as efficient inference backends in probabilistic programming, and demonstrate with examples in ecology and epidemiology.
Probabilistic programming for birth-death models of evolution using an alive particle filter with delayed sampling
http://www.indii.org/research/probabilistic-programming-for-birth-death-models-of-evolution/index.html
Sun, 14 Jul 2019 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/probabilistic-programming-for-birth-death-models-of-evolution/index.htmlWe consider probabilistic programming for birth-death models of evolution and introduce a new widely-applicable inference method that combines an extension of the alive particle filter (APF) with automatic Rao-Blackwellization via delayed sampling. Birth-death models of evolution are an important family of phylogenetic models of the diversification processes that lead to evolutionary trees. Probabilistic programming languages (PPLs) give phylogeneticists a new and exciting tool: their models can be implemented as probabilistic programs with just a basic knowledge of programming. The general inference methods in PPLs reduce the need for external experts, allow quick prototyping and testing, and accelerate the development and deployment of new models. We show how these birth-death models can be implemented as simple programs in existing PPLs, and demonstrate the usefulness of the proposed inference method for such models. For the popular BiSSE model the method yields an increase of the effective sample size and the conditional acceptance rate by a factor of 30 in comparison with a standard bootstrap particle filter. Although concentrating on phylogenetics, the extended APF is a general inference method that shows its strength in situations where particles are often assigned zero weight. In the case when the weights are always positive, the extra cost of using the APF rather than the bootstrap particle filter is negligible, making our method a suitable drop-in replacement for the bootstrap particle filter in probabilistic programming inference.
Automated learning with a probabilistic programming language: Birch
http://www.indii.org/research/automated-learning-with-a-probabilistic-programming-language-birch/index.html
Sat, 13 Oct 2018 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/automated-learning-with-a-probabilistic-programming-language-birch/index.htmlThis work offers a broad perspective on probabilistic modeling and inference in light of recent advances in probabilistic programming, in which models are formally expressed in Turing-complete programming languages. We consider a typical workflow and how probabilistic programming languages can help to automate this workflow, especially in the matching of models with inference methods. We focus on two properties of a model that are critical in this matching: its structure—the conditional dependencies between random variables—and its form—the precise mathematical definition of those dependencies. While the structure and form of a probabilistic model are often fixed a priori, it is a curiosity of probabilistic programming that they need not be, and may instead vary according to random choices made during program execution. We introduce a formal description of models expressed as programs, and discuss some of the ways in which probabilistic programming languages can reveal the structure and form of these, in order to tailor inference methods. We demonstrate the ideas with a new probabilistic programming language called Birch, with a multiple object tracking example.
Birch
http://www.indii.org/http:/www.birch-lang.org
Sat, 24 Mar 2018 00:00:00 -0700Lawrence Murrayhttp://www.indii.orghttp://www.birch-lang.orgAn object-oriented, universal probabilistic programming language.Improving the particle filter for high-dimensional problems using artificial process noise
http://www.indii.org/research/improving-the-particle-filter-for-high-dimensional-problems/index.html
Thu, 01 Mar 2018 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/improving-the-particle-filter-for-high-dimensional-problems/index.htmlThe particle filter is one of the most successful methods for state inference and identification of general non-linear and non-Gaussian models. However, standard particle filters suffer from degeneracy of the particle weights for high-dimensional problems. We propose a method for improving the performance of the particle filter for certain challenging state space models, with implications for high-dimensional inference. First we approximate the model by adding artificial process noise in an additional state update, then we design a proposal that combines the standard and the locally optimal proposal. This results in a bias-variance trade-off, where adding more noise reduces the variance of the estimate but increases the model bias. The performance of the proposed method is evaluated on a linear Gaussian state space model and on the non-linear Lorenz’96 model. For both models we observe a significant improvement in performance over the standard particle filter.
Delayed Sampling and Automatic Rao-Blackwellization of Probabilistic Programs
http://www.indii.org/research/delayed-sampling-and-automatic-rao-blackwellization-of-probabilistic-programs/index.html
Fri, 22 Dec 2017 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/delayed-sampling-and-automatic-rao-blackwellization-of-probabilistic-programs/index.htmlWe introduce a dynamic mechanism for the solution of analytically-tractable substructure in probabilistic programs, to reduce variance in Monte Carlo estimators. For inference with Sequential Monte Carlo, it yields improvements such as locally-optimal proposals and Rao-Blackwellization, with little modification to model code necessary. A directed graph is maintained alongside the running program, evolving dynamically as the program triggers operations upon it. Nodes of the graph represent random variables, and edges the analytically-tractable relationships between them (e.g. conjugate priors and affine transformations). Each random variable is held in the graph for as long as possible, sampled only when used by the program in a context that cannot be resolved analytically. This allows it to be analytically conditioned on as many observations as possible before sampling. We have implemented the approach in both Anglican and a new probabilistic programming language named Birch. We demonstrate it on a number of small examples, and a larger mixed linear-nonlinear state-space model.
Better together? Statistical learning in models made of modules
http://www.indii.org/research/better-together-statistical-learning-in-models-made-of-modules/index.html
Tue, 29 Aug 2017 00:00:00 -0700Lawrence Murrayhttp://www.indii.org/research/better-together-statistical-learning-in-models-made-of-modules/index.htmlIn modern applications, statisticians are faced with integrating heterogeneous data modalities relevant for an inference, prediction, or decision problem. In such circumstances, it is convenient to use a graphical model to represent the statistical dependencies, via a set of connected “modules”, each relating to a specific data modality, and drawing on specific domain expertise in their development. In principle, given data, the conventional statistical update then allows for coherent uncertainty quantification and information propagation through and across the modules. However, misspecification of any module can contaminate the estimate and update of others, often in unpredictable ways. In various settings, particularly when certain modules are trusted more than others, practitioners have preferred to avoid learning with the full model in favor of approaches that restrict the information propagation between modules, for example by restricting propagation to only particular directions along the edges of the graph. In this article, we investigate why these modular approaches might be preferable to the full model in misspecified settings. We propose principled criteria to choose between modular and full-model approaches. The question arises in many applied settings, including large stochastic dynamical systems, meta-analysis, epidemiological models, air pollution models, pharmacokinetics-pharmacodynamics, and causal inference with propensity scores.
Probabilistic learning of nonlinear dynamical systems using sequential Monte Carlo
http://www.indii.org/research/probabilistic-learning-of-nonlinear-dynamical-systems-using-sequential-monte-carlo/index.html
Tue, 07 Mar 2017 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/probabilistic-learning-of-nonlinear-dynamical-systems-using-sequential-monte-carlo/index.htmlProbabilistic modeling provides the capability to represent and manipulate uncertainty in data, models, predictions and decisions. We are concerned with the problem of learning probabilistic models of dynamical systems from measured data. Specifically, we consider learning of probabilistic nonlinear state-space models. There is no closed-form solution available for this problem, implying that we are forced to use approximations. In this tutorial we will provide a self-contained introduction to one of the state-of-the-art methods—the particle Metropolis–Hastings algorithm—which has proven to offer a practical approximation. This is a Monte Carlo based method, where the particle filter is used to guide a Markov chain Monte Carlo method through the parameter space. One of the key merits of the particle Metropolis–Hastings algorithm is that it is guaranteed to converge to the “true solution” under mild assumptions, despite being based on a particle filter with only a finite number of particles. We will also provide a motivating numerical example illustrating the method using a modeling language tailored for sequential Monte Carlo methods. The intention of modeling languages of this kind is to open up the power of sophisticated Monte Carlo methods—including particle Metropolis–Hastings—to a large group of users without requiring them to know all the underlying mathematical details.
Anytime Monte Carlo
http://www.indii.org/research/anytime-monte-carlo/index.html
Sat, 10 Dec 2016 00:00:00 -0800Lawrence Murrayhttp://www.indii.org/research/anytime-monte-carlo/index.htmlA Monte Carlo algorithm typically simulates some prescribed number of samples, taking some random real time to complete the computations necessary. This work considers the converse: to impose a real-time budget on the computation, so that the number of samples simulated is random. To complicate matters, the real time taken for each simulation may depend on the sample produced, so that the samples themselves are not independent of their number, and a length bias with respect to compute time is apparent. This is especially problematic when a Markov chain Monte Carlo (MCMC) algorithm is used and the final state of the Markov chain—rather than an average over all states—is required. The length bias does not diminish with the compute budget in this case. It occurs, for example, in sequential Monte Carlo (SMC) algorithms. We propose an anytime framework to address the concern, using a continuous-time Markov jump process to study the progress of the computation in real time. We show that the length bias can be eliminated for any MCMC algorithm by using a multiple chain construction. The utility of this construction is demonstrated on a large-scale SMC$^2$ implementation, using four billion particles distributed across a cluster of 128 graphics processing units on the Amazon EC2 service. The anytime framework imposes a real-time budget on the MCMC move steps within SMC$^2$, ensuring that all processors are simultaneously ready for the resampling step, demonstrably reducing wait times and providing substantial control over the total compute budget.