indii.org / research
http://www.indii.org/
Lawrence Murray: software, research, photography.en-usWed, 21 Jun 2017 18:51:46 +0200Wed, 21 Jun 2017 18:51:46 +0200Anytime Monte Carlo
http://www.indii.org/research/anytime-monte-carlo/index.html
Sat, 10 Dec 2016 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/anytime-monte-carlo/index.htmlA Monte Carlo algorithm typically simulates some prescribed number of samples, taking some random real time to complete the computations necessary. This work considers the converse: to impose a real-time budget on the computation, so that the number of samples simulated is random. To complicate matters, the real time taken for each simulation may depend on the sample produced, so that the samples themselves are not independent of their number, and a length bias with respect to compute time is apparent. This is especially problematic when a Markov chain Monte Carlo (MCMC) algorithm is used and the final state of the Markov chain—rather than an average over all states—is required. The length bias does not diminish with the compute budget in this case. It occurs, for example, in sequential Monte Carlo (SMC) algorithms. We propose an anytime framework to address the concern, using a continuous-time Markov jump process to study the progress of the computation in real time. We show that the length bias can be eliminated for any MCMC algorithm by using a multiple chain construction. The utility of this construction is demonstrated on a large-scale SMC$^2$ implementation, using four billion particles distributed across a cluster of 128 graphics processing units on the Amazon EC2 service. The anytime framework imposes a real-time budget on the MCMC move steps within SMC$^2$, ensuring that all processors are simultaneously ready for the resampling step, demonstrably reducing wait times and providing substantial control over the total compute budget.
Sequential Monte Carlo with Highly Informative Observations
http://www.indii.org/research/sequential-monte-carlo-with-highly-informative-observations/index.html
Sun, 01 Mar 2015 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/sequential-monte-carlo-with-highly-informative-observations/index.htmlWe propose sequential Monte Carlo (SMC) methods for sampling the posterior distribution of state-space models under highly informative observation regimes, a situation in which standard SMC methods can perform poorly. A special case is simulating bridges between given initial and final values. The basic idea is to introduce a schedule of intermediate weighting and resampling times between observation times, which guide particles towards the final state. This can always be done for continuous-time models, and may be done for discrete-time models under sparse observation regimes; our main focus is on continuous-time diffusion processes. The methods are broadly applicable in that they support multivariate models with partial observation, do not require simulation of the backward transition (which is often unavailable), and, where possible, avoid pointwise evaluation of the forward transition. When simulating bridges, the last characteristic cannot be avoided entirely without concessions, and we suggest an $\epsilon$-ball approach (reminiscent of approximate Bayesian computation) as a workaround. Compared to the bootstrap particle filter, the new methods deliver substantially reduced mean squared error in normalizing constant estimates, even after accounting for execution time. The methods are demonstrated for state estimation with two toy examples, and for parameter estimation (within a particle marginal Metropolis–Hastings sampler) with three applied examples in econometrics, epidemiology, and marine biogeochemistry.
Bayesian State-Space Modelling on High-Performance Hardware Using LibBi
http://www.indii.org/research/bayesian-state-space-modelling-on-high-performance-hardware-using-libbi/index.html
Sun, 01 Feb 2015 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/bayesian-state-space-modelling-on-high-performance-hardware-using-libbi/index.htmlLibBi is a software package for state-space modelling and Bayesian inference on modern computer hardware, including multi-core central processing units (CPUs), many-core graphics processing units (GPUs) and distributed-memory clusters of such devices. The software parses a domain-specific language for model specification, then optimises, generates, compiles and runs code for the given model, inference method and hardware platform. In presenting the software, this work serves as an introduction to state-space models and the specialised methods developed for Bayesian inference with them. The focus is on sequential Monte Carlo (SMC) methods such as the particle filter for state estimation, and the particle Markov chain Monte Carlo (PMCMC) and SMC$^2$ methods for parameter estimation. All are well-suited to current computer hardware. Two examples are given and developed throughout, one a linear three-element windkessel model of the human arterial system, the other a nonlinear Lorenz ’96 model. These are specified in the prescribed modelling language, and LibBi demonstrated by performing inference with them. Empirical results are presented, including a performance comparison of the software with different hardware configurations.
Parallel Resampling in the Particle Filter
http://www.indii.org/research/parallel-resampling-in-the-particle-filter/index.html
Thu, 01 Jan 2015 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/parallel-resampling-in-the-particle-filter/index.htmlModern parallel computing devices, such as the graphics processing
unit (GPU), have gained significant traction in scientific and
statistical computing. They are particularly well-suited to
data-parallel algorithms such as the particle filter, or more
generally Sequential Monte Carlo (SMC), which are increasingly used in
statistical inference. SMC methods carry a set of weighted
particles through repeated propagation, weighting and
resampling steps. The propagation and weighting steps are
straightforward to parallelise, as they require only independent
operations on each particle. The resampling step is more difficult, as
standard schemes require a collective operation, such as a sum, across
particle weights. Focusing on this resampling step, we analyse two
alternative schemes that do not involve a collective operation
(Metropolis and rejection resamplers), and compare them to standard
schemes (multinomial, stratified and systematic resamplers). We find
that, in certain circumstances, the alternative resamplers can perform
significantly faster on a GPU, and to a lesser extent on a CPU, than
the standard approaches. Moreover, in single precision, the standard
approaches are numerically biased for upwards of hundreds of thousands
of particles, while the alternatives are not. This is particularly
important given greater single- than double-precision throughput on
modern devices, and the consequent temptation to use single precision
with a greater number of particles. Finally, we provide auxiliary
functions useful for implementation, such as for the permutation of
ancestry vectors to enable in-place propagation. Supplementary
materials are available online.
Path Storage in the Particle Filter
http://www.indii.org/research/path-storage-in-the-particle-filter/index.html
Mon, 01 Dec 2014 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/path-storage-in-the-particle-filter/index.htmlThis article considers the problem of storing the paths generated by a particle filter and more generally by a sequential Monte Carlo algorithm. It provides a theoretical result bounding the expected memory cost by $T+CN\log N$ where $T$ is the time horizon, $N$ is the number of particles and $C$ is a constant, as well as an efficient algorithm to realise this. The theoretical result and the algorithm are illustrated with numerical experiments.
Rethinking soil carbon modelling: a stochastic approach to quantify uncertainties
http://www.indii.org/research/rethinking-soil-carbon-modelling/index.html
Sat, 01 Feb 2014 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/rethinking-soil-carbon-modelling/index.htmlThe benefits of sequestering carbon are many, including improved crop productivity, reductions in greenhouse gases, and financial gains through the sale of carbon credits. Achieving better understanding of the sequestration process has motivated many deterministic models of soil carbon dynamics, but none of these models address uncertainty in a comprehensive manner. Uncertainty arises in many ways—around the model inputs, parameters, and dynamics, and subsequently model predictions. In this paper, these uncertainties are addressed in concert by incorporating a physical-statistical model for carbon dynamics within a Bayesian hierarchical modelling framework. This comprehensive approach to accounting for uncertainty in soil carbon modelling has not been attempted previously. This paper demonstrates proof-of-concept based on a one-pool model and identifies requirements for extension to multi-pool carbon modelling. Our model is based on the soil carbon dynamics in Tarlee, South Australia. We specify the model conditionally through its parameters, soil carbon input and decay processes and observations of those processes. We use a particle marginal Metropolis–Hastings approach specified using the LibBi modelling language. We highlight how samples from the posterior distribution can be used to summarise our knowledge about model parameters, to estimate the probabilities of sequestering carbon and to forecast changes in carbon stocks under crop rotations not represented explicitly in the original field trials.
Bayesian Learning and Predictability in a Stochastic Nonlinear Dynamical Model
http://www.indii.org/research/bayesian-learning-and-predictability-in-a-stochastic-nonlinear-dynamical-model/index.html
Sun, 01 Dec 2013 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/bayesian-learning-and-predictability-in-a-stochastic-nonlinear-dynamical-model/index.htmlBayesian inference methods are applied within a Bayesian hierarchical modelling framework to the problems of joint state and parameter estimation, and of state forecasting. We explore and demonstrate the ideas in the context of a simple nonlinear marine biogeochemical model. A novel approach is proposed to the formulation of the stochastic process model, in which ecophysiological properties of plankton communities are represented by autoregressive stochastic processes. This approach captures the effects of changes in plankton communities over time, and it allows the incorporation of literature metadata on individual species into prior distributions for process model parameters. The approach is applied to a case study at Ocean Station Papa, using Particle Markov chain Monte Carlo computational techniques. The results suggest that, by drawing on objective prior information, it is possible to extract useful information about model state and a subset of parameters, and even to make useful long-term forecasts, based on sparse and noisy observations.
LibBi
http://www.indii.org/research/libbi
Sat, 01 Jun 2013 00:00:00 +0200Lawrence Murrayhttp://www.indii.org/research/libbiSoftware package for Sequential Monte Carlo on high-performance computer systems.On Disturbance State-Space Models and the Particle Marginal Metropolis--Hastings Sampler
http://www.indii.org/research/on-disturbance-state-space-models/index.html
Fri, 01 Mar 2013 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/on-disturbance-state-space-models/index.htmlWe investigate nonlinear state-space models without a closed-form transition density, and propose re-expressing such models over their latent noise variables rather than their latent state variables. In doing so the tractable noise density emerges in place of the intractable transition density. For importance sampling methods such as the auxiliary particle filter (APF), this enables importance weights to be computed where they could not otherwise be. As case studies we take two multivariate marine biogeochemical models and perform state and parameter estimation using the particle marginal Metropolis-Hastings (PMMH) sampler, within which is an APF component. For that APF, we compare several proposal strategies over noise variables, all based on lookaheads with the unscented Kalman filter. These strategies are compared using Markov chain convergence and acceptance rates, along with a newly introduced conditional acceptance rate that is useful for assessing the impact of using an estimated, rather than exact, likelihood. Results indicate the utility of designing proposals over the noise variables, particularly for fast-mixing process models.
Feynman-Kac Particle Integration with Geometric Interacting Jumps
http://www.indii.org/research/feynman-kac-particle-integration-with-geometric-interacting-jumps/index.html
Fri, 01 Mar 2013 00:00:00 +0100Lawrence Murrayhttp://www.indii.org/research/feynman-kac-particle-integration-with-geometric-interacting-jumps/index.htmlThis article is concerned with the design and analysis of discrete time Feynman-Kac particle integration models with geometric interacting jump processes. We analyze two general types of model, corresponding to whether the reference process is in continuous or discrete time. For the former, we consider discrete generation particle models defined by arbitrarily fine time mesh approximations of the Feynman-Kac models with continuous time path integrals. For the latter, we assume that the discrete process is observed at integer times and we design new approximation models with geometric interacting jumps in terms of a sequence of intermediate time steps between the integers. In both situations, we provide non asymptotic bias and variance theorems w.r.t. the time step and the size of the system, yielding what appear to be the first results of this type for this class of Feynman-Kac particle integration models. We also discuss uniform convergence estimates w.r.t. the time horizon. Our approach is based on an original semigroup analysis with first order decompositions of the fluctuation errors.