The geometric foundations of Hamiltonian Monte Carlo
Authors
Michael Betancourt, Simon Byrne, Sam Livingstone, Mark Girolami
Abstract
Although Hamiltonian Monte Carlo has proven an empirical success, the lack of a rigorous theoretical understanding of the algorithm has in many ways impeded both principled developments of the method and use of the algorithm in practice. In this paper we develop the formal foundations of the algorithm through the construction of measures on smooth manifolds, and demonstrate how the theory naturally identifies efficient implementations and motivates promising generalizations.
Citation
- Journal: Bernoulli
- Year: 2017
- Volume: 23
- Issue: 4A
- Pages:
- Publisher: Bernoulli Society for Mathematical Statistics and Probability
- DOI: 10.3150/16-bej810
BibTeX
@article{Betancourt_2017,
title={{The geometric foundations of Hamiltonian Monte Carlo}},
volume={23},
ISSN={1350-7265},
DOI={10.3150/16-bej810},
number={4A},
journal={Bernoulli},
publisher={Bernoulli Society for Mathematical Statistics and Probability},
author={Betancourt, Michael and Byrne, Simon and Livingstone, Sam and Girolami, Mark},
year={2017}
}
References
- Baez, J. & Muniain, J. P. Gauge Fields, Knots and Gravity. Series on Knots and Everything (1994) doi:10.1142/2324 – 10.1142/2324
- Beskos, A., Pillai, N., Roberts, G., Sanz-Serna, J.-M. & Stuart, A. Optimal tuning of the hybrid Monte Carlo algorithm. Bernoulli vol. 19 (2013) – 10.3150/12-bej414
- Beskos, A., Pinski, F. J., Sanz-Serna, J. M. & Stuart, A. M. Hybrid Monte Carlo on Hilbert spaces. Stochastic Processes and their Applications vol. 121 2201–2230 (2011) – 10.1016/j.spa.2011.06.003
- Burrage, K., Lenane, I. & Lythe, G. Numerical Methods for Second‐Order Stochastic Differential Equations. SIAM Journal on Scientific Computing vol. 29 245–264 (2007) – 10.1137/050646032
- Byrne, S. & Girolami, M. Geodesic Monte Carlo on Embedded Manifolds. Scandinavian Journal of Statistics vol. 40 825–845 (2013) – 10.1111/sjos.12036
- Caflisch, R. E. Monte Carlo and quasi-Monte Carlo methods. Acta Numerica vol. 7 1–49 (1998) – 10.1017/s0962492900002804
- Censor, A. & Grandini, D. Borel and continuous systems of measures. Rocky Mountain Journal of Mathematics vol. 44 (2014) – 10.1216/rmj-2014-44-4-1073
- Chang, J. T. & Pollard, D. Conditioning as disintegration. Statistica Neerlandica vol. 51 287–317 (1997) – 10.1111/1467-9574.00056
- Cotter, S. L., Roberts, G. O., Stuart, A. M. & White, D. MCMC Methods for Functions: Modifying Old Algorithms to Make Them Faster. Statistical Science vol. 28 (2013) – 10.1214/13-sts421
- Dawid, A. P., Stone, M. & Zidek, J. V. Marginalization Paradoxes in Bayesian and Structural Inference. Journal of the Royal Statistical Society Series B: Statistical Methodology vol. 35 189–213 (1973) – 10.1111/j.2517-6161.1973.tb00952.x
- Diaconis, P. & Freedman, D. Iterated Random Functions. SIAM Review vol. 41 45–76 (1999) – 10.1137/s0036144598338446
- Duane, S., Kennedy, A. D., Pendleton, B. J. & Roweth, D. Hybrid Monte Carlo. Physics Letters B vol. 195 216–222 (1987) – 10.1016/0370-2693(87)91197-x
- Fang, Y., Sanz-Serna, J. M. & Skeel, R. D. Compressible generalized hybrid Monte Carlo. The Journal of Chemical Physics vol. 140 (2014) – 10.1063/1.4874000
- Gelfand, A. E. & Smith, A. F. M. Sampling-Based Approaches to Calculating Marginal Densities. Journal of the American Statistical Association vol. 85 398–409 (1990) – 10.1080/01621459.1990.10476213
- Geman, S. & Geman, D. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. IEEE Transactions on Pattern Analysis and Machine Intelligence vol. PAMI-6 721–741 (1984) – 10.1109/tpami.1984.4767596
- Girolami, M. & Calderhead, B. Riemann Manifold Langevin and Hamiltonian Monte Carlo Methods. Journal of the Royal Statistical Society Series B: Statistical Methodology vol. 73 123–214 (2011) – 10.1111/j.1467-9868.2010.00765.x
- Haario, H., Saksman, E. & Tamminen, J. An Adaptive Metropolis Algorithm. Bernoulli vol. 7 223 (2001) – 10.2307/3318737
- Horowitz, A. M. A generalized guided Monte Carlo algorithm. Physics Letters B vol. 268 247–252 (1991) – 10.1016/0370-2693(91)90812-5
- Husain, S., Vasishth, S. & Srinivasan, N. Strong Expectations Cancel Locality Effects: Evidence from Hindi. PLoS ONE vol. 9 e100986 (2014) – 10.1371/journal.pone.0100986
- Izaguirre, J. A. & Hampton, S. S. Shadow hybrid Monte Carlo: an efficient propagator in phase space of macromolecules. Journal of Computational Physics vol. 200 581–604 (2004) – 10.1016/j.jcp.2004.04.016
- Jasche, J., Kitaura, F. S., Li, C. & Enßlin, T. A. Bayesian non-linear large-scale structure inference of the Sloan Digital Sky Survey Data Release 7. Monthly Notices of the Royal Astronomical Society vol. 409 355–370 (2010) – 10.1111/j.1365-2966.2010.17313.x
- Joulin, A. & Ollivier, Y. Curvature, concentration and error estimates for Markov chain Monte Carlo. The Annals of Probability vol. 38 (2010) – 10.1214/10-aop541
- Lee, J. M. Introduction to Topological Manifolds. Graduate Texts in Mathematics (Springer New York, 2011). doi:10.1007/978-1-4419-7940-7 – 10.1007/978-1-4419-7940-7
- Livingstone, S. & Girolami, M. Information-Geometric Markov Chain Monte Carlo Methods Using Diffusions. Entropy vol. 16 3074–3102 (2014) – 10.3390/e16063074
- Ollivier, Y. Ricci curvature of Markov chains on metric spaces. Journal of Functional Analysis vol. 256 810–864 (2009) – 10.1016/j.jfa.2008.11.001
- Polettini, M. Generally covariant state-dependent diffusion. Journal of Statistical Mechanics: Theory and Experiment vol. 2013 P07005 (2013) – 10.1088/1742-5468/2013/07/p07005
- Porter, E. K. & Carré, J. A Hamiltonian Monte–Carlo method for Bayesian inference of supermassive black hole binaries. Classical and Quantum Gravity vol. 31 145004 (2014) – 10.1088/0264-9381/31/14/145004
- Gelman, A., Gilks, W. R. & Roberts, G. O. Weak convergence and optimal scaling of random walk Metropolis algorithms. The Annals of Applied Probability vol. 7 (1997) – 10.1214/aoap/1034625254
- Roberts, G. O. & Rosenthal, J. S. General state space Markov chains and MCMC algorithms. Probability Surveys vol. 1 (2004) – 10.1214/154957804100000024
- Simmons, D. Conditional measures and conditional expectation; Rohlin’s Disintegration Theorem. Discrete & Continuous Dynamical Systems - A vol. 32 2565–2582 (2012) – 10.3934/dcds.2012.32.2565
- Tierney, L. A note on Metropolis-Hastings kernels for general state spaces. The Annals of Applied Probability vol. 8 (1998) – 10.1214/aoap/1027961031
- Weinstein, A. The local structure of Poisson manifolds. Journal of Differential Geometry vol. 18 (1983) – 10.4310/jdg/1214437787
- Amari, S. & Nagaoka, H. Methods of Information Geometry. Translations of Mathematica Monographs (2007) doi:10.1090/mmono/191 – 10.1090/mmono/191
- Halmos, P. R. Measure Theory. Graduate Texts in Mathematics (Springer New York, 1950). doi:10.1007/978-1-4684-9440-2 – 10.1007/978-1-4684-9440-2
- José, J. V. & Saletan, E. J. Classical Dynamics. (1998) doi:10.1017/cbo9780511803772 – 10.1017/cbo9780511803772
- Kardar, M. Statistical Physics of Particles. (2007) doi:10.1017/cbo9780511815898 – 10.1017/cbo9780511815898
- Marx, D. & Hutter, J. Ab Initio Molecular Dynamics. (2009) doi:10.1017/cbo9780511609633 – 10.1017/cbo9780511609633
- Meyn, S., Tweedie, R. L. & Glynn, P. W. Markov Chains and Stochastic Stability. (2009) doi:10.1017/cbo9780511626630 – 10.1017/cbo9780511626630
- Øksendal, B. Stochastic Differential Equations. Universitext (Springer Berlin Heidelberg, 2003). doi:10.1007/978-3-642-14394-6 – 10.1007/978-3-642-14394-6
- Robert, C. P. & Casella, G. Monte Carlo Statistical Methods. Springer Texts in Statistics (Springer New York, 1999). doi:10.1007/978-1-4757-3071-5 – 10.1007/978-1-4757-3071-5
- Schutz, B. F. Geometrical Methods of Mathematical Physics. (1980) doi:10.1017/cbo9781139171540 – 10.1017/cbo9781139171540
- Structure of Dynamical Systems. (Springer International Publishing, 1997). doi:10.1007/978-1-4612-0281-3 – 10.1007/978-1-4612-0281-3