From Probabilistic Programming
This website serves as a repository of links and information about probabilistic programming languages, including both academic research spanning theory, algorithms, modeling, and systems, as well as implementations, evaluations, and applications. If you would like to contribute to this site, please contact Daniel Roy. The site is still under construction: please help us link to relevant projects and research!
Join the Probabilistic-Programming mailing list
The probabilistic-programming mailing list hosted at CSAIL/MIT hopes to support discussion between researchers working in the area of probabilistic programming, but also to provide a means to announce new results, software, workshops, etc. The mailing list is fashioned after the popular "uai" mailing list.
The probabilistic programming approach
Probabilistic graphical models provide a formal lingua franca for modeling and a common target for efficient inference algorithms. Their introduction gave rise to an extensive body of work in machine learning, statistics, robotics, vision, biology, neuroscience, artificial intelligence (AI) and cognitive science. However, many of the most innovative and useful probabilistic models published by the AI, machine learning, and statistics community far outstrip the representational capacity of graphical models and associated inference techniques. Models are communicated using a mix of natural language, pseudo code, and mathematical formulae and solved using special purpose, one-off inference methods. Rather than precise specifications suitable for automatic inference, graphical models typically serve as coarse, high-level descriptions, eliding critical aspects such as fine-grained independence, abstraction and recursion.
PROBABILISTIC PROGRAMMING LANGUAGES aim to close this representational gap, unifying general purpose programming with probabilistic modeling; literally, users specify a probabilistic model in its entirety (e.g., by writing code that generates a sample from the joint distribution) and inference follows automatically given the specification. These languages provide the full power of modern programming languages for describing complex distributions, and can enable reuse of libraries of models, support interactive modeling and formal verification, and provide a much-needed abstraction barrier to foster generic, efficient inference in universal model classes.
We believe that the probabilistic programming language approach within AI has the potential to fundamentally change the way we understand, design, build, test and deploy probabilistic systems. This approach has seen growing interest within AI over the last 10 years, yet the endeavor builds on over 40 years of work in range of diverse fields including mathematical logic, theoretical computer science, formal methods, programming languages, as well as machine learning, computational statistics, systems biology, probabilistic AI.
Please see our collection of research articles on probabilistic programming.
A growing body of literature studies probabilistic programming from an array of perspectives. The individual project pages linked below often contain lists of publications, although we aim to collect these in our own master list as well. A related but distinct body of work is that of Approximate Bayesian Computation (ABC), which focuses on likelihood-free methods, developed originally to tackle statistical queries in population genetics but now applied more broadly. The website for the i-Like research programme links to a number of very interesting articles. Another related area of research is Statistical Relational Learning, which is in general interested in distributions on structured spaces (e.g., models of first order languages) where there may be uncertainty in the number and types of objects.
Existing probabilistic programming systems
Below we have compiled a list of probabilistic programming systems including languages, implementations/compilers, as well as software libraries for constructing probabilistic models and toolkits for building probabilistic inference algorithms.
- Alchemy is a software package providing a series of algorithms for statistical relational learning and probabilistic logic inference, based on the Markov logic representation.
- Anglican is a portable Turing-complete research probabilistic programming language that includes particle MCMC inference.
- BLOG, or Bayesian logic, is a probabilistic programming language with elements of first-order logic, as well as an MCMC-based inference algorithm. BLOG makes it relatively easy to represent uncertainty about the number of underlying objects explaining observed data.
- BUGS is a language for specifying finite graphical models and accompanying software for performing B(ayesian) I(nference) U(sing) G(ibbs) S(ampling), although modern implementations (such as WinBUGS, JAGS, and OpenBUGS) are based on Metropolis-Hastings. BiiPS is an implementation based on interacting particle systems methods like Sequential Monte Carlo.
- Church is a universal probabilistic programming language, extending Scheme with probabilistic semantics, and is well suited for describing infinite-dimensional stochastic processes and other recursively-defined generative processes (Goodman, Mansinghka, Roy, Bonawitz and Tenenbaum, 2008). The active implementation of Church is webchurch. Older implementations include MIT-Church, Cosh, Bher, and JSChurch. See also Venture below.
- Dimple is a software tool that performs inference and learning on probabilistic graphical models via belief propagation algorithms or sampling based algorithms.
- FACTORIE is a Scala library for creating relational factor graphs, estimating parameters and performing inference.
- Figaro is a Scala library for constructing probabilistic models that also provides a number of built-in reasoning algorithms that can be applied automatically to any constructed models.
- HANSEI is a domain-specific language embedded in OCaml, which allows one to express discrete-distribution models with potentially infinite support, perform exact inference as well as importance sampling-based inference, and model inference over inference.
- Hakaru is a simply-typed probabilistic programming language which offers composable and pluggable inference engines as well as computer-algebra guided program optimization.
- Hierarchical Bayesian Compiler (HBC) is a language for expressing and compiler for implementing hierarchical Bayesian models, with a focus on large-dimension discrete models and support for a number of non-parametric process priors.
- LibBi is a library for Bayesian inference, based largely on the sequential Monte Carlo framework.
- Markov the Beast is a software package for statistical relational learning and structured prediction based on Markov logic.
- PRISM is a general programming language intended for symbolic-statistical modeling, and the PRISM programming system is a tool that can be used to learn the parameters of a PRISM program from data, e.g., by expectation-maximization.
- Infer.NET is a software library developed by Microsoft for expressing graphical models and implementing Bayesian inference using a variety of algorithms.
- Probabilistic-C is a C-language probabilistic programming system that, using standard compilation tools, automatically produces a compiled parallel inference executable from C-language generative model code.
- ProbLog is a probabilistic extension of Prolog based on Sato's distribution semantics. While ProbLog1 focuses on calculating the success probability of a query, ProbLog2 can calculate both conditional probabilities and MPE states.
- PyMC is a python module that implements a suite of MCMC algorithms as python classes, and is extremely flexible and applicable to a large suite of problems. PyMC includes methods for summarizing output, plotting, goodness-of-fit and convergence diagnostics.
- PyMLNs is a toolbox and software library for learning and inference in Markov logic networks.
- R2 is a probabilistic programming system that employs powerful techniques from programming language design, program analysis and verification for scalable and efficient inference.
- Stan exposes a language for defining probability density functions for probabilistic models. Stan includes a compiler, which produces C++ code that performs Bayesian inference via a method similar to Hamiltonian Monte Carlo sampling.
- Tuffy is a highly scalable inference engine for Markov logic networks using a database backend.
- Venture is an interactive, Turing-complete, higher-order probabilistic programming platform that aims to be sufficiently expressive, extensible and efficient for general-purpose use. Its virtual machine supports multiple scalable, reprogrammable inference strategies, plus two front-end languages: VenChurch and VentureScript.
Tutorials and Books
- The Design and Implementation of Probabilistic Programming Languages
An introduction to PPLs and lightweight implementation techniques for sequential Monte Carlo and Metropolis-Hastings. Uses WebPPL.
- Probabilistic Models of Cognition Tutorial
A Web-based book using Church to introduce probabilistic Cognitive Science and AI.
- Towards common-sense reasoning via conditional simulation: legacies of Turing in Artificial Intelligence
Freer, Roy, and Tenenbaum relate Turing's legacy to probabilistic programming approaches in Artificial Intelligence in this book chapter, appearing in a volume edited by Rod Downey, entitled Turing's Legacy and being published by Cambridge University Press in their ASL Lecture Notes in Logic series.
- Practical Probabilistic Programming
This book provides an introduction to probabilistic programming focusing on practical examples and applications. No prior experience in machine learning or probabilistic reasoning is required. The book uses Figaro to present the examples but the principles are applicable to many probabilistic programming systems.
- NIPS*2008 Workshop on Probabilistic Programming
- NIPS*2012 Workshop on Probabilistic Programming
- NIPS*2014 Workshop on Probabilistic Programming