From Probabilistic Programming
This website serves as a repository of links and information about probabilistic programming languages, including both academic research spanning theory, algorithms, modeling, and systems, as well as implementations, evaluations, and applications. If you would like to contribute to this site, please contact Daniel Roy. The site is still under construction: please help us link to relevant projects and research!
Probabilistic Programming in the News
DARPA has recently announced a new "Probabilistic Programming for Advanced Machine Learning" (PPAML) program. The program seeks to "greatly increase the number of people who can successfully build machine learning applications and make machine learning experts radically more effective." “We want to do for machine learning what the advent of high-level program languages 50 years ago did for the software development community as a whole,” said Kathleen Fisher, DARPA program manager. More information can be found in the DARPA press release and in the broad agency announcement.
The "UAI-style" Probabilistic-Programming Mailing List
The probabilistic-programming mailing list hosted at CSAIL/MIT hopes to support discussion between researchers working in the area of probabilistic programming, but also to provide a means to announce new results, software, workshops, etc.
NIPS Workshop on Probabilistic Programming
It is our pleasure to report that the NIPS*2012 Workshop on Probabilistic Programming, which took place in Lake Tahoe, Nevada on December 7 and 8, was a great success. Every talk had a full room, often with standing room only. Visit the workshop page to see the list of talks, as well as to find links to extended abstracts.
The probabilistic programming approach
Probabilistic graphical models provide a formal lingua franca for modeling and a common target for efficient inference algorithms. Their introduction gave rise to an extensive body of work in machine learning, statistics, robotics, vision, biology, neuroscience, artificial intelligence (AI) and cognitive science. However, many of the most innovative and useful probabilistic models published by the AI, machine learning, and statistics community far outstrip the representational capacity of graphical models and associated inference techniques. Models are communicated using a mix of natural language, pseudo code, and mathematical formulae and solved using special purpose, one-off inference methods. Rather than precise specifications suitable for automatic inference, graphical models typically serve as coarse, high-level descriptions, eliding critical aspects such as fine-grained independence, abstraction and recursion.
PROBABILISTIC PROGRAMMING LANGUAGES aim to close this representational gap, unifying general purpose programming with probabilistic modeling; literally, users specify a probabilistic model in its entirety (e.g., by writing code that generates a sample from the joint distribution) and inference follows automatically given the specification. These languages provide the full power of modern programming languages for describing complex distributions, and can enable reuse of libraries of models, support interactive modeling and formal verification, and provide a much-needed abstraction barrier to foster generic, efficient inference in universal model classes.
We believe that the probabilistic programming language approach within AI has the potential to fundamentally change the way we understand, design, build, test and deploy probabilistic systems. This approach has seen growing interest within AI over the last 10 years, yet the endeavor builds on over 40 years of work in range of diverse fields including mathematical logic, theoretical computer science, formal methods, programming languages, as well as machine learning, computational statistics, systems biology, probabilistic AI.
Please see our collection of research articles on probabilistic programming.
A growing body of literature studies probabilistic programming from an array of perspectives. The individual project pages linked below often contain lists of publications, although we aim to collect these in our own master list as well. A related but distinct body of work is that of Approximate Bayesian Computation (ABC), which focuses on likelihood-free methods, developed originally to tackle statistical queries in population genetics but now applied more broadly. The website for the i-Like research programme links to a number of very interesting articles. Another related area of research is Statistical Relational Learning, which is in general interested in distributions on structured spaces (e.g., models of first order languages) where there may be uncertainty in the number and types of objects.
Existing probabilistic programming systems
Below we have compiled a list of probabilistic programming systems including languages, implementations/compilers, as well as software libraries for constructing probabilistic models and toolkits for building probabilistic inference algorithms.
- BLOG, or Bayesian logic, is a probabilistic programming language with elements of first-order logic, as well as an MCMC-based inference algorithm. BLOG makes it relatively easy to represent uncertainty about the number of underlying objects explaining observed data.
- BUGS is a language for specifying finite graphical models and accompanying software for performing B(ayesian) I(nference) U(sing) G(ibbs) S(ampling).
- Church is a universal probabilistic programming language, extending Scheme with probabilistic semantics, and is well suited for describing infinite-dimensional stochastic processes and other recursively-defined generative processes (Goodman, Mansinghka, Roy, Bonawitz and Tenenbaum, 2008). Implementations of Church include MIT-Church, Cosh, Bher, and JSChurch.
- Dimple is a software tool that performs inference and learning on probabilistic graphical models via belief propagation algorithms or sampling based algorithms.
- FACTORIE is a Scala library for creating relational factor graphs, estimating parameters and performing inference.
- Figaro is a Scala library for constructing probabilistic models that also provides a number of built-in reasoning algorithms that can be applied automatically to any constructed models.
- HANSEI is a domain-specific language embedded in OCaml, which allows one to express discrete-distribution models with potentially infinite support, perform exact inference as well as importance sampling-based inference, and model inference over inference.
- Hierarchical Bayesian Compiler (HBC) is a language for expressing and compiler for implementing hierarchical Bayesian models, with a focus on large-dimension discrete models and support for a number of non-parametric process priors.
- Infer.NET is a software library developed by Microsoft for expressing graphical models and implementing Bayesian inference using a variety of algorithms.
- ProbLog is a probabilistic extension of Prolog based on Sato's distribution semantics. While ProbLog1 focuses on calculating the success probability of a query, ProbLog2 can calculate both conditional probabilities and MPE states.
- PyMC is a python module that implements a suite of MCMC algorithms as python classes, and is extremely flexible and applicable to a large suite of problems. PyMC includes methods for summarizing output, plotting, goodness-of-fit and convergence diagnostics.
- Stan exposes a language for defining probability density functions for probabilistic models. Stan includes a compiler, which produces C++ code that performs Bayesian inference via a method similar to Hamiltonian Monte Carlo sampling.
- Probabilistic Models of Cognition Tutorial
A tutorial in Church using examples from Cognitive Science and AI.