NIPS*2008 Workshop/Schedule
From Probabilistic Programming
SATURDAY, DEC. 13, 2008
The first day of the workshop will be held at the Westin in the Alpine CD conference room.
SUNDAY, DEC. 14, 2008
The goal of the second day is to provide more unstructured time where participants can discuss specific issues, brainstorm, and start collaborations. We have five talks planned in the morning, although we may spread these out. We will be finalizing this schedule in the next week.
8:00am | Constraint-based Probabilistic Modeling Taisuke Sato Tokyo Institute of Technology |
8:30am | Probabilistic Programming for Cognitive Science Noah Goodman MIT |
9:00am | Break |
9:15am | Syntactic Independence in Stochastic Functional Languages David McAllester Toyota Technological Institute at Chicago |
9:45am | Towards Digesting the Alphabet-Soup of Statistical Relational Learning Luc de Raedt Katholieke Universiteit Leuven |
10:15am | The Programming Languages Prospective Panel |
10:45am | Invited Talk by Gregor Kiczales University of British Columbia |
12:00pm | Group Lunch |
2:00pm | Brainstorming Session: Next steps |
Talks and Posters
The open universe
Stuart Russell, University of California, Berkeley
Recent advances in knowledge representation for probability models have allowed for uncertainty about the properties of objects and the relations that might hold among them. Such models, however, typically assume exact knowledge of which objects exist and of which object is which---that is, they assume domain closure and unique names. These assumptions simplify the sample space for probability models, but are inappropriate for many real-world situations in which the universe of discourse is open. These include most instances of perception and language understanding. This talk describes the evolution of open-universe probability models, our current efforts associated with the BLOG language, and some future directions. Joint work with Brian Milch, Bhaskara Marthi, Rodrigo Braz, Hanna Pasula, David Sontag, Andrey Kolobov, and Daniel Ong.
What We Can Do with Markov Logic Today
Pedro Domingos, University of Washington
A program in Markov logic is a set of weighted formulas in first-order logic, interpreted as templates for features of a Markov random field. Most widely used graphical models can be specified very compactly in Markov logic. In turn, this makes it easy to define large, complex models involving combinations of these components. For example, a complete information extraction system, performing joint segmentation and entity resolution, can be specified with just seven formulas. Current implementations of Markov logic enable routine learning and inference on graphical models with millions of variables and billions of features. State-of-the-art solutions to a wide variety of problems have been developed using Markov logic, including information extraction, robot mapping, unsupervised coreference resolution, semantic role labeling, prediction of protein beta-partners, parts of the CALO personal assistant, probabilistic extensions of the Cyc knowledge base, and others. This talk will survey the state of the art in Markov logic: language features, inference and learning algorithms, applications to date and research challenges. Joint work with Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang.
Church: a universal language for generative models
Vikash Mansinghka, MIT
To engineer robust, adaptive, autonomous systems and explain human intelligence in computational terms, we must develop formal languages for the structured representation of uncertain knowledge and machines capable of efficiently solving the resulting inference problems. In this talk, I will present the first of these two layers, Church, a probabilistic programming language which provides a unified procedural notation for stochastic generative processes and uncertain beliefs. Church recovers the pure subset of Scheme in its deterministic limit, recovers McCarthy's amb in its deductive limit, and generalizes Lisp evaluation to conditional stochastic simulation for universal Bayesian inference. I will show how to use Church to compactly specify a range of problems, including nonparametric Bayesian models, planning as inference, and Bayesian learning of programs from data. Joint work with Noah Goodman, Daniel Roy, and Joshua Tenenbaum.
Automatic Differentiation for Probabilistic Programming
Jeffrey Mark Siskind, School of Electrical and Computer Engineering, Purdue University
We have developed powerful and efficient techniques for taking gradients of functional programs. Our methods extends the efficiency of prior approaches that are limited to imperative numeric programs to the realm of functional programs that mix symbolic and numeric computation. We demonstrate the relevance and significance of this effort to the enterprise of probabilistic programming by constructing evaluators for two different probabilistic programming languages and using our methods to take the gradients of such evaluators in order to perform gradient-based maximum-likelihood parameter estimation.
Poster: Distributional Logic Programming: a brief overview
Nikolaos Angelopoulos, Edinburgh University
Logic programming (LP) is an attractive formalism for representing crisp knowledge. In recent years a number of probabilistic and quantitative extensions to LP have been proposed. Stochastic Logic Programs is one such formalism that have been previously proposed for representing Bayesian priors. Here, we present an extension to the probabilistic aspects of Stochastic Logic Programs based on guards. The main constructs of how the language combines probabilities with logical statements along with examples are provided. We also illustrate their use in defining a Bayesian prior over classification trees.
Poster: Observational Languages and Ontologies
David Poole, University of British Columbia
Clinton Smyth, Georeference Online Ltd.
Over the last two decades, there have been a number of proposed and implemented languages for representing rich probabilistic models. However, it is not always easy to condition in these models. For rich models that persist or use heterogeneous sets of observations, it is usually impossible to tell the system everything that is observed unless there are implicit or explicit ontologies. Even when we have the language to describe the observations, when a model specifies a role, it is not obvious which object in the world, if any, fills that role. This poster presents these issues in the context of applications of what we call "semantic science" to landslide susceptibility and mineral exploration.
Poster: Multi-agent Markov Logic
Miroslav Dudík, Carnegie Mellon University
Geoff Gordon, Carnegie Mellon University
Austin McDonald, Carnegie Mellon University
Successful artificial agents should be able to exploit both deductive and probabilistic reasoning while incorporating knowledge about incentives of other agents. To model multi-agent environments, we propose a language called multi-agent Markov logic. MAML subsumes first-order logic, Bayes nets and extensive-form games, without any sacrifices in the representation size. We also propose a modular inference procedure based on boosting which calculates the maximum entropy equilibrium of a MAML game. We demonstrate the utility of our approach on a small three-player poker variant.
Poster: Program inference by importance sampling
Georges Harik, HS Labs
Noam Shazeer, HS Labs
Poster: Open-Universe State Estimation with DBLOG
Rodrigo de Salvo Braz, UC Berkeley
Nimar Arora, UC Berkeley
Erik Sudderth, UC Berkeley
Stuart Russell, UC Berkeley
The paper considers the problem of state estimation or filtering in the open universe context, where the set of objects comprising the state is initially unknown and may change over time, and objects are not uniquely identified. Although a general first-order open-universe modeling and inference system such as BLOG can handle this problem, generic BLOG inference cannot take advantage of the temporal structure and is therefore inefficient for extended state estimation episodes. DBLOG, a specialization of BLOG for temporal applications, is described. Examining the inference behavior of DBLOG, we find an interesting phenomenon called retro-instantiation, whereby past histories of newly detected objects must be generated for correct inference to occur. This violates the standard "memoryless" recursive-update framework for state estimation. We examine cases where retro-instantiation can be correctly avoided, and find they do not quite match cases in the literature (especially for data association) where retroinstantiation is avoided. This seems to identify a bias coming from a tendency to treat absence of observations as absence of information.
Poster: Automatic Inference in PyBLOG
Nimar Arora, UC Berkeley
Rodrigo de Salvo Braz, UC Berkeley
Erik Sudderth, UC Berkeley
Stuart Russell, UC Berkeley
PyBLOG is a language built on top of Python for specifying Bayes Nets of arbitrary complexity. Its key features are automatic inference and extensibility. It can automatically identify efficient inference algorithms for a large class of models by analyzing the relationships between the random variables. Also, the user can extend the efficient inference to models with new distributions, as long as a few properties of these distributions and associated likelihood functions are provided.
Poster: Dependency Diagrams for Probabilistic Models and their Sampling and Inference Algorithms
Todd Johnson, UC Irvine
Eric Mjolsness, UC Irvine
Poster: CP-Logic Theory Inference with Contextual Variable Elimination
Wannes Meert, Katholieke Universiteit Leuven
Jan Struyf, Katholieke Universiteit Leuven
Hendrik Blockeel, Katholieke Universiteit Leuven
We propose an approach to do inference for CP-logic that is based on Poole and Zhang's contextual variable elimination algorithm. Since there is a lot of contextual independence in a CP-theory, this allows for an inference strategy that is better suited for CP-logic. The new method outperforms a traditional variable elimination based approach with regard to both speed and memory consumption.
Poster: Bach: Probabilistic Declarative Programming
John Lloyd, Australian National University
Kee Siong Ng, National ICT Australia
Will Uther, National ICT Australia
This poster gives an introduction to the Bach functional logic programming language. Bach programs are equational theories in higher-order logic. Its core is the functional language Haskell extended in such a way as to admit logic programming idioms and support probabilistic reasoning.
Higher-order functions allow straightforward modelling of probabilistic concepts such as joint and conditional probability densities. Efficient probabilistic inference is supported in part by several equations that capture important logical transformations. We can neatly and rigorously formalise both continuous Bayesian filtering (e.g. Kalman filter) and (FOVE-style) lifted inference in our system with these equations. We have also implemented monte-carlo sampling techniques in Bach if exact computation is too expensive.
We compare Bach to other functional systems such as IBAL and Church, as well as other SRL systems. Finally we note our approach to learning Bach models.
Poster: Embedded Probabilistic Programming
Chung-chieh Shan, Rutgers University
Oleg Kiselyov, FNMOC
Poster: Bayesian Programming formalism and ProBT API
Pierre Bessiere, CNRS - INRIA
J-M Ahuactzin, ProBayes, Inc.
E. Mazer, ProBayes, Inc.
K. Mekhnacha, ProBayes, Inc.
We present the Bayesian Programming formalism and the associated ProBT API and inference engine to specify and compute probabilistic models. ProBT is a commercial product but is available for free for academic purposes.
A Probabilistic Language for Biological Processes
Andrew Phillips, Microsoft Research Cambridge
This talk presents a programming language for designing and simulating computer models of biological processes. The language is based on a mathematical formalism known as the pi-calculus, and the simulation algorithm is based on standard kinetic theory of physical chemistry. The language is first presented using a simple graphical notation, which is subsequently used to model and simulate an immune system pathway relating to the detection of pathogens inside a cell. One of the benefits of the language is its ability to model large systems incrementally, by directly composing simpler models of subsystems.
Dyna: A Non-Probabilistic Programming Language for Probabilistic AI
Jason Eisner, John Hopkins University
The Dyna programming language is intended to provide an declarative abstraction layer for building systems in ML and AI. It extends logic programming with weights in a way that resembles functional programming. The weights are often probabilities. Yet Dyna does not enforce a probabilistic semantics, since many AI and ML methods work with inexact probabilities (e.g., bounds) and other numeric and non-numeric quantities. Instead Dyna aims to provide a flexible abstraction layer that is "one level lower," and whose efficient implementation will be able to serve as infrastructure for building a variety of toolkits, languages, and specific systems.
PMTK: probabilistic modeling toolkit for Matlab
Matt Dunham, University of British Columbia
Kevin Murphy, University of British Columbia
PMTK is a new open-source Matlab toolkit for probabilistic data modeling. It emphasizes simplicity, uniformity, and efficiency. Simplicity and uniformity are achieved by posing all learning and prediction problems as Bayesian inference. Efficiency is obtained by allowing the user to specify different kinds of posterior approximations, such as point estimates, as well as supporting many different algorithms for computing these approximate posteriors, such as fast convex optimization methods. The basic data type is "probability distribution" which can be combined together in various ways. Graphical models are treated like any other (multivariate) distribution. Non-parametric (conditional) distributions, such as Gaussian processes, are also supported.
FACTORIE: Efficient Probabilistic Programming for Relational Factor Graphs via Imperative Declarations of Structure, Inference and Learning
Andrew McCallum, University Massachusetts Amhearts
Discriminatively trained undirected graphical models, or conditional random fields, have had wide empirical success, and there has been increasing interest in toolkits that ease their application to complex relational data. Although there has been much historic interest in the combination of logic and probability, we argue that in this mixture "logic" is largely a red herring. The power in relational models is in their repeated structure and tied parameters; and logic is not necessarily the best way to define these structures. Rather than using a declarative language, such as SQL or first-order logic, we advocate using an object-oriented imperative language to express various aspects of model structure, inference and learning. By combining the traditional, declarative, statistical semantics of factor graphs with imperative definitions of their construction and operation, we allow the user to mix declarative and procedural domain knowledge, and also gain significant efficiencies. We have implemented our ideas in a system we call FACTORIE, a software library for an object-oriented, strongly-typed, functional JVM language named Scala. Joint work with Khashayar Rohanemanesh, Michael Wick, Karl Schultz, Sameer Singh.
Infer.NET and Csoft
John Winn, Microsoft Research Cambridge
Infer.NET is a framework for performing Bayesian inference by message passing in graphical models. The first version of Infer.NET constructed in-memory factor graphs and performed message passing by traversing these graphs. This commonly-used architecture was found to have two major drawbacks: that it added significant overhead to the inference procedure and that the message passing code rapidly became complex and unmaintainable as additional algorithm features were added. In this talk, I will describe how we overcame these problems in the second version of Infer.NET by using a programming language called Csoft as our internal representation and architecting the framework as a Csoft compiler. I will show how the new architecture has allowed us to provide a rich set of inference features whilst still being efficient, flexible and easy to maintain.
Constraint-based Probabilistic Modeling
Taisuke Sato, Tokyo Institute of Technology
We propose constraint-based probabilistic models defining a joint distribution over possible worlds X consisting of random boolean variables which are independent but satisfy the knowledge-base KB, a set of clauses, as logical constraints. They cover (discrete) log-linear models and Bayesian networks. We also propose an EM algorithm for the parameter learning of constraint-based probabilistic models.
Probabilistic Programming for Cognitive Science
Noah Goodman, MIT
Syntactic Independence in Stochastic Functional Languages
David McAllester, Toyota Technological Institute at Chicago
This talk considers stochastic functional languages in general as probabilistic modeling languages. For models defined by stochastic functional programs, independence plays a central role in both exact inference and in MCMC sampling. This talk presents a nontrivial syntactic independence criterion and gives examples where the criterion greatly improves the efficiency of both exact inference and MCMC sampling.
Towards Digesting the Alphabet-Soup of Statistical Relational Learning
Luc De Raedt, Katholieke Universiteit Leuven
I will report on our work towards the development of a probabilistic logic programming environment intended as a target language in which other probabilistic languages can be compiled, thereby contributing to the digestion of the "alphabet soup" of statistical relational learning. The language combines principles of the probabilistic Prolog environments, ProbLog and PRISM, and is fully integrated in YAP-Prolog. Providing such a low-level efficient probabilistic programming language should facilitate making comparisons and experimental evaluations of different statistical relational learning approaches. Joint work with Bart Demoen; Daan Fierens; Bernd Gutmann; Gerda Janssens; Angelika Kimmig; Niels Landwehr; Theofrastos Mantadelis; Wannes Meert; Ricardo Rocha; Vitor Santos Costa; Ingo Thon; Joost Vennekens.
Invited Talk
Gregor Kiczales, University of British Columbia
Demos
Alchemy (http://alchemy.cs.washington.edu/)
Pedro Domingos, University of Washington
Alchemy is an open-source implementation of Markov logic. It includes a suite of inference and learning algorithms, standard programming language features, a library of existing Markov logic networks, APIs for plugging in new algorithms, datasets, tutorial materials, etc. Joint work with Stanley Kok, Daniel Lowd, Hoifung Poon, Matt Richardson, Parag Singla, Marc Sumner, and Jue Wang.
Infer.NET (http://research.microsoft.com/infernet)
John Winn, Microsoft Research Cambridge
Infer.NET is a free .NET framework for probabilistic inference, with a straightforward API based on random variables. Our goal is to make it easy to perform Bayesian inference in a broad range of models without having to hand code the algorithm each time. You can think of it as a rapid prototyping tool for machine learning applications. Compared to existing inference tools like MSBN, Infer.NET is unique in that it can perform Expectation Propagation and Variational Message Passing on models with both discrete and continuous variables. It is also uniquely structured as a compiler, taking a .NET program with random variables as input and producing a specially-optimized inference program on output. We will show demonstrations of using Infer.NET for classification with Bayes Point Machines, learning ranking functions for web search, and inferring document relevance from click-through data.
PMTK (http://cs.ubc.ca/~murphyk/pmtk/)
Matt Dunham, University of British Columbia
Kevin Murphy, University of British Columbia
PRISM (http://sato-www.cs.titech.ac.jp/prism/)
Sato Taisuke, Tokyo Institute of Technology
PRISM is a logic-based Turing complete language for generative modeling covering Bayesian networks, HMMs, PCFGs, etc with a unified semantics (distribution semantics) and a unified computation mechanism (dynamic programming on explanation graphs). It also supports (hyper-)parameter learning, Viterbi inference and Bayesian inference by variational Bayes.
FACTORIE
Andrew McCallum, University of Massachusetts Amhearst
Khashayar Rohanemanesh, University of Massachusetts Amhearst
Michael Wick, University of Massachusetts Amhearst
Karl Schultz, University of Massachusetts Amhearst
Sameer Singh, University of Massachusetts Amhearst
FACTORIE is a system for building relational factor graphs, designed with an emphasis on MCMC inference and discriminative parameter estimation. Rather than using a declarative language, FACTORIE is characterized by its users' ability to leverage imperative constructs to specify variable-factor structure, to coordinate among variable values, to map variables values to factors' sufficient statistics, to define constraint-preserving proposal distributions, and to provide efficient learning via sample-rank. It is implemented as a software library in Scala, an object-oriented, strongly-typed, functional programming language that runs in the JVM.
Church
Noah Goodman, MIT
Vikash Mansinghka, MIT
Daniel Roy, MIT
Josh Tenenbaum, MIT
PyBLOG
Nimar Arora, UC Berkeley
Rodrigo de Salvo Braz, UC Berkeley
Erik Sudderth, UC Berkeley
Stuart Russell, UC Berkeley
PyBLOG is a language built on top of Python for specifying Bayes Nets of arbitrary complexity. Its key features are automatic inference and extensibility. It can automatically identify efficient inference algorithms for a large class of models by analyzing the relationships between the random variables. Also, the user can extend the efficient inference to models with new distributions, as long as a few properties of these distributions and associated likelihood functions are provided.
CILog2 (http://cs.ubc.ca/spider/poole/aibook/code/cilog/CILog2.html)
David Poole, University of British Columbia
This implements the Independent Choice Logic (Logic programs, that can include function symbols and negation as failure, with probabilistic choices). Note that this is mature code that we use for teaching and has sophisticated debugging/explanation tools. It has been used by hundreds of students and researchers to explore relational Turing-complete probabilistic models over the last decade. There are also examples of models from other people's papers on that web page.
Hazard Match (http://georeferenceonline.com) and Minematch (http://georeferenceonline.com/MineMatch)
Clinton Smyth, Georeference Online Ltd.
David Poole, University of British Columbia
Hazardmatch and Minematch are fielded systems. Note that these are designed for vertical domains (landslide prediction and minerals exploration) and contain hundreds of models and tens of thousands of observations of the world. It is less easy to represent arbitrary examples in these systems (building complex probabilistic model is difficult!), as they need to refer to the ontology of the data. But we can add new landslide models or mineral occurrence models easily!
ProBT API (http://probayes.com)
Pierre Bessiere, CNRS - INRIA
J-M Ahuactzin, ProBayes, Inc.
E. Mazer, ProBayes, Inc.
K. Mekhnacha, ProBayes, Inc.
We present the Bayesian Programming formalism and the associated ProBT API and inference engine to specify and compute probabilistic models. ProBT is a commercial product but is available for free for academic purposes.