From Probabilistic Programming
universal languages and inference; systems; and applications
December 13th, 2008
Organizers: Daniel Roy (MIT), Vikash Mansinghka (MIT), John Winn (MSR-Cambridge), David McAllester (TTI-Chicago), Joshua Tenenbaum (MIT)
If you want to join our workshop mailing list and receive important announcements directly, please visit the following link and sign up: http://lists.csail.mit.edu/mailman/listinfo/probabilistic-programming
- OCT 21 2008 Submissions Due
- OCT 28 2008 Notifications Sent
- DEC 13 2008 Workshop, first day
- DEC 14 2008 Workshop, second day (confirmed)
SATURDAY, DEC. 13, 2008
The first day of the workshop will be held at the Westin in the Alpine CD conference room.
SUNDAY, DEC. 14, 2008
The goal of the second day is to provide more unstructured time where participants can discuss specific issues, brainstorm, and start collaborations. We have five talks planned in the morning, although we may spread these out. We will be finalizing this schedule in the next week.
|8:00am|| Constraint-based Probabilistic Modeling|
Taisuke Sato Tokyo Institute of Technology
|8:30am|| Probabilistic Programming for Cognitive Science|
Noah Goodman MIT
|9:15am|| Syntactic Independence in Stochastic Functional Languages|
David McAllester Toyota Technological Institute at Chicago
|9:45am|| Towards Digesting the Alphabet-Soup of Statistical Relational Learning|
Luc de Raedt Katholieke Universiteit Leuven
|10:15am||The Programming Languages Prospective Panel|
|10:45am||Invited Talk by Gregor Kiczales University of British Columbia|
|2:00pm||Brainstorming Session: Next steps|
Probabilistic graphical models provide a formal lingua franca for modeling and a common target for efficient inference algorithms. Their introduction gave rise to an extensive body of work in machine learning, statistics, robotics, vision, biology, neuroscience, artificial intelligence (AI) and cognitive science. However, many of the most innovative and useful probabilistic models published by the NIPS community far outstrip the representational capacity of graphical models and associated inference techniques. Models are communicated using a mix of natural language, pseudo code, and mathematical formulae and solved using special purpose, one-off inference methods. Rather than precise specifications suitable for automatic inference, graphical models typically serve as coarse, high-level descriptions, eliding critical aspects such as fine-grained independence, abstraction and recursion.
PROBABILISTIC PROGRAMMING LANGUAGES in AI (e.g., IBAL [14,15,30], Csoft/Infer.NET , Church , PTP , PFP/Haskell ; and logical approaches including BLOG , iBLOG , PRiSM , BLP , SLP , Markov Logic ) aim to close this representational gap, unifying general purpose programming with probabilistic modeling; literally, users specify a probabilistic model in its entirety (e.g., by writing code that generates a sample from the joint distribution) and inference follows automatically given the specification. These languages provide the full power of modern programming languages for describing complex distributions, and can enable reuse of libraries of models, support interactive modeling and formal verification, and provide a much-needed abstraction barrier to foster generic, efficient inference in universal model classes.
We believe that the probabilistic programming language approach within AI, which has been emerging over the last 10 years yet builds on over 40 years of work in range of diverse fields including mathematical logic, theoretical computer science, formal methods, programming languages, as well as machine learning, computational statistics, systems biology, probabilistic AI, has the potential to fundamentally change the way we understand, design, build, test and deploy probabilistic systems. A NIPS workshop will be a unique opportunity for this diverse community to meet, share ideas, collaborate, and help plot the course of this exciting research area.
One of the central goals of the workshop is to bridge the NIPS, UAI and programming languages communities, introducing PL researchers to a wealth of critical statistical machine learning problems and yielding an influx of new mathematical ideas, compilation techniques, and application problems to NIPS.
Topics of the workshop will include (but are not limited to):
- languages, in particular programming languages, for specifying joint and conditional distributions (i.e., probabilistic models)
- inference algorithms for executing probabilistic programs, exact and approximate, deterministic and randomized
- killer applications (what can we do now that we could not before)
- learning and using structured representations of uncertain beliefs; large-scale probabilistic knowledge engineering
- relationships between logical and procedural representations
- how to build useful systems and environments for different user groups
- trade-offs between the various existing languages and systems for probabilistic programming
- how programming language research can help us, and vice versa:
automatic differentiation; contracts; partial evaluation; identifying mathematical relationships between fundamental statistical notions like exchangeability, conditional independence, and random measures; and programming language notions like referential transparency, concurrency, and higher-order procedures; framing key conjectures and open problems
- computable analysis and probability theory, e.g. finding exact, effective representations of continuous distributions
- analysis of probabilistic programs and recursive stochastic processes and mechanical verification of probabilistic models and algorithms
Several recent probabilistic programming languages support modeling idioms and exploit inference methods pioneered at NIPS, and are direct outgrowths of the NIPS research community. For example, Csoft is a probabilistic language based on C# used within the Infer.NET inference engine (Winn and Minka ) to apply variational message passing or expectation propagation to large models. It has been effectively applied in large-scale problems in information retrieval and computational biology. In addition to being Turing-universal, Church (Goodman, Mansinghka, Roy and Tenenbaum ) naturally handles recursive stochastic processes, nonparametric Bayesian models (e.g. the Dirichlet process, adaptor grammars , and the HDP-PCFG ) and planning-as-inference ; exploits Monte Carlo inference methods widely used in many probabilistic modeling papers at NIPS; and has drawn interest from NIPS-oriented cognitive scientists. Other languages are outgrowths of overlapping communities, especially probabilistic AI: IBAL (Pfeffer ; Pfeffer, McAllester, Koller ; Ramsey and Pfeffer ) emphasizes efficient dynamic programs for exact inference; PTP (Park, Pfenning, Thrun ) has been applied to simulation problems in robotics.
Research into probabilistic programming languages extends far beyond the NIPS community. There is a large body of work within theoretical computer science that studies the mathematical foundations of lambda-calculus with probabilistic choice, much of it geared towards the analysis of randomized algorithms [28, 29, 33, 34, 35]. On the applied/modeling side, programming language ideas are filtering into systems biology; Phillips and Cardelli [31, 32] have modeled the kinetics of thousands of interacting chemical networks in cells using their SPiM language (a stochastic pi-calculus) to capture the modular structure and recurrent concurrency inherent in the physical chemistry and molecular biology of these systems. Compared to languages coming from the machine learning and probabilistic AI communities, recent languages introduced by members of the PL community generally have not addressed the use of such languages for statistical inference; notable exceptions are PTP (mentioned above) and PFP/Haskell , which has been used to build computational models of micro-RNA formation, but is limited to finite domains where enumeration is possible. Interestingly, preliminary evidence suggests that providing state-of-the-art approximate inference in models specified by probabilistic programs will require generalizations of program analysis and compilation; bringing PL expertise in these area in contact with NIPS expertise on inference seems especially fruitful to both communities.
Finally, machine learning researchers interested in specifying or learning logical representations of data have developed a host of probabilistic logics and approximate inference algorithms (e.g., Markov Logic , BLOG , iBLOG , npBLOG , BLP , PRiSM , SLP ) which approach the expressiveness gap from a fundamentally different mathematical perspective, in some sense analogous to the difference between undirected (logics) and directed (programs) graphical models. Characterizing the relationships between deductive/logical and procedural/probabilistic-programming approaches (and exchanging techniques and applications) is a central challenge ideally suited for NIPS.
This one day workshop will consist of invited and contributed talks, spotlights, poster sessions, and round-table discussions. Due to the considerable interest we have already received and a widely stated preference for a highly collaborative workshop environment, we anticipate spending the bulk of the workshop on poster sessions and open discussion, with very few talks.
We are also soliciting feedback on whether a second day focused more on work and fostering collaboration would be of interest to our attendees, especially those only attending the workshops. If you would be interested in an extra day (timed so as to allow late afternoon/evening flights out of Vancouver), please contact us.
CALL FOR CONTRIBUTIONS
Authors interested in presenting their work and ideas at the workshop should send an email with subject "Workshop Submission" to email@example.com and include:
- a title
- authors and emails
- abstract (2 pages, NIPS style, PDF or PostScript)
Organizer Mailing List: firstname.lastname@example.org
We are grateful to Microsoft Research Cambridge for generously offering to help fund the workshop.
 H. Abelson and G. Sussman. Structure and Interpretation of Computer Programs. MIT Press, 1996.
 J. Winn and C. M. Bishop Variational Message Passing Journal of Machine Learning Research , Volume 6, pp. 661-694, 2005.
 K. A. Bonawitz. Composable Probabilistic Inference with Blaise. PhD thesis, MIT, 2008.
 A. Church. A Set of Postulates for the Foundation of Logic. The Annals of Mathematics, 33(2):346–366, 1932.
 M. Johnson, T. Griffiths, and S. Goldwater. Adaptor grammars: A framework for specifying compositional nonparametric Bayesian models. NIPS 19, 2007.
 R. Kelsey, W. Clinger, and J. Rees (eds.). Revised5 Report on the Algorithmic Language Scheme. Higher-Order and Symbolic Computation, 11(1):7–105, 1998.
 N. D. Goodman, V. K. Mansighka, D. Roy, K. Bonawitz, J. B. Tenenbaum. Church: a language for generative models. In Uncertainty in Artificial Intelligence 2008.
 J. Winn and T. Minka. Infer.NET/Csoft website: http://research.microsoft.com/mlp/ml/Infer/Csoft.htm. Aug 1, 2008.
 D.J. Lunn, A. Thomas, N. Best, and D. Spiegelhalter. WinBUGS-A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10(4):325–337, 2000.
 D. McAllester, B. Milch, and N. D. Goodman. Random-world semantics and syntactic independence for expressive languages. Technical Report MIT-CSAIL-TR-2008-025, Massachusetts Institute of Technology, 2008.
 J. McCarthy. A Basis for a Mathematical Theory of Computation. In Computer Programming and Formal Systems, pages 33–70, 1963.
 B. Milch, B. Marthi, S. Russell, D. Sontag, D.L. Ong, and A. Kolobov. BLOG: Probabilistic models with unknown objects. Proc. IJCAI, 2005.
 S. Muggleton. Stochastic logic programs. In L. de Raedt, editor, Advances in Inductive Logic Programming, pages 254–264. IOS Press, 1996.
 A. Pfeffer. IBAL: A probabilistic rational programming language. Proc. IJCAI, 2001.
 D. Koller, D. McAllester, and A. Pfeffer (1997). "Effective Bayesian Inference for Stochastic Programs." Proceedings of the 14th National Conference on Artificial Intelligence (AAAI) (pp. 740-747).
 A. Radul. Report on the probabilistic language scheme. Technical Report MIT-CSAIL-TR-2007-059, Massachusetts Institute of Technology, 2007.
 J. Reynolds. Deﬁnitional interpreters for higher-order programming. ACM Annual Conference, pages 717–740, 1972.
 M. Richardson and P. Domingos. Markov logic networks. Machine Learning, 62(1):107–136, 2006.
 T. Sato and Y. Kameya. PRISM: A symbolic-statistical modeling language. In International Joint Conference on Artiﬁcial Intelligence, 1997.
 P. Carbonetto, J. Kisynski, N. de Freitas and D. Poole. Nonparametric Bayesian Logic . UAI 2005.
 M. Toussaint, S. Harmeling, and A. Storkey. Probabilistic inference for solving (PO)MDPs. Technical Report EDI-INF-RR-0934, University of Edinburgh, 2006.
 K. Kersting and L. De Raedt. Bayesian Logic Programming: Theory and Tool. Chapter to appear in L. Getoor and B. Taskar, editors, An Introduction to Statistical Relational Learning, MIT Press.
 Y.W. Teh, H. Daume III and D.M. Roy. Bayesian Agglomerative Clustering with Coalescents. NIPS 2007.
 P. Liang, S. Petrov, M. Jordan, D. Klein. The infinite PCFG using hierarchical Dirichlet processes. EMNLP/CoNLL, 2007.
 S. Park, F. Pfenning and S. Thrun. A Probabilistic Language Based Upon Sampling Functions. POPL 2004 (pgs 171–182)
 M. Erwig and S. Kollmansberger. Probabilistic Functional Programming in Haskell, Journal of Functional Programming, Vol. 16, No. 1, 21-34, 2006
 E. Allen, J. Carrington, M. Erwig, K. Kasschau and S. Kollmansberger. Computational Modeling of microRNA Formation and Target Differentiation in Plants.
 E. Hehner. Probabilistic Predicative Programming. Pages 169–185 of: 7th Int. Conf. on Mathematics of Program Construction. 2004. LNCS, vol. 3125.
 C. Morgan, A. McIver and K. Seidel. Probabilistic Predicate Transformers. ACM Transactions on Programming Languages and Systems, 1996. 18(3), 325–353.
 N. Ramsey and A. Pfeffer. Stochastic Lambda Calculus and Monads of Probability Distributions. POPL 2002.
 A. Phillips and L. Cardelli. A Correct Abstract Machine for the Stochastic Pi-calculus. In Proceedings of Concurrent Models in Molecular Biology. August 2004.
 R. Blossey, L. Cardelli and A. Phillips. Compositionality, Stochasticity and Cooperativity in Dynamic Models of Gene Regulation. HFSP Journal, 2(1):17–28 February 2008.
 J. Hurd. Formal Verification of Probabilistic Algorithms. PhD thesis, University of Cambridge, 2002.
 J. Hurd, A. McIver, and C. Morgan. Probabilistic guarded commands mechanized in HOL. Theoretical Computer Science, 346:96–112, November 2005.
 M. Schroeder: Admissible Representations of Probability Measures. Electr. Notes Theor. Comput. Sci. 167: 61-78 (2007).
 D. McAllester, B. Milch, N. D. Goodman. Random-World Semantics and Syntactic Independence for Expressive Languages. Technical Report MIT-CSAIL-TR-2008-025, Massachusetts Institute of Technology (2008).
 D. Kozen. Semantics of probabilistic programs. Journal of Computer and System Sciences, 22(3):328--350, 1981.
 Sungwoo Park. A calculus for probabilistic languages. ACM SIGPLAN Workshop on Types in Language Design and Implementation, 2003.