CS 7480: Topics in Programming Languages: Probabilistic Programming

The course web-page for the Fall 2021 offering of CS 7480 Topics in Programming Languages: Probabilistic Programming.

  • Instructor: Steven Holtzen, s.holtzen@northeastern.edu.
    Feel free to email me with any questions, but please include CS7480 in the subject line.
  • Session: Fall 2021
  • Student Office Hours: Tuesday 11AM – 12PM EST (Also available by appointment)
  • Meeting Time: Tues/Fri 3:25PM – 5:05PM
  • Location: International Village 022

Description

Probabilistic programming languages (PPLs) use the syntax and semantics of programming languages to define probabilistic models. PPLs enable a diverse audience – data scientists, systems designers, medical doctors, etc. – to design and reason about probabilistic systems. PPLs are becoming a central topic in artificial intelligence and programming languages research, with increasing interest from industry and academia in designing and applying PPLs.

The goal of this course is to introduce the core ideas of probabilistic programming languages: probabilistic inference, semantics, program analysis, language design, and applications. The course will consist of formal lectures as well as research presentations by students surveying the modern landscape of developments in probabilistic programming. There will be a minor project involving implementing a probabilistic programming language, and a self-directed term project that aims to deeply explore some of the core ideas of probabilistic programming.

Prerequisites

The target audience for this course is advanced master’s students and PhD. students in programming languages, artificial intelligence, and machine learning. There are no formal prerequisites, but students are expected to be comfortable programming and have mathematical maturity. In particular, students should be familiar with the notion of mathematical proof and comfortable programming in at least one major programming language.

Organization & Grading

The course will consist in part of (1) lectures delivered by the instructor; (2) student presentations on research projects and research papers; (3) a minor project; (4) a self-directed term-project. Grades will follow a standard letter-grade curve, and points will be apportioned as:

  • 20%: Course Participation and Presentations
    • Participation is an important part of this class. Please inform the instructor if you need to miss class; no excuse is necessary.
  • 30%: Minor Project (Link to syllabus)
  • 50%: Final Project (Link to syllabus)

Each assignment will have an accompanying syllabus with a description of how the grade is apportioned within it.

Schedule

This schedule is tentative and is subject to change and re-ordering in topics.

# Date Topic Information
1 Friday Sept. 10 Introduction and Course Overview Lecture 1 Slides
2 Tuesday Sept. 14 Logic and Propositional Reasoning Lecture 2 Slides
3 Friday Sept. 17 Foundations of Probability Lecture 3 Slides
4 Tuesday Sept. 21 SimPPL Introduction Lecture 4 Slides
5 Friday Sept. 24 Inference I: SimPPL Inference Lecture 5 Slides
6 Tuesday Sept. 28 Inference II: Exact Inference Lecture 6 Slides
7 Friday Oct. 1 Inference III: Circuit Compilation Lecture 7 Slides
8 Tuesday Oct. 5 Inference IV: Approximate Inference Lecture 8 Slides
9 Friday Oct. 8 Paper Discussion Semantics of Probabilistic Programming: A Gentle Introduction link
10 Tuesday Oct. 12 Paper Discussion A Language for Counterfactual Generative Models link
11 Friday Oct. 15 Paper Discussion (Discussant: Sam Stites) Automatic Reparameterisation of Probabilistic Programs link
12 Tuesday Oct. 19 Paper Discussion (Discussant: Abdelrahman Madkour) Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks link
13 Friday Oct. 22 Paper Discussion (Discussant: John Guower) Conditional Independence by Typing link
14 Tuesday Oct. 26 Intro to Separation Logic (Discussant: John Li) A Primer on Separation Logic link
Optional: Separation Logic: A Logic for Shared Mutable Data Structures link
15 Friday Oct. 29 Paper Discussion (Discussant: John Li) A Probabilistic Separation Logic link
Minor Project Due, Term Project Proposal & Scoping
16 Tuesday Nov. 2 Term Project Proposal Discussion Email Steven 4 slides on your chosen topic (Overview, Motivation, Outcomes, Challenges)
17 Friday Nov. 5 Paper Discussion (Discussant: Sai Joseph) Deep Structural Causal Models for Tractable Counterfactual Inference link
Probabilistic Logic Programming
18 Tuesday Nov. 9 Paper Discussion Introduction to Logic Programming (Chapters 1,2, and 3) link
Download SWI-Prolog or use this website, try Exercises 1.1, 1.3, 2.1
19 Friday Nov. 12 Paper Discussion PRISM : A Language for Symbolic-Statistical Modeling link (Supplemental reading: link)
20 Tuesday Nov. 16 Paper Discussion (Discussant: Ritwik Anand) Inference in probabilistic logic programs using weighted CNF's link
21 Friday Nov. 19 Paper Discussion (Discussant: Luke Van Der Male) Deepproblog: Neural probabilistic logic programming link
22 Tuesday Nov. 23 Paper Discussion Hinge-Loss Markov Random Fields and Probabilistic Soft Logic link
Term Project Check-In
23 Friday Nov. 26 No Class
Generating Probabilistic Programs
24 Tuesday Nov. 30 Paper Discussion Bayesian Synthesis of Probabilistic Programs for Automatic Data Modeling link
25 Friday Dec. 3 Paper Discussion Neural Sketch Learning for Conditional Program Generation link
Probability-Guided Program Analysis
26 Tuesday Dec. 7 Paper Discussion Predicting Program Properties from "Big Code" link
27 Friday Dec. 10 Paper Discussion Evaluating Large Language Models Trained on Code link
Term Project Discussions
27 Tuesday Dec. 14 Term Project Discussion #1
27 Friday Dec. 17 Term Project Discussion #2 Final Term Project Deadline

Paper Presentations

An important part of the class is getting up to date and familiar with reading the modern literature in probabilistic programming. The reading will be chosen at least 1 week in advance and posted on the course website. Before each session, everyone is expected to fill out the following reading questionnaire:

  • What is the main motivation for the paper? What problem is it solving?
  • What is the paper’s main technical contribution? What tools does it use?
  • Does the paper live up to its claims? Why or why not?
  • How well does the paper situate itself in the broader literature? Did it miss anything important?
  • Was the paper easy to understand? Which parts were confusing or well-explained?
  • Did you have any questions while reading?

Please make sure your questionnaire is accessible to you during class. Each paper will have a discussant whose job it is to choose which paper to read. Everyone is expected to be a discussant at least once. In addition to the usual reading questionnaire, discussants should fill out the following discussant questionnaire:

  • Why did you choose this paper?
  • Where does this paper sit in the research landscape? Is it a significant milestone, or a more of an incremental improvement? Justify your answer, this can be subjective!
  • What is one thing you would do to improve or extend upon the ideas in this paper? This can be an extension, a technical clarification, etc.
  • Is there any follow-up work to the paper, or very closely related work that we should know about?

Please be prepared to speak in front of the class for approximately 15-20 minutes about your answers to these questions. If you have something else interesting that you want to bring up, please feel free to do so: this is a chance for you to be creative.

Selecting readings. A list of suggested papers can be found at this link.

Additional Materials

This course does not have a required textbook, but it does by necessity rely on a number of topics including programming languages, artificial intelligence, computational complexity, and machine learning. Some useful materials include:

Related courses taught at other universities: