177 Huntington Avenue
Boston, MA 02115
ATTN: Jan-Willem van de Meent, 202 WVH
360 Huntington Avenue
Boston, MA 02115
- PhD in Theoretical Physics, Leiden University
Jan-Willem van de Meent is an assistant professor of Computer Science at Northeastern University’s Khoury College of Computer Sciences. Prior to joining Northeastern, Professor van de Meent held positions as a postdoctoral researcher at the University of Oxford and Columbia University. Professor van de Meent holds a PhD in Theoretical Physics from Leiden University.
Professor van de Meent’s interests lie at the interface of programming languages and machine learning research. He is one of the creators of Anglican, a probabilistic programming system integrated with the Clojure language. His research aims to understand how probabilistic programs can be used to define structured and composable models in machine learning and artificial intelligence. His past contributions span granular physics, biological fluid mechanics, and machine learning for single-molecule biophysics.
NCS-FO: Leveraging Deep Probabilistic Models to Understand the Neural Bases of Subjective Experience
NCS-FO: Leveraging Deep Probabilistic Models to Understand the Neural Bases of Subjective Experience
Different individuals experience the same events in vastly different ways, owing to their unique histories and psychological dispositions. For someone with social fears and anxieties, the mere thought of leaving the home can induce a feeling of panic. Conversely, an experienced mountaineer may feel quite comfortable balancing on the edge of a cliff. This variation of perspectives is captured by the term subjective experience. Despite its centrality and ubiquity in human cognition, it remains unclear how to model the neural bases of subjective experience. The proposed work will develop new techniques for statistical modeling of individual variation, and apply these techniques to a neuroimaging study of the subjective experience of fear. Together, these two lines of research will yield fundamental insights into the neural bases of fear experience. More generally, the developed computational framework will provide a means of comparing different mathematical hypotheses about the relationship between neural activity and individual differences. This will enable investigation of a broad range of phenomena in psychology and cognitive neuroscience.
The proposed work will develop a new computational framework for modeling individual variation in neuroimaging data, and use this framework to investigate the neural bases of one powerful and societally meaningful subjective experience, namely, of fear. Fear is a particularly useful assay because it involves variation across situational contexts (spiders, heights, and social situations), and dispositions (arachnophobia, acrophobia, and agoraphobia) that combine to create subjective experience. In the proposed neuroimaging study, participants will be scanned while watching videos that induce varying levels of arousal. To characterize individual variation in this neuroimaging data, the investigators will leverage advances in deep probabilistic programming to develop probabilistic variants of factor analysis models. These models infer a low-dimensional feature vector, also known as an embedding, for each participant and stimulus. A simple neural network models the relationship between embeddings and the neural response. This network can be trained in a data-driven manner and can be parameterized in a variety of ways, depending on the experimental design, or the neurocognitive hypotheses that are to be incorporated into the model. This provides the necessary infrastructure to test different neural models of fear. Concretely, the investigators will compare a model in which fear has its own unique circuit (i.e. neural signature or biomarker) to subject- or situation-specific neural architectures. More generally, the developed framework can be adapted to model individual variation in neuroimaging studies in other experimental settings.
Babak Esmaeli, Hao Wu, Sarthak Jain, Alican Bozkurt, N. Siddharth, Brooks Paige, Dana H. Brooks, Jennifer Dy, Jan-Willem van de Meent. "Structured Disentangled Representations." The 22nd International Conference on Artificial Intelligence and Statistics.
Deep latent-variable models learn representations of high-dimensional data in an unsupervised manner. A number of recent efforts have focused on learning representations that disentangle statistically independent axes of variation by introducing modifications to the standard objective function. These approaches generally assume a simple diagonal Gaussian prior and as a result are not able to reliably disentangle discrete factors of variation. We propose a two-level hierarchical objective to control relative degree of statistical independence between blocks of variables and individual variables within blocks. We derive this objective as a generalization of the evidence lower bound, which allows us to explicitly represent the trade-offs between mutual information between data and representation, KL divergence between representation and prior, and coverage of the support of the empirical data distribution. Experiments on a variety of datasets demonstrate that our objective can not only disentangle discrete variables, but that doing so also improves disentanglement of other variables and, importantly, generalization even to unseen combinations of factors.
Esmaeili, Babak, Hongyi Huang, Byron C. Wallace and Jan-Willem van de Meent. “Structured Neural Topic Models for Reviews.” AISTATS (2018).
We present Variational Aspect-based Latent Topic Allocation (VALTA), a family of autoencoding topic models that learn aspect-based representations of reviews. VALTA defines a user-item encoder that maps bag-of-words vectors for combined reviews associated with each paired user and item onto structured embeddings, which in turn define per-aspect topic weights. We model individual reviews in a structured manner by inferring an aspect assignment for each sentence in a given review, where the per-aspect topic weights obtained by the user-item encoder serve to define a mixture over topics, conditioned on the aspect. The result is an autoencoding neural topic model for reviews, which can be trained in a fully unsupervised manner to learn topics that are structured into aspects. Experimental evaluation on large number of datasets demonstrates that aspects are interpretable, yield higher coherence scores than non-structured autoencoding topic model variants, and can be utilized to perform aspect-based comparison and genre discovery.
Jain, Sarthak, Edward Banner, Jan-Willem van de Meent, Iain James Marshall and Byron C. Wallace. “Learning Disentangled Representations of Texts with Application to Biomedical Abstracts.” EMNLP (2018).
We propose a method for learning disentangled representations of texts that code for distinct and complementary aspects, with the aim of affording efficient model transfer and interpretability. To induce disentangled embeddings, we propose an adversarial objective based on the (dis)similarity between triplets of documents with respect to specific aspects. Our motivating application is embedding biomedical abstracts describing clinical trials in a manner that disentangles the populations, interventions, and outcomes in a given trial. We show that our method learns representations that encode these clinically salient aspects, and that these can be effectively used to perform aspect-specific retrieval. We demonstrate that the approach generalizes beyond our motivating application in experiments on two multi-aspect review corpora.
Narayanaswamy, Siddharth, Brooks Paige, Jan-Willem van de Meent, Alban Desmaison, Noah D. Goodman, Pushmeet Kohli, Frank D. Wood and Philip H. S. Torr. “Learning Disentangled Representations with Semi-Supervised Deep Generative Models.” NIPS (2017).
Variational autoencoders (VAEs) learn representations of data by jointly training a probabilistic encoder and decoder network. Typically these models encode all features of the data into a single variable. Here we are interested in learning disentangled representations that encode distinct aspects of the data into separate variables. We propose to learn such representations using model architectures that generalise from standard VAEs, employing a general graphical model structure in the encoder and decoder. This allows us to train partially-specified models that make relatively strong assumptions about a subset of interpretable variables and rely on the flexibility of neural networks to learn representations for the remaining variables. We further define a general objective for semi-supervised learning in this model class, which can be approximated using an importance sampling procedure. We evaluate our framework’s ability to learn disentangled representations, both by qualitative exploration of its generative capacity, and quantitative evaluation of its discriminative ability on a variety of models and datasets.
Rainforth, T., Le, T. A., van de Meent, J.-W., Osborne, M. A., & Wood, F. (2016). Bayesian Optimization for Probabilistic Programs. In Advances in Neural Information Processing Systems (pp. 280–288).
We present the first general purpose framework for marginal maximum a posteriori estimation of probabilistic program variables. By using a series of code transformations, the evidence of any probabilistic program, and therefore of any graphical model, can be optimized with respect to an arbitrary subset of its sampled variables. To carry out this optimization, we develop the first Bayesian optimization package to directly exploit the source code of its target, leading to innovations in problem-independent hyperpriors, unbounded optimization, and implicit constraint satisfaction; delivering significant performance improvements over prominent existing packages. We present applications of our method to a number of tasks including engineering design and parameter optimization.
Rainforth, T., Naesseth, C. A., Lindsten, F., Paige, B., van de Meent, J.-W., Doucet, A., & Wood, F. (2016). Interacting Particle Markov Chain Monte Carlo. In Proceedings of The 33rd International Conference on Machine Learning, (pp. 2616–2625)
We introduce interacting particle Markov chain Monte Carlo (iPMCMC), a PMCMC method based on an interacting pool of standard and conditional sequential Monte Carlo samplers. Like related methods, iPMCMC is a Markov chain Monte Carlo sampler on an extended space. We present empirical results that show significant improvements in mixing rates relative to both noninteracting PMCMC samplers and a single PM-CMC sampler with an equivalent memory and computational budget. An additional advantage of the iPMCMC method is that it is suitable for distributed and multi-core architectures.
Tolpin, D., van de Meent, J.-W., Paige, B., & Wood, F. (2015). Output-Sensitive Adaptive Metropolis-Hastings for Probabilistic Programs. In A. Appice, P. P. Rodrigues, V. Santos Costa, J. Gama, A. Jorge, & C. Soares (Eds.), Machine Learning and Knowledge Discovery in Databases
We introduce an adaptive output-sensitive Metropolis-Hastings algorithm for probabilistic models expressed as programs, Adaptive Lightweight Metropolis-Hastings (AdLMH). This algorithm extends Lightweight Metropolis-Hastings (LMH) by adjusting the probabilities of proposing random variables for modification to improve convergence of the program output. We show that AdLMH converges to the correct equilibrium distribution and compare convergence of AdLMH to that of LMH on several test problems to highlight different aspects of the adaptation scheme. We observe consistent improvement in convergence on the test problems.
van de Meent, J.-W., Yang, H., Mansinghka, V., & Wood, F. (2015). Particle Gibbs with Ancestor Sampling for Probabilistic Programs. In Artificial Intelligence and Statistics.
Particle Markov chain Monte Carlo techniques rank among current state-of-the-art methods for probabilistic program inference. A drawback of these techniques is that they rely on importance resampling, which results in degenerate particle trajectories and a low effective sample size for variables sampled early in a program. We here develop a formalism to adapt ancestor resampling, a technique that mitigates particle degeneracy, to the probabilistic programming setting. We present empirical results that demonstrate nontrivial performance gains.
Tolpin, D., van de Meent, J.-W., Yang, H., & Wood, F. (2016). Design and Implementation of Probabilistic Programming Language Anglican. In Proceedings of the 28th Symposium on the Implementation and Application of Functional Programming Languages
Anglican is a probabilistic programming system designed to interoperate with Clojure and other JVM languages. We introduce the programming language Anglican, outline our design choices, and discuss in depth the implementation of the Anglican language and runtime, including macro-based compilation, extended CPS-based evaluation model, and functional representations for probabilistic paradigms, such as a distribution, a random process, and an inference algorithm.
We show that a probabilistic functional language can be implemented efficiently and integrated tightly with a conventional functional language with only moderate computational overhead. We also demonstrate how advanced probabilistic modelling concepts are mapped naturally to the functional foundation.
van de Meent, J.-W., Paige, B., Tolpin, D., & Wood, F. (2016). An Interface for Black Box Learning in Probabilistic Programs. In POPL Workshop on Probabilistic Programming Semantics.
In this work, we explore how probabilistic programs can be used to represent policies in sequential decision problems. In this formulation, a probabilistic program is a black-box stochastic simulator for both the problem domain and the agent. We relate classic policy gradient techniques to recently introduced black-box variational methods which generalize to probabilistic program inference. We present case studies in the Canadian traveler problem, Rock Sample, and a benchmark for optimal diagnosis inspired by Guess Who. Each study illustrates how programs can efficiently represent policies using moderate numbers of parameters.
Wood, F., van de Meent, J.-W., & Mansinghka, V. (2014). A new approach to probabilistic programming inference. In Artificial Intelligence and Statistics (pp. 1024–1032).
We introduce and demonstrate a new approach to inference in expressive probabilistic programming languages based on particle Markov chain Monte Carlo. Our approach is simple to implement and easy to parallelize. It applies to Turing-complete probabilistic programming languages and supports accurate inference in models that make use of complex control flow, including stochastic recursion. It also includes primitives from Bayesian nonparametric statistics. Our experiments show that this approach can be more efficient than previously introduced single-site Metropolis-Hastings methods.
Emmett, K. J., Rosenstein, J. K., van de Meent, J.-W., Shepard, K. L., & Wiggins, C. H. (2015). Statistical Inference for Nanopore Sequencing with a Biased Random Walk Model. Biophysical Journal, 108(April), 1852–1855
Nanopore sequencing promises long read-lengths and single-molecule resolution, but the stochastic motion of the DNA molecule inside the pore is, as of this writing, a barrier to high accuracy reads. We develop a method of statistical inference that explicitly accounts for this error, and demonstrate that high accuracy (>99%) sequence inference is feasible even under highly diffusive motion by using a hidden Markov model to jointly analyze multiple stochastic reads. Using this model, we place bounds on achievable inference accuracy under a range of experimental parameters
Johnson, S., van de Meent, J.-W., Phillips, R., Wiggins, C. H., & Linden, M. (2014). Multiple LacI-mediated loops revealed by Bayesian statistics and tethered particle motion. Nucleic Acids Research, gku563
The bacterial transcription factor LacI loops DNA by binding to two separate locations on the DNA simultaneously. Despite being one of the best-studied model systems for transcriptional regulation, the number and conformations of loop structures accessible to LacI remain unclear, though the importance of multiple coexisting loops has been implicated in interactions between LacI and other cellular regulators of gene expression. To probe this issue, we have developed a new analysis method for tethered particle motion, a versatile and commonly used in vitro single-molecule technique. Our method, vbTPM, performs variational Bayesian inference in hidden Markov models. It learns the number of distinct states (i.e. DNA–protein conformations) directly from tethered particle motion data with better resolution than existing methods, while easily correcting for common experimental artifacts. Studying short (roughly 100 bp) LacI-mediated loops, we provide evidence for three distinct loop structures, more than previously reported in single-molecule studies. Moreover, our results confirm that changes in LacI conformation and DNA-binding topology both contribute to the repertoire of LacI-mediated loops formed in vitro, and provide qualitatively new input for models of looping and transcriptional regulation. We expect vbTPM to be broadly useful for probing complex protein–nucleic acid interactions.
Stylistic clusters and the Syrian/South Syrian tradition of first-millennium BCE Levantine ivory carving: A machine learning approach
- Gansell, A. R., van de Meent, J. W., Zairis, S., & Wiggins, C. H. (2014). Stylistic clusters and the Syrian/South Syrian tradition of first-millennium BCE Levantine ivory carving: A machine learning approach. Journal of Archaeological Science, 44, 194–205
Thousands of first-millennium BCE ivory carvings have been excavated from Neo-Assyrian sites in Mesopotamia (primarily Nimrud, Khorsabad, and Arslan Tash) hundreds of miles from their Levantine production contexts. At present, their specific manufacture dates and workshop localities are unknown. Relying on subjective, visual methods, scholars have grappled with their classification and regional attribution for over a century. This study combines visual approaches with machine-learning techniques to offer data-driven perspectives on the classification and attribution of this early Iron Age corpus. The study sample consisted of 162 sculptures of female figures. We have developed an algorithm that clusters the ivories based on a combination of descriptive and anthropometric data. The resulting categories, which are based on purely statistical criteria, show good agreement with conventional art historical classifications, while revealing new perspectives, especially with regard to the contested Syrian/South Syrian/Intermediate tradition. Specifically, we have identified that objects of the Syrian/South Syrian/Intermediate tradition may be more closely related to Phoenician objects than to North Syrian objects; we offer a reconsideration of a subset of Phoenician objects, and we confirm Syrian/South Syrian/Intermediate stylistic subgroups that might distinguish networks of acquisition among the sites of Nimrud, Khorsabad, Arslan Tash and the Levant. We have also identified which features are most significant in our cluster assignments and might thereby be most diagnostic of regional carving traditions. In short, our study both corroborates traditional visual classification methods and demonstrates how machine-learning techniques may be employed to reveal complementary information not accessible through the exclusively visual analysis of an archaeological corpus.