Bio Coming Soon!
Ness R.O., Sachs K., Mallick P., Vitek O. (2017) A Bayesian Active Learning Experimental Design for Inferring Signaling Networks. In: Sahinalp S. (eds) Research in Computational Molecular Biology. RECOMB 2017. Lecture Notes in Computer Science, vol 10229. Springer, Cham
Machine learning methods for learning network structure, applied to quantitative proteomics experiments, reverse-engineer intracellular signal transduction networks. They provide insight into the rewiring of signaling within the context of a disease or a phenotype. To learn the causal patterns of influence between proteins in the network, the methods require experiments that include targeted interventions that fix the activity of specific proteins. However, the interventions are costly and add experimental complexity.
We describe a active learning strategy for selecting optimal interventions. Our approach takes as inputs pathway databases and historic datasets, expresses them in form of prior probability distributions on network structures, and selects interventions that maximize their expected contribution to structure learning. Evaluations on simulated and real data show that the strategy reduces the detection error of validated edges as compared to an unguided choice of interventions, and avoids redundant interventions, thereby increasing the effectiveness of the experiment.
From Correlation to Causality: Statistical Approaches to Learning Regulatory Relationships in Large-Scale Biomolecular Investigations
R. Ness, K. Sachs, O. Vitek. “From correlation to causality: statistical approaches to learning regulatory relationships in large-scale biomolecular investigations”. Journal of Proteome Research, in press, 2016.
Causal inference, the task of uncovering regulatory relationships between components of biomolecular pathways and networks, is a primary goal of many high-throughput investigations. Statistical associations between observed protein concentrations can suggest an enticing number of hypotheses regarding the underlying causal interactions, but when do such associations reflect the underlying causal biomolecular mechanisms? The goal of this perspective is to provide suggestions for causal inference in large-scale experiments, which utilize high-throughput technologies such as mass-spectrometry-based proteomics. We describe in nontechnical terms the pitfalls of inference in large data sets and suggest methods to overcome these pitfalls and reliably find regulatory associations.