Professor Contributes to Science and Health through Data

December 17, 2014

A section of tissue on a glass slide holds biological matter. It also contains something else that matters to Associate Professor Olga Vitek: data, up to five gigabytes on a single slide.

Trained in statistics, the new Northeastern University faculty member with a joint appointment in the College of Computer and Information Science (CCIS) and the College of Science focuses on computational biology and statistical bioinformatics. Her wide-ranging research areas include mass spectrometry, particularly proteomics, or the analysis of structure and function of proteins. She is also interested in integrating and interpreting other large-scale genomic, metabolomics, and ionomic datasets to study complex biological systems. Vitek’s research has earned her a National Science Foundation CAREER Award and recognition as a University Faculty Scholar at Purdue University, where she led the Laboratory for Statistical Proteomics and Biostatistics before coming to Northeastern.

Vitek develops statistical methods and algorithms to optimize the design of experiments and to ensure the large-scale and complex datasets that result from these experiments can be interpreted in an accurate, objective, and reproducible manner.

“I’m like a detective,” says Vitek, who first became interested in bioinformatics during her PhD program at Purdue and then held a postdoctoral position at the Institute for Systems Biology in Seattle. “Biological processes leave evidence in form of experimental data. My goal is to use this evidence, extract useful information, and uncover facts and events that are hidden from the human eye.”

For example, Vitek’s work helps scientists better understand and compare the chemical composition of biological tissue and infer how the spatial distribution of molecules varies between healthy and unhealthy tissue segments.

“We have to distinguish the systematic and the nonsystematic variation in the measurements,” Vitek explains. “Besides changes in chemical composition caused by disease, there are also biological differences from one person to another and between different tissue types. The mass spectrometric measurement technology itself also has variability and noise. We need to distinguish the clinically relevant changes from non-systematic variation and artifacts”

In addition to developing statistical methods, Vitek and her research group share them with others through open-source statistical software and adapt the algorithms and the implementation to make it possible to analyze large datasets. Vitek is also an active educator who participates in numerous short courses worldwide to train life scientists in statistical practice.

Now that she’s at Northeastern, Vitek will train CCIS students as well. This spring, she’ll teach a course called Statistics for Big Data, designed for MS and PhD students in computer science.

Meanwhile, Vitek is focused on continuing to expand her research group’s expertise in statistical inference and interpretation of large and complex scientific data. She says, “I want to help life scientists design and conduct informative and cost-effective experiments, and this can only be done with statistical considerations in mind. Providing practical and objective solutions to real-life problems and knowing that this research benefits human health, is the main driver of my work.”