By Shandana Mufti
The internet is home to a staggering amount of information, much of which is accessible through web searches. Matthew Ekstrand-Abueg, a PhD candidate in Computer Science, specializing in information retrieval, studies and evaluates summarization systems that condense mass numbers of webpages, with a focus on temporal summarizations created over periods of time.
“Say some major event comes out and you want to look at how information is released over time and summarize that,” he explains. “How do you do that and how do you evaluate how good systems are at doing that?”
Those are the questions at the heart of his research, and ones he’s been digging into during his six years at Northeastern. In addition to his PhD work, Matthew has also completed two internships at Microsoft Research and Google Research, where he worked on information retrieval projects.
His first internship, at Microsoft Research in 2012, was spent evaluating how to translate webpages from desktop to mobile, and how to make that shift automatic. “We take a bunch of pages that are rendered both for desktop and mobile and see if we can find patterns and find machine learning algorithms that could translate those pages automatically to make them look more appropriate for a mobile environment,” Matthew explains. This includes evaluating what modifications need to be made and what parts of the page can be kept or removed in order to adapt a webpage to a mobile browser.
Last year, Matthew was an information retrieval intern at Google Research. The project he worked on focused on automatically identifying and extracting medical symptoms from forms filled in by users writing about the symptoms they experience. Previous information retrieval work in the medical text field has mined information from clinical texts that a doctor or med student might use, but patients and web users describe symptoms in less heavy, less technical language. “We looked at how the average person writes information about symptoms and learned some information about that linguistic style,” Matthew says.
The internships Matthew completed helped shape and further his PhD work. For example, his work at Google Research involved a text learning method similar to what he uses for his temporal summarization work. As part of his doctoral research, Matthew looks at how people write and how to determine if numerous pieces of content contain similar information, even when written by different people.
“It’s all about learning to understand how people write things and how you can figure out what sort of information is similar even if it’s written in different ways – if it’s paraphrased or written by different people,” he says. “Those technologies that were used were very similar for the two applications.”
Looking beyond his doctoral program, Matthew says he hopes to earn his PhD this year before looking for postdoc positions or jobs in industry research. And his internship employers are on his list: “I could see myself ending up somewhere like one of the research groups like Google Research or maybe Microsoft or Yahoo,” he says.