Khoury News
Northeastern mechanistic interpretability workshop aims to make sense of AI systems
At a largely student-organized Northeastern workshop in late August, more than 250 researchers gathered to unlock of the mysteries of the AI technology that's taken the world by storm.
If your car breaks down, you can call a mechanic, someone who understands how the parts fit together and function. When something goes wrong with an AI model, it’s unclear who to ask for help.
This question is at the heart of mechanistic interpretability, or “mech interp,” a growing research field that aims to open up the “black box” of AI systems and understand them internally, at a structural level.
At the second New England Mechanistic Interpretability (NEMI) workshop, a one-day event held at Northeastern University, academic and industry researchers, students, and professionals gathered to explore this very challenge and share progress in making AI systems more approachable.
“Usually, people look at AI by giving it a different input and seeing how the output changes,” said Koyena Pal, a doctoral student at Khoury College and the lead organizer of the 2024 and 2025 NEMI workshops. “But there’s a whole process inside that this field tries to decouple.”
The event was held on August 22 in Northeastern’s Curry Student Center Ballroom and included more than 200 attendees, some of whom traveled from abroad. A YouTube livestream accommodated around 50 virtual participants.
“We filled up the room,” said David Bau, an assistant professor at Khoury College, senior organizer of the NEMI Workshop, and director of Northeastern’s National Deep Inference Fabric (NDIF) project, which strives to unpack the mysteries of large AI systems. “There are a lot of people in the Boston area who are interested in not just using AI, but trying to explain how the mystery of AI works.”
To bring this vision to life, Pal and the team leaned on community input and the creative use of technology. Working alongside Pal were organizers Alex Loftus and Aruna Sankaranarayanan, logistics supporter Heather Sciacca, and senior program committee members Jacob Andreas, Himabindu Lakkaraju, and Najoung Kim.
After polling participants, Pal and her co-organizers reviewed papers and noted requests to meet other specific attendees at the workshop. They then used a large language model — an AI program trained on large quantities of text — to identify overarching topics of interest. With 24 tables available, the organizers generated 24 topics and identified three or four participants to moderate at each table.
“I kind of felt like a matchmaker,” Pal said. “I felt like I had full access to know what people were interested in and what they were doing.”
The organizers’ efforts paid off.
“At the conference itself, it was nice to know that it was very noisy,” Pal said. “That means people were actually talking! There were hallway discussions, discussions across all the tables, and in the poster rooms.”
Hands-on demo sessions of new tools and initiatives in the mech interp research community followed.
“We demoed our main software, which is called NNsight. We also demoed some new features that are coming out,” said Emma Bortz, NDIF’s technical community outreach and education manager and co-organizer of the NEMI workshop. “We host models for researchers to run these experiments on. We’re kind of an infrastructure for mechanistic interpretability, and we will soon be releasing many new models that researchers can access for free on our platform.”
Among these models is a no-code visualization and analysis tool, Logit Lens Workbench UI, which allows users to interact with machine learning models without writing a single line of code.
“Getting us speaking the same language is, I think, the first step toward this interdisciplinary work,” Bortz said. “This is why we’re developing these no-code user interfaces.”
With its emphasis on interdisciplinary research, the workshop reflected the fact that AI is no longer confined to computer science.
“You want to involve the philosophers, you want to involve the doctors, you want to involve the lawyers, because they’re the ones who really understand the concepts underneath the field,” Bau said.
Bau sees involving other subject-matter experts as crucial to ensuring the technology is applied responsibly.
“A legal expert might be able to explain to me what it is that the AI is thinking, and we could crack open its neurons together and figure out how it all works,” Bau said. “It’s really hard to do that on your own. These workshops are a chance to bring people together and share all these different perspectives.”
One example of such interdisciplinary research was “Discovering Interpretable Concepts in Large Generative Music Models,” in which Dartmouth professor Nikhil Singh and his colleagues combined machine learning and music theory to explore how generative models understand and represent musical structure.
“How do we bridge the gap between the raw statistical horsepower of these models and the structured conceptual vocabulary we humans use?” the paper asks.
Ultimately, the workshop organizers hoped to reveal not only how AI models think, but also how much we have yet to uncover.
“Science is usually a difficult, painstaking, gradual process, but in AI we’re seeing pretty rapid advances,” Bau said. “One of the takeaways that I hope everybody brings with them is what an amazing time it is to be studying this stuff.”
The Khoury Network: Be in the know
Subscribe now to our monthly newsletter for the latest stories and achievements of our students and faculty