Khoury News
“Only a partial representation of society”: Meet the researcher addressing bias in AI
For many of those crusading against bias in AI models, the name of the game is to remove the bias. For Lucy Havens, who blends computing with data visualization and archival experience, the goal isn't removal — it's transparency.
Millions of people interact with AI systems like ChatGPT and Gemini every day. However, these models rely upon massive datasets that can include bias, resulting in skewed outputs to users and potential feedback loops of data bias. This endangers historically marginalized groups and the wider public, and while developers have made efforts to combat bias in their models, Khoury College postdoctoral research fellow Lucy Havens sees that as a losing battle.
In Havens’s view, AI comes with inherent biases. Rather than rely on limited mitigation approaches that might catch obvious hate speech or derogatory language but leave contextual bias in place, she advocates for a different approach.
“I don’t think there’s a fix to bias,” Havens said. “We need to work on transparency — more about acknowledging what has been measured and what has not been measured, what data has been collected and what has not been collected.”
In addition to computer science, Havens has worked with libraries and archives, and as research libraries grapple with how to compare the numerous AI tools being marketed to them, she sees a lot of gaps in AI evaluation practices.
“The world is a much more dynamic environment than the lab,” she noted. “That complexity is overlooked when innovations and achievements in AI are communicated to the public. The typical metrics and benchmarks used to evaluate AI models to promote their capabilities have little relevance to many real-world settings.”
Beyond shortcomings in the large language models themselves, users bring their own cognitive biases, with lived experiences shaping how they think about and use AI. As part of the Human-Data Interaction team at the Roux Institute, Havens works with Mahsan Nourani, an expert in responsible AI who focuses on how cognitive biases and other user backgrounds affect human–AI interaction.
Day to day, Havens conducts literature reviews, designs and runs research projects and user studies, and meets with the team and its industry partners. She sees AI literacy as key to an AI-shaped society.
“We don’t just want people to understand how AI works on a technical level; we also want to help people become aware of common but mistaken assumptions — for example, that technology will be more balanced or fair than humans,” she said. “AI is trained on human data to make decisions, and humans are often biased.”
From her own studies, Havens gives the example of gender bias in training data. Women are often described in relation to men rather than in relation to their own work, interests, or accomplishments. A female artist’s first mention might be her marriage to a male artist or the moniker of “woman artist,” while men were described as just “artists.”
“People use AI for summarization, so the same groups that have been overlooked will continue to be overlooked,” she continued. “AI-generated summaries will perpetuate those biases.”
With AI-generated summaries incorporated into search engines, Havens is also wary of complex topics being oversimplified, especially when there are different sides to an argument or when distinct disciplines are blended in research.

“I think some friction is actually a good thing,” she explained. “The idea that everything should be simpler and easier — the world doesn’t work that way.”
According to the United Nations, only 68% of the world’s population are considered internet users, which leaves billions of people unrepresented or underrepresented in datasets.
“It’s important to avoid generalizing the knowledge or intelligence of these AI systems,” Havens added. “Data can only ever be a partial representation of global society.”
Additionally, much of the world’s knowledge has not been digitized or can’t be easily represented in data. For instance, less than five percent of the holdings of the United States National Archives and Records Administration has been digitized, which includes everything from manuscripts to photographs. Each object’s metadata must be manually created, and most archives have troves of material that aren’t represented in their catalogs or online databases, let alone digitized.
“The world can’t be perfectly replicated in digital spaces — there will always be something that’s lost,” Havens said. “We’re using methods designed for a narrow lab context to gauge the progress of AI models. But the technology is being deployed in so many different contexts.”
Havens knew for a long time that she wanted to work with the way that information is accessed and presented to people. During her time in libraries, archives, and museums, she found inspiration in data visualization, seeing how information could be presented beyond just a list of search results.
“I got interested in those possibilities, especially when you’re not sure what you’re searching for,” she said.
Havens has presented her work at a wide array of conferences, most recently in Japan last spring at ACM CHI. Her paper there was awarded an Honorable Mention, placing it in the top 5% of submissions.
Havens moved to Maine to join Northeastern’s Roux Institute this past summer.
“What’s cool about the Roux Institute are the industry partnerships it’s developing with local companies and startups,” she said. “I was very interested in the end users of AI, and it’s inspiring to hear about all these different companies finding innovative and socially impactful ways to leverage AI and data-driven technologies in general.”
The Khoury Network: Be in the know
Subscribe now to our monthly newsletter for the latest stories and achievements of our students and faculty