Khoury News
Khoury master’s researchers showcase their fall 2025 work in Boston
In December, after a semester of dedication and investigation, Khoury College's master's student researchers displayed their findings on a broad array of technical and practical topics.
On December 2, Khoury College master’s student researchers gathered in the Boston campus’s Alumni Center to present the fruits of their labors, with topics ranging from smarter fitness monitors to tutor matching to robot tracking. To learn more about their research, click any of the linked names below, or simply read on:
Ankith Seethesh Vaidya
Comparison of Open-ended Sensemaking Models for Passive Sensing Data Analyzing GLOSS and SensorChat Towards a Hybrid Architecture

Our smartwatches and fitness trackers passively collect our health data, but making sense of that data can be challenging. Typically, these wearable devices track one metric at a time, showing a user their step count or heart rate instead of combining these metrics to make higher-level assessments.
That’s where models like GLOSS and SensorChat come in. They use large language models and machine learning to help users and clinicians make sense of multi-modal data. You could ask these models, “Based on my step count, what will my heart rate be tomorrow?”
Vaidya, advised by Varun Mishra, is researching the differences between GLOSS and SensorChat, hoping to pave the way for a more sophisticated model that combines the positives of both to improve everyday health monitoring.
“SensorChat gives output in a faster way. GLOSS is a little bit slower. GLOSS is completely open-ended. SensorChat is not completely open-ended. So how about we take the good things from SensorChat and implement them in GLOSS?” Vaidya says.
Barath Balamurugan
Virtual Reality for Effective Collaboration Between Remote Outposts

Balamurugan, advised by Wallace Lages, is researching the effectiveness of virtual reality tools across remote spaces. He is comparing team collaboration over video calls (e.g. Zoom) with collaboration in 3D VR environments using Apple Vision Pro and Slime VR Trackers — small devices that track the movements of the body and create a 3D virtual avatar.
This VR environment can enable coworkers in different parts of the globe to see and explore a 3D space together. Balamurugan anticipates seeing higher cohesion, more balanced participation, and stronger nonverbal synchrony among the group using VR tools. This research could be used by teams operating in remote or extreme environments, such as space analog missions, remote scientific labs, or distributed medical teams.
“You could be on Earth, I could be on the moon, but we will both be in a shared environment in which we can communicate face to face,” Balamurugan says.
Bhavana Rajan Nair
Improving LLM Annotation Accuracy for Medical Entity Extraction

Medical information annotation is important for prescribing medications, especially those with many side effects. It is a time-consuming and expensive task that requires skilled annotators.
That’s why Nair is exploring the ability of large learning models to annotate medical information, knowing that the LLMs must be as reliable as human annotators, if not more so.
Nair trained Meta’s Llama-2 model using 1500 Reddit posts about weight loss drugs Ozempic and Wegovy — two medications that often prompt online discourse about unexpected side effects. Annotated by Nair and evaluated by her advisor, Deahan Yu, the medical information consists of complex medical language and informal phrasing that reflect patients’ inconsistent description of symptoms and adverse drug events.
Llama-2 scored a 20% accuracy in the annotation test on the first try, and after getting fine-tuned by low-rank adaptation, its score jumped to 75%. Since the dataset of 1500 entries required heavy computational processing, Nair decided to fine-tune the parameters further, which improved the accuracy score again.
As such, with a more robust information set, LLMs have the potential to achieve human-level performance in medical annotations.
Hrishika Rakesh Samani and Dalton Burkhart
Improve Students’ Performance Through Active Learning and Automation

Active learning is used in many educational settings to engage students in class activities, including assignments, collaborations, and Q&As. However, Samani and Burkhart note, there are limitations to this practice, namely the time-intensive and draining nature of rapid grading, continuous feedback, and personalized routine that teachers must carry out.
Advised by Chieh Wu, the duo automated this active learning style by creating an AI-powered application that generates questions about lessons, grades student answers immediately to promote critical thinking, analyzes student performance, and identifies those in need of support so instructors can assist them efficiently.
Samani and Burkhart plan to examine AI-driven, real-time practice questions for students based on course context, as well as automatic grading of different forms of questions. They also identified the need to integrate their system into existing learning platforms like Canvas and Blackboard, and to add natural language processing features for questions and explanations.
Grace Chong
Are Current University Policies on Artificial Intelligence Adequate for Interdisciplinary Programs?

Is it OK for students to use generative AI in coursework? Depends on who you ask. There is disagreement not only among students and professors, but among professors and department heads across the country, too.
Chong, advised by Youna Jung, is evaluating the 80 top-ranked computer science colleges and universities to assess trends and identify gaps in their AI policies. While academic integrity is a consistent concern, policies are often light on details, leaving professors in the lurch and students held to inconsistent expectations.
Chong’s research specifically looks at computer science courses open to non-CS students, revealing a widening gap between how and why CS students use AI compared to non-CS students.
“CS students are much less fearful of generative AI tools than their non-STEM counterparts, who are very fearful of AI taking over,” Chong says.
Chong’s research aims to help educators form more consistent and effective policies surrounding classroom AI use.
Manas Aggrawal
Typed Conversational Interfaces

Holding a human–computer conversation in natural language, like ChatGPT or a Snapchat bot, is a daily occurrence for many. But many chatbots are limited in the accuracy of their information and robustness of the conversation when they can’t identify casual texts from human users.
To improve users’ experience with chatbots, Aggrawal, advised by Chris Martens, broke down conversational rules to create more reliable two-way dialogues. He applied typed domain-specific language — which requires users to only type — into conversational interfaces. By applying programming language theory, Aggrawal designs, analyzes, and classifies languages used to instruct computers.
Through these interfaces, Aggrawal aims to train chatbots to type domain-specific language, resulting in fewer mistakes and making the bots more trustworthy and flexible for various fields of work, like finance or health care. He looks forward to using Agda, a dependent-proof assistant, to cross-check the accuracy of information within conversations.
Nihar Purshottam Sanda
eGoT: An Enhanced Graph-of-Thoughts Framework for Adaptive, Multi-Hop Knowledge Retrieval

There are more than 37 million published biomedical works, with 5,000 new papers published daily. Sifting through this massive knowledge base means researchers often spend more time searching for existing information than actually conducting research.
Nihar Sanda, advised by Ayan Paul, Ben Gyori, and Auroop Ganguly, is developing a better framework for information retrieval by large language models (LLMs) to speed up this process. His method is called eGoT, or Enhanced Graph-of-Thoughts, which refers to a technique where LLMs make logical associations via graph structures instead of linear or tree structures. This allows for more flexible and humanlike reasoning, where the AI explores multiple ideas simultaneously and connects the dots between them.
Current graph-based models underperform when asked to connect pieces of information from different sources across diverse domains. Sanda’s eGoT model addresses this by assessing its results and adjusting its search decisions as it goes, helping it to decide which related areas to explore.
Sanda’s testing showed eGoT performed 3–7% better than state-of-the-art methods on multiple datasets, which translates to finding the right answer far more often in real-world use. This framework provides a more efficient method for information retrieval in LLMs, enabling researchers to spend more time researching and discovering.
Pratyusha Jaitly
Resisting Through Migration: Exploring Empowerment and Community-Building Among Queer Women Platform Migrants

Jaitly, advised by Michael Ann DeVito, is exploring the effects on vulnerable groups — particularly queer women — of migrating from one social media app to another. Her inspiration came from the proposed TikTok ban in early 2025, which sent many users to Red Note, a Chinese-owned social media app, in search of community and political alignment.
“Especially for queer people and marginalized people, online spaces can be an important infrastructural need,” Jaitly says, noting that at-risk groups, which often do not find acceptance in their immediate surroundings, often seek out safe, welcoming, and joyful online communities where they can truly be themselves.
Jaitly is conducting semi-structured qualitative interviews across four groups: new Red Note migrants, original Red Note users, cross-cultural users, and multi-migration veterans. By learning from their experiences, Jaitly hopes to empower app makers to better facilitate influxes of platform migrants, and to treat app migration as a form of resistance and empowerment rather than merely loss.
Ritika Kumar
ICD Code Extraction from Clinical Notes using Large Language Models

ICD-10-CM is a set of standardized diagnostic codes used for billing, reimbursement, and decision support in the US health care system. Using these codes can be time-consuming, and researchers usually face difficulties in labeling diagnoses and symptom descriptions.
Kumar used BioMistral 7B, a generative LLM, to process ICD codes of de-identified electronic health records obtained from a public critical care database. Using discharge summaries, she paired this information with each patient’s ICD-10 codes, then mapped the data onto the clinical classifications software refined (CCSR), a system that organizes ICD codes and reflects the relationships between different diagnoses that the ICD codes can’t show.
From the ICD-10 codes and the CCSR codes, the LLM generates a predicted set of ICD codes that account for more in-depth clinical narratives. Advised by Deahan Yu, Kumar found out that “BioMistral 7B successfully produces valid ICD-10-CM and CCSR codes from discharge summaries.”
By understanding both the data’s hierarchical structure and graphs of diagnoses’ relationships, researchers can reduce distractions from the predictions and anchor rare clinical codes. Kumar’s research builds a foundation for an accurate and enriching processing of clinical operation data.
Rupert Simpson
Tutor Learner Database

Matching tutors’ and learners’ schedules using Google Sheets can be a tedious, time-consuming task. For instance, library staff in Plymouth, Massachusetts had been doing this manually to organize classes and tutoring sessions for the High School Equivalency Test and the General Education Test.
Advised by Albert Lionelle, Simpson focused on creating a web application that can automate the process, organize data more efficiently, and minimize errors, thus making the app more convenient for users and more affordable for nonprofits. So far, he’s switched the platform from Heroku to Firebase to reduce cost, as well as switched from PostgreSQL — a database that uses fixed columns and rows — to NoSQL, a more flexible database.
“I learned to think from the perspective of a future developer who might work on the project,” Simpson says.
Saichandu Juluri and Kashif Imteyaz
CollabPlore: Co-Design Toolkit for Rapid Prototyping with Generative AI

With the goal of using AI to assist creativity and collaboration — and to preserve the creative agency of users — Juluri and Imteyaz developed CollabPlore, a toolkit that transforms conversations into text.
Going beyond simple voice-to-text functionality, CollabPlore understands conversation through an AI-powered user interface mockup engine. This engine refines the language’s accuracy, coherence, and comprehensibility by transcribing user interactions onto the page as participants interact on a video call, thus freeing up the humans to focus more on creativity. By revealing the conversation transcriptions, the AI also preserves transparency in the conversation-to-text chain.
Advised by Saiph Savage, Juluri and Imteyaz plan to continue testing the mockup to see the effectiveness of AI-assisted sessions. They also plan to assess potential risks by observing AI-skeptical participants as they interact with the app.
Shane Ferrante, Amir Azarmehr, and Mohammad Saneian
Optimal Approximation of Maximum Directed Cut in the Streaming Setting

Ferrante, Azarmehr, and Saneian, advised by Soheil Behnezhad, are trying to resolve a problem that has plagued computer scientists for a decade: how to process massive datasets efficiently by processing the data as a stream, in a single pass, without needing to store the data first.
“The streaming setting is an algorithmic environment where instead of seeing the entire input and answering a question about that, the input is assumed to be very large and you can’t fit it on a single machine, so you have to stream in the input bit by bit, in one pass,” Ferrante says.
The maximum directed-cut, or Max-DI-CUT, problem refers to the theoretical limits on processing data in this streaming setting while trying to satisfy as many constraints as possible.
Algorithmic theory and a decade of trying have proven there’s a fundamental limit; with just one pass through the data and limited memory, you can’t satisfy more than half the optimal number of constraints, with the best efforts reaching 48.5%. This research team has achieved this limit, but with a catch: getting closer to it requires more memory, eventually approaching the point where you’d need to store everything anyway — defeating the purpose of processing data in this way.
By resolving an open question about whether the theoretical upper bound is achievable in practice, their research advances our understanding of streaming algorithms and provides techniques applicable to related optimization problems.
Siddhant Maheshkumar Narode
Mixed Reality and LLMs for Situated Language Learning

Learning a new language can be frustrating, and practicing fledgling skills with native speakers can be intimidating. But by using mixed-reality technologies and large language models to access more engaging and lifelike content, learners can reduce their anxiety.
Narode, advised by Mirjana Prpa, aims to build on data showing that merging educational content with the learner’s real-world surroundings (e.g. labeling physical objects with the object’s name in the target language) improves vocabulary retention. By combining mixed reality technology, large language models, and spatial intelligence, he hopes to enhance the language learning experience and overcome obstacles to it.
“There are some indications that using a mixed reality headset helps reduce your anxiety because it’s always speaking to you; it’s always readily available to translate,” Narode says.
Srijan Dokania
Zero-Splat TeleAssist: A Zero Shot Pose Estimation Framework for Semantic Teleoperation

When working with robots, many operators face obstacles that limit their ability to track the robots’ activity, including narrow views, visual blockages, and faint tracing of movements. Many improvements require costly calibration and markers to identify robots on camera.
Dokania, advised by Hanumant Singh, aims to reduce operator burden by improving the security cameras’ situational awareness. The cameras are structured so they are zero-shot (able to understand the environment and detect objects without pre-training) and marker-free (reliant on miniscule markings to map the space).
After using infrastructure cameras to capture the robots from different environments, Dokania processes the images of the robots through a CLIP-based architecture — a framework developed by OpenAI to align images and texts. The information is then mapped onto a 3D Gaussian splat map using Monocular Depth Estimation in Adverse Scenes (MiDaS).
This way, operators can observe robots in the real world without having to move between environments. And even if the robots’ routes are accidentally interrupted in a big space like a warehouse, the operators can easily localize them.
“We can also have future extensions like an AR-augmented tele-operation, where the robots can be seen in the VR headset of the operators … They can see the collision envelopes on video feeds,” Dokania says about possible further research.
Swadeep
Side Channel Analysis on Microcontrollers to Understand Effects of Temperature to Break Encryption

While Internet of Things (IoT) devices like fitness trackers, doorbell cameras, and self-driving cars can make our lives easier, they offer hackers new ways to invade our privacy.
Swadeep, advised by Guevara Noubir, is testing the security strength of a popular IoT microcontroller (a small single-chip computer) called ESP32. When microcontrollers perform operations, they consume energy. Swadeep has found that by closely monitoring fluctuations in power consumption in the microcontroller, information about secret operations can be exposed to malicious actors.
“For each byte of data, we developed a model to see how much power the byte will consume. We correlate and see which models match the byte the best. We found out that you can easily break an encrypted byte,” Swadeep says.
Security concerns for these types of devices often lie at the software level, leaving the information given off by hardware neglected. Swadeep hopes this research will help manufacturers harden their devices against such leakages.
Thanya Mysore Santhosh
Understanding the Drivers of Depression: An LLM-Based Cause Classification Framework Using Twitter Data
You can tell a lot about a person from their social media posts.
Santhosh, advised by Deahan Yu, is using natural language processing to interpret X data and understand how social factors contribute to mental health issues like depression.
Santhosh is using sentiment analysis — a method for determining the emotional tone of digital text — on geotagged tweets from Greater Boston to uncover disparities between different communities. By exploring why some people express their mental health issues on X and others don’t, she aims to identify underlying causes of depression by considering factors such as financial stress, relationship issues, health problems, academic or work pressure, loneliness, and trauma.
By mapping these causes to geographic and demographic data, Santhosh hopes to inform mental health strategies that cater to the specific needs of a community.
“This will be helpful for clinicians and policymakers, as well as digital health systems to target interventions and to track mental health trends over time,” says Santhosh.
Wenqing Zeng
Human activity recognition (HAR) models investigation and generalization

People tend to behave differently when they’re being observed. This poses a problem for researchers trying to evaluate human activities in a lab setting using human activity recognition (HAR), a technique that combines machine learning algorithms and sensor data to identify and track human behaviors like running or walking.
Zeng’s project explores the gap between HAR laboratory data and self-reported data from participants’ daily lives.
“In the real world, when people are running, they’re trying to catch a bus, or running with a dog. In the lab setting, they’re on a treadmill, they want to perform well for the researchers, so there’s a bias,” Zeng says.
Zeng, advised by Stephen Intille, seeks to evaluate the generalizability of existing HAR algorithms by investigating datasets spanning laboratory and real-world environments. She is assessing established models trained on lab data and testing them on in-the-wild datasets, quantifying performance degradation to determine where these models’ applicability is limited.
Zeng’s work is essential for translating wearable sensor technology into trustworthy tools that can inform evidence-based health decisions for clinicians and public health officials.
Yash Mahesh Burshe
ELASTIC: Evaluation of Large Learning Models and Abstract Syntax Parsing for Identification of Code Smell Detection
Did you know that computer code can smell?
Software developer Martin Fowler popularized the term “code smell” to refer to issues in source code that hint at deeper problems. In 1999, he identified 23 cases; since then, many tools have been developed to identify Fowler’s code smells.
Burshe, advised by Joydeep Mitra, aims to evaluate the effectiveness of using LLMs in static analysis to help identify code smells. His work takes SmeLLM, a tool designed to evaluate LLMs for code smell identification, as a precursor for his tool ELASTIC, which uses a hybrid approach. First, it uses a form of static analysis called abstract syntax tree parsing to filter out obvious cases of code smells; this technique is fast and cheap but lacks contextual understanding. That’s where the LLM takes over; it’s slower and more expensive, but it uses its nuanced understanding of the code to identify ambiguous cases.
By utilizing the strengths of both techniques, Burshe is developing a tool that can maximize our ability to detect code smells and thus raise our collective standard for source code.
Zeyu Li and Xiaogang Peng
Bridging Industry and Research: A Unified Pipeline for Human Motion Datasets

Research into human motion generation, like how a person’s arms or legs move, currently lacks robust datasets, leading to a gap between industrial and academic research. As such, Li and Peng proposed a three-step pipeline to transform datasets used for industrial purposes into research-ready samples.
Li and Peng, advised by Huaizu Jiang, first convert FBX files — which store 3D model data including geometry, texture, materials, and animations — into BVH format. Doing so filters out unnecessary data while maintaining the important motion information. The motion is then transferred from the frame that holds the original movements, known as the source skeleton, to the target skeleton, which contains animations and more diverse body proportions.
Finally, the duo map BVH files into SMPL-X parameters that control the body’s motion and shape. During training, the model compares its generated 3D human body from SMPL-X parameters to the real motion-capture skeleton data of the BVH files, then adjusts the body shape and movements to better match the real motion.
With their work, Li and Peng note, they hope to bridge the industrial–academic research gap and “empower future research in human motion synthesis, retargeting, and generative modeling.”
The Khoury Network: Be in the know
Subscribe now to our monthly newsletter for the latest stories and achievements of our students and faculty