A Khoury researcher writes on a paper stuck to a whiteboard as another researcher looks on

Capstone Research Projects

Looking for in-depth research support?

A capstone is an end-of-program applied research project where students spend 20 hours per week, for 15 weeks, investigating a research problem alongside an industry stakeholder. Most capstone research projects are related to machine learning, computer vision, networking, HCI, cloud computing, AI, NLP, speech recognition, or DevOps.

Depending on the problem, the project scope will include a literature review of related work, identification of methodologies to solve the problem, an exploratory set of experiments with results, a final analysis, and future work. Students will work in teams of three or four. Stakeholders can meet with students virtually or in person, typically every two weeks.

Recent capstone research projects across Northeastern campuses

Explore the projects below to see the breadth of our students’ research — from machine learning and AI to networking, robotics, and data science — in close collaboration with industry and academic partners.

Oakland

Applying Machine Learning to Analyze Neutron Reflectometry Data of Polymers for Sustainable Applications

Neutron reflectometry (NR) is a powerful tool for probing the structure of materials. In previous works, we developed a neural-network-based workflow to accelerate the analysis and interpretation of NR spectra. However, this approach was found to be sensitive to noise. In this project, the students demonstrated that dropout can be used effectively to handle noisy data, while also developing a user-friendly pipeline, pyreflect, that makes the framework accessible to non-experts.

Poster will be added when available.

Seattle

Beyond LLMs: Cognitive-Emotional Trajectories of Human vs. Machine Authorship

Large language models (LLMs) are reshaping text creation, prompting questions about whether human writing trajectories remain distinct. Unlike traditional authorship detection that analyzes isolated texts, we model writing as longitudinal cognitive–emotional trajectories. Using 6,086 human texts and 74,241 LLM generations across academic, blog, and news domains, we capture temporal variability in a 75-dimensional feature space. Humans exhibit richer, more irregular trajectories than LLMs, and our trajectory-based classifier distinguishes them with high accuracy, showing that temporal cognitive–emotional patterns are a robust marker of human authorship.

  • Students: YeoJin Jenny Kim, Zhanwei Max Cao 
  • Faculty: Shanu Sushmita 
  • Subject: Natural language processing 
  • Date: December 2025 

Towards Shadow-Invariant Perception: Reflectance Estimation for Robotic Navigation

Robotic navigation is highly sensitive to lighting variation, where shadows and specular highlights distort object appearance and terrain boundaries. We introduce a spectral ratio estimation framework for illuminant-invariant imaging that improves perception robustness under variable lighting. The method builds on classical reflection models while leveraging spectral ratio maps to suppress illumination effects and preserve reflectance. We evaluate performance on raw and benchmark datasets using angular error, shadow detection accuracy, and reproduction error, and further assess downstream impact on object detection and navigation success. Results show that spectral ratio–based representations significantly enhance illumination-invariant perception, offering a practical and physics-grounded solution for reliable robot navigation.

  • Student: Yuhan Li 
  • Faculty: Bruce Maxwell 
  • Subject: Computer vision 
  • Date: December 2025 

In the Eyes of LLMs: Aligning AI with Human Perception

This project explores using AI to align with human task performance on graph visualization. We collected ~8,000 responses across 4,000 layouts from eight algorithms, measuring accuracy and time on tasks like shortest-path finding, connectivity, and neighbor counting. Vision-based ML models (ResNet, DinoV2) closely match human performance and can serve as proxies for evaluating readability, while current multimodal LLMs show systematic gaps and hallucinations, failing to perceive layouts as humans do. The work investigates whether LLMs can evaluate, select, or generate graph layouts aligned with human understanding.

  • Students: Yinghui Yang, Kefan Zhou
  • Faculty: Yifan Hu
  • Subjects: Artificial intelligence; data visualization; machine learning; computer vision 
  • Date: December 2025 

Silicon Valley

Drone Ranger

Urban wildlife management in the Bay Area presents growing challenges as animals damage vegetation, attract predators, and create safety risks. This work explores a noninvasive approach using a lightweight autonomous aerial platform to deter wildlife through motion, light, and short acoustic cues. We present a unified obstacle detection and avoidance architecture that integrates perception and control using PX4, ROS2, and Gazebo/Unity, with validation on a Holybro X500 V2 drone. Results demonstrate reliable simulation-to-reality transfer for trajectory planning and obstacle avoidance, establishing a scalable and ecologically sensitive framework for urban wildlife deterrence.

  • Students: Zhipeng Ling, Renxiang Yin, Chunzhang Liu, Xiaoman Zou 
  • Faculty: Ilmi Yoon 
  • Subject: Robotics 
  • Date: December 2025

SimPath 2.0: An Emotion-Aware Virtual Patient System for Therapist Training Using Multimodal AI

SimPath 2 is a clinically grounded AI-based patient simulation system designed to support therapist training through realistic, adaptive conversational practice. Building on structured patient personas, DSM-5-aligned clinical knowledge, and persistent interaction memory, SimPath 2 enables trainees to engage in safe, low-stakes therapeutic conversations that evolve in response to their strategies and communication style. The system integrates in-session, rubric-guided feedback to scaffold reflection on key clinical skills such as rapport building, questioning quality, diagnostic reasoning, and empathy. Through this design, SimPath 2 advances human-AI collaboration for professional education by supporting deliberate practice while preserving human judgment and agency.

  • Students: Minghong Xia, Zixuan Yin 
  • Faculty: Akram Bayat
  • Subjects: Human-centered computing; artificial intelligence 
  • Date: December 2025

Poster will be added when available.

Learning to Play: Game Generation Model with Action and Memory

Recent advances in diffusion models have demonstrated strong robustness in long-sequence generation, motivating their use in interactive game video synthesis. However, existing approaches still struggle with low frame fidelity, inconsistent action-conditioned rendering, and high computational cost. This project develops a lightweight diffusion-forcing framework capable of generating playable Super Mario gameplay directly from data under tight resource constraints. Our method combines a compact 4× VAE with a 32-block DiT diffusion model to preserve essential scene details while maintaining smooth frame-to-frame transitions. Being trained on 522k frames, the model achieves competitive visual quality, including PSNR above 26 dB at short horizons and FID as low as 13.39 at 32-frame rollouts, while sustaining real-time performance at 17 FPS on a single T4 GPU. These results highlight that carefully designed latent diffusion architectures can deliver interactive, action-responsive gameplay without heavy computation. The study provides a concrete case for adapting generative models to interactive environments with limited data, with implications for research, education, and lightweight world-model design.

  • Student: Feiyan Zhou 
  • Faculty: Tehmina Amjad, Smruthi Mukund
  • Industry Partner: Peize Sun
  • Subjects: Artificial intelligence; games; machine learning
  • Date: December 2025

A Geo-Meta Ensemble Framework for Robust Cross-Well Pore Pressure Prediction

Pore pressure prediction (PPP) is a vital task for ensuring safe drilling in oil and gas operations. Due to the wide application of machine learning (ML) methods, their use is getting the attention of researchers in this domain. The existing methods that apply ML methods for the estimation of pore pressure often exhibit poor cross-well generalization due to domain shift, specifically in regions that have pressure imbalance issues. This study proposes a GeoMeta Ensemble (GME) that attempts to improve these limitations through strategic, well-based data partitioning, physics-informed feature engineering, and a stacking-based meta learner that integrates predictions from diverse architectures, including Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Transformer Model, Deep Feedforward Neural Networks (DFNN), Random Forest, and XGBoost. It employs a Ridge stacking strategy that automatically selects optimal model weights by dynamically leveraging their complementary strengths. This selective ensemble approach combines the use of spatial pattern recognition from CNNs with temporal dependencies of RNNs, attention mechanisms of Transformers, and nonlinear relationships of tree-based models toward robust predictions under domain shift conditions within one model. Processing 267,168 samples from 20 wells across 11 geological regions, the framework sustains an inference latency below 50 ms per prediction, supporting real-time drilling applications. The proposed method achieves R² of 0.9176 on a blind well that has a significant pressure regime mismatch (16.2% underpressure samples in test set vs. 13.8% in training set), thus showing robust generalization under domain shift conditions. The proposed GME outperforms the best individual model (Transformer) by 0.49%, while the transformer model exhibits minimal overfitting (2.7% generalization gap) despite temporal sequence complexities. The study establishes reproducible benchmarks for assessing cross-well generalization.

  • Students: Rohan Benjamin Varghese, Pranav Patel 
  • Faculty: Tehmina Amjad
  • Subjects: Artificial intelligence; machine learning
  • Date: December 2025

Vancouver

KN-RAG: A Novel Approach to Retrieval-Augmented Generation with Knowledge Graphs and Self-Contextualization

Large language models (LLMs) have significantly advanced natural language processing (NLP), driving improvements in content generation and question answering. However, these models encounter challenges such as generating non-factual content, high resource demands for updates, and difficulties in effectively processing large inputs. Retrieval-Augmented Generation (RAG) systems offer a solution by integrating factual retrieval to reduce hallucinations. Despite this, standard RAG systems struggle with complex multi-document datasets due to information loss about connections and relationships in the text during vector embedding. To address these limitations, we introduce Knowledge-Nexus RAG (KN-RAG), a novel approach that integrates knowledge graphs with a self-contextualization mechanism within the RAG framework. KN-RAG aims to improve the accuracy, completeness, and diversity of LLM-generated responses, particularly for multi-document question answering.

  • Students: Zhixiao Wang, Haoning Wang, Wenqian Xie, Mingyi Qiu 
  • Faculty: Ryan Rad 
  • Industry partner: Deloitte 
  • Subjects: Artificial intelligence; data science; natural language processing 
  • Date: Summer 2024 

Can AI Be Trusted? Evaluating Reliability in Retrieval-Augmented Generation Responses

Royal British Columbia Museum (RBCM) AI Avatar Project: The RBCM is using AI avatars powered by retrieval-augmented generation (RAG) to enhance visitor interactions, evaluating the quality and accuracy of AI responses to enhance visitor experience, share cultural heritage, provide personalized tours, and advance educational goals.

  • Students: Zheng Gu, Hejia Li, Mulei Ni 
  • Faculty: Yvonne Coady 
  • Industry partner: Royal British Columbia Museum 
  • Subjects: Artificial intelligence; human-centered computing 
  • Date: Fall 2024 

Past projects

Spring 2023

Strata Fee Management in Condominiums via Smart Contracts

Condominiums and similar properties use a stratum to manage daily operations, and owners fund it through strata fees. While existing strata fee management systems may be able to handle such funds, such systems could be more inherently transparent. It is possible to leverage the digital ledger from blockchain networks and smart contracts to build a fully transparent strata fee management system. This paper proposes designing a strata fee management system based on a smart contract in the Ethereum network. Both strata corporations and homeowners can interact with the smart contract to execute common procedures such as paying strata fees and handling expenses. Using smart contracts for strata fee management, it is believed that the chance of fraud by strata corporations is lowered compared to other systems.

Students: Liam Scholte, Rui Wang, Kwok Keung Chung
Professor: Michal Aibin
Industry Partner: The Jervis BC Strata
Subjects: Blockchain, smart contracts, security, decentralization
Date: January 2023

Fall 2022

Emergency Surgical Scheduling Model Based on Moth-flame Optimization Algorithm

In this paper, we propose an optimization approach based on an improved Moth Flame optimization (MFO) algorithm for solving emergency operating room scheduling problems. The purpose of the MFO is to minimize the maximum span of operations, ensuring patients receive their surgeries in a timely manner. This nature-inspired algorithm stimulates the moth’s special navigation method at night called transverse orientation. The moth uses the moonlight to sustain a fixed angle to the moon, therefore, guaranteeing a straight line. However, a light source can cause a useless or deadly spiral fly path for moths. The results show that MFO has advantages over Grey Wolf optimization (GWO) and Genetic Algorithm (GA), particularly when comparing the performance of the algorithms under different spiral curves when considering the unrestricted use of surgical beds between different procedures and the optimization of algorithm speed.

Students: Cuiting Huang, Sicong Ye, Shi Shuai, Mengdi Wei, Yehong Zhou
Professor: Michal Aibin
Industry Partner: Healthcare provider
Subjects: Algorithms and theory, cloud computing
Date: September 2022

Q-Learning Based Routing in Optical Networks

The rapid increase in bandwidth demand has driven the development of flexible, efficient, and scalable optical networks. One of the technologies that allows for much more flexible resource utilization is Elastic Optical Network. However, there is a need to solve the Routing, Modulation and Spectrum Assignment (RMSA) problem. In this paper, we use reinforcement learning to improve the efficiency of the routing algorithm. More specifically, we implement an off-policy Q-learning and compare it with the state-of-the-art algorithms. The results confirm that Q-learning is highly effective when optimal results need to be found in a large search space.

Students: Nolen B. Bryant; Kwok K. Chung; Jie Feng; Sommer Harris; Kristine N. Umeh
Professor: Michal Aibin
Industry Partner: Internet service provider
Subjects: Networking, artificial intelligence
Date: September 2022

Survey of RPAS Autonomous Control Systems Using Artificial Intelligence

In this survey, we look at the overall idea of Remotely Piloted Aircraft Systems (RPAS) and autonomous control, as well as RPAS infrastructure, levels of autonomy, and current benefits and difficulties in the field when utilizing Artificial Intelligence. While current remotely piloted aircraft systems have a manual pilot operator to provide double-layer security and safety, studies show that having RPAS with a fully autonomous vehicle at its centre could significantly improve decision-making and overall mission precision, accuracy, safety, and efficiency.

Students: Ruchi Bhavsar; Mino Reyes
Professor: Michal Aibin
Industry Partners: InDro Robotics, Aerometrix
Subjects: Robotics, artificial intelligence
Date: September 2022

Spring 2022

Study on High Availability and Fault Tolerance

With the growing demand for e-Commerce and remote working applications, it has become more important than ever to design applications with high availability and fault tolerance. This research proposes a push-based mechanism with persistent connection to reduce the “time to detect” such that the overall service level agreement for applications can be improved.

Students: Norman Kong Koon Kit
Professor: Michal Aibin
Industry Partner: Amazon
Subjects: Systems and networking, cloud computing
Date: May 2022

Real-Time Search and Rescue using Remotely Piloted Aircraft System with Frame Dropping

Usage of Artificial Intelligence (AI) technology to aid the Remotely Piloted Aircraft System (RPAS) helps to get accurate imagery along with vital ground details, which as a result boosts the Search and Rescue operations. Since the search must be done quickly, real-time video processing is essential for survival. Our solution attempts to integrate image processing, more specifically, the You Only Look Once (YOLO) algorithm to detect humans in all environmental conditions. Moreover, traditional methods of AI use Graphics Processing Units (GPU) instead of Central Processing Units (CPU). We solved the issue of low frame-per-second processing on the CPU with a newly designed frame-skipping algorithm. This improved method results in accurate and quick detection of humans and allows real-time detection.

Students: Rohan Sharma
Professor: Michal Aibin
Industry Partner: InDro Robotics
Subjects: Robotics, artificial intelligence
Date: January 2022

Fall 2021

Aerial Footage Analysis Using Computer Vision for Efficient Detection of Points of Interest Near Railway Tracks

Object detection is a fundamental part of computer vision, with a wide range of real-world applications. It involves the detection of various objects in digital images or video. In this paper, we propose a proof of concept usage of computer vision algorithms to improve the maintenance of railway tracks operated by Via Rail Canada. Via Rail operates about 500 trains running on 12,500 km of tracks. These tracks pass through long stretches of sparsely populated lands. Maintaining these tracks is challenging due to the sheer amount of resources required to identify the points of interest (POI), such as growing vegetation, missing or broken ties, and water pooling around the tracks. We aim to use the YOLO algorithm to identify these points of interest with the help of aerial footage. The solution shows promising results in detecting the POI based on unmanned aerial vehicle (UAV) images. Overall, we achieved a precision of 74% across all POI and a mean average precision @ 0.5 (mAP @ 0.5) of 70.7%. The most successful detection was the one related to missing ties, vegetation, and water pooling, with an average accuracy of 85% across all three POI.

Students: Rohan Sharma, Kishan Patel, Sanyami Shah
Professor: Michal Aibin
Industry Partner: Via Rail Canada/spexiGeo
Subjects: Computer vision, machine learning
Date: September 2021