805 Columbus Avenue
Interdisciplinary Science & Engineering Complex
Boston, MA 02121
Jeremiah Onaolapo is a Postdoctoral Research Associate at Northeastern’s Khoury College of Computer Sciences. He conducts research on applied machine learning, honeypots, malicious activity in online accounts, social network security, online scams, and botnet activity under Professor Dr. Engin Kirda at the Interdisciplinary Science and Engineering Complex. He is also affiliated with the iDrama Lab.
Prior to coming to Northeastern, he studied at the University College of London (UCL), where he assisted Drs. Gianluca Stringhini and Emiliano De Cristofaro in developing infrastructure and administering lab sessions for Computer Security II, a module in the UCL MSc in Information Security degree. He also assisted Dr. Nicolas Courtois in Applied Cryptography lab sessions.
Emeric Bernard-Jones, Jeremiah Onaolapo, Gianluca Stringhini. TheWebConf (WWW) CyberSafety Workshop, 2018, Lyon, France.
We set out to understand the effects of differing language on the ability of cybercriminals to navigate webmail accounts and locate sensitive information in them. To this end, we configured thirty Gmail honeypot accounts with English, Romanian, and Greek language settings. We populated the accounts with email messages in those languages by subscribing them to selected online newsletters. We also hid email messages about fake bank accounts in fifteen of the accounts to mimic real-world webmail users that sometimes store sensitive information in their accounts. We then leaked credentials to the honey accounts via paste sites on the Surface Web and the Dark Web, and collected data for fifteen days. Our statistical analyses on the data show that cybercriminals are more likely to discover sensitive information (bank account information) in the Greek accounts than the remaining accounts, contrary to the expectation that Greek ought to constitute a barrier to the understanding of non-Greek visitors to the Greek accounts. We also extracted the important words among the emails that cybercriminals accessed (as an approximation of the keywords that they possibly searched for within the honey accounts), and found that financial terms featured among the top words. In summary, we show that language plays a significant role in the ability of cybercriminals to access sensitive information hidden in compromised webmail accounts.
Under and over the surface: a comparison of the use of leaked account credentials in the Dark and Surface Web.
Adrian Bermudez Villalva, Jeremiah Onaolapo, Gianluca Stringhini, Mirco Musolesi. Crime Science (Journal), 2018.
The world has seen a dramatic increase in cybercrime, in both the Surface Web, which is the portion of content on the World Wide Web that may be indexed by popular engines, and lately in the Dark Web, a portion that is not indexed by conventional search engines and is accessed through network overlays such as the Tor network. For instance, theft of online service credentials is an emerging problem, especially in the Dark Web, where the average price for someone’s online identity is £820. Previous research studied the modus operandi of criminals that obtain stolen account credentials through Surface Web outlets. As part of an effort to understand how the same crime unfolds in the Surface Web and the Dark Web, this study seeks to compare the modus operandi of criminals acting on both by leaking Gmail honey accounts in Dark Web outlets. The results are compared to a previous similar experiment performed in the Surface Web. Simulating operating activity of criminals, we posted 100 Gmail account credentials on hidden services on the Dark Web and monitored the activity that they attracted using a honeypot infrastructure. More specifically, we analysed the data generated by the two experiments to find differences in the activity observed with the aim of understanding how leaked credentials are used in both Web environments. We observed that different types of malicious activity happen on honey accounts depending on the Web environment they are released on. Our results can provide the research community with insights into how stolen accounts are being manipulated in the wild for different Web environments.
Jeremiah Onaolapo, Martin Lazarov, Gianluca Stringhini. EuroS&P WACCO 2019, Stockholm, Sweden.
As of 2014, a fifth of EU citizens relied on cloud accounts to store their documents according to a Eurostat report. Although useful, there are downsides to the use of cloud documents. They often accumulate sensitive information over time, including financial information. This makes them attractive targets to cybercriminals. To understand what happens to compromised cloud documents that contain financial information, we set up 100 fake payroll sheets comprising 1000 fake records of fictional individuals. We populated the sheets with traditional bank payment information, cryptocurrency details, and payment URLs. To lure cybercriminals and other visitors into visiting the sheets, we leaked links pointing to the sheets via paste sites. We collected data from the sheets for a month, during which we observed 235 accesses across 98 sheets. Two sheets were not opened. We also recorded 38 modifications in 7 sheets. We present detailed measurements and analysis of accesses, modifications, edits, and devices that visited payment URLs in the sheets. Contrary to our expectations, bank payment URLs received many more clicks than cryptocurrency payment URLs despite the popularity of cryptocurrencies and emerging blockchain technologies. On the other hand, sheets that contained cryptocurrency details recorded more modifications than sheets that contained traditional banking information. In summary, we present a comprehensive picture of what happens to compromised cloud spreadsheets.
Andreas Haslebacher, Jeremiah Onaolapo, Gianluca Stringhini. eCrime 2017, Scottsdale, USA.
Underground online forums are platforms that enable trades of illicit services and stolen goods. Carding forums, in particular, are known for being focused on trading financial information. However, little evidence exists about the sellers that are present on active carding forums, the precise types of products they advertise, and the prices that buyers pay. Existing literature focuses mainly on the organisation and structure of the forums. Furthermore, studies on carding forums are usually based on literature review, expert interviews, or data from forums that have already been shut down. This paper provides first-of-its-kind empirical evidence on active forums where stolen financial data is traded. We monitored five out of 25 discovered forums, collected posts from the forums over a three-month period, and analysed them quantitatively and qualitatively. We focused our analyses on products, prices, seller prolificacy, seller specialisation, and seller reputation, and present a detailed discussion on our findings.
Enrico Mariconti, Jeremiah Onaolapo, Gordon Ross, Gianluca Stringhini. USENIX CSET 2017, Vancouver, Canada.
Malware samples are created at a pace that makes it difficult for analysis to keep up. When analyzing an unknown malware sample, it is important to assess its capabilities to determine how much damage it can make to its victims, and perform prioritization decisions on which threats should be dealt with first. In a corporate environment, for example, a malware infection that is able to steal financial information is much more critical than one that is sending email spam, and should be dealt with the highest priority. In this paper we present a statistical approach able to determine causality relations between a specific trigger action (e.g., a user visiting a certain website in the browser) and a malware sample. We show that we can learn the typology of a malware sample by presenting it with a number of trigger actions commonly performed by users, and studying to which events the malware reacts. We show that our approach is able to correctly infer causality relations between information stealing malware and login events on websites, as well as between adware and websites containing advertisements.
Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effect on the Web
Gabriel Emile Hine, Jeremiah Onaolapo, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Riginos Samaras, Gianluca Stringhini, Jeremy Blackburn. ICWSM 2017, Montreal, Canada.
The discussion-board site 4chan has been part of the Internet’s dark underbelly since its inception, and recent political events have put it increasingly in the spotlight. In particular, /pol/, the “Politically Incorrect” board, has been a central figure in the outlandish 2016 US election season, as it has often been linked to the alt-right movement and its rhetoric of hate and racism. However, 4chan remains relatively unstudied by the scientific community: little is known about its user base, the content it generates, and how it affects other parts of the Web. In this paper, we start addressing this gap by analyzing /pol/ along several axes, using a dataset of over 8M posts we collected over two and a half months. First, we perform a general characterization, showing that /pol/ users are well distributed around the world and that 4chan’s unique features encourage fresh discussions. We also analyze content, finding, for instance, that YouTube links and hate speech are predominant on /pol/. Overall, our analysis not only provides the first measurement study of /pol/, but also insight into online harassment and hate speech trends in social media.
Enrico Mariconti, Jeremiah Onaolapo, Sharique Ahmad, Nicolas Nikiforou, Manuel Egele, Nick Nikiforakis, Gianluca Stringhini. TheWebConf (WWW) 2017, Perth, Australia.
Users on Twitter are commonly identified by their profile names. These names are used when directly addressing users on Twitter, are part of their profile page URLs, and can become a trademark for popular accounts, with people referring to celebrities by their real name and their profile name, interchangeably. Twitter, however, has chosen to not permanently link profile names to their corresponding user accounts. In fact, Twitter allows users to change their profile name, and afterwards makes the old profile names available for other users to take.
In this paper, we provide a large-scale study of the phenomenon of profile name reuse on Twitter. We show that this phenomenon is not uncommon, investigate the dynamics of profile name reuse, and characterize the accounts that are involved in it. We find that many of these accounts adopt abandoned profile names for questionable purposes, such as spreading malicious content, and using the profile name’s popularity for search engine optimization. Finally, we show that this problem is not unique to Twitter (as other popular online social networks also release profile names) and argue that the risks involved with profile-name reuse outnumber the advantages provided by this feature.
Gibson Mba, Jeremiah Onaolapo, Gianluca Stringhini, Lorenzo Cavallaro. TheWebConf (WWW) CyberSafety Workshop 2017, Perth, Australia.
Most of cyberscam-related studies focus on threats perpetrated against the Western society, with a particular attention to the USA and Europe. Regrettably, no research has been done on scams targeting African countries, especially Nigeria, where the notorious and (in)famous 419 advanced-fee scam, targeted towards other countries, originated. However, as we know, cybercrime is a global problem affecting all parties. In this study, we investigate a form of advance fee fraud scam unique to Nigeria and targeted at Nigerians, but unknown to the Western world. For the study, we rely substantially on almost two years worth of data harvested from an online discussion forum used by criminals. We complement this dataset with recent data from three other active forums to consolidate and generalize the research. We apply machine learning to the data to understand the criminals’ modus operandi. We show that the criminals exploit the socio-political and economic problems prevalent in the country to craft various fraud schemes to defraud vulnerable groups such as secondary school students and unemployed graduates. The result of our research can help potential victims and policy makers to develop measures to counter the activities of these criminal groups.
Jeremiah Onaolapo, Enrico Mariconti, Gianluca Stringhini. IMC 2016, Santa Monica, USA.
Cybercriminals steal access credentials to webmail accounts and then misuse them for their own profit, release them publicly, or sell them on the underground market. Despite the importance of this problem, the research community still lacks a comprehensive understanding of what these stolen accounts are used for. In this paper, we aim to shed light on the modus operandi of miscreants accessing stolen Gmail accounts. We developed an infrastructure that is able to monitor the activity performed by users on Gmail accounts, and leaked credentials to 100 accounts under our control through various means, such as having information-stealing malware capture them, leaking them on public paste sites, and posting them on underground forums. We then monitored the activity recorded on these accounts over a period of 7 months. Our observations allowed us to devise a taxonomy of malicious activity performed on stolen Gmail accounts, to identify differences in the behavior of cybercriminals that get access to stolen accounts through different means, and to identify systematic attempts to evade the protection systems in place at Gmail and blend in with the legitimate user activity. This paper gives the research community a better understanding of a so far understudied, yet critical aspect of the cybercrime economy.
What’s your major threat? On the differences between the network behavior of targeted and commodity malware
Enrico Mariconti, Jeremiah Onaolapo, Gordon Ross, Gianluca Stringhini. ARES WMA 2016, Salzburg, Austria.
This work uses statistical classification techniques to learn about the different network behavior patterns demonstrated by targeted malware and generic malware. Targeted malware is a recent type of threat, involving bespoke software that has been created to target a specific victim. It is considered a more dangerous threat than generic malware, because a targeted attack can cause more serious damage to the victim. Our work aims to automatically distinguish between the network activity generated by the two types of malware, which then allows samples of malware to be classified as being either targeted or generic. For a network administrator, such knowledge can be important because it assists to understand which threats require particular attention. Because a network administrator usually manages more than an alarm simultaneously, the aim of the work is particularly relevant. We set up a sandbox and infected virtual machines with malware, recording all resulting malware activity on the network. Using the network packets produced by the malware samples, we extract features to classify their behavior. Before performing classification, we carefully analyze the features and the dataset to study all their details and gain a deeper understanding of the malware under study. Our use of statistical classifiers is shown to give excellent results in some cases, where we achieved an accuracy of almost 96% in distinguishing between the two types of malware. We can conclude that the network behaviors of the two types of malicious code are very different.
Martin Lazarov, Jeremiah Onaolapo, Gianluca Stringhini. USENIX CSET 2016, Austin, USA.
Cloud-based documents are inherently valuable, due to the volume and nature of sensitive personal and business content stored in them. Despite the importance of such documents to Internet users, there are still large gaps in the understanding of what cybercriminals do when they illicitly get access to them by for example compromising the account credentials they are associated with. In this paper, we present a system able to monitor user activity on Google spreadsheets. We populated 5 Google spreadsheets with fake bank account details and fake funds transfer links. Each spreadsheet was configured to report details of accesses and clicks on links back to us. To study how people interact with these spreadsheets in case they are leaked, we posted unique links pointing to the spreadsheets on a popular paste site. We then monitored activity in the accounts for 72 days, and observed 165 accesses in total. We were able to observe interesting modifications to these spreadsheets performed by illicit accesses. For instance, we observed deletion of some fake bank account information, in addition to insults and warnings that some visitors entered in some of the spreadsheets. Our preliminary results show that our system can be used to shed light on cybercriminal behavior with regards to leaked online documents.
Enrico Mariconti, Jeremiah Onaolapo, Syed Sharique Ahmad, Nicolas Nikiforou, Manuel Egele, Nick Nikiforakis, Gianluca Stringhini. EUROSEC 2016, London, UK.
Twitter allows their users to change profile name at their discretion. Unfortunately, this design decision can be used by attackers to effortlessly hijack user names of popular accounts. We call this practice profile name squatting. In this paper, we investigate this name squatting phenomenon, and show how this can be used to mount impersonation attacks and attract a larger number of victims to potentially malicious content. We observe that malicious users are already performing this attack on Twitter and measure its prevalence. We provide insights into the characteristics of such malicious users, and argue that these problems could be solved if the social network never released old user names for others to use.
Jeremiah Onaolapo, Enrico Mariconti, Gianluca Stringhini. ESSoSDS 2016, London, UK.
Cybercriminals steal access credentials to online accounts in a bid to derive profit from the valuable content of such accounts. The research community lacks a comprehensive understanding of what these stolen accounts are used for. This is largely because it is hard for researchers to collect data on compromised online accounts. To bridge this gap, we present an infrastructure that is able to monitor accesses and activity of cybercriminals in compromised Gmail accounts. We leaked credentials to 100 accounts under our control through information-stealing malware, public paste sites, and underground forums. We then monitored accesses and activity observed in these accounts over a period of 7 months.
Enrico Mariconti, Jeremiah Onaolapo, Gordon Ross, Gianluca Stringhini. ESSoSDS 2016, London, UK.
The current malware analysis methods cannot stand the pace the creation of new malware samples has. When analyzing an unknown malware sample, it is important to determine its capabilities of damaging its victims. In a company, for example, a malware infection from an information stealer sample is much more critical than one from a spambot sample, and have to be dealt with the highest priority. In this paper, we present a methodology and some initial results about learning the typology of a malware sample by presenting it with a number of user trigger actions, and studying if the sample reacts to the events. We present a statistical approach able to determine causality relations between a specific trigger action (e.g., a user visiting a certain website) and a malware sample. The initial results show that our approach can correctly infer the causality relations between malware types and trigger events.