805 Columbus Avenue
617 Interdisciplinary Science and Engineering Complex (ISEC)
Boston, MA 02120
ATTN: William Robertson, 435 ISEC
360 Huntington Avenue
Boston, MA 02115
- Systems security
- Web security
- Mobile security
- PhD in computer science, University of California, Santa Barbara
- BS in computer science, University of California, Santa Barbara
William Robertson is an associate professor of computer science at Northeastern University and co-directs the Northeastern Systems Security Lab. Robertson’s research revolves around improving the security of operating systems, mobile devices, and the web, as well as making use of techniques such as security by design, program analysis, and anomaly detection. Prior to joining Northeastern in 2011, he was a postdoctoral researcher at UC Berkeley (2009-2011).
Robertson was involved in both the California Top-to-Bottom-Review and the Ohio EVEREST projects as a Red Team member. In this capacity, he demonstrated that electronic voting systems were susceptible to large-scale attacks that could exploit numerous vulnerabilities in the firmware and physical security of the components of the voting system. His work lead to significant changes in public policy in both states with respect to electronic voting. He has extensive experience in organizing and participating in Capture-the-Flag exercises. With Shellphish, a team composed of UCSB-affiliated members, he won the 2005 edition of the DEFCON CTF competition. He was also instrumental in helping to organize the UCSB iCTF, the largest distributed CTF competition.
Additionally, he is the program co-chair of the Annual Computer Security Applications Conference for 2015-1016, was the co-chair of the 2013 USENIX Workshop on Offensive Technologies, co-located with USENIX Security, and was the chair of the 2012 Conference on the Detection of Intrusions and Malware & Vulnerability Assessment. He has participated on the program committees of a number of top-tier systems security venues, including IEEE Security and Privacy, USENIX Security, NDSS, ACSAC, and RAID. He has authored more than thirty peer-reviewed journal and conference papers in the area of systems and network security.
Automated Reverse Engineering of Commodity Software
Automated Reverse Engineering of Commodity Software
Prior academic work has examined how to automatically discover vulnerabilities in binary software, and even how to automatically craft exploits for these vulnerabilities, the ability to answer basic security-relevant questions about closed-source software remains elusive. This project aims to provide algorithms and tools for answering these questions.
Software, including common examples such as commercial applications or embedded device firmware, is often delivered as closed-source binaries. While prior academic work has examined how to automatically discover vulnerabilities in binary software, and even how to automatically craft exploits for these vulnerabilities, the ability to answer basic security-relevant questions about closed-source software remains elusive.
This project aims to provide algorithms and tools for answering these questions. Leveraging prior work on emulator-based dynamic analyses, we propose techniques for scaling this high-fidelity analysis to capture and extract whole-system behavior in the context of embedded device firmware and closed-source applications. Using a combination of dynamic execution traces collected from this analysis platform and binary code analysis techniques, we propose techniques for automated structural analysis of binary program artifacts, decomposing system and user-level programs into logical modules through inference of high-level semantic behavior. This decomposition provides as output an automatically learned description of the interfaces and information flows between each module at a sub-program granularity. Specific activities include: (a) developing software-guided whole-system emulator for supporting sophisticated dynamic analyses for real embedded systems; (b) developing advanced, automated techniques for structurally decomposing closed-source software into its constituent modules; (c) developing automated techniques for producing high-level summaries of whole system executions and software components; and (d) developing techniques for automating the reverse engineering and fuzz testing of encrypted network protocols. The research proposed herein will have a significant impact outside of the security research community. We will incorporate the research findings of our program into our undergraduate and graduate teaching curricula, as well as in extracurricular educational efforts such as Capture-the-Flag that have broad outreach in the greater Boston and Atlanta metropolitan areas.
The close ties to industry that the collective PIs possess will facilitate transitioning the research into practical defensive tools that can be deployed into real-world systems and networks.
Tobias Lauinger, Abdelberi Chaabane, Ahmet Salih Buyukkayhan, Kaan Onarlioglu, William Robertson. Game of Registrars: An Empirical Analysis of Post-Expiration Domain Name Takeovers. USENIX Security Symposium. August 2017.
Every day, hundreds of thousands of Internet domain names are abandoned by their owners and become available for re-registration. Yet, there appears to be enough residual value and demand from domain speculators to give rise to a highly competitive ecosystem of drop-catch services that race to be the first to re-register potentially desirable domain names in the very instant the old registration is deleted. To pre-empt the competitive (and uncertain) race to re-registration, some registrars sell their own customers’ expired domains pre-release, that is, even before the names are returned to general availability.
These practices are not without controversy, and can have serious security consequences. In this paper, we present an empirical analysis of these two kinds of post-expiration domain ownership changes. We find that 10% of all .com domains are re-registered on the same day as their old registration is deleted. In the case of .org, over 50% of re-registrations on the deletion day occur during only 30s. Furthermore, drop-catch services control over 75% of accredited domain registrars and cause more than 80% of domain creation attempts, but represent only between 9-17% of successful domain creations. These findings highlight a significant demand for expired domains, and hint at highly competitive re-registrations.
Our work sheds light on various questionable practices in an opaque ecosystem. The implications go beyond the annoyance of websites turned into “Internet graffiti”, as domain ownership changes have the potential to circumvent established security protocols.
Semi-automated Discovery of Server-Based Information Oversharing Vulnerabilities in Android Applications
Wil Koch, Abdelberi Chaabane, Manuel Egele, William Robertson, Engin Kirda ,In Proceedings of the International Symposium on Software Testing and Analysis (ISSTA) ,Santa Barbara, CA, USA ,July 2017
Modern applications are often split into separate client and server tiers that communicate via message passing over the network. One well-understood threat to privacy for such applications is the leakage of sensitive user information either in transit or at the server. In response, an array of defensive techniques have been developed to identify or block unintended or malicious information leakage. However, prior work has primarily considered privacy leaks originating at the client directed at the server, while leakage in the reverse direction – from the server to the client – is comparatively under-studied. The question of whether and to what degree this leakage constitutes a threat remains an open question. We answer this question in the affirmative with Hush, a technique for semi-automatically identifying Server-based InFormation OvershariNg (SIFON) vulnerabilities in multi-tier applications. In particular, the technique detects SIFON vulnerabilities using a heuristic that over-shared sensitive information from server-side APIs will not be displayed by the application’s user interface. The technique first performs a scalable static program analysis to screen applications for potential vulnerabilities, and then attempts to confirm these candidates as true vulnerabilities with a partially-automated dynamic analysis. Our evaluation over a large corpus of Android applications demonstrates the effectiveness of the technique by discovering several previously-unknown SIFON vulnerabilities in eight applications.
Ahmet Salih Buyukkayhan, Alina Oprea, Zhou Li, William Robertson. Lens on the endpoint: Hunting for malicious software through endpoint data analysis. International Symposium on Research in Attacks, Intrusions and Defenses (RAID). September 2017.
Organizations are facing an increasing number of criminal threats ranging from opportunistic malware to more advanced targeted attacks. While various security technologies are available to protect organizations’ perimeters, still many breaches lead to undesired consequences such as loss of proprietary information, financial burden, and reputation defacing. Recently, endpoint monitoring agents that inspect system-level activities on user machines started to gain traction and be deployed in the industry as an additional defense layer. Their application, though, in most cases is only for forensic investigation to determine the root cause of an incident.
In this paper, we demonstrate how endpoint monitoring can be proactively used for detecting and prioritizing suspicious software modules overlooked by other defenses. Compared to other environments in which host-based detection proved successful, our setting of a large enterprise introduces unique challenges, including the heterogeneous environment (users installing software of their choice), limited ground truth (small number of malicious software available for training), and coarse-grained data collection (strict requirements are imposed on agents’ performance overhead). Through applications of clustering and outlier detection algorithms, we develop techniques to identify modules with known malicious behavior, as well as modules impersonating popular benign applications. We leverage a large number of static, behavioral and contextual features in our algorithms, and new feature weighting methods that are resilient against missing attributes. The large majority of our findings are confirmed as malicious by anti-virus tools and manual investigation by experienced security analysts
Muhammad Ahmad Bashir and Sajjad Arshad and William Robertson and Christo Wilson In Proceedings of Usenix Security. Austin, TX, August, 2016
Numerous surveys have shown that Web users are concerned about the loss of privacy associated with online tracking. Alarmingly, these surveys also reveal that people are also unaware of the amount of data sharing that occurs between ad exchanges, and thus underestimate the privacy risks associated with online tracking.
In reality, the modern ad ecosystem is fueled by a flow of user data between trackers and ad exchanges. Although recent work has shown that ad exchanges routinely perform cookie matching with other exchanges, these studies are based on brittle heuristics that cannot detect all forms of information sharing, especially under adversarial conditions.
In this study, we develop a methodology that is able to detect client- and server-side flows of information between arbitrary ad exchanges. Our key insight is to leverage retargeted ads as a tool for identifying information flows. Intuitively, our methodology works because it relies on the semantics of how exchanges serve ads, rather than focusing on specific cookie matching mechanisms. Using crawled data on 35,448 ad impressions, we show that our methodology can successfully categorize four different kinds of information sharing behavior between ad exchanges, including cases where existing heuristic methods fail.
We conclude with a discussion of how our findings and methodologies can be leveraged to give users more control over what kind of ads they see and how their information is shared between ad exchanges.
UNVEIL: A Large-Scale, Automated Approach to Detecting Ransomware. A. Kharraz, S. Arshad, C. Mulliner, W. Robertson, E. Kirda In USENIX Security Symposium Austin, TX US, Aug 2016
Although the concept of ransomware is not new (i.e., such attacks date back at least as far as the 1980s), this type of malware has recently experienced a resurgence in popularity. In fact, in the last few years, a number of high-profile ransomware attacks were reported, such as the large-scale attack against Sony that prompted the company to delay the release of the film “The Interview.” Ransomware typically operates by locking the desktop of the victim to render the system inaccessible to the user, or by encrypting, overwriting, or deleting the user’s files. However, while many generic malware detection systems have been proposed, none of these systems have attempted to specifically address the ransomware detection problem.
In this paper, we present a novel dynamic analysis system called UNVEIL that is specifically designed to detect ransomware. The key insight of the analysis is that in order to mount a successful attack, ransomware must tamper with a user’s files or desktop. UNVEIL automatically generates an artificial user environment, and detects when ransomware interacts with user data. In parallel, the approach tracks changes to the system’s desktop that indicate ransomware-like behavior. Our evaluation shows that UNVEIL significantly improves the state of the art, and is able to identify previously unknown evasive ransomware that was not detected by the antimalware industry.
Ozcan, Ahmet Talha, et al. "BabelCrypt: The Universal Encryption Layer for Mobile Messaging Applications." Financial Cryptography and Data Security. Springer Berlin Heidelberg, 2015. 355-369.|
Internet-based mobile messaging applications have become a ubiquitous means of communication, and have quickly gained popularity over cellular short messages (SMS). Unfortunately, from a security point of view, free messaging services do not guarantee the privacy of users. For example, free messaging providers can record and store exchanged messages indefinitely to collect information about specific users. Moreover, these messages can be accessed by criminals who gain access to social media accounts. In this paper, we introduce BabelCrypt, a system that addresses the problem of automatically retrofitting arbitrary mobile chat applications with end-to-end encryption. Our system works by transparently interfacing with the original client applications supplied by the respective service providers. It does not require any modification to the individual applications, nor does it require any knowledge or customization for specific chat applications. BabelCrypt is able to automatically inject control messages in-band, using the underlying application’s message exchange mechanism, and thus supports running arbitrarily complex encryption protocols such as OTR. We successfully used BabelCrypt with a number of popular messaging applications including Facebook Messenger, WhatsApp, and Skype. Our evaluation shows that BabelCrypt provides end-to-end security for arbitrary messaging applications while satisfactorily preserving the original user experience of the messaging application.
"TrueClick: automatically distinguishing trick banners from genuine download links" S Duman, K Onarlioglu, AO Ulusoy, W Robertson, E Kirda- Proceedings of the 30th Annual Computer Security, 2014
The ubiquity of Internet advertising has made it a popular target for attackers. One well-known instance of these attacks is the widespread use of trick banners that use social engineering techniques to lure victims into clicking on deceptive fake links, potentially leading to a malicious domain or malware. A recent and pervasive trend by attackers is to imitate the “download” or “play” buttons in popular file sharing sites (e.g., one-click hosters, video-streaming sites, bittorrent sites) in an attempt to trick users into clicking on these fake banners instead of the genuine link.
In this paper, we explore the problem of automatically assisting Internet users in detecting malicious trick banners and helping them identify the correct link. We present a set of features to characterize trick banners based on their visual properties such as image size, color, placement on the enclosing webpage, whether they contain animation effects, and whether they consistently appear with the same visual properties on consecutive loads of the same webpage. We have implemented a tool called TrueClick, which uses image processing and machine learning techniques to build a classifier based on these features to automatically detect the trick banners on a webpage. Our approach automatically classifies trick banners, and requires no manual effort to compile blacklists as current approaches do. Our experiments show that TrueClick results in a 3.55 factor improvement in correct link selection in the absence of other ad blocking software, and that it can detect trick banners missed by a popular ad detection tool, Adblock Plus.