CS 6535: Engineering Reliable Software

Spring 2019



The story. Software bugs have been compared to death and taxes: they seem unavoidable. While some contest that statement, much of the software development cycle today is spent on finding (and getting rid of) bugs in programs — in large projects sometimes more than 50% of the time. Research and industry have come up with a wild variety of techniques to help engineers identify flaws in their software. Every programmer in industry must be aware of, and be able to use, at least some of these techniques in his or her career.

The course. In this course you will learn about a broad set of software reliability methods used in industry, ranging from human-centered code analysis methods, to systematic program testing, and high-precision formal software analysis approaches. As a side effect, you will also become a better programmer: chances are that programming bugs you discover during exercises in this course (in your own code or someone else's) result from mistakes you will never make in the future to begin with.

Software bug

Synopsis


Course topics

  1. Property specifications: how to state precisely what your program is supposed to do, using formal English or property specification languages. Formal property specifications serve two purposes: (i) they force you to be precise about the intended purpose and properties of the program (for your benefit, and that of others), and (ii) they allow automated techniques to test or otherwise analyze your program as to whether it meets those properties. “Writing specs” is a critical skill of software designers.

  2. Trial-based approaches to software reliability: Program testing is the most widely used form of vetting a program's reliability. Historically ad-hoc, large-scale program testing today is a very systematic activity. We will study it

  3. Symbolic code analysis: treating programs as mathematical formulas. This powerful paradigm, going back to landmark work from the 1970s, encodes a program's execution as a set of constraints over the program variables at any point in time. Such an encoding gives you access to modern constraint solver technology (which didn't exist in the 1970s in comparable form) and is extremely instrumental in various software reliability techniques, from test case generation to formal program verification.

  4. Formal approaches to software reliability: such approaches emphasize the fact that program source code has (mostly) well-defined semantics and can thus be analyzed with mathematical and logical techniques. We will study the most widely used such technique: model checking, a method to systematically search programs for bugs, in a way that can either expose them, or in fact guarantee their absence.

  5. Other techniques: (if time permits) code monitoring and runtime analysis; theorem proving
We will study these topics hands-on, using tools and languages that you have already seen in your prior CS education, or will see in your professional career as a software programmer. Candidates for such tools include ScalaTest, the Java Pathfinder, the KLEE symbolic execution engine, the ACL2s programming environment, the CBMC model checker, and many more.

Methodology

Lecture time will be spent on highly-interactive instructor presentations, discussions, practical exercises, and presentations by students. Outside the lecture, there will be homework exercises (some shallow, some deeper) to bring across the techniques, and a more comprehensive course project that should span several aspects of the course, or go into depth on one of them. Projects are chosen in negotiation with the instructor.

Prerequisites

I. Formal prerequisites: The course is intended for upper-division undergraduates and Master-level students. Details:
  1. Undergraduates: you should be a Junior or Senior. You will be able to benefit from this course most if you have had exposure to programming in some imperative language such as Java or C++ and are curious about the power of automated reasoning (i.e., not using pencil and paper) for such languages (rather than the teaching languages used in Fundies I or Logic and Computation). Examples of supporting courses you might have taken include Programming Languages (CS 4400) and Software Development (CS 4500). An example of a related course is Computer-Aided Reasoning (CS 4820). In comparison to CS 4820, this course will discuss reliability broadly (covering systematic software testing as well as formal techniques), emphasize imperative languages as used in industry, and emphasize model checking systems rather than theorem provers.

  2. MS students: you can take this course at any stage of your degree program, but it will be most beneficial after you have taken CS 5010 "Programming Design Paradigms". CS 6535 is not about program design, but about program investigation, in the broadest sense of the word. If you are planning to work in serious software development jobs (or you already do), techniques learned in this course will be a plus and make you stand out among fellow job applicants.

II. Content prerequisites: (listed roughly in order of decreasing significance) You should also not be afraid of basic (e.g. propositional) logic, and of basic algorithmic concepts such as graph search.

Course components and grading

Your final grade will depend on the following contributions to the class:

Textbooks

All material required in class will be presented during lecture discussions, so we will not require any textbook. Here some recommended books for your own perusal. Some of the course material will be taken from these books, others will not. We will discuss these books on the first class day.