CS5007: Computer Systems Course Charter

Introduction

This course is in intended to provide Align students with a common base of knowledge in systems programming, including memory management, multi-threaded programming, and introductions to related topics.

CS5007 is part of the second semester of the Align bridge. Before taking this course, students should have completed CS5001 and CS5002.

CS5006 and 5007 are each half-semester courses that can, in theory, be taught in either order. Because C is taught in one and used in both, the courses should ideally be taught by the same instructor, and students taking one should register for both.

The current preferred order is CS5007 followed by 5006.

Example Course Page (PDF)

Goals

The goals of this class are to provide students with the following:

Experience incorporating resource management into software design. Lessons in resource management will be reinforced through practice.

An introduction to concepts from computer architecture, programming languages and software engineering. This course is heavily project based and gives students hands-on experience with selected topics.

CS5007 uses the C programming language. The motivation for this is that C readily exposes resources (allocated memory, file handles, sockets), making it simple to gain experience writing programs that manage said resources. This can also present a challenge to some students who are newly introduced to computer science and who have previously had experience with other languages.

As a means of encouraging experimentation, students have used Ubuntu running in a virtual machine as their development environment.  Using linux necessitates that most of the students learn a new OS and how to develop software using a new set of tools. While there is a ramp-up cost, this does seem to build students’ confidence and it certainly broadens their toolkit.

High-Level Learning Objectives

At the end of CS5007, a student should be able to:

  1. Configure and develop inside a virtual machine. Also, explain what a virtual machine is and how it works.
  2. Write a script to automate a process. For example, an instructor could choose to have students do this using bash.
  3. Using a compiled language, create an executable binary that utilizes a shared library. For example, using C with a Makefile.
  4. Make use of input & output in a computer program. This includes:
    1. Command line IO
    2. File IO
    3. Network IO
  5. Manually manage memory used in a program. Explain the difference between memory on the stack vs heap memory.
  6. Author a program that explicitly makes use of concurrency. For example, using pthreads in C.

Implementation Details

Students submit assignments via the school’s git server and class discussions are conducted via piazza. The TAs grade the homework (using guidelines provided by the instructor), though it is not uncommon for the instructor to field student questions pertaining to grades.

For a given topic, a typical cadence for the course is:

  1. assign reading on the topic
  2. cover the topic in lecture
  3. assign a homework based on the topic that is due 8 days after that lecture
  4. set aside time at the start of the lecture the following week to answer any questions from the homework. This is in addition to providing timely responses on piazza, the online forum software used for the class.

Class time is a mix of lectures, code review and live coding. This class is broad in that it touches on a wide range of topics outside of the core systems concepts, including: working at the command line, linux, scripting, build systems (e.g., make), non-graphical editors (vim, emacs), computer architecture and its advancement over time, the history of programming languages, and virtualization. The two main challenges of addressing this wide an array of topics are 1) tying them together and 2) not having the time to go too deeply on any one topic.

This class admittedly covers a more constrained set of systems programming topics than a typical undergraduate “introduction to systems” course. This is necessitated by the students’ newness to programming and limited knowledge of computer hardware. Instructional staff should be aware of the cognitive load generated by picking up this many new tools, this fast.

An example of a per-week course schedule is as follows:

  • Week 1: Overview of Computer Systems, Linux crash course
  • Week 2: C programming language; data structures
  • Week 3: Assembly and Machine Representation, CPU Architecture, and Operating Systems
  • Week 4: Compilers, Linkers, and Code Generation
  • Week 5: Processes and The Memory Hierarchy
  • Week 6: Concurrency
  • Week 7: Networking with Sockets

Topics and Learning Outcomes

Note that the topics and learning outcomes are the minimum to be achieved in CS5007. Faculty may choose to present additional material or cover a topic in greater depth. The order of topics here does not imply an ordering for the course itself. Faculty are encouraged to develop syllabi that complement their teaching style. Faculty are also encouraged to share syllabi and other materials with each other.

A note on levels of mastery, which are taken from CS2013 (see cs2013.org): For levels of mastery, CS2013 takes inspiration from (but does not directly follow) Bloom’s Taxonomy. CS2013 defines three levels:

  • Familiarity: The student understands what a concept is or what it means. This level of mastery concerns a basic awareness of a concept as opposed to expecting real facility with its application. It provides an answer to the question “What do you know about this?”
  • Usage: The student is able to use or apply a concept in a concrete way. Using a concept may include, for example, appropriately using a specific concept in a program, using a particular proof technique, or performing a particular analysis. It provides an answer to the question “What do you know how to do?”
  • Assessment: The student is able to consider a concept from multiple viewpoints and/or justify the selection of a particular approach to solve a problem. This level of mastery implies more than using a concept; it involves the ability to select an appropriate approach from understood alternatives. It provides an answer to the question “Why would you do that?”

Note that course topics appear in regular font. Learning outcomes appear in italics.

Also note that some learning outcomes apply to many course topics. More specific learning outcomes appear with their respective topics.

Networking and Communication Introduction

  • Organization of the Internet (Internet Service Providers, Content Providers, etc.)
  • Switching techniques (e.g., circuit, packet)
  • Physical pieces of a network, including hosts, routers, switches, ISPs, wireless,
  • LAN, access point, and firewalls

Articulate the organization of the Internet. [Familiarity]
List and define the appropriate network terminology. [Familiarity]
Describe the layered structure of a typical networked architecture. [Familiarity]

  • Stacks [Use, Implement] Also in 5001
  • Queues [Use] Also in 5001
  • Maps [Use] Some in 5001, 04, 06
Course Topics Learning Outcomes
Computational Paradigms
  • Basic building blocks and components of a computer
  • Application-level sequential processing: single thread
  • List commonly encountered patterns of how computations are organized. [Familiarity]
  • Describe the basic building blocks of computers and their role in the historical development of computer architecture. [Familiarity]
  • Articulate the differences between single thread vs multiple thread, single server vs. multiple server models. [Familiarity]
  • Articulate the concept of strong vs. weak scaling, i.e., how performance is affected by scale of problem vs. scale of resources to solve the problem. [Familiarity]
  • Write application-level sequential programs. [Usage]
Parallelism
  • Sequential vs. parallel processing
  • Demonstrate on an execution time line that events and operations can take place simultaneously. Explain how work can be performed in less elapsed time if this can be exploited. [Familiarity]
Networked Applications
  • Naming and address schemes (DNS, IP addresses, Uniform Resource Identifiers, etc.)
  • HTTP as an application layer protocol
  • List the differences and the relations between names and addresses in a network. [Familiarity]
  • Define the principles between naming schemes and resource location. [Familiarity]
  • Implement a simple client-server socket-based application. [Usage]
Overview of Operating Systems
  • Role and purpose of the operating system
  • Functionality of a typical operating system
  • Mechanisms to support client-server models, hand-held devices
  • Design issues (efficiency, robustness, flexibility, portability, security, compatibility)
  • Influences of security, networking, multimedia, windowing systems
  • Explain the objectives and functions of modern operating systems. [Familiarity]
  • Analyze the tradeoffs inherent in operating system design. [Usage]
  • Describe the functions of a contemporary operating system with respect to convenience, efficiency, and the ability to evolve. [Familiarity]
  • Discuss networked, client-server, distributed operating systems and how they differ from single user operating systems. [Familiarity]
  • Identify potential threats to operating systems and the security features designed to guard against them.  [Familiarity]
Cross-Layer Communications
  • Programming abstractions, interfaces, use of libraries
  • Describe how computing systems are constructed of layers upon layers, based on separation of concerns, with well-defined interfaces, hiding details of low layers from the higher layers. [Familiarity]
  • Describe how hardware, VM, OS, and applications are additional layers of interpretation/processing. [Familiarity]
  • Describe the mechanisms of how errors are detected, signaled back, and handled through the layers. [Familiarity]
  • Construct a simple program using methods of layering, error detection and recovery, and reflection of error status across layers. [Usage]
  • Find bugs in a layered program by using tools for program tracing, single stepping, and debugging. [Usage]
Defensive Programming
  • Input validation and data sanitization
  • Race conditions
  • Correct handling of exceptions and unexpected behaviors
  • Explain why input validation and data sanitization is necessary in the face of adversarial control of the input channel. [Familiarity]
  • Explain why you might choose to develop a program in a type-safe language like Java, in contrast to an unsafe programming language like C/C++. [Familiarity]
  • Classify common input validation errors, and write correct input validation code. [Usage]
  • Demonstrate using a high-level programming language how to prevent a race condition from occurring and how to handle an exception. [Usage]
  • Demonstrate the identification and graceful handling of error conditions. [Usage]
Though not a Data Structures course, CS5007 introduces several data structures in the context of introducing C and various C implementations. They appear in other courses as well; various presentations reinforce / complement each other.

Abstract data types: Usage and Implementation

  • Stacks [Use, Implement] Also in 5001
  • Queues [Use] Also in 5001
  • Maps [Use] Some in 5001, 04, 06

Data Structures: Use and Implementation

  • Arrays [Use] Also in 5004, 06
  • Linked Lists [Use, Implement] Also in 5004
  • Discuss appropriate use of data structures above. [Familiarity]
  • Write programs that use the above. [Usage]