To appear in Computer Science Education:

What We Swept Under the Rug:
Radically Rethinking CS1

Lynn Andrea Stein*

Massachusetts Institute of Technology

 

Abstract

Introductory computer science education is entrenched in an outdated computational model. Although it corresponds neither to our computing environments nor our work, we teach our students a single-thread-of-control static problem-solving view of the role of the computer program: computation as calculation. In this model, the job of a computer program is to start with a problem, calculate its answer, return that answer, and stop. This program-as-an-island bears little resemblance to most of today's software.

We can dramatically improve this situation--and, as a corollary, all of undergraduate computer science--by teaching our students from the very beginning to conceptualize computation with a model of computer programs as simultaneous ongoing entities embedded in and interacting with a dynamic environment: computation as interaction; computation as it occurs in spreadsheets and video games, web applications and robots.

Contents

  1. Introduction
  2. Motivation
    1. The Problem
    2. An Alternative Approach
  3. Course Overview
  4. Implementation
    1. Modes of Teaching
    2. Why Now?
    3. Curricular Implications
  5. Conclusion
  6. Course Syllabus
  7. Acknowledgements

1. Introduction

Many schools are considering a switch to Java in CS1. If we do this, we can continue to teach approximately our existing courses, sweeping a few nasty details (e.g., the enforced concurrency of AWT) under the rug. Alternately, we can use this switch to make a fundamental--and perhaps long overdue--change to the way that we teach introductory programming.

This paper describes a rather unusual CS1 course. It uses Java and it teaches the basic ideas of program decomposition and design. In most other ways, however, this course is not much like a traditional introductory programming course. Its main focus is on the idea of COMPUTATION AS INTERACTION. In this view, programming is about constituting a community of interacting processes, and the fundamental questions one asks have to do what goes inside each of these community members and how they interact. (Java facilitates, but is not necessary for, this course.) This paper presents the motivations behind this approach and gives an overview of the course.

Perhaps the most fundamental idea in modern computer science is that of interactive processes. Computation is embedded in a (physical or virtual) world; its role is to interact with that world to produce desired behavior. While von Neumann serial programming has it that computation-as-calculation uses inputs--at the beginning--to produce outputs--at the end--computation-as-interaction treats inputs as things that are monitored and outputs as actions that are taken over the lifetime of an ongoing process. By beginning with a decomposition in terms of interacting computational processes, we can teach our students a model of the world much closer to the one that underlies the thinking of most CS professionals today.

2. Motivation

Typically, an introductory computer science course teaches students how to write computer programs. The aim of this approach is twofold: on the one hand, students learn an important skill, a tool that will serve them well within this discipline. The more fundamental reason why so many schools teach programming first, however, is that programming teaches students the basic vocabulary of computer science, including what computation is, how it works, and what questions one ought to ask about it.

2.1. The Problem

Introductory computer science is where students learn the basic vocabulary of computation, the fundamental language that we as a field use. Over the decades, this language has shifted, first to include procedures, more recently to incorporate objects. This work explores the next transition, one that is increasingly crucial to the way that we as practitioners think about computation.

When computation was first performed mechanically, its basic vocabulary was the sequencing of primitive steps. The addition of procedural abstraction allowed a more abstract conceptualization of computation, roughly in terms of input-output functionality. Rather than manipulating data in single primitive steps, programs with procedural abstraction allow a collection of these steps to be treated as a new primitive. The basic vocabulary of this kind of computation is that of functional behavior. This is the approach at the heart of most traditional introductory computer science courses. A more recent variant packages functionally-defined procedures together with their associated data, producing an objects-first approach.

Current practice changes the fundamental vocabulary of computation further: not only are functions replaced by interaction patterns, but these interaction patterns take place concurrently and asynchronously. Programming no longer (necessarily) involves designing THE flow of control in a system; instead, it is fundamentally about constituting a community of autonomously interacting entities, deciding what goes inside and what goes between them.

Consider the contrast between a traditional view of word processing and a current version. An old-style word-processing session consists of text creation, followed by spell-checking, then text formatting (e.g., with LaTeX or n/troff), then perhaps format conversion (e.g. dvips), and finally previewing or printing. Today, as I type this paper, the word processor is simultaneously accepting text and laying it out on the page. As I mis-type--e.g., "hte" instead of "the"--the program automatically corrects my spelling. One could imagine that it also critiques my writing style or even searches the web for relevant on-line references as well. All of this happens concurrently and interactively. My word processor is apparently a community.

Today's undergraduates express a certain dissatisfaction with the coverage provided by traditional curricula. These complaints are echoed by their industrial employers. While some of this criticism stems from a fundamental confusion between tool-specific training and a general conceptual education, a second and more legitimate source of this concern is the discrepancy between the basics of traditional computation-as-calculation and the fundamentals of this new kind of computation. Interaction is a the heart of today's reality, but it has made only minimal inroads into the computer science curriculum.

2.2. An Alternative Approach

Our course, like many introductory programming courses, teaches its students to think in terms of the fundamental vocabulary of computation. The novelty of our approach is in what we take to be that fundamental vocabulary. First, the atomic unit of computation is NOT

print("Hello, world!");

--the paradigmatic ``step'' or function of computation. Instead, the atomic unit of computation is

while(true){ echo(); }1

--an autonomous interactive control loop which constantly monitors and responds to its environment. Second, the questions this leads one to ask about computation are different: not ``What value does it produce?'' or ``How long does it take?'', but ``What are the entities constituting the computational community?'', ``What resources does each control?'', and ``How do the members of this community interact?''

Although this approach corresponds more closely to computation as it is practiced today, there has been to date no curricular material supporting its introduction in the early computer science curriculum. There has also been a certain amount of skepticism as to the feasibility of teaching topics such as concurrency to first semester freshmen. Finally, until recently the computational infrastructure needed to support such an approach was simply not available.

Today, these factors have converged to make this new approach not only feasible but necessary. Systems today are increasingly difficult to describe in the old paradigm. Networks, distributed computing, and concurrency are integral parts of the computational world. And languages such as Java force us to confront these issues in our curriculum or else to recognize that we are deliberately teaching our students a view of computation that no longer corresponds to the majority of our practice.

Students trained in this new curriculum are likely to experience their subsequent curriculum in a new light. For example, an operating systems course no longer needs to teach both the idea of concurrency and the mechanisms by which it is implemented; students will have been living in a concurrent computational world from the very beginning. Similarly, an algorithms course can introduce topics like parallel search and sort--natural topics, but ones which in the current curriculum require too much intellectual baggage to be included in a sophomore-level class.

Further, a whole host of issues that now fit into our curriculum only poorly if at all now become sensible parts of the model of computation that we teach our students. For example, the traditional curriculum has a tremendously difficult time handling the topic of user interfaces. In our new vision, accounting for the role of the user becomes straightforward: the user is another member of the community of interacting processes that together constitute our computation.

3. Course Overview

A syllabus for our course is included at the end of this paper. All of our course materials, including course notes (a textbook-in-progress) and assignments, are available from our course web site, http://www-cs101.ai.mit.edu/. Further information about the project is available at http://www.ai.mit.edu/projects/cs101/.

In the first several weeks of the course, we focus on teaching the students the most basic elements of Java programming: expressions and statements, objects and classes. In this way, our course might seem like a traditional introduction to programming. However, at the same time that students are learning to construct program fragments, we are teaching them about the relationships between aggregates and the entities of which they are made. For example, an early laboratory involves writing an entity to control one end of a seesaw. 2 Each iteration of the student's code senses the relative slope of the seesaw (up or down), then pushes the end (down or up) as appropriate. We supply a GUI that enables the student to visualize the virtual seesaw. Students run their code at both ends of the seesaw or at one end with code supplied by course staff or classmates at the other. This laboratory teaches simple mechanics and syntax of Java. At the same time, students observe the relationships between the code they write and the behavior of the system as a whole. They observe variation in overall behavior as they modify the strategy used by their code or as their code interacts with various other entities.

The student's code has available the relative slope of the seesaw; the student's code returns an integer indicating the force of an upwards or downwards push. It is run in concert with a GUI that allows the student to observe the interactions between their own code and the code controlling the other end of the seesaw.3 We provide the students with a GUI that allows them to visualize the behavior of their code. The program is designed to be insensitive to which side of the GUI it controls.

By the end of the first month of the term, students have learned to write what might in other contexts be called ``stand-alone'' programs: complete programs including class definitions and a main routine. They have also learned that every program is a part of a system of interacting entities--including the user, libraries and other software, hardware, etc.--and that no program truly stands alone. The remainder of the course addresses issues and alternatives that arise in the design of software communities.

In the second segment of the course, we focus on the internal structure of a computational entity. We begin with a central control loop and look at how it can respond to a variety of inputs. [For example, a calculator.] This leads to a discussion of dispatch control structures and an exploration of procedural abstraction. [Handle number, handle operator.] As these procedures become increasingly weighty, the central control loop's importance recedes until it is merely the manager/dispatcher that ensures that a particular procedure is invoked. This leads naturally to an investigation of event-driven programming, in which the dispatch process is minimized, often implicit or provided by the underlying system, and the focus of the program is on what to do when a particular event is invoked. A third model--smart objects that handle their own behavior [*-button, =-button]--is also explored. Comparing and contrasting these three approaches to system design gives the students an appreciation for issues that they will continue to confront throughout their careers.

The third segment addresses the issue of how entities are tied together. A recurring theme--throughout the course, but emphasized here--concerns the design of interface. This refers both to the Java construct--a signature specification--and to the more general concept. What services does a particular entity provide? What does it promise, and under what circumstances? Students design, in pairs, a simple networked video game. A part of the structure is shared and so built as a team. Another part, concerning what is sent across the network, when, and by whom, is designed together but implemented separately. Afterwards, the groups discuss their different choices for this interface--including, for example, a client push vs. server pull--and learn that though each choice has advantages, both sides of the same application must fit a single pattern. You cannot simply pair a server push server with a client pull client.

[This video game also illustrates a community of entities--the players' programs--that are themselves composed of communities of entities--e.g., an event-driven one to handle the GUI and a control loop/server to monitor the network connection.]

In addition to learning how to specify an interface, students learn what the interface does not specify. For example, a typical server push may rely on the client to accept every message that is sent; if the server outpaces the client, the system may fail even though the specified interface has been rigidly adhered to. Students learn, too, about streams and messages and shared memory, about connecting to objects in the same name space and to those running under different processes or on different machines, and about how to communicate with them. Finally, they learn about safety and liveness, that shared mutable state can lead to program failures, and some simple mechanisms for coping with them. They do not, of course, learn to build arbitrarily complex designs that avoid deadlock under all circumstances; this is a topic that will be revisited later in the computer science curriculum. Instead, they learn to recognize the general preconditions for the possibility of safety failures and the kinds of solutions that might be possible. The goal, throughout this course, is to give our students the basic conceptual vocabulary that will allow them to ask the right questions later in their education.

The final segment of the curriculum is intended to introduce students to a variety of canonical architectures for distributed systems. The idea is that these patterns enable them to constitute communities of interacting entities, each of which is itself a community, and so on. Rather than specifying all of the interactions at each of these levels, students learn to name certain interaction patterns and to use these to isolate the details of one level from another. For example, we discuss a variety of client-server models--including broadcast and a hierarchical model such as the domain name service--peer architectures such as round robin, blackboards, explicit arbitration, subsumption, etc. Several of the earlier assignments are revisited; in addition, students complete a final project. A typical final project from the 1996 course was a networked client-server chat program that sent both text and colored line drawings to individuals or groups spread across the internet.

Not what you would generally expect from first semester freshmen!

4. Implementation

The course as described has been taught at MIT several times in a variety of contexts. (These include regular term as well as summer session/professional institute.) The course has been apparently quite successful; the current term was significantly oversubscribed. Students going on to subsequent course work in the department report that their experience with this material was extremely helpful. Course evaluations have been positive. Nonetheless, more experience with the material in a wider variety of settings is obviously desirable.

This section discusses a few of the implications of the course, e.g., for teaching style and for later curriculum.

4.1. Modes of Teaching

The traditional approach to introductory programming gives students few opportunities to observe the behavior of their code in a context other than debugging. In this course, design and implementation work is supplemented with observational laboratory assignments, inviting students to consider not only how to build a program, but how to anticipate its behavior and how to modify that behavior. While such sessions are common in the laboratory science, this in-class hands-on observational experience is largely missing from computer science. Understanding the interactions between program and behavior is critical in many modern applications.

A major component of this class is a weekly three hour in-class laboratory. Students prepare for the laboratory in advance, including both design of code and experimental predictions (but no actual coding). In the lab, they not only implement and debug code (typical closed lab activities) but also experiment with and observe the behavior of their code. The nature of the programs used in this course is such that their observable interactions are often complex. Unexpected interactions often arise. Students thus learn the relationships between code they write and system behavior, as well as the relationships between system composition and system behavior. After lab, students are responsible for a post-lab write-up which includes explicit comparison of their pre-lab predictions and their empirical observations.

It is a thesis of this course that computation-as-interaction is best taught through interaction with computation.

4.2. Why Now?

If this is such a reasonable idea, it might seem surprising that introductory computer science is not already taught this way.4 One explanation is simply the lag between the time that an idea hits the research frontier and the time at which it is ready to be integrated into the lowest levels of our curriculum. An equally significant issue, however, has been the non-availability of simple models of embedded embodied computation.

Actually, conventional computer science curricula do currently teach this model of computation: in an upper level course, typically one on operating systems. Unfortunately, an operating system, while a wonderful example of this type of computation, is also an extremely complicated program and notoriously difficult for students to understand. One reason for this may be their prior training in serial computation; but the inherent complexity of an operating system, which is a software system whose behavior is largely unobservable to the eye, is in itself sufficient explanation of its difficulty. In any case, an operating system is obviously inappropriate for a first-term programming course; it is simply beyond the reach of most introductory computer science students.

An operating system is only the most commonly taught example of embedded computation. Many other consequences and implications of this approach are also treated in existing curricula, though disjointedly: distributed algorithms in a theoretical computer science class, transaction processing in databases, parallel processing in operating systems, interprocess communication models in programming languages, event-driven computation in a class on interfaces, multi-agent interaction in distributed AI. One implication of this course is to tie together all of these ideas and to present their simpler manifestations early in the curriculum. Students who begin here are likely to see all of these issues as pieces of a larger whole, and to have better appreciation for their complexities when revisiting them in these upper level courses. The difficulty is in finding a way to package up these issues and make them tangible for students who have never programmed a computer before.

4.3. Curricular Implications

Although this approach to introductory CS seems intriguing, it is hardly likely to see widespread acceptance if it require a reworking of the entire undergraduate curriculum. Fortunately, it turns out that the new introductory course slides nicely into place without requiring much change at all; the upper level courses can continue as they are, but are likely to find their task simplified somewhat by the new perspective that students bring to them.

After all, this course is really still an introductory programming course. Its thematic lesson concerns a model of computation as interaction, rather than calculation. But its pragmatic goals include most of the skills that are learned in existing versions of introductory CS. As a concrete example, sorting algorithms absolutely have a place in the revised course; it is now, however, equally appropriate to discuss parallel quicksort as bubble sort. The fundamental lesson of this course remains how to take a description and construct a program whose behavior implements that description; the differences are in the underlying assumptions, the kinds of descriptions that can therefore be considered, and the corresponding conceptualizations used to build the program. The computational constructs and modeling tools have changed; the problem remains the programming.

The remainder of the curriculum which begins with an introduction to computation on these terms may thus look much like the existing computer science undergraduate curriculum. Nonetheless, there are subtle but significant difference. Several important topics that are currently covered only in advanced undergraduate or graduate level classes can be introduced earlier in the curriculum. For example, topics in distributed algorithms and parallel complexity--such as the time/processor tradeoff--can be taught in the first course in computer science theory if the model of parallel computation is already familiar. Since modern algorithms increasingly make use of such approaches, it seems only natural to expose our undergraduates to the fundamental ideas in these areas.

Other topics, already present at the undergraduate level, become much easier to explain when students come equipped with this world view. Much of operating systems becomes an exploration of different methods for implementing and ensuring appropriate behavior of multi-processing, rather than focusing on the concept of parallel execution itself. Students seeing these ideas for the second time, now in depth, are more likely to appreciate some of the subtleties of the problem rather than being confused by the many levels at which operating system code must operate. Synchronization and interprocess communication can be introduced along with scheduling. Transaction-safety, remote procedure call, and shared memory models similarly follow smoothly from this approach.

Yet other important topics have never fit well into the curriculum under the old model. As one concrete example, consider the role of the user. Traditional programming courses--traditional computer science curricula, even--have always had difficulty explaining the role of the user in a software system. In many cases, this ``special case'' is tacked on to a curriculum as an afterthought (or altogether ignored), largely because it just doesn't fit. In contrast, our model of computation has no trouble accounting for the role of the user. The user is simply another member of the community of interactions; our job is to develop, with the user, an acceptable interface that gives each participant--program or person--an appropriate set of responsibilities and services. Of course, a human has different skills and needs from a computer program; but this, too, is a natural part of our larger way of thinking--and teaching--- about computational systems.

5. Conclusion

As we have moved from Pascal to C to C++ to Java without changing our fundamental notion of computation, we have swept more and more details of how computation really works under the metaphorical rug. It is time to come clean, to rethink our fundamental notion of computation in a way that is much closer to current practice. This reconceptualization allows a radical shift that runs throughout the curriculum without requiring a significant restructuring of our course sequence.

 

Course Syllabus

Class Sessions

Laboratories

Introduction to Interactive Programming

Expressions and Statements

Spirograph (Expressions and Statements)

Objects and Classes

Interfaces and Exceptions

Nodes and Channels (Interactions)

Self-Animating Objects

Inheritance

Design Project

Student Holiday

Object-Oriented Programming

Balance (Classes)

Dispatch Mechanisms

Procedural Abstraction

Calculator (Procedures)

In-Class Examination

Events Driven Programming and java.awt

(Documentation Project)

Columbus Day

Event Delegation (and more java.awt)

Scribble (Events)

Safety, Livenenss, and Synchronization

Interfaces and Protocols: Composing Systems

No Laboratory

Push and Pull

Cat and Mouse (Systems of Systems)

In-Class Examination

Explicit Communication: java.io, java.net

Veteran's Day

Servers

Final Project (Networked Interactions)

Arbitration or RMI

Design Architectures

On Presentations

Thanksgiving Holiday

Group Project Presentations

In-Class Examination

Interactive Programming as Program Design

Course materials corresponding to the lectures and laboratories in this syllabus are available on the course web site, http://www-cs101.ai.mit.edu/

Acknowledgements

Too many people have contributed to this project to acknowledge individually here; that will have to wait for the book. The earliest inklings of these ideas probably took root while I was teaching--and learning to teach--at Harvard. There is no doubt that that they were nourished in the dynamic environment of Brown's CS Department, which supported many parallel growths. My more recent colleagues, at MIT and elsewhere, have been a source of inspiration and challenge, both of which have strengthened the work immeasurably. Feedback, especially concerning industrial relevance and especially from the networked community, has been invaluable.

I have had tremendous assistance from the teaching staff who have helped me with the development of the material reported on here. The students who have participated in the several versions of this course--in the regular session at MIT, in MIT's Professional Institute, and elsewhere--also have left their mark in untold ways upon this material. I am indebted to all of them for their insights as well as their patience.

This work would not have been possible without the generous support of the National Science Foundation under Young Investigator Award IRI-9357761 to Professor Stein. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation. Additional support has been received from MIT's Class of 1951 Fund for Excellence in Education and Class of 1955 Fund for Excellence in Teaching and from the Office of Naval Research through the Science Scholars Program at the Mary Ingraham Bunting Institute of Radcliffe College.


Footnotes:

* Correspondence to 545 Technology Square, Cambridge, MA, 02139, USA. Voice: +1 617 253 2663. Fax: +1 617 253 5060. Electronic mail: las@ mit.edu. WWW: http://www.ai.mit.edu/people/las.

1. By echo() we intend to evoke a procedure that reads a line of input and writes the same thing back out. For a more implicit, apparently passive statement of this program, consider a mirror.

2. A seesaw, or teeter-totter, is a toy that was once common on playgrounds. It is built out of a board placed over a raised pivot. One child sits at each end of the board; as one jumps up in the air, the other returns to the ground.

3. In one implementation, the students actually write only the statement(s) required to return the appropriate value. We then embed this code in the loop, procedure, and object necessary to connect it to the rest of the system. This version is used as the first laboratory, and emphasizes the syntax of Java statements. Another version, intended for use later in the first month of the course, has the students writing the complete class file to implement a prespecified interface.

4. To be perfectly fair, many existing introductory courses do make an effort to include some indication of the multi-threaded nature of the universe. When done, this is generally accomplished through the inclusion of a unit&emdash;on user interface, simulation, or explicit parallelism&emdash;which extends the course's central sequentialist metaphor.


This paper is a part of Lynn Andrea Stein's Rethinking CS101 project at the MIT AI Lab and the Department of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology.

Questions or comments:
<cs101-webmaster@ai.mit.edu>