6.170 Laboratory in Software Engineering
Spring 2004
Problem Set 4: Abstract Data Types
Due: Tuesday, March 9, 2004 at 9pm

Handout P4

Quick links:

Contents:

Warning: This is a long assignment, so we suggest starting early. We recommend that you read the entire problem set before you begin work.


Important notice

When you complete the assignment, make sure to log onto athena and run:

athena% validate6170 ps4

Read the output message carefully!


Purpose

In this assignment you will practice designing, implementing, and testing several abstract data types.

Preamble

To review, Problem Sets 3, 4, 5, and 6 address different aspects of the design, implementation, documentation, and testing of Traffick, a traffic simulator for Boston and Cambridge. In Problem Set 3, you built basic data abstractions such as Car, Route, RoadSegment, and GeoPoint.  This problem set involves designing and implementing a graph abstraction, and then using the graph to implement a generic pipe-and-filter simulator.  This simulator will be used in the traffic simulator you will build in the next problem set.

This problem set requires you to do a fair amount of design. Writing code to a careful specification is fairly easy; writing code to an incomplete specification is harder; and writing a specification is hardest of all. However, it can also be much more interesting. When you select a design, you should first sketch out the design space. Brainstorm all the different ways you might imagine satisfying the specification. (Some of these might seem silly, but don't reject them out of hand.) Almost every set of requirements has multiple designs with different properties, and the ones we give you are no exception. When you write up your design, talk about different designs, tradeoffs they make, in what circumstances each is good, and why you chose the one you did. There is no one correct answer, no one "perfect" design that the 6.170 staff is waiting for you to discover. Rather, we want you to think about the problem, consider the alternatives, and make a cogent case for the one that you choose.

Getting Started

You are provided specifications in two different formats: HTML and starter Java files. The HTML specifications are good for viewing or printing with a Web browser, and they contains special formatting and hyperlinks, just like the Java API Specification. You can edit the starter Java files to complete your assignment without having to retype the declarations.

Make sure you have followed the instructions on the tools handout relevant to directory setup before you begin development. Furthermore, be sure you have read the problem set procedure handout before you begin.

You can get the source files for problem set 4 by using CVS. Follow the problem set checkout instructions and checkout the ps4 module from your repository.

Your code MUST work on Athena, so if you choose to work on a non-Athena machine, please take a few minutes before submitting your solution to ensure that it runs correctly on Athena. Once everything is in order, read the Problem Set Submission instructions for how to run a final check of your code and turn it in.



BipartiteGraph

In this part, you will design and implement an abstraction for directed bipartite graphs with labeled edges.  Let's take this mathematical object apart piece by piece.

A graph consists of a set of nodes, some or all of which may be connected by edges.

In a directed graph, the edges have direction: node B might have an edge to node C, but node C is not necessarily connected to node B. For our purposes there cannot be more than one edge from a given node to another given node, but there can be an edge from A to B and another edge from B to A.  Note that no two nodes may have the same label.

In a graph with labeled edges, each edge has an Object associated with it, which acts as its label.  For our purposes, all the edges leaving a node should have distinct labels, and all the edges entering a node also have distinct labels.  However, two different edges may have the same label if they don't both come from the same node or both go to the same node.  For example, the picture below shows two edges labeled x, which is legal.  But it wouldn't be legal for both edges leaving B to be labeled x.  This lets us uniquely identify any edge in the graph by providing just its label and one of the nodes it connects.

Finally, a bipartite graph is a graph in which there are two different types of nodes -- black nodes and white nodes -- such that the edges only connect nodes of different colors: from a black node to a white node, or from a white node to a black node. No edges connect a black node to another black node, or a white node to another white node.

Here's an example of a directed bipartite graph with labeled edges.

A simple directed bipartite graph
A simple directed bipartite graph with labeled edges.

The children of node B are the nodes to which there is an edge from B. In the example above the children of B are A and D. Similarly, the parents of B are the nodes from which there is an edge to B. In the example above, B only has one parent, A.

The simulator you will build in the last part of this problem set will use a bipartite graph to represent the simulated system. For this part, however, you should implement a generic bipartite graph representation that can be used for various purposes.

Problem 1: Writing a Specification for BipartiteGraph [20 points]

For Problems 1-3, you will have to make a number of design decisions.  Problem 4 will require you to document these decisions.  Start keeping notes right now about the alternatives you consider and the decisions you make.

You should begin by considering what sorts of methods and operations you feel you should include in your graph abstraction. The test driver specification may be used as a guide to a choice of which operations to support, but keep in mind several things:
You will find other suggestions on how to go about choosing operations for your BipartiteGraph abstract data type in the Liskov book in sections 5.1 and 5.1.1 (pp. 79-83).

Write a specification for methods to perform these operations, including requires/effects/modifies clauses. Write your specification as comments in BipartiteGraph.java

Your specification should also include a class overview, a few paragraphs long, that describes what a BipartiteGraph represents.

Finally, you should generate Javadocs for your specifications, using the ant script. Using Eclipse, right-click on the build.xml file and select "Run Ant....". A dialog box will appear. Check the "javadoc" box and click "Run". From the command line, type "cd ~/6.170/workspace/ps4/; ant javadoc;". The files will be automatically placed in /doc/api/; you should "refresh" and check that they are correct. Make sure that all the methods have one-line summaries and that the specs for each method correspond to the correct method. If you find a problem, edit the Java file and regenerate the Javadoc. (Do not edit the generated HTML files.)

To view your javadocs, open them locally using a browser. They will be in ~/6.170/workspace/ps4/doc/api/. You can also run /mit/6.170/bin/publish-doc ps4 and view them in your web space -- you will need certificates. If you prefer a text browser, run add sipb; links doc/api/index.html.

Problem 2: Testing BipartiteGraph [15 points]

First, in order to allow us to verify your implementation, you will have to provide a test driver, BipartiteGraphTestDriver, that allows the staff tests to construct and use your graph abstraction. We have provided a specification and a skeleton implementation of the test driver. You should fill in this file with your own code that constructs and uses instances of your BipartiteGraph class.

Second, write test cases that fully exercise your specification for BipartiteGraph.  Use JUnit for your unit tests, and put them in a class called BipartiteGraphTest.java. A skeletal file is provided.

Problem 3: Implementing BipartiteGraph [15 points]

Write an implementation of your graph data type. Consider when doing so that your BipartiteGraph abstraction will be used by the traffic simulator. Since you will eventually be creating and operating on very large sized graphs it is important that your graph implementation be scalable. The simulator must frequently look up the children for a given node, so this operation should be performed quickly even for large graph sizes. Similarly, the simulator will frequently look up the parents for a node.  Both these operations should be able to be performed in constant time. Your graph building operations should also be reasonably efficient. However, strive first for a good design, then worry about performance.

Make sure you write a proper abstraction function and representation invariant as a comment in your code. You should also implement a checkRep() method. This will help in finding errors in your implementation. Eventually, the checkRep() for BipartiteGraph will have to be disabled since it is too computationally intensive for the traffic simulator.

Problem 4: Documentation for BipartiteGraph [15 points]

Document your design decisions in a file called problem4.txt and place it in the doc/ directory.  Your documentation must be in .txt format, and should have the following sections:

Requirements

The purpose of this section is for you to explain all of the assumptions your program makes. If there is anything which you found unclear in the description of BipartiteGraph, please state what assumptions you had to make, and clarify the problem. If you found nothing unclear, then you may simply write "No clarification necessary."

Design

Write a brief textual description of the representation you chose for BipartiteGraph in problem 3, and explain why you chose it. Compare it to at least one other representation that you considered.

Testing

Describe the testing strategy you used to generate test cases in problem 2.  Explain why you feel that your test cases are sufficient, but not excessive.  Indicate whether you passed all of your own and the staff tests. If your code fails any of these tests, indicate which ones and why they are failing.

Analysis

In a paragraph or two explain what you regarded as the successes and failures of the development: unresolved design problems, performance problems, etc. Also describe what lessons you learned from the experience, how you might do it differently a second time around and how the faults of the design and implementation may be corrected.

Finally, please include, as a comment in the file, a brief description of why you included the operations you did and why you feel they are a sufficient interface to a graph.



Pipe and Filter Simulator

Now, you will use your graph abstraction to implement a generic simulator of a system of pipes and filters.  Conceptually, a pipe and filter system has three kinds of objects:
Pipes and filters can be used to model various different domains, such as:
The topology of a pipe-and-filter system is modeled by a bipartite graph in which pipes are black nodes and filters are white nodes.  An edge from a pipe to a filter indicates that the pipe provides input to the filter, and an edge from the filter to another pipe indicates that the filter sends output to the pipe.  (Notice that pipes are nodes in the graph, not edges!  We require this so that multiple filters can connect to the same pipe, which wouldn't be possible if pipes were represented by graph edges.)  The bipartite nature of the graph guarantees that pipes are only connected to filters, and filters are connected only to pipes.  Labels on the edges allow filters to differentiate between their input and output pipes, if that is important to the operation of the filter.  In the traffic simulator, for example, a traffic light filter will need to distinguish the incoming roads that currently have a green light from the roads that have a red light.

Time is an important feature of the simulation.  Work objects do not propagate instantaneously through the pipe-and-filter system.  Instead, each pipe and filter may have some delay for propagating a work object through it.  A pipe may also have an optional capacity, a maximum number of work objects it can hold.  If work objects are not drained from the pipe by a downstream filter as fast as they are being poured in by an upstream filter, then the pipe will eventually fill up, refusing any more input until at least one work object has been removed.  As a result, fast upstream filters will be forced to slow down to the rate of the slowest downstream filter.

The passage of time is handled by the simulator using round-robin timeslicing.  Each pipe and filter is told to simulate itself for one time slice (an arbitrary short unit of time).  Once all the pipes and filters are done, the simulator advances its clock by one time slice, and then repeats.  This approach has a serious pitfall, however: the behavior of the simulator strongly depends on the order in which the pipes and filters are simulated.  Consider this simple example:

Two pipes and a filter
Suppose the pipes and filters have a delay of 1 time slice, and a work object starts at the beginning of pipe A. If the simulator simulates the pipes and filter in the order A, B, C, then the work object will be through the entire system after only 1 time slice. On the other hand, if it simulates the system in the order C, B, A, then the work object will take 3 time slices to propagate through the system.

We can avoid this problem by simulating pipes and filters in separate phases: first all the pipes, and then all the filters.  Since no pipes are connected to other pipes, it doesn't matter what order we call the pipes with respect to other pipes, since any work objects that leave a pipe will have to enter a filter.  Likewise for the filters.

Here is pseudocode for simulating one time slice of the entire pipe-and-filter system:
       for each pipe p {
simulate p for one time slice
}
for each filter f {
simulate f for one time slice
}
increment simulator clock by one time slice
 

Problem 5: Designing and Implementing Simulator [15 points]

Now create a specification for a Simulator class that simulates a pipe-and-filter system.  Your class should use your BipartiteGraph abstraction to store the graph of pipes and filters.  It should provide a simulate() method that simulates one time slice of the system using the algorithm above.

Your Simulator should support simulating BipartiteGraphs with arbitrary node types. It should assume only that the pipes and filters implement the Simulatable interface , which we provide for you.

Implement the specification of your Simulator class.


Integer Arithmetic Simulator

To test your simulator, you will simulate a very simple integer arithmetic domain. In this domain, the work objects are integers, and the filters are integer functions. For testing purposes, you will implement one kind of pipe and two kinds of filters.

An IntPipe pipe stores a flow of integers, with a delay of one time slice and infinite capacity.  An easy way to implement this pipe is with two lists, one for the integers that have just entered the pipe, and another for the integers that are ready to leave.  (Note that in order to store an integer in a List, you will have to wrap it in an Integer object.)  When an IntPipe is simulated for one time slice, it should take all the integers from its input list and deposit them at the end of its output list.

simulating an IntPipe for one time slice
A PlusFilter has zero or more input pipes and exactly one output pipe.  It computes the sum of the integers on its input pipes.  When it is simulated for one time slice, it should take an integer from each of its input pipes, add them together, and deposit the sum on its output pipe.  Any input pipes that are empty should be treated as the integer 0.  A PlusFilter with no input pipes should always output 0.

a PlusFilter with 3 inputs
A GCDFilter computes one step of Euclid's greatest-common-divisor algorithm.  It has two input pipes, whose edges are labeled "a" and "b", and three output pipes labeled "a", "b", and "gcd".  When it is simulated for one time slice, it executes the following algorithm, shown in pseudocode:
	a = integer from input pipe a
b = integer from input pipe b
if b = 0 then
// GCD computation is finished: a is the gcd
send a to output pipe gcd
else if a < b then
// swap a and b so that a >= b
send b to output pipe a
send a to output pipe b
else
// let (a,b)=(b, remainder of a/b)
send b to output pipe a
send a % b to output pipe b
end if
One way to use the GCDFilter is by hooking up its a and b outputs to its a and b inputs, as shown below.  With this configuration, each pair of numbers will cycle around the filter until b reaches 0, at which point the GCDFilter will output the final GCD onto its gcd output.

GCDFilter

Note that the behavior of these pipes and filters are described in terms of time slices, not in seconds. So it doesn't matter how small or large a time slice value (in seconds) you pass to simulate(). Each call to simulate() should always execute exactly one time slice. In the next problem set, you will start building the traffic simulation, and the behavior of pipes and filters will be described in terms of seconds. In that problem set, the size of the time slice will determine how much work you can get done in a single call to simulate().

Problem 6: Testing Simulator [20 points]

Design and implement IntPipe, PlusFilter, and GCDFilter.  In order to be usable in your Simulator, each of these classes must implement the Simulatable interface.

You will have to provide a test driver, SimulatorTestDriver, that allows the staff tests to construct and use arithmetic simulations. We have provided a specification and a skeleton implementation of the test driver. You should fill in this file with your own code that constructs and uses instances of your Simulator, IntPipe, PlusFilter, and GCDFilter classes.

We have also provided a skeleton JUnit test suite, SimulatorTest, that uses SimulatorTestDriver to construct a few simple simulations. You should add more tests to this test suite to exercise your Simulator class thoroughly.  (Your new tests may call on Simulator and your other classes directly, without using SimulatorTestDriver as a middleman; the staff tests require SimulatorTestDriver only because we don't know what interfaces you've designed for your classes.)

What to Turnin

Remember to commit all your files to CVS. Your electronic submission should include at least:
BipartiteGraph.java
BipartiteGraphTestDriver.java
BipartiteGraphTest.java
problem4.txt

Simulatable.java
Simulator.java
IntPipe.java
PlusFilter.java
GCDFilter.java
SimulatorTestDriver.java
SimulatorTest.java



Hints

While the BipartiteGraph testing driver requires listing nodes in alphabetical order, it is not necessary for your graph implementation to store or return values in alphabetical order. You can have your testing driver sort the values it receives from your graph before returning them. You may find java.util.Collections.sort to be useful.


Errata


Q & A


Back to the Problem sets page.
For problems or questions regarding this page, contact: 6.170-webmaster@mit.edu.
$$