| 6.170 |
Laboratory in Software Engineering
Spring 2004
Problem Set 4: Abstract Data Types
Due: Tuesday, March 9, 2004 at 9pm |
Handout P4
Quick links:
Contents:
Warning: This is a long
assignment, so we suggest starting early. We recommend that you read
the entire problem set before you begin work.
Important notice
When you complete the assignment, make sure to log onto athena and
run:
athena% validate6170 ps4
Read the output message carefully!
In this assignment you will practice designing, implementing, and
testing several abstract data types.
To review, Problem Sets 3, 4, 5, and 6 address different aspects of the
design,
implementation, documentation, and testing of Traffick, a traffic
simulator for Boston and Cambridge. In Problem Set 3, you built basic
data abstractions such as Car,
Route, RoadSegment, and GeoPoint.
This problem set involves designing and implementing a graph
abstraction, and then using the graph to implement a generic
pipe-and-filter simulator. This simulator will be used in the
traffic simulator you will build in the next problem set.
This problem set requires you to do a fair amount of design. Writing
code
to a careful specification is fairly easy; writing code to an
incomplete
specification is harder; and writing a specification is hardest of all.
However, it can also be much more interesting. When you select a
design,
you should first sketch out the design space. Brainstorm all the
different
ways you might imagine satisfying the specification. (Some of these
might
seem silly, but don't reject them out of hand.) Almost every set of
requirements has multiple designs with different properties, and the
ones
we give you are no exception. When you write up your design, talk about
different designs, tradeoffs they make, in what circumstances each is
good,
and why you chose the one you did. There is no one correct answer, no
one
"perfect" design that the 6.170 staff is waiting for you to discover.
Rather, we want you to think about the problem, consider the
alternatives,
and make a cogent case for the one that you choose.
You are provided specifications in two different formats: HTML and starter Java files.
The HTML specifications are good for viewing or printing with a Web
browser, and they contains special formatting and hyperlinks, just like
the Java
API Specification. You can edit the starter Java files to complete
your assignment without having to retype the declarations.
Make sure you have followed the instructions on the tools handout
relevant to directory
setup before you begin development. Furthermore, be sure you
have read the problem
set
procedure handout before you begin.
You can get the source files for problem set 4 by using CVS. Follow
the problem set checkout
instructions and checkout the ps4 module from your
repository.
Your code MUST work on Athena, so if you choose to work on a
non-Athena machine, please take a few minutes before submitting your
solution to ensure that it runs correctly on Athena. Once everything
is in order, read the Problem
Set
Submission instructions for how to run a final check of your code
and turn it in.
In this part, you will design and implement an
abstraction for directed bipartite
graphs with labeled edges. Let's take this mathematical
object apart piece by piece.
A graph consists of a set
of nodes, some or all of which may be
connected by edges.
In a directed graph, the
edges have direction:
node B might have an edge to node C, but node C is not necessarily
connected to node B. For our purposes there cannot be more than one
edge from a given node to another given node, but there can be an edge
from A to B and another edge from B to A. Note that no two nodes
may have the same label.
In a graph with labeled edges,
each edge has an Object associated with it, which acts as its
label. For our purposes, all the edges leaving a node should have
distinct labels, and all the edges entering a node also have distinct
labels. However, two different edges may have the same label if they
don't both come from the same node or both go to the same node.
For example, the picture below shows two edges labeled x, which is legal. But it
wouldn't be legal for both edges leaving B to be labeled x. This lets us uniquely
identify any edge in the graph by providing just its label and one of
the nodes it connects.
Finally, a bipartite graph
is a graph in which there are two different types of nodes -- black
nodes and white nodes -- such that the edges only connect nodes of
different colors: from a black node to a white node, or from a white
node to a black node. No edges connect a black node to another black
node, or a white node to another white node.
Here's an example of a directed bipartite graph with labeled edges.

A simple directed bipartite graph with labeled edges.
The children of node B are
the nodes to which there is an edge
from B. In the example above the children of B are A and D.
Similarly, the parents of B
are the nodes from which there is an edge
to B. In the example above, B only has one parent, A.
The simulator you will build in the last part of this problem set
will use a bipartite graph to represent the simulated system. For this
part, however, you should implement a generic
bipartite graph representation that can be used for various purposes.
For Problems 1-3, you will have to make a number of design
decisions. Problem 4 will require you to document these
decisions.
Start keeping notes right now about the alternatives you consider and
the decisions you make.
You should begin by considering what sorts of methods and operations
you
feel you should include in your graph abstraction. The test driver
specification may be used as a
guide to a choice of which operations to support, but keep in mind
several things:
- Our test driver only uses Strings as node and edge labels.
Your graph abstraction must be generic,
supporting arbitrary Objects as nodes and edge labels. You should
even allow mutable objects as node and edge labels, but it's okay to
require that such mutable objects must implement behavioral equivalence.
- Our test driver's methods have lots of preconditions -- e.g., it
relies on the caller not to add the same node twice. Your graph's
specification should be better designed than this.
- Our test driver is minimal.
An adequate abstract data type should offer more operations than the
test driver actually tests.
You will find other suggestions
on how to go about choosing operations for
your BipartiteGraph abstract data type in the Liskov book in sections
5.1 and
5.1.1 (pp. 79-83).
Write a
specification
for methods to perform these operations, including
requires/effects/modifies
clauses. Write your specification as comments in BipartiteGraph.java
Your specification should also include a class overview, a few
paragraphs long, that describes what a BipartiteGraph represents.
Finally, you should generate Javadocs for your specifications, using the ant
script. Using Eclipse, right-click on the build.xml file and select
"Run Ant....". A dialog box will appear. Check the "javadoc" box and click "Run".
From the command line, type "cd ~/6.170/workspace/ps4/; ant javadoc;". The files
will be automatically placed in /doc/api/; you should "refresh" and check that
they are correct. Make sure that all the methods have one-line summaries and that
the specs for each method correspond to the correct method. If you find a problem,
edit the Java file and regenerate the Javadoc. (Do not edit the generated HTML files.)
To view your javadocs, open them locally using a browser. They will be in ~/6.170/workspace/ps4/doc/api/. You can also run /mit/6.170/bin/publish-doc ps4 and view them in your web space -- you will need certificates. If you prefer a text browser, run add sipb; links doc/api/index.html.
First, in order to allow us to verify your implementation, you will
have to
provide a test driver, BipartiteGraphTestDriver,
that allows the staff tests to construct and use
your graph abstraction. We
have provided a specification
and a
skeleton implementation
of the test driver.
You should fill in this file with your own code that constructs and
uses instances of your BipartiteGraph class.
Second, write test cases that fully exercise your specification for
BipartiteGraph. Use JUnit for your unit tests, and put them in a
class called BipartiteGraphTest.java.
A skeletal
file is provided.
Write an implementation of your graph data type. Consider when
doing so that your BipartiteGraph abstraction will be used by the
traffic simulator. Since you will eventually be creating and operating
on very large
sized graphs it is important that your graph implementation be
scalable. The simulator must frequently look up the children
for a
given node, so this operation should be performed quickly even for
large
graph
sizes. Similarly, the simulator will frequently look up the parents for
a node. Both these operations should be able to be performed in
constant time.
Your graph building operations should also be reasonably efficient.
However,
strive first for a good design, then worry about performance.
Make sure you write a proper abstraction function and
representation invariant as a comment in your code. You should also
implement a checkRep() method. This will help in finding
errors in your implementation. Eventually, the checkRep()
for BipartiteGraph will have to be disabled since it is too
computationally
intensive for the traffic simulator.
Document your design decisions in a file called problem4.txt and place it in
the doc/ directory. Your
documentation
must be in .txt format, and should have the following sections:
The purpose of this section is for you to explain all of the
assumptions
your program makes. If there is
anything which you found unclear in the description of BipartiteGraph,
please state
what
assumptions you had to make, and clarify the problem. If you found
nothing unclear, then you
may simply write "No clarification necessary."
Write a brief textual description of the representation you chose for
BipartiteGraph in problem 3, and explain why you chose it. Compare it
to
at least one other representation that you considered.
Describe the testing strategy you used to generate test cases in
problem 2. Explain why you feel that your test cases are
sufficient, but not
excessive. Indicate whether you passed all of your own and the
staff
tests. If your code fails any of these tests, indicate which ones and
why
they
are failing.
In a paragraph or two explain what you regarded as the successes and
failures of the development: unresolved design problems, performance
problems, etc. Also describe what lessons you learned from the
experience,
how you might do it differently a second time around and how the faults
of
the design and implementation may be corrected.
Finally, please include, as a comment in the file, a brief description of why
you
included
the operations you did and why you feel they are a sufficient
interface to a graph.
Now, you will use your graph
abstraction to implement a generic simulator of a system of pipes and
filters. Conceptually, a pipe and filter system has three kinds
of objects:
- work objects flowing
around the system;
- pipes that convey the
work objects from one place to another;
- filters that process
work objects from their incoming pipes and send them onward to their
outgoing pipes.
Pipes and filters can be used to model various different domains, such
as:
- An automotive assembly line.
In this domain, the work objects are cars under construction, the pipes
are conveyor belts, and the filters are work stations where something
is done to the car, such as painting the body or installing the engine.
- An image processing system.
Here, the work objects might be frames of video, the pipes are
communication channels, and the filters are image processing
algorithms, such as compression, decompression, changing color balance,
or resizing.
- Traffic in a road system.
In this domain, which will be the subject of Problem Set 5, the work
objects are moving cars, the pipes are road segments, and the filters
are intersections that cars must negotiate.
The topology of a pipe-and-filter system is modeled by a bipartite
graph in which pipes are black nodes and filters are white nodes.
An edge from a pipe to a filter indicates that the pipe provides input
to the filter, and an edge from the filter to another pipe indicates
that the filter sends output to the pipe. (Notice that pipes are nodes in the graph, not
edges! We require this so that multiple filters can connect to
the same pipe, which wouldn't be possible if pipes were represented by
graph edges.) The bipartite nature of the graph guarantees that
pipes are only connected to filters, and filters are connected only to
pipes. Labels on the edges allow filters to differentiate between
their input and output pipes, if that is important to the operation of
the filter. In the traffic simulator, for example, a traffic
light filter will need to distinguish the incoming roads that currently
have a green light from the roads that have a red light.
Time is an important feature of the simulation. Work objects do
not propagate instantaneously through the pipe-and-filter system.
Instead, each pipe and filter may have some delay for propagating a work object
through it. A pipe may also have an optional capacity, a maximum number of work
objects it can hold. If work objects are not drained from the
pipe by a downstream filter as fast as they are being poured in by an
upstream filter, then the pipe will eventually fill up, refusing any
more input until at least one work object has been removed. As a
result, fast upstream filters will be forced to slow down to the rate
of the slowest downstream filter.
The passage of time is handled by the simulator using round-robin timeslicing. Each
pipe and filter is told to simulate itself for one time slice (an
arbitrary short unit of time). Once all the pipes and filters are
done, the simulator advances its clock by one time slice, and then
repeats. This approach has a serious pitfall, however: the
behavior of the simulator strongly depends on the order in which the
pipes and filters are simulated. Consider this simple example:
Suppose the pipes and filters have a delay of 1 time slice, and a work
object starts at the beginning of pipe A. If the simulator simulates
the pipes and filter in the order A, B, C, then the work object will be
through the entire system after only 1 time slice. On the other hand,
if it simulates the system in the order C, B, A, then the work object
will take 3 time slices to propagate through the system.
We can avoid this problem by simulating pipes and filters in separate
phases: first all the pipes, and then all the filters. Since no
pipes are connected to other pipes, it doesn't matter what order we
call the pipes with respect to other pipes, since any work objects that
leave a pipe will have to enter a filter. Likewise for the
filters.
Here is pseudocode for simulating one time slice of the entire
pipe-and-filter system:
for each pipe p {
simulate p for one time slice
}
for each filter f {
simulate f for one time slice
}
increment simulator clock by one time slice
Now create a specification for a Simulator class that simulates a
pipe-and-filter system. Your class should use your BipartiteGraph
abstraction to store the graph of pipes and filters. It should
provide a simulate()
method that simulates one time slice of the system using the
algorithm above.
Your Simulator should support simulating BipartiteGraphs with
arbitrary node types. It should assume only that the pipes and filters
implement the Simulatable interface
, which we provide for you.
Implement the specification of your Simulator class.
To test your simulator, you will simulate a very simple integer
arithmetic domain. In this domain, the work objects are integers, and
the filters are integer functions. For testing purposes, you will
implement one kind of pipe and two kinds of filters.
An IntPipe pipe stores a
flow of integers,
with a delay of one time slice and infinite capacity. An easy way
to implement this pipe is with two lists, one for the integers that
have
just entered the pipe, and another for the integers that are ready to
leave. (Note that in order to store an integer in a List, you will have to wrap it
in an Integer
object.) When an IntPipe
is simulated for one time slice, it should take all the integers from
its input list and deposit them at the end of its output list.
A PlusFilter has zero or
more input pipes and exactly one output pipe. It computes the
sum of the integers on its input pipes. When it is simulated for
one time slice, it should take an integer from each of its input pipes,
add them together, and deposit the sum on its output pipe. Any
input pipes that are empty should be treated as the integer 0. A PlusFilter with no input pipes should always output
0.
A GCDFilter computes one
step of Euclid's greatest-common-divisor algorithm. It has two
input pipes, whose edges are labeled "a" and "b", and three output
pipes labeled "a", "b", and "gcd". When it is simulated for one
time slice, it executes the following algorithm, shown in pseudocode:
a = integer from input pipe a
b = integer from input pipe b
if b = 0 then
// GCD computation is finished: a is the gcd
send a to output pipe gcd
else if a < b then
// swap a and b so that a >= b
send b to output pipe a
send a to output pipe b
else
// let (a,b)=(b, remainder of a/b)
send b to output pipe a
send a % b to output pipe b
end if
One way to use the GCDFilter
is by hooking up its a and b outputs to its a and b inputs, as shown below.
With this configuration, each pair of numbers will cycle around the
filter until b reaches 0, at
which point the GCDFilter
will output the final GCD onto its gcd
output.
Note that the behavior of these pipes and filters are described in
terms of time slices, not in seconds. So it doesn't matter how
small or large a time slice value (in seconds) you pass to simulate().
Each call to simulate() should always execute exactly one time slice.
In the next problem set, you will start building the traffic
simulation, and the behavior of pipes and filters will be described in
terms of seconds. In that problem set, the size of the time
slice will determine how much work you can get done in a single call
to simulate().
Design and implement IntPipe,
PlusFilter, and GCDFilter. In order to be
usable in your Simulator,
each of these classes must implement the Simulatable interface.
You will have to
provide a test driver, SimulatorTestDriver,
that allows the staff tests to construct and use
arithmetic simulations. We
have provided a specification and a skeleton implementation of the test
driver.
You should fill in this file with your own code that constructs and
uses instances of your Simulator,
IntPipe, PlusFilter, and GCDFilter classes.
We have also provided a skeleton JUnit test suite, SimulatorTest, that uses SimulatorTestDriver to
construct a few simple simulations. You should add more tests to this
test suite to exercise your Simulator
class thoroughly. (Your new tests may call on Simulator and your other
classes directly, without using SimulatorTestDriver
as a middleman; the staff tests require SimulatorTestDriver only
because we don't know what interfaces you've designed for your classes.)
Remember to commit all your files to CVS. Your electronic submission
should include at least:
BipartiteGraph.java
BipartiteGraphTestDriver.java
BipartiteGraphTest.java
problem4.txt
Simulatable.java
Simulator.java
IntPipe.java
PlusFilter.java
GCDFilter.java
SimulatorTestDriver.java
SimulatorTest.java
While the BipartiteGraph testing driver requires listing nodes in
alphabetical
order, it is not necessary for your graph implementation to store or
return values in alphabetical order. You can have your testing driver
sort the values it receives from your graph before returning them.
You may find java.util.Collections.sort
to be useful.
Errata
Q & A
Back to the Problem sets page.
For problems or questions regarding this page, contact:
6.170-webmaster@mit.edu.
$$