| 6.170 |
Laboratory in Software Engineering
Fall 2002
Problem Set 3: Design an Abstract Data Type
Problem 1 Due: Tuesday, September 24, 2002 at 4pm
Problems 2 & 3 Due: Tuesday, October 1, 2002 at 4pm |
Handout P3
Quick links:
We recommend that you read the entire problem set before you begin work.
Introduction
In this problem set, you'll design and implement an abstract type. Designing
an abstract type involves determining what services it should provide and
what their behavior should be -- that is, writing a specification. Implementing
an abstract type involves choosing a representation and algorithms, and embodying
them in code. The focus of this exercise is the design, so you should expect
to spend a considerable amount of time on it. Note the distribution of points
among each problem. We advise you to distribute your time accordingly.
You have two weeks to complete the entire problem set. However, in
order that you can get feedback about your specifications before you begin
to implement them, Problem 1 should be handed in after one week, on Tuesday,
September 24. Your specifications will be reviewed in the next recitation.
You will then have the opportunity to revise your specifications while
you finish the rest of the problem set, which is due on Tuesday, October
1.
Graphs
The abstract type that you'll design is a directed labelled multi-graph.
A graph is a collection of nodes with edges between them. Every edge connects
one node (the source) to one other node (the target). There can be nodes
without edges but no edges without nodes. A node may be connected to itself.
In a multi-graph, there can be zero, one or more edges between a pair
of nodes. Every edge has a label. Distinct edges may have the same label.
Some examples of directed labelled multi-graphs are depicted below.

|

|

|
a graph with 1 node
|
a graph with 2 nodes
and 1 edge
|
a graph with 3 nodes
and 5 edges
|
Here are some examples of applications of directed labelled multi-graphs.
- A compiler may represent the control flow of a program as a graph
whose nodes are points in the program, and whose edges are program statements.
The graph would be used for analyses, such as propagating the values of
constants, and for transformations, such as hoisting a statement out of a
loop.
- A website design tool may represent a website
as a graph whose nodes are documents and whose edges are links. The tool
may examine the site for connectivity, find broken links, update all documents
when a document is moved, and so on.
- A curriculum design tool may use a graph to show prerequisite
relationships between courses, to find inconsistencies and determine feasible
programs of study.
- A program for generating driving directions may use
a graph to represent a street map, and compute a shortest path to find
directions from one point to another.
- A Java compiler may use a graph to represent dependences
between source code files, in order to determine a reasonable order of
compilation.
Your type will be used to find directions on the Boston subway, so in
its use, the nodes will represent stations and the edges will represent
track segments. But your design and implementation should be polymorphic.
This means that the node type should be generic; it should be possible
to use your graph design and implementation in different applications, where
nodes and edges represent other things.
Your graph implementation must be efficient. This means it should
perform reasonably for medium sized graphs -- those consisting of thousands
(but not millions) of nodes and edges. You should not assume that
the graph will be sparse (that is, containing very few edges compared
to nodes) or dense (that is, with most node pairs connected).
Ideas
To give you some sense of the kinds of issues you should be considering
in your design, here are some questions you might want to consider. These
don't in general have simple answers. You'll need to exercise careful judgment,
and think carefully about how decisions you make interfere with each other.
- will the graph be mutable or immutable? will it be
possible to change the label of an edge?
- will the graph allow edges without labels?
- will edge labels be strings or generic objects? if
objects, will it be OK to use a node or an edge as a label? or even a graph?
- will nodes be required only to satisfy the interface
of java.lang.Object? or will you design a Java interface for nodes?
- will the graph be implemented as a single class, or
will there be a separate Java interface for the Graph specification, and
a class for the implementation?
- will edges be objects in their own right? will they
be visible to a client of the abstract type?
- will it be possible to find the successor of a node
from the node alone, or will the graph be needed too? can a node belong
to multiple graphs?
- what kind of iterators will the type provide?
- should path-finding operations be included as methods
of the graph, or should they be implemented in client code on top of the
graph?
- will the type provide any views, like the set view
returned by the entrySet method of java.util.Map?
- will the type implement any standard Java collection
interfaces?
- will the type use any standard Java collections in
its implementation?
Problem 1: Graph Specification (60 points)
Design an abstract data type for a directed labelled multi-graph.
You should hand in the following elements:
- A brief design overview listing the public classes and interfaces
that are visible to clients of the graph, and what role they play.
- A specification for each public class or interface in your design,
containing:
- an overview paragraph explaining in abstract terms what objects
are represented by that class or interface, and whether they are mutable
or not;
- a specification of each method.
- A comment on the computational complexity of the graph: say which
methods you expect to be computationally intensive (i.e., taking more than
constant time), give a bound on their asymptotic cost, and give a bound on
the space requirements. For example, you may say (rather implausibly) that
addNode will be the most costly method, that, as the graph
grows, its cost should increase at most quadratically with the size of the
graph, and that the memory required to represent the graph will grow exponentially
with the number of edges.
- A brief design rationale (no more than half a page long) listing,
as separate points, a variety of alternative design features that you considered
but rejected, with explanations of why they were rejected.
The specifications should be in skeletal source code files. You
will fill in these skeletons yourself in problem 2.
The other pieces of your design (the design overview, complexity note, and
design rationale) should be included in one or more files distinct from the
source code itself.
Hand in all these elements by the first due date, Tuesday, September 24.
You will then have the opportunity to revise your design before the
final due date (October 1).
Problem 2: Graph Implementation (25 points)
Implement the graph and test your implementation using JUnit.
You should hand in:
- A brief overview of the implementation
you chose (as distinguished from your design), along with a brief rationale
justifying your implementation in comparison to another, different (but still
plausible) implementation.
- Tastefully commented source code.
- Code for your test cases.
- A statement indicating whether or not your code
passes all the tests. We will assume that your code fails if you say nothing.
For this problem, a small collection of plausible test cases will suffice.
In the next problem set, you'll return to the code and examine it more carefully.
Here, the purpose of the testing is to make you comfortable with JUnit,
and to help you weed out egregious errors that would prevent you from completing
this problem.
Problem 3: Using the Graph (15 points)
Build a program that generates directions for the Boston subway system.
It should accept two station names as input on the command line, and output
a list of steps to be taken.
A simple breadth-first search is adequate. Don't worry about handling
the kinds of complications a real program would have to handle (such as
selecting a Green Line train at Park Street, and amalgamating steps into
journeys on individual lines).
We have written some code that parses the file describing the Boston
subway system. This file bostonmetro.txt should
be an input to the program. The code in question is found in MetroMapParser.java. You should make
modifications to this file, by inserting calls to your Graph ADT methods.
Depending on how you have designed your ADT, these calls will be different.
For example, if you have built an immutable Graph, you might have to
collect all the information from the file before you build your Graph.
However, if you have built a mutable Graph, you might be able to build
it up incrementally as the file is parsed. You may even find it easier
to make two passes through the file. Feel free to modify the
code in MetroMapParser any way you see fit. In any case, you should
read through the parser code carefully, and provide clear documentation
on the modifications you are making.
You should hand in:
- Tastefully commented source code.
- Sample output for some test cases.
Provided Code
Getting started
Make sure you have followed the instructions on the tools handout relevant
to directory setup and using Java before you begin development.
Create a ps3 directory and copy the provided files to it:
mkdir ~/6.170/ps3
cp -p /mit/6.170/src/ps3-src/*.java ~/6.170/ps3
Hints
See the problem set guidelines
and advice.
If you are unclear about how to read/write specifications, read Handout S3 (A Guideline to Specifications).
You may find the Integer.parseInt(String s) method to be of use
in converting String's into int's.
Although it is generally a bad idea to start coding before you have
thought deeply, it often makes sense to work incrementally, interleaving
design and coding. Once you have a sketch of your specification, you may
want to write some experimental code. This should give you some concrete
feedback on how easy it is to implement the methods you've specified. You
may even want to start at the end, and write the code that uses your type,
so that you can be confident that the methods you provide will be sufficient.
This strategy can backfire and degenerate into mindless hacking, leaving
you with a pile of low-quality code and an incoherent specification. To
avoid that, bear two things in mind. First, you must be willing to start
again: experimental code isn't experimental if you're not prepared to throw
it away. Second, whenever you start coding, you must have a firm idea of
what you're trying to implement. There's no point starting to code to a
specification that is vague and missing crucial details. That doesn't mean
that your specification must be complete and polished, but it does mean
that you shouldn't start coding a method until at least you have its own
specification written. Third, you must write down the specification
of a method and not just imagine it; it's too easy to delude yourself. Try
to write it on paper and mull it over before you start any coding. It's tempting
to sit in front of an editor, write some specification as comments, and then
start coding around them, but this tends not to be nearly so effective.
Make good use of your TA. Take your specification to office hours before
going to the lab and get some feedback on your design and style. This is
likely to save you a lot of time!
Errata
September 18, 2002
Problem 1 originally asked you to hand in a list of the changes you made
in your design between the first due date and the final due date. That
change list will not be required.
Q & A
Q: In directed, labelled multi-graph, what does
the term directed mean?
A: It means that each edge has a direction, indicated in the pictures
above by an arrowhead. A directed edge goes from one node to
another node. A path through a directed graph must follow each edge
in the correct direction.
Q: What is a successor of a node?
A: A node y is a successor of node x if and only if
there is an edge pointing from x to y. Finding the successors
of a node is a common operation on a graph. Finding the predecessors
of a node (nodes with an edge pointing to it) is far less common. Graph
applications that need to find predecessors often do it by adding new edges
pointing in the other direction.
Q: In bostonmetro.txt, there appear
to be two T stops with the same name (St. Paul Street). What to do?
A: Think about it, do something reasonable, and document your design
decision.
Q: How can I search the graph for Problem 3?
A: You can use Breadth-first search.
This section will list clarifications and answers to common questions.
We'll try to keep it as up-to-date as possible, so this should be the first
place to look (after carefully rereading the handout) when you have a problem.
Back to the Problem Sets .
For problems or questions regarding this page, contact: 6.170-webmaster@mit.edu
.
$Id: ps3.html,v 1.5 2002/09/26 19:39:39 decouto Exp $