| 6.170 | Laboratory in Software Engineering
Spring 2001 Problem Set 3: Designing and Implementing Graph Due: Thursday, March 1, 2001 at 5pm |
Quick links:
We recommend that you read the entire problem set before you begin work.
To do so, MapQuick will need a graph abstraction to represent connectivity, a path abstraction to describe the cost of traveling a particular course, and a path-finding algorithm to search for the shortest possible path.
While we do not specify the names, method signatures, or
specifications of your graph or path-finding abstractions, we do
specify Path and provide you with a file format for your
testing driver. The testing driver will execute a script of commands
and output results to standard out.
A directed graph consists of a set of nodes some of which may be connected by edges. In a directed graph, the edges have direction: node B might have an edge to node C, but node C is not necessarily connected to node B. For our purposes there cannot be more than one edge from a given node to another given node, but there can be an edge from A to B and an edge from B to A.

The children of node B are the nodes to which there is an edge from B. In the example above the children of B are A and C. Similarly, the parents of B are the nodes from which there is an edge to B. In the example above B only has one parent, A.
Path interface will
provide an abstraction for both representing a given path and for
obtaining the cost of that path.
Path is a Java interface which only specifies a set of
required methods, but cannot be directly instantiated and provides no
code to classes which implement it. This allows for searching graphs
containing any type of node so long as an appropriate implementation
of Path is available.
Since the graphs we are dealing with are node-weighted, we provide a shortest path algorithm for you to use which can handle node weighted graphs. Here is a modified version of Dijkstra's algorithm that finds the shortest path from node start to node goal in a node-weighted graph. This algorithm is called a "greedy" algorithm because it never needs to recompute or reconsider information: it simply uses the first path that it finds from start to a particular node.
The following modified version of Dijkstra's algorithm has been
written up in a pseudocode fashion which, while not executable, should
be easy to read, understand, and translate to Java. The notation
[a,b,c] stands for the three-element sequence consisting
of a, b, and c; here, we use it
to represent a path. When applied to sequences, "+" means
concatenation; for instance, [a,b] + [c,d] = [a,b,c,d].
Using parentheses on a map looks up the value of the argument in the
map. For instance somemap(a) evaluates to the value that
a is mapped to in somemap. This value could
be set as somemap(a) = foo.
// Return a shortest path from any element of starts to any
// element of goals in a node-weighted graph.
Algorithm node-weighted-shortest-path(starts, goals) {
for each start in starts
if start in goals
return [start]
// maps nodes -> paths
Map paths = { forall start in starts | (start, [start]) }
// The priority queue contains nodes with priority equal to the cost
// of the shortest path to reach that node. Initially it contains
// the start nodes
PriorityQueue active = starts
Set finished = { }
while active is non-empty do {
// y is the element of active with shortest path
y = active.extractMin()
ypath = paths(y)
// iterate over edges (y, z) in y.edges
for each child z of y {
zpath = ypath + [z]
if (z in goals) {
return zpath
}
if (z not in finished) and (z not in active) {
paths(z) = zpath
insert z in active
}
}
insert y in finished
}
// execution reaches this point only if active becomes empty
return No Path Exists
}
You should read the note about a bug in this algorithm.
This algorithm requires a priority queue, a collection data structure that supports inserting elements and extracting the highest-priority element. (The highest-priority element has the numerically smallest priority; here, we use path costs as priorities, so shorter paths have higher priority.) We provide the class PriorityQueue which efficiently implements this data structure.
A key property of this algorithm is that if x is a finished node, and y is not a finished node (either it is active, or we don't know the shortest path yet), then the length of the shortest path from start to x is no more than the length of the shortest path from start to y.
The algorithm starts with a set of active nodes starts, each element of which is associated with a path consisting of just that node. When any of the nodes contained in goals is reached, the algorithm terminates and returns the associated path with that element of goals.
The basic step of the algorithm (which is repeated over and over) is to find, in the set of active nodes, the one for which the associated shortest path has lowest total weight. Call this node y. Now consider every edge that leaves y; say that edge goes to z. If z is finished or active, ignore the (y, z) edge, because we already know of a shorter path to z than the one that goes through y. Otherwise, add z to the collection of active nodes, and associate with it the path zpath which extends the path associated to y by adding z at the end. zpath is the shortest path from start to z. After all the edges leaving y have been examined, move y from the active collection to the finished collection.
The nodes a, b, c, d, and e have weights 2, 1, 3, 1, and 4, respectively. We want to find the shortest path from a to e.
Initially:
paths = { (a, [a]) }
active = { a }
finished = { }
Remove a from active; its path has cost 2.
Consider each neighbor of a:
add b to active
set paths(b) = [a,b]
add c to active
set paths(c) = [a,c]
Add a to finished.
Now:
paths = { (a, [a]), (b, [a,b]), (c, [a,c]) }
active = { b, c }
finished = { a }
Remove b from active; its path has cost 3, which is the lowest in active.
Consider each neighbor of b:
do not add c to active, as c already appears in active
add d to active
set paths(d) = [a,b,d]
Add b to finished.
Now:
paths = { (a, [a]), (b, [a,b]), (c, [a,c]), (d, [a,b,d]) }
active = { c, d }
finished = { a, b }
Remove d from active; its path has cost 4, which is the lowest in active.
Consider each neighbor of d:
e is the goal node, so we are finished; return [a,b,d,e]
The testing script manipulates graphs of WeightedNode elements.
Each WeightedNode has a name and a cost. After a
WeightedNode is created, it is referred to by name later on in the
testing file. We have also provided a WeightedNodePath which
implements the Path interface for sequences of
WeightedNodes. When the testing driver outputs a node,
it should do so by displaying that node's name. The name for each
Graph is also a String. The testing driver must manage the mapping of
names to graphs in addition to the mappings from names to nodes. This
could be accomplished by maintaining a HashMap in the testing driver
mapping Strings to Graphs and another for mapping names to
WeightedNodes. To simplify the parsing of the file, names of nodes and
graphs must contain only alphanumeric characters.
The following is a list of the valid commands, and the meaning of their arguments. Each command has an associated output which should be printed to standard out when the command is exectued.
CreateGraph graphName
Creates a new graph which shall be named
graphName. The graph is initially empty (has no nodes
and no edges). The command's output is
created graph graphName
CreateNode nodeName cost
Creates a new node named by the string nodeName with the cost
specified by the non-negative integer cost. The node can be referred to by name for the
rest of the file.
The command's output is
created node nodeName with cost cost
AddNode graphName nodeName
Adds a node represented by the string nodeName to the graph named
graphName. The command's output is
added node nodeName to graphName
AddEdge graphName parentNode childNode
Creates an edge in graph graphName from
parentNode to childNode.
The command's output is
added edge from parentNode to childNode in graphNode
ListNodes graphName
This command has no effect on the graph. Its output starts with
graphName contains:and is followed, on the same line, by a space-separated list of the names of all nodes in the graph. The nodes should appear in alphabetical order. There is a single space between the colon and the first node name.
ListChildren graphName parentNode
This command has no effect on the graph. Its output starts with
the children of parentNode in graphName are:and is followed, on the same line, by a space-separated list of the names of nodes to which there is an edge from parentNode. The nodes should appear in alphabetical order. There is a single space between the colon and the first node name.
FindPath graphName from1 [from2 [from3 ... ] ] -> to1 [ to2 [ to3 ... ] ]
The -> symbol is separated from node names by whitespace on both sides.
This command has no effect on the graph. It finds and outputs the shortest
path amongst all of the possible paths from any of the from nodes to any of
the to nodes. The output starts with
shortest path in graphName:and is followed, on the same line, by a space-separated list of nodes in the order of traversal from fromX -> toX If no path exists the output should be
shortest path in graphName:no path found
Given the following as input for the testing driver:
CreateNode n1 5 CreateNode n3 1 CreateGraph A AddNode A n1 CreateNode n2 10 AddNode A n2 CreateGraph B ListNodes B AddNode A n3 AddEdge A n3 n1 AddNode B n1 AddNode B n2 AddEdge B n2 n1 AddEdge A n1 n3 AddEdge A n1 n2 ListNodes A ListChildren A n1 AddEdge A n3 n3 ListChildren A n3 FindPath A n3 -> n2a correct implementation would output the following:
created node n1 with cost 5 created node n3 with cost 1 created graph A added node n1 to A created node n2 with cost 10 added node n2 to A created graph B B contains: added node n3 to A added edge from n3 to n1 in A added node n1 to B added node n2 to B added edge from n2 to n1 in B added edge from n1 to n3 in A added edge from n1 to n2 in A A contains: n1 n2 n3 the children of n1 in A are: n2 n3 added edge from n3 to n3 in A the children of n3 in A are: n1 n3 shortest path in A: n3 n1 n2The behavior of the testing driver on malformed input files is not defined; you may assume the input files are well-formed.
Since it is possible that your specification throws exceptions for
certain inputs, the testing file format provides for one additional
feature. If the command in the input file results in an exception,
the command, rather than outputting its normal string, should output
"Exception: <toString value of exception>". This
exception-output behavior is already implemented in the
executeCommand method of the provided
PS3TestDriver starter file.
A sample testing file is provided in /mit/6.170/www/psets/ps3/example1.test and the example above has been placed in /mit/6.170/www/psets/ps3/example2.test. The expected output in each case is in /mit/6.170/www/psets/ps3/example1.expected and /mit/6.170/www/psets/ps3/example2.expected respectively.
Graph.java. Your graph
object itself should not manage, nor be aware of costs or weights of
nodes; for this problem set the nodes themselves will store costs.
You will find suggestions on how to go about choosing operations for your Graph abstract data type in the course text in sections 5.1 and 5.1.1 (pp. 79-83).
Please include in your turnin a brief description of why you included the operations you did and why you feel they are a sufficient interface to a graph. This should be placed in problem1.txt.
Also include a brief textual description of your representation and explain why you chose it. Compare it to at least one other representation that you considered. You should place this discussion in a file called problem2.txt.
Remember that the Graph data type should be an abstract general
purpose abstraction of a directed graph. It should not directly store
cost information within it. To represent a directed graph with costs
associated with each node, it is the node's responsibility to store
the cost. The Graph abstraction should not depend upon the node
objects being of any particular runtime type and as such should
therefore only use operations defined in Object.
WeightedNodes as the elements of the
Graph. You will also write your test cases in this file
format. The testing driver should read input from standard in using the format described under Testing Script Format and print its results to standard out. In order to help you parsing the input file format, we have provided a skeleton implementation of the parser. You should fill in this file with your own code. Please be sure to use the output field in the tester to print out the desired output.
Your test cases should be written in the specified machine readable format (with comments). You may place them in multiple testing files but each file should begin with the letters "p3" and have the extension ".test" The expected output for your test cases should be a file with the name name, but with the extension ".expected". For example you may have a file name "p3foo.test" with expected output in "p3foo.expected".
Include with your test cases documentation explaining your testing
strategy and an argument for why you feel that they are sufficient,
but not excessive. Place this writeup in problem3.txt. Be sure to
include all of your tests and your testing driver in your Manifest
file.
Note: For the test for this problem your driver does need not yet
implement the FindPath command.
Your path finder abstraction should support searching Graphs with
arbitrary node types. It should obtain a cost metric through the
Path interface.
In order to keep your PathFinder
implementation from being dependent on a particular node type you may
find it useful to pass in source Paths rather than source
nodes.
ps3pathfinder.test and provide there expected output in
ps3pathfinder.expected. Also include an explanation your testing
strategy and an argument for why you feel that your tests are
sufficient, but not excessive. Place this writeup in problem5.txt.
WeightedNodePath. In
a file named problem6.txt, describe why one might prefer the
representation that was chosen over possible alternatives. You should
describe at least one alternate representation. You should consider
the representation within the context of its expected use in
PathFinder.
Graph.java PathFinder.java [.java source for Test Driver] problem1.txt problem2.txt problem3.txt problem5.txt problem6.txt [test cases for p3] [expected output for p3 tests] [test cases for p5] [expected output for p5 tests]
You should also turnin a hardcopy of each of these files.
The following classes are all provided for you, in compiled form, by the staff:
While the testing driver requires listing nodes in alphabetical
order, it is not necessary for your graph implementation to store or
return values in alphabetical order. You can have your testing driver
sort the values it receives from your graph before outputting them.
You may find java.util.Collections.sort
to be useful.
In order to compare the output of your testing driver to either the expected output we provide, or to your own expected output program, you may find the unix diff command to be useful. The diff command can compare two input files and report what is different between them. To use diff you should first save the output of running your test to a file. You can do this by adding, after your normal command line, a greater than (>) character followed by the name of a file to save the output to. You can then run diff by typing the command (substituting the appropriate file names):
diff expected-output-file actual-output-fileIf diff exits and prints nothing to the screen, then the two files are identical. If they are different, however, diff will print to the screen a summary of the changes. If you have a wide terminal you might find:
diff -y expected-output-file actual-output-filewhich displays the two files side by side to be more readable.
Just like you can use the greater than symbol to send the output of a program to a file you can use the less than symbol (<) to redirect the contents of a file to standard in. Therefore if you wanted to run the provided testing skeleton reading input from a file name mytest.test and writing it to mytest.out you could run:
java ps3.PS3TestDriver < mytest.test > mytest.out
The algorithm given above will work under most circumstances, but will fail under certain conditions. If you solve this problem set using the algorithm as stated above, you will not have points deducted for the cases which fail. That is, we won't test your implementation under conditions which cause the algorithm to fail.
However, we provide a fix to the algorithm below, for students who wish to have an implementation which will be correct for all inputs. Again, students need not make this change for ps3; we are providing the fix for purposes of later reference, thoroughness, and correctness.
The algorithm is the same as before, except for the changes marked in bold. The change is that the test for reaching a goal is performed after extraction from the queue, instead of while iterating over child nodes.
// Return a shortest path from any element of starts to any
// element of goals in a node-weighted graph.
Algorithm node-weighted-shortest-path(starts, goals) {
// [ test for start nodes in goals removed ]
// maps nodes -> paths
Map paths = { forall start in starts | (start, [start]) }
// The priority queue contains nodes with priority equal to the cost
// of the shortest path to reach that node. Initially it contains
// the start nodes
PriorityQueue active = starts
Set finished = { }
while active is non-empty do {
// y is the element of active with shortest path
y = active.extractMin()
ypath = paths(y)
// [goal test added here]
if (y in goals) {
return ypath
}
// iterate over edges (y, z) in y.edges
for each child z of y {
zpath = ypath + [z]
// [goal test removed from here]
if (z not in finished) and (z not in active) {
paths(z) = zpath
...
This section details the changes made since the handout was posted to the website. You may use the version number at the bottom of your hardcopy to determine what has changed since you printed this handout.
| revision 1.68: | Added due date for returnin. |
| revision 1.67: | Add a section describing an optional fix to the path-finding algorithm. |
| revision 1.65: |
Example command line in final
hint now accounts for PS3TestDriver being in package
ps3.
|
| revision 1.64: | Added this changelog. |
| revision 1.62: | Move hint #2 about passing in Path instances to PathFinder into the actual problem statement. |
| revision 1.61: | Emphasize that Graph is to be a generic directed graph. |
| revision 1.60: | Don't require that the PathFinder method is static. |
| revision 1.58: | Remove the mention of NoSuchPathException from the comment at the top of the path finding pseudo code. Students are not required to use an exception. |
| revision 1.57: | Add section detailing what classes are provided by the staff. |
| revision 1.55: | Clarify problem 4 to remind students to implement PathFinder. |
| revision 1.53: | Clarify the handling of failing to find a path between two nodes. |
| revision 1.52: | Fix iterator bug in WeightedNodePath implementation. |