Thursday, February 12, 2004
Contents:
Last week you learned semantics of Java and its application to object-oriented programming, including the concepts of mutability, equality, and the basic structure of a Java program. This week in lecture you learned about subclassing and dynamic dispatch, procedural abstractions and specifications, and testing specifications. In this recitation we will receive this material in preparation for problem set 2.
As usual, no need to cover all the material below; if students feel comfortable with a topic, you may want to skip directly to the questions just to make sure they understand it.
To illustrate the concepts above, we will use the example of a type hierarchy of geometrical figures, or curves. Suppose that you are writing software to control an ink plotter drawing on a piece of paper, or the head of a lasercutter/waterjet drilling into wood or metal. We would like the head controlling the ink/laser/water to trace a contiguous toolpath of straight lines or curves from a starting point to an end point to minimize the number of times the head must stop and start.
The different curves can be classified on various levels. For simplicity, we will divide them into the following types:
Note that these types are related because they share some properties or because we might want to perform the same kinds of operations on them. For example, to draw any of these shapes, we would want to know where to place our head at the start of drawing, where to stop drawing, and how much the head has traveled as a rough measure of how much ink or power we've used up while drawing. These operations are encapsulated in the most abstract type, a curve.
A curve may follow a straight line for a certain length with different starting and ending points (a line segment). On the other hand, it may have the same starting and ending points (a closed curve), and we might want to measure the area it encloses. Different kinds of closed curves can be conveniently represented with a few points and distances. For example, one made up entirely of line segments (a polygon) can be defined as a list of vertices, or common endpoints of line segments. A closed curve that maintains an equal distance around a center point (a circle) can be defined just by that center point and the distance (radius).
Describing a toolpath is an example of a problem which deals with a related family of types. It requires
In both of these cases, we would like to save work and increase maintainability by reusing existing, tested code. We would also like a way to express the relationship that some types inherit behavior from a parent but may specialize it to handle new problems. This is known as subtyping and there are two ways to do this in Java: subclasses, which inherit some implementation details from a superclass, and interfaces, whose implementing subtypes only inherit method signatures (partial spec). We will concentrate on subclassing in this recitation and leave interfaces for later, although you have already seen interfaces such as List and SortedSet in problem set 1.
Subclassing is extending
a previous class (the superclass)
by adding new methods (operations) or
overriding old methods to behave differently. For example,
LineSegments and ClosedCurves are subclasses
of Curve. LineSegment behaves the same way as
Curve except it cannot have an arbitrary length; it overrides
the getLength method to always return the straight line distance
between its starting and ending point. Overriding methods must have the
same signature (name, number/type/order of parameters, return type) in both
the subclass and superclass.
ClosedCurve can have arbitrary length just like Curve
but it adds a new operation getArea to
handle the special case of curves that enclose an area, getArea.
Likewise, Polygons and Circles are even more
concrete kinds of ClosedCurves and have corresponding
new operations specific to their subclass. Every class is more abstract
than its subclasses but more concrete than its superclass, thus forming
a type hierarchy shown below.
Subclasses can access their superclasses
using the keyword super. Subclasses can (but do not have to)
call their superclass constructor by calling the method super()
with the appropriate arguments. Recall that the private
access modifier prevents a field or method from being access outside
its immediate class, but protected fields and methods
can be access from a subclass and, unfortunately, anywhere else in the
same package.
Problem Source Files
Solved Source Files
Questions
Q.
startingPoint and endingPoint are declared as
protected in Curve but they can also be
private and still meet the specs. Why is this, and what is the
advantage of this second situation?
A.
Starting and ending points are only referenced from within the
superclass Curve and are general enough to have the same meaning
for all subclasses (contrast with area or length.
Making these fields private would shield subclasses from changes in their
superclass and isolates the superclass representation from subclasses.
Q.
Why do Polygon and Circle have to override
getArea from ClosedCurve?
(e.g. what's wrong with having a big switch statement inside
ClosedCurve.getArea that calls instanceof for Polgyon
or Circle and does the correct area calculation?
A.
This procedure-based approach would make it hard to add new kinds of
closed curves because you would have to keep changing,
recompiling, and testing ClosedCurve for every new subclass.
This would negate the benefits of inheritance/subclassing and
object-oriented programming. If the implementation of Circle
or Polygon changed, these changes would no longer be
localized to that class.
An assignment in Java is an expression with a location (a variable) of a declared type T and an expression which, when evaluated, as a runtime type and is stored in the location.
T <variable> = <expression>
Java is a type-safe language, meaning it checks (both at compile-time and at run-time) that all expressions are assigned to variables with valid types. Java type-checking ensures that the type of the expression is always a subtype of T. (recall that a type is always a subtype of itself).
In Java, variables can be of two types.
The declared type of a variable is what it appears to be at compile-time. The runtime type of a variable is its actual type when code is executed. Because primitive types cannot be subclassed and do not have a type-hierarchy, the declared and runtime types are always the same and type-checking is easy.
For object types, the runtime type must either be the same as the declared
type T, one of its subclasses if T is a class, or an implementing class
if T is an interface.
For example, in overriding the equals method in Problem Set 1,
the declared type of the argument is always a
(java.lang.)Object. However, in order to determine
equality, you usually have to (down)cast the object to another type to
compare specific fields.
public boolean equals(Object o) {
...
Card c = (Card)o;
...
}
Questions
Q. If an assignment is type-safe at compile-time, is it always type-safe at run-time? What about the reverse?
A.
No, assignments that are type-safe at compile-time but not at run-time
are what causes ClassCastExceptions. No, the compiler
can fail on assignments requiring explicit (down)casting which would
have been otherwise type-safe at run-time.
Q.
Assuming the following statements are run in sequence.
Which of the numbered assignments are
A.
Because of the difference between declared and runtime type,
overridden methods,
and inherited methods, the compiler
cannot always determine which code to run when a method is called. This is
determined dynamically at runtime via dispatching. All subclasses
inherit the method specifications of their superclasses; however, a
subclass may override a method implementation In the diagram
below, you can see that some subclasses of Curve inherit
some methods from their superclass but override others or add new methods.
In the Curve.appendCurve() method, the overall algorithm
is independent of the different kinds of curves.
It delegates the computation of the second curve's length to its
getLength() method, which is meant to be overridden in
subclasses. This use of an overridden method is called a template.
Questions
Q. Assume the following code is run in sequence. For each numbered line, tell which method (type and method name) is called at runtime.
A.
Abstraction by parameterization abstracts away the identity of the
data being used (in this case, the types of the arguments). Implementations
of a method can treat arguments according to their declared type without
having to know their runtime type, called polymorphism.
Allows everything outside
method to be treated like a black box.
EXAMPLE: Curve.appendCurve can combine any two curves into a new
curve in the same way, even if the two curves are of different subtypes.
Abstraction by specification abstracts away the implementation of
a method from its callers. A method with
a given specification (signature and behavior) should have the same effect
given the same arguments even if its implementation is replaced
(modifiability). It also means that all the code dealing with a particular
operation can be found in one place (locality).
Everything inside the method can be treated as a black box.
EXAMPLE: Polygon.getArea calculates the area of a polygon by
dividing it into triangles and using Heron's formula. If we replaced its
implementation with a more efficient algorithm (commented out), it should
still return the same result as before, and the calling code should not have
to be changed.
Specifications make abstractions precise so that implementors know what requirements their code must meet and also so that they gain the benefit of locality and modifiability (above). Specifications are written using abstract spec fields rather than runtime Java fields. Also, specs allow us to verify that programs actually meet their requirements through tests.
requires preconditions that should be true upon method entry.
If absent, assumed to be true.modifies lists which inputs are modified (called side-effects)
during the method call. If absent, assumed nothing is modified.effects postconditions that should be true upon method exit, given that requires clause is met. It also technically includes
exceptions thrown and return values although these have separate tags in
Javadoc. It should only refer to spec fields mentioned in modifies.
Note that effects clause says nothing about what happens if
a method caller fails to meet the requires clause, although
it is good practise to fail gracefully or make a best-effort to fulfill
the client code's request.
Questions
Q.
What's wrong with the specification for the spec field length
in Curve? (Hint: consider the physical system that this
class models, and the desired behavior of an arbitrary curve with
respect to, for example, a line segment.) How can you fix this (in the
spec and in the implementation)?
A. The length of a curve cannot be shorter than the length of a line segment between the same starting and ending points. We can fix this by adding this to the spec and by adding a check to the constructor.
Q.
Polygon.getLength() is not implemented. What code would
implement the given specs?
A. See solutions.
The ability to replace one method with another, one of the chief benefits of procedural abstraction, depends on comparing their specifications. One specification can always replace an equal or weaker specification. Specification A is at least as strong (no weaker) than a specification B if A is satisfied anywhere (by any implementation) where B is satisfied. The strength of preconditions and postconditions in A and B above are related to their effect on their specification's strength.
Intuitively:
A postcondition that is weak enough to accept more than one output as valid for a single input is called underdetermined or non-deterministic. In designing procedural specifications, you want to make sure its preconditions are weak enough (generality) to handle all the inputs you are interested in but that its postconditions are strong enough (restrictiveness) to produce the outputs you are interested in.
Questions
Q. Should a subclass's specification be weaker or stronger than its superclass's specification?
A. Stronger. In a sense, if the class promises to do a task for a client, it can also "subcontract" this task to a subclass. The subclass can do more than its parent class originally promised and still meet the contract, but it cannot do less.
Q.
Curve.append()'s requires clause prevents a curve from
being appended to itself. This is impossible to do without lifting the
head at the ending point and moving it back to the starting point, except
for a ClosedCurve. Write a weaker precondition
so that this is possible. Is the specification
for Curve.append() weaker or stronger now?
A. Remove the clause (second != this). The precondition is now weaker b/c is makes fewer demands, so the specification is stronger, as we would expect for a subclass.
Touch on staff expectations for the test suites. In the first two problem sets, students use complete, staff-provided test suites called by the validate6170 script. However, eventually they will have to write the complete test suite themselves, like for the final project. Since they have heard about ways to test from the lectures and readings, we expect them to develop their own methods for testing their code.
There are two testing strategies, black box (which depends only on the specifications) and glass box (which takes into account implementation details). Both have their advantages and disadvantages, and a thorough test suite should include both kinds. We have only covered specification testing (black box) in lecture, but we will cover implementation testing (glass box) in a few weeks.
Black-box testing involves testing a module through its specifications alone. One way of doing this is by checking to see if the input-output relationship embodied by the specifications holds for all possible inputs to the procedure. However, it is usually infeasible to test on the set of all possible inputs and as a result, we need to try and focus our tests on the most likely problem spots. We do this by:
You may have heard of extreme programming (XP), a software methodology that stresses flexibility, iterative design, and test-driven development. Although 6.170 does not advocate all aspects of XP, we definitely teach test-driven development. By writing tests first, you are demonstrating knowledge of the specs and can catch any incorrect assumptions and high-level errors early on. By writing the code after the tests, you are able to focus on a specific goal: writing the simplest implementation possible that will pass the tests.
Test-driven development also includes unit testing (bottom-up testing of modules in isolation) and automated regression testing (having a structured way to catch new bugs and make sure old bugs aren't reintroduced). 6.170 relies heavily on these two testing practices and the JUnit tool itself, all of which come from XP.
Questions
Q. Why is it a good idea for a program to be tested on inputs outside of its expected input space (not meeting its requires clause)?
A. A number of bugs involve callers accidentally (or intentionally, in the case of hackers) failing to obey a "requires" clause and thus causing an error. For this reason, it is important to test the behavior of the code in response to incorrect inputs. The worst way to deal with such inputs is to do nothing and return a wrong answer. Ideally, when tested, the program should check the "requires" clause and signal some kind of an error.
Q.
What sort of boundary cases can you think of for testing the
specification only (not the implementation) of
Curve.appendCurve?
A.
Some possible answers are a null object, passing this
into itself, passing in two non-contiguous curves, and passing in two
contiguous curves.