6.170 Recitation 2: Subclassing, Procedural Abstraction, and Testing Specs

Thursday, February 12, 2004

Contents:


Announcements


Introduction

Last week you learned semantics of Java and its application to object-oriented programming, including the concepts of mutability, equality, and the basic structure of a Java program. This week in lecture you learned about subclassing and dynamic dispatch, procedural abstractions and specifications, and testing specifications. In this recitation we will receive this material in preparation for problem set 2.

As usual, no need to cover all the material below; if students feel comfortable with a topic, you may want to skip directly to the questions just to make sure they understand it.


Shape Example

To illustrate the concepts above, we will use the example of a type hierarchy of geometrical figures, or curves. Suppose that you are writing software to control an ink plotter drawing on a piece of paper, or the head of a lasercutter/waterjet drilling into wood or metal. We would like the head controlling the ink/laser/water to trace a contiguous toolpath of straight lines or curves from a starting point to an end point to minimize the number of times the head must stop and start.

The different curves can be classified on various levels. For simplicity, we will divide them into the following types:

Note that these types are related because they share some properties or because we might want to perform the same kinds of operations on them. For example, to draw any of these shapes, we would want to know where to place our head at the start of drawing, where to stop drawing, and how much the head has traveled as a rough measure of how much ink or power we've used up while drawing. These operations are encapsulated in the most abstract type, a curve.

A curve may follow a straight line for a certain length with different starting and ending points (a line segment). On the other hand, it may have the same starting and ending points (a closed curve), and we might want to measure the area it encloses. Different kinds of closed curves can be conveniently represented with a few points and distances. For example, one made up entirely of line segments (a polygon) can be defined as a list of vertices, or common endpoints of line segments. A closed curve that maintains an equal distance around a center point (a circle) can be defined just by that center point and the distance (radius).


Subclassing and Type Hierarchies

Describing a toolpath is an example of a problem which deals with a related family of types. It requires

In both of these cases, we would like to save work and increase maintainability by reusing existing, tested code. We would also like a way to express the relationship that some types inherit behavior from a parent but may specialize it to handle new problems. This is known as subtyping and there are two ways to do this in Java: subclasses, which inherit some implementation details from a superclass, and interfaces, whose implementing subtypes only inherit method signatures (partial spec). We will concentrate on subclassing in this recitation and leave interfaces for later, although you have already seen interfaces such as List and SortedSet in problem set 1.

Subclassing is extending a previous class (the superclass) by adding new methods (operations) or overriding old methods to behave differently. For example, LineSegments and ClosedCurves are subclasses of Curve. LineSegment behaves the same way as Curve except it cannot have an arbitrary length; it overrides the getLength method to always return the straight line distance between its starting and ending point. Overriding methods must have the same signature (name, number/type/order of parameters, return type) in both the subclass and superclass.

ClosedCurve can have arbitrary length just like Curve but it adds a new operation getArea to handle the special case of curves that enclose an area, getArea. Likewise, Polygons and Circles are even more concrete kinds of ClosedCurves and have corresponding new operations specific to their subclass. Every class is more abstract than its subclasses but more concrete than its superclass, thus forming a type hierarchy shown below.

Subclasses can access their superclasses using the keyword super. Subclasses can (but do not have to) call their superclass constructor by calling the method super() with the appropriate arguments. Recall that the private access modifier prevents a field or method from being access outside its immediate class, but protected fields and methods can be access from a subclass and, unfortunately, anywhere else in the same package.

HTML documentation

Problem Source Files

Solved Source Files

Questions

Q. startingPoint and endingPoint are declared as protected in Curve but they can also be private and still meet the specs. Why is this, and what is the advantage of this second situation?

A. Starting and ending points are only referenced from within the superclass Curve and are general enough to have the same meaning for all subclasses (contrast with area or length. Making these fields private would shield subclasses from changes in their superclass and isolates the superclass representation from subclasses.

Q. Why do Polygon and Circle have to override getArea from ClosedCurve? (e.g. what's wrong with having a big switch statement inside ClosedCurve.getArea that calls instanceof for Polgyon or Circle and does the correct area calculation?

A. This procedure-based approach would make it hard to add new kinds of closed curves because you would have to keep changing, recompiling, and testing ClosedCurve for every new subclass. This would negate the benefits of inheritance/subclassing and object-oriented programming. If the implementation of Circle or Polygon changed, these changes would no longer be localized to that class.


Variable Assignment

An assignment in Java is an expression with a location (a variable) of a declared type T and an expression which, when evaluated, as a runtime type and is stored in the location.

T <variable> = <expression>

Java is a type-safe language, meaning it checks (both at compile-time and at run-time) that all expressions are assigned to variables with valid types. Java type-checking ensures that the type of the expression is always a subtype of T. (recall that a type is always a subtype of itself).

In Java, variables can be of two types.

The declared type of a variable is what it appears to be at compile-time. The runtime type of a variable is its actual type when code is executed. Because primitive types cannot be subclassed and do not have a type-hierarchy, the declared and runtime types are always the same and type-checking is easy.

For object types, the runtime type must either be the same as the declared type T, one of its subclasses if T is a class, or an implementing class if T is an interface. For example, in overriding the equals method in Problem Set 1, the declared type of the argument is always a (java.lang.)Object. However, in order to determine equality, you usually have to (down)cast the object to another type to compare specific fields.

public boolean equals(Object o) { ... Card c = (Card)o; ... }

Questions

Q. If an assignment is type-safe at compile-time, is it always type-safe at run-time? What about the reverse?

A. No, assignments that are type-safe at compile-time but not at run-time are what causes ClassCastExceptions. No, the compiler can fail on assignments requiring explicit (down)casting which would have been otherwise type-safe at run-time.

Q. Assuming the following statements are run in sequence. Which of the numbered assignments are

  1. safe based on both declared and runtime types
  2. safe based on declared but not on runtime types
  3. safe based on runtime but not on declared types
  4. unsafe based on both declared and runtime types
  1. Object a = new Point(1,2);
    Point b = new Point(3,4);
  2. Polygon c = new ClosedCurve(a, 22.0);
  3. ClosedCurve d = new Polygon(new Point[] {b, new Point(5,6), new Point(7,8)});
  4. Circle e = (Circle)d;
  5. Polygon f = (Polygon)d;
    Vector vector = new Vector();
    vector.add(new Circle(b, 22));
  6. LineSegment h = vector.get(0);
  7. Circle i = vector.get(0);
  8. LineSegment j = (LineSegment)vector.get(0);
  9. Circle k = new Circle(a, 22);

A.

  1. (a)
  2. (d)
  3. (a)
  4. (b)
  5. (a)
  6. (d)
  7. (c)
  8. (b)
  9. (c)


Dispatching

Because of the difference between declared and runtime type, overridden methods, and inherited methods, the compiler cannot always determine which code to run when a method is called. This is determined dynamically at runtime via dispatching. All subclasses inherit the method specifications of their superclasses; however, a subclass may override a method implementation In the diagram below, you can see that some subclasses of Curve inherit some methods from their superclass but override others or add new methods.

In the Curve.appendCurve() method, the overall algorithm is independent of the different kinds of curves. It delegates the computation of the second curve's length to its getLength() method, which is meant to be overridden in subclasses. This use of an overridden method is called a template.

Questions

Q. Assume the following code is run in sequence. For each numbered line, tell which method (type and method name) is called at runtime.

    Point a = new Point(1,2);
    Point b = new Point(3,4);
    Curve c = new ClosedCurve(a, 22.0);
  1. double d = c.getLength();
  2. double e = ((ClosedCurve)c).getArea();
  3. Circle f = new Circle(a, 22.0);
  4. double g = f.getLength();
  5. Point h = f.getStartingPoint();
    LineSegment h = new LineSegment(a,b);
  6. double i = h.getLength();

A.

  1. Curve.getLength()
  2. ClosedCurve.getArea()
  3. Circle.getLength()
  4. Curve.getStartingPoint()
  5. LineSegment.getLength()


Procedural Abstraction

Abstraction Mechanisms

Applied to procedural abstraction (methods in a class).

Abstraction by parameterization abstracts away the identity of the data being used (in this case, the types of the arguments). Implementations of a method can treat arguments according to their declared type without having to know their runtime type, called polymorphism. Allows everything outside method to be treated like a black box.
EXAMPLE: Curve.appendCurve can combine any two curves into a new curve in the same way, even if the two curves are of different subtypes.

Abstraction by specification abstracts away the implementation of a method from its callers. A method with a given specification (signature and behavior) should have the same effect given the same arguments even if its implementation is replaced (modifiability). It also means that all the code dealing with a particular operation can be found in one place (locality). Everything inside the method can be treated as a black box.
EXAMPLE: Polygon.getArea calculates the area of a polygon by dividing it into triangles and using Heron's formula. If we replaced its implementation with a more efficient algorithm (commented out), it should still return the same result as before, and the calling code should not have to be changed.

Procedural Specifications

Specifications make abstractions precise so that implementors know what requirements their code must meet and also so that they gain the benefit of locality and modifiability (above). Specifications are written using abstract spec fields rather than runtime Java fields. Also, specs allow us to verify that programs actually meet their requirements through tests.

Note that effects clause says nothing about what happens if a method caller fails to meet the requires clause, although it is good practise to fail gracefully or make a best-effort to fulfill the client code's request.

Questions

Q. What's wrong with the specification for the spec field length in Curve? (Hint: consider the physical system that this class models, and the desired behavior of an arbitrary curve with respect to, for example, a line segment.) How can you fix this (in the spec and in the implementation)?

A. The length of a curve cannot be shorter than the length of a line segment between the same starting and ending points. We can fix this by adding this to the spec and by adding a check to the constructor.

Q. Polygon.getLength() is not implemented. What code would implement the given specs?

A. See solutions.

Specification Ordering

The ability to replace one method with another, one of the chief benefits of procedural abstraction, depends on comparing their specifications. One specification can always replace an equal or weaker specification. Specification A is at least as strong (no weaker) than a specification B if A is satisfied anywhere (by any implementation) where B is satisfied. The strength of preconditions and postconditions in A and B above are related to their effect on their specification's strength.

Intuitively:

A postcondition that is weak enough to accept more than one output as valid for a single input is called underdetermined or non-deterministic. In designing procedural specifications, you want to make sure its preconditions are weak enough (generality) to handle all the inputs you are interested in but that its postconditions are strong enough (restrictiveness) to produce the outputs you are interested in.

Questions

Q. Should a subclass's specification be weaker or stronger than its superclass's specification?

A. Stronger. In a sense, if the class promises to do a task for a client, it can also "subcontract" this task to a subclass. The subclass can do more than its parent class originally promised and still meet the contract, but it cannot do less.

Q. Curve.append()'s requires clause prevents a curve from being appended to itself. This is impossible to do without lifting the head at the ending point and moving it back to the starting point, except for a ClosedCurve. Write a weaker precondition so that this is possible. Is the specification for Curve.append() weaker or stronger now?

A. Remove the clause (second != this). The precondition is now weaker b/c is makes fewer demands, so the specification is stronger, as we would expect for a subclass.


Testing Specifications

Touch on staff expectations for the test suites. In the first two problem sets, students use complete, staff-provided test suites called by the validate6170 script. However, eventually they will have to write the complete test suite themselves, like for the final project. Since they have heard about ways to test from the lectures and readings, we expect them to develop their own methods for testing their code.

There are two testing strategies, black box (which depends only on the specifications) and glass box (which takes into account implementation details). Both have their advantages and disadvantages, and a thorough test suite should include both kinds. We have only covered specification testing (black box) in lecture, but we will cover implementation testing (glass box) in a few weeks.

Black Box

Black-box testing involves testing a module through its specifications alone. One way of doing this is by checking to see if the input-output relationship embodied by the specifications holds for all possible inputs to the procedure. However, it is usually infeasible to test on the set of all possible inputs and as a result, we need to try and focus our tests on the most likely problem spots. We do this by:

Test-Driven Development

You may have heard of extreme programming (XP), a software methodology that stresses flexibility, iterative design, and test-driven development. Although 6.170 does not advocate all aspects of XP, we definitely teach test-driven development. By writing tests first, you are demonstrating knowledge of the specs and can catch any incorrect assumptions and high-level errors early on. By writing the code after the tests, you are able to focus on a specific goal: writing the simplest implementation possible that will pass the tests.

Test-driven development also includes unit testing (bottom-up testing of modules in isolation) and automated regression testing (having a structured way to catch new bugs and make sure old bugs aren't reintroduced). 6.170 relies heavily on these two testing practices and the JUnit tool itself, all of which come from XP.

Questions

Q. Why is it a good idea for a program to be tested on inputs outside of its expected input space (not meeting its requires clause)?

A. A number of bugs involve callers accidentally (or intentionally, in the case of hackers) failing to obey a "requires" clause and thus causing an error. For this reason, it is important to test the behavior of the code in response to incorrect inputs. The worst way to deal with such inputs is to do nothing and return a wrong answer. Ideally, when tested, the program should check the "requires" clause and signal some kind of an error.

Q. What sort of boundary cases can you think of for testing the specification only (not the implementation) of Curve.appendCurve?

A. Some possible answers are a null object, passing this into itself, passing in two non-contiguous curves, and passing in two contiguous curves.