Thursday, March 4, 2004
Contents:
You could easily spend a lot of time on any of the relevant topics: testing, debugging, exceptions, confusions from Quiz 1, confusions from ps3, approaching ps4. Consider allocating time based on which of these the students want to have covered.
If a class A overrides Object's toString(), then there is not way for a different class to call Object's toString() on instances of A.
Talk about why equals is not symmetric for TolerantFloat
hashCode question.
Behavioral equivalence, both from ps3 and from the quiz, isn't quite crystal clear yet. First of all, do you all understand observational equivalence?
Try thinking about behavioral equivalence like this: let's say objects A and B are behaviorally equivalent. That means that if you perform any sequence of mutators on either one of them, then A and B will still be observationally equivalent.
So two Cars from ps3 with the same attributes are not behaviorally equivalent...you can drive one of them, and then it will no longer be observationally equivalent with the other.
Car.drive method. The spec for
public double drive(double timeSlice) tells us that timeslice is in
seconds. And the spec for public void setSpeed() says that speed is
passed in in miles per hour! So you're almost certainly
storing your speed field in mph...in
drive you'll then have to convert units. The fact
that speed is stored in mph is a great thing
to put into Car's abstraction function.
As is often the case in the real world, these crucial bits of information are buried in the specification (if they're even present at all!). These are the sorts of details you should be paying special attention to while reading a specification!
Good choice of variable names goes a long way toward making the code readable. It's often more useful than comments alongside the code. For example, ...
When you are evaluating advantages of one design over another, think about it in terms of the problem domain at hand. I mean, when analyzing what would make a good representation for Route, you should talk about it in terms of Route's methods and the functionality that they might require. For example, a representation that makes it possible to implement a certain method in constant time.
The fact that ArrayList allows constant time access to any element can't really be called an advantage, because that functionality isn't necessary for any of Route's methods. But the fact that you can get an iterator from the ArrayList quickly, as used for getSegments(), can be called an advantage.
Ideally, you'd want to be able to perform Route's methods in constant time...
Catching the wrong exception. For example...
Did you feel that writing your test suites first helped your implementation?
One thing to keep in mind is that your black box test suites should be testing the specification, not your particular implementation. Furthemore, any correct implementation of the specification should pass your test cases. If anyone else is later maintaining your code, and changes the internal implementation somewhere, then your test suites should still be valid!
What is the point of testing, anyway?
You should touch on our expectations for the test suites. We held the students' hand in the first two problem sets by making validate6170 use our complete staff test suite. However, eventually they will have to write the complete test suite themselves, like for the final project. Since they have heard about ways to test from the lectures and readings, we expect them to develop their own methods for testing their code
Black-box testing involves testing a module through its specifications alone. One way of doing this is by checking to see if the input-output relationship embodied by the specifications holds for all possible inputs to the procedure. However, it is usually infeasible to test on the set of all possible inputs and as a result, we need to try and focus our tests on the most likely problem spots. We do this by:
static void appendVector(ArrayList v1, ArrayList v2) {
if (v1 == null || v2 == null) throw new
NullPointerException("input arraylists cannot be null");
while (v2.size() > 0) {
v1.add(v2.get(v2.size() - 1));
v2.remove(v2.size() - 1);
}
}
Q: What happens when this procedure is called with v1 and v2 both referring to the same vector ?
A: The procedure enters an infinite loop.
Glass-box testing involves testing a module's internal program structure exhaustively. Black-box testing may not exercise all lines of code by virtue of the fact that it ignores the internals of procedures and focuses only on the specifications. Glass-box testing remedies this by using knowledge of the program structure to provide maximum coverage of the code. Ideally, glass-box testing should be path-complete, i.e., it should run all possible paths through a program. To help determine what the paths are through a program, we make use of basic-block diagrams where basic-blocks are sequences of statements that do not contain any branches (conditionals or loops). Basic-block diagrams depict the flow of control in a program from block to block. Since there are often an infinite number of paths through a program, we must settle for testing path boundary conditions:
Q: A program should also be tested on inputs outside its expected input space. Why is this a good idea ?
A: A number of bugs involve callers accidentally (or intentionally, in the case of hackers) failing to obey a "requires" clause and thus causing an error. For this reason, it is important to test the behaviour of the code in response to incorrect inputs. The worst way to deal with such inputs is to do nothing and return a wrong answer. Ideally, when tested, the program should check the "requires" clause and signal some kind of an error.
Q: What are the advantages/disadvantages of top-down testing ?
A: Good for identifying design problems early on, but can be difficult to implement.
Q: What are the advantages/disadvantages of bottom-down testing ?
A: As mentioned, it is easier to implement bottom-down testing. However, this approach does not reveal high-level design flaws until much work on submodules has been done.
In order to test the type hierarchy for any given subtype, we must test:
It is important to note that although the supertype should have its own glass-box test suite for testing its correctness, this is not needed to test the type hierarchy for its subtypes.
Q: How do you test an abstract supertype?
A: To test the supertype class itself, one should provide a stub implementation of a subtype: one that provides trivial implementations of the supertype's abstract methods. This stub can then be used to test the supertype's non-abstract methods. Full subtype implementations should also test their behavior through the supertype's spec.
Unit tests serve to test a single module in isolation. While testing large programs, unit tests should be written for each class and every static procedure in the system. Initially, black-box tests should be written as soon as the specification for a module exists. Once the implementation of the module is complete, additional glass-box tests should be written to test its implementation-specific behaviour.
Q: Why should the black-box tests be written before the implementation of the module is complete ?
A: Black-box tests depend only on the specifications for a procedure and are completely independent of implementation-level details. For this reason it is a good idea to write these tests before the implementation is concrete so as to preserve any bias arising from knowledge of the internal workings of the procedure.
In practice, writing unit tests before a module is implemented can be frustrating, especially since changes to the spec may cause a set of tests to be wasted. A good habit is to write new tests each time new behavior is implemented. Don't wait until after development to write tests, since it's likely that a test suite written afterwards will be less complete than one written during development.
Integration tests serve to test a group of modules interfacing to one another.
Q: What is the need to test modules that independently appear to be bug-free as a group ?
A: Integration tests help isolate problems related to the interconnection of various modules. These problems primarily arise from vague specifications leading to assumptions between modules that are not shared universally.
Integration tests can be done recursively upwards, as larger modules make use of more and more smaller ones. At the top-level, this is called a system test and should be run automatically and regularly in any software engineering environment.
Regression tests involve testing every component after each build or bug-fix.
Q: What is the need for regression tests ?
A: These are necessary to ensure changes made to one module do not cause something that was previously working to break down. Regression tests are especially useful since whatever changes were made are fresh on the engineer's mind, so bugs can be tracked down more quickly.
Automatic testing systems are vital to make regression testing easy and fast. When a bug is found, tests should be written immediately that fail with the bug and pass once the bug is fixed.
A number of classes have been provided: the Fib interface, a test suite for that interface, a recursive implementation and its test suite, a linear implementation and its test suite, and a caching implementation and its test suite. Things to notice:
Q: What happens to RecursiveFib.fib() if it's called with n < 1?
A: Infinite recursion: StackOverflowError. This could be fixed by checking that the requires clause is satisfied.
Q: RecursiveFib fails testFibThirty and testFibFortySeven with this message: java.lang.InterruptedException: Test time exceeded 100ms. Why?
A: RecursiveFib's recursion grows exponentially with increasing n; at n == 30, it's attempting to make approximately 2^30 (half a billion) recursive calls, which takes quite some time.
Q: LinearFib fails testFibFortySeven with this message: fib(47) expected:<2971215073> but was:<-1323752223>. Why?
A: Integer overflow. fib(47) is larger than 2^31, the largest positive int (since an int has 32 bits). The return value looks like a negative number because of two's-complement integer representation.
Q: CachingFib fails testFibTenTwice with this message: 2nd fib(10) expected:<55> but was:<10>. Why? Why didn't testFibOneTwice() also fail?
A: There's a bug in CachingFib: it caches the argument 'n' instead of the result 'fib(n)' (testFibOneTwice() passes because fib(1) == 1). Notice that this error was not caught by any black-box tests. The glass-box tests make sure that both branches of the conditional in CachingFib.fib() are explored.
Q: It's difficult to automate GUI testing since usually a user has to do the point-and-clicking. How might we automate testing a system that has a GUI?
A: One way is to separate the GUI and the functional part of the system into separate modules. The functional part can then be tested with an automated test driver, while the GUI funcitonality can be tested by hand. However, the two parts still must be tested together (integration testing).
Another way is to use a GUI scripting tool that reads in mouse clicks and causes the system to send its results to some output analyzer. For example, once could script the process of creating an address book through a GUI, dump that data to a file, and compare that file against an expected result.
Q: An alternative to testing is verification: a formal or informal argument that a program works on all inputs. Why don't we usually use verification instead of testing?
A: For non-trivial programs, arguing correctness becomes difficult and very time-consuming. Furthermore, unless the argument refers directly to the program text, bugs can cause the program to violate the argument. Also, most forms of formal verification require a formal specification, which is often as hard or harder than the implementation itself!
Q: Why is regression testing necessary?
A: Changes to one part of a program can break behavior in another part of a program because of mistakes in implementation or bad specifications. Regression testing reveals these errors immediately after they are introduced, allowing the engineer to fix them while the change is fresh.
Q: Creating a large (perhaps even exhaustive) test suite requires generating a lot of test cases with the correct output. How can we create this data and be sure that it's correct?
A: One way is to implement a simple stub that generates correct input and output pairs. Such a stub can be checked by hand, and its output can be used to test a more complex, optimized implementation.
Because it takes a lot of time to generate good test suites by hand, we often try to minimize the number of partitions in a program's input space that we need to test. Automatic tools make it easier to generate large test suites and run them, allowing us cover the program input space better.
Q: Should engineers write the tests for their own programs?
A: Yes and no. Yes, because the engineer understands the code and the spec and can quickly write a number of the required tests. No, because an engineer will make assumptions about a program's behavior and either forget to test certain behavior or write tests that assume certain behavior. The best solution is to have the author of the code, other engineers, and customer representatives write tests.
Imagine you are flying in an aircraft to visit a friend. The pilot sets it on autopilot and goes to sleep. Now something goes wrong with the engine. What can we do?
Exercise: Can you think of any examples of when you might want an exception?
Exercise: How might you handle these error conditions in a different way? Why is the use of exceptions better?
Essentially, exceptions allow methods to terminate gracefully while providing feedback about the error condition.
Exercise: Why use an exception if you have a checkRep() function already?
Answer:
Exercise: When would you use checked exceptions and when would you use unchecked exceptions?
Answer: As a rule of thumb, if you are throwing an exception for an abnormal condition that you feel that client programmers should consciously decide how to handle, throw a checked exception. In general, exceptions that indicate an improper use of a class should be unchecked.
The StringIndexOutOfBoundsException thrown by
String's charAt() method is an unchecked
exception. The designers of the String class didn't want
to force client programmers to deal with the possibility of an invalid
index parameter every time they called charAt(int
index).
The read() method of class
java.io.FileInputStream, on the other hand, throws
IOException, which is a checked exception. This exception
indicates some kind of error occurred while attempting to read from the
file. It doesn't indicate that the client has used the
FileInputStream class improperly. It just signals that
the method itself is unable to fulfill its contractual responsibility
of reading in the next byte from the file. The designers of the
FileInputStream class considered this abnormal condition
to be common enough, and important enough, to force client programmers
to deal with it.
public class NegativeException extends Exception {
public NegativeException() {
super(); // explicitly call constructor of the superclass (Exception)
}
public NegativeException(String s) {
super(s); // explicitly call the constructor of the superclass (Exception)
}
}
Checking exceptions are created by extending Exception, and unchecked exceptions are created by extending. RuntimeException. As you can see from this example, a new exception type only has to define constructors. All of the other methods of the exception can be inherited from its superclass.
A method can terminate by throwing an exception, using the throw statement. For example,
// effects: if n < 0 throws a NegativeException; else returns n!
public static int fact(int n) throws NegativeException {
if (n < 0) {
throw new NegativeException("fact");
}
...
}
This example shows how to throw an exception, as well as how to declare that a method may throw a checked exception. The string argument that is passed to the exception's constructor can be retrieved using the toString() method of the exception. This allows a user to get an English description of what went wrong if the program cannot handle the exception.
try {
x = Num.fact(y);
}
catch (NegativeException e) {
// code that handles exception, which can use e
}
If the call to fact throws a NegativeException, the catch
clause is executed. The exception that is thrown by fact is bound
to the variable e, which can be used by the code in the catch
clause.
Several catch clauses can be attached to a try statement, so that different
exceptions can be handled differently. The syntax for this is:
try {
// code that may throw a SomeException or a SomeOtherException
}
catch (SomeException e) {
// code fragment a
}
catch (SomeOtherException e) {
// code fragment b
}
In this case, if the code in the try block throws a SomeException, then code fragment a is executed; if it throws a SomeOtherException, then code fragment b is executed. Otherwise, execution continues normally after the catch blocks.
Since exceptions are arranged in a class hierarchy, you can use a try statement to catch any exception that is a subclass of some type of exception. For example, the java.io package contains umerous exceptions which all subclasses of IOException, such as FileNotFoundException, InterruptedIOException, and EOFException. One way to handle all of these different types of IOExceptions is to write code like:
try {
// code that could throw an IOException
}
catch (FileNotFoundException e) {
...
}
catch (InterruptedIOException) {
...
}
catch (EOFException e) {
...
}
Exercise: What would happen in the following code? Why would you do this?
try {
// code that throws an EOFException or FileNotFoundException
}
catch (IOException e) {
...
}
You can also handle exceptions by propagating them. If a method m() calls code that is declared to throw a checked exception, the method may declare the exception or a superclass of the exception in its throws clause and delegate the responsibilty to catch the exception to the code that calls m(). For the sake of code clarity, it is actually preferrable to catch the exception in m(), and then rethrow the exception. For example (in the following bad code):
void m() throws NegativeException {
try {
int x = Num.fact(y);
}
catch (NegativeException e) {
throw e;
}
...
}
rethrows the exception e. This makes it clearer to a reader of the code
where the NegativeException declared in the throws clause may be thrown
from.
Contrast these two pieces of code:
try {
Object o = iterator.next();
}
catch (NoSuchElementException e) {
...
}
if (iterator.hasNext())
Object o = iterator.next();
This is in page 73 in Liskov, and also in Bloch.
As their name implies, exceptions should only be used in exceptional circumstances. You should not use exceptions for normal control flow, for example to replace a while loop. Among other reasons, exceptions are generally expensive to create, throw, and catch, because JVM implementations tend not to optimize their performance.
Favor the use of standard exceptions. It makes your API easier to learn, because it matches established conventions that people are already familiar with. It also makes your API easier to learn, because it won't be littererd with unfamiliar exceptions.
Throw exceptions appropriate to the abstraction. You want your exceptions you throw to be helpful to your clients. You don't want to 'expose' the representation in an exception message, since your clients do not know what rep you're using. Higher layers should catch lower-level exceptions and, in their place, throw exceptions in terms of the higher-level abstraction.
Document all exceptions thrown by each method. Always declare checked exceptions individually, and document the conditions under which each one is thrown, using the @throws tag.
Don't ignore exceptions. Whenever you have an empty catch block:
you're ignoring the purpose of exceptions. There is a reason why the exception is being thrown,
and you shouldn't ignore that.
//Empty catch block ignores exception - Highly suspect!
try {
...
} catch (SomeException e) {
}
From Bloch: It is the responsibility of any class overriding the Object contact methods (equals, hashCode, toString, clone, finalize) to obect their general contracts; failure to do so will prevent other classes that depend on these contracts from functioning properly in conjunction with the class.[3]