Contents:

Testing

As mentioned in lecture, testing increases our confidence in the fact that our code behaves "correctly" (i.e., meets the specifications provided). Testing can only prove that certain specific errors don't exist, not that a program is free from all errors.

Black-Box Testing vs. Glass-Box Testing

Black-Box Testing

Black-box testing involves testing a module through its specifications alone. One way of doing this is by checking to see if the input-output relationship embodied by the specifications holds for all possible inputs to the procedure. However, it is usually infeasible to test on the set of all possible inputs and as a result, we need to try and focus our tests on the most likely problem spots. We do this by:

Glass-Box Testing

Glass-box testing involves testing a module's internal program structure exhaustively. Black-box testing may not exercise all lines of code by virtue of the fact that it ignores the internals of procedures and focuses only on the specifications. Glass-box testing remedies this by using knowledge of the program structure to provide maximum coverage of the code. Ideally, glass-box testing should be path-complete, i.e., it should run all possible paths through a program. To help determine what the paths are through a program, we make use of basic-block diagrams where basic-blocks are sequences of statements that do not contain any branches (conditionals or loops). Basic-block diagrams depict the flow of control in a program from block to block. Since there are often an infinite number of paths through a program, we must settle for testing path boundary conditions:

SpecificationTests vs. ImplementationTests

As a brief aside, notice that black-box testing and glass-box testing DO NOT correspond to the SpecificationTest and ImplementationTest suites that we have you build in 6.170. Instead, our rule of thumb is that specification tests are tests that should pass on everybody's code base, while implementation tests are tests that may only work on your code. Thus, the implementation tests and specification tests are completely orthogonal to the concepts of black-box and glass-box testing.

Q:Why would you build a black-box implementation test?

A:Because you will want to test your internal specifications. In PS3 and beyond you are given the freedom to write some of your own class specifications, including defining their public interfaces. Therefore, any black-box tests you write based on your specifications will only work on classes you write, and so by the rule of thumb given above those tests should go in the ImplementationTests suite.

Top-Down Testing vs. Bottom-Up Testing

Top-Down Testing

Bottom-Up Testing

Testing Type Hierarchies

In order to test the type hierarchy for any given subtype, we must test:

It is important to note that although the supertype should have its own glass-box test suite for testing its correctness, this is not needed to test the type hierarchy for its subtypes.

Q: How do you test an abstract supertype?

A: To test the supertype class itself, one should provide a stub implementation of a subtype: one that provides trivial implementations of the supertype's abstract methods. This stub can then be used to test the supertype's non-abstract methods. Full subtype implementations should also test their behavior through the supertype's spec.

Different Types of Tests

Unit Tests vs. Integration Tests

Unit Tests

Unit tests serve to test a single module in isolation. While testing large programs, unit tests should be written for each class and every static procedure in the system. Initially, black-box tests should be written as soon as the specification for a module exists. Once the implementation of the module is complete, additional glass-box tests should be written to test its implementation-specific behaviour.

Q: Why should the black-box tests be written before the implementation of the module is complete?

A: Black-box tests depend only on the specifications for a procedure and are completely independent of implementation-level details. For this reason it is a good idea to write these tests before the implementation is concrete so as to preserve any bias arising from knowledge of the internal workings of the procedure.

In practice, writing unit tests before a module is implemented can be frustrating, especially since changes to the spec may cause a set of tests to be wasted. A good habit is to write new tests each time new behavior is implemented. Don't wait until after development to write tests, since it's likely that a test suite written afterwards will be less complete than one written during development.

Integration Tests

Integration tests serve to test a group of modules interfacing to one another.

Q: What is the need to test modules that independently appear to be bug-free as a group?

A: Integration tests help  isolate problems related to the interconnection of various modules. These problems primarily arise from vague specifications leading to assumptions between modules that are not shared universally.

 Integration tests can be done recursively upwards, as larger modules make use of more and more smaller ones. At the top-level, this is called a system test and should be run automatically and regularly in any software engineering environment.

Regression Tests

Regression tests involve testing every component after each build or bug-fix.

Q: What is the need for regression tests?

A: These are necessary to ensure changes made to one module do not cause something that was previously working to break down. Regression tests are especially useful since whatever changes were made are fresh on the engineer's mind, so bugs can be tracked down more quickly.

Automatic testing systems are vital to make regression testing easy and fast. The Eclipse Continuous Testing plugin that we introduced this term is one example of such a tool. When a bug is found, tests should be written immediately that fail with the bug and pass once the bug is fixed.

Testing Tools

Example

A number of classes have been provided: the Fib interface, a test suite for that interface, a recursive implementation and its test suite, a linear implementation and its test suite, and a caching implementation and its test suite. Things to notice:

Q: What happens to RecursiveFib.fib() if it's called with n < 1?

A: Infinite recursion: StackOverflowError. This could be fixed by checking that the requires clause is satisfied.

Q: RecursiveFib fails testFibThirty and testFibFortySeven with this message: java.lang.InterruptedException: Test time exceeded 100ms. Why?

A: RecursiveFib's recursion grows exponentially with increasing n; at n == 30, it's attempting to make approximately 2^30 (half a billion) recursive calls, which takes quite some time.

Q: LinearFib fails testFibFortySeven with this message: fib(47) expected:<2971215073> but was:<-1323752223>. Why?

A: Integer overflow. fib(47) is larger than 2^31, the largest positive int (since an int has 32 bits). The return value looks like a negative number because of two's-complement integer representation.

Q: CachingFib fails testFibTenTwice with this message: 2nd fib(10) expected:<55> but was:<10>. Why? Why didn't testFibOneTwice() also fail?

A: There's a bug in CachingFib: it caches the argument 'n' instead of the result 'fib(n)' (testFibOneTwice() passes because fib(1) == 1). Notice that this error was not caught by any black-box tests. The glass-box tests make sure that both branches of the conditional in CachingFib.fib() are explored.

Exercises

Q: It's difficult to automate GUI testing since usually a user has to do the point-and-clicking. How might we automate testing a system that has a GUI?

A: One way is to separate the GUI and the functional part of the system into separate modules. The functional part can then be tested with an automated test driver, while the GUI funcitonality can be tested by hand. However, the two parts still must be tested together (integration testing).

Another way is to use a GUI scripting tool that reads in mouse clicks and causes the system to send its results to some output analyzer. For example, once could script the process of creating an address book through a GUI, dump that data to a file, and compare that file against an expected result.

Q: An alternative to testing is verification: a formal or informal argument that a program works on all inputs. Why don't we usually use verification instead of testing?

A: For non-trivial programs, arguing correctness becomes difficult and very time-consuming. Furthermore, unless the argument refers directly to the program text, bugs can cause the program to violate the argument. Also, most forms of formal verification require a formal specification, which is often as hard or harder than the implementation itself!

Q: Why is regression testing necessary?

A: Changes to one part of a program can break behavior in another part of a program because of mistakes in implementation or bad specifications. Regression testing reveals these errors immediately after they are introduced, allowing the engineer to fix them while the change is fresh.

Q: Creating a large (perhaps even exhaustive) test suite requires generating a lot of test cases with the correct output. How can we create this data and be sure that it's correct?

A: One way is to implement a simple stub that generates correct input and output pairs. Such a stub can be checked by hand, and its output can be used to test a more complex, optimized implementation.

Because it takes a lot of time to generate good test suites by hand, we often try to minimize the number of partitions in a program's input space that we need to test. Automatic tools make it easier to generate large test suites and run them, allowing us cover the program input space better.

Q: Should engineers write the tests for their own programs?

A: Yes and no. Yes, because the engineer understands the code and the spec and can quickly write a number of the required tests. No, because an engineer will make assumptions about a program's behavior and either forget to test certain behavior or write tests that assume certain behavior. The best solution is to have the author of the code, other engineers, and customer representatives write tests.

References

[1]
Joshua Bloch. Effective Java Programming Language Guide. Addison Wesley, 2001. Chapter 8.
[2]
Liskov, Barbara with John Guttag. Program Development in Java: Abstraction, Specification, and Object-Oriented Design. Addison-Wesley, 2001. Chapter 10.
[3]
Sommerville, Ian. Software Engineering. Addison-Wesley, 1996. Chapters 22 and 23.