6.170 Recitation 3: Abstractions

Thursday, February 18, 2003

Javadoc Tags

this part is copied verbatim from tools.html

Some extended tags belong in the overview for a class; they are used to formally define what a given Abstract Data Type represents.
@specfield name : T // text Indicates that name is a abstract specification field of type T for the class, adding text as a comment if present
@derivedfield name : T // text Same as specfield, except that this also adds the property "derived" to the output information

Derived fields can be viewed as functions on preexisting state; thus if a class had a specfield @specfield n : integer we could define a derived field

@derivedfield pos : boolean // pos = true iff n > 0

Derived fields are not allowed to hold any information that could not be already calculated from the already existing state in the object. Thus, you use specfields to introduce new state variables and derived fields to introduce functions on those state variables.

Derived fields are not strictly needed in specifications, but they may reduce complexity and redundancy.

this part deviates slightly from tools.html

Here is a review of the Javadoc tags that you should use to document the methods in your Java classes:

@param tag is followed by the name of the parameter and should describe what the parameter represents
@requires list preconditions for the method
@modifies list of the parameters that may be modified by this method
@effects explains how the parameters in modifies may be changed
@throws List a separate throws tag for any Exception that may be thrown by this method and explain under what conditions the Exception will be thrown
@return explains what value (if any) can be returned by this method

Here is an example of all of the tags in action:

/**
 * This method takes a list, sorts it, and then performs
 * a search on it to find the required object,
 * returning the first index of the object in the sorted
 * list such that (obj == null ? e == null : obj.equals(e)).
 * If no such object exists in the list, method returns -1.
 *
 * @param list   the list to be sorted and searched
 * @param obj    the object to be found in the list
 * @requires list is not null
 * @modifies list
 * @effects sorts list using its natural order
 * @throws NullPointerException if list is null
 * @throws ClassCastException if the type of the specified element
 * is incompatible with this list 
 * @return the first index of the obj in the sorted list,
 * or -1 if it is not in the list
 */
public static int sortAndFind(ArrayList list, Object obj);
Some notes:

Abstract Data Types, Representation Invariants, and Abstract Functions

We are talking about the implementation of an Abstract Data Type (ADT). Data abstraction is a type of abstraction used to abstract away the details of data representation. Data abstraction is also a specification mechanism -- a way to think about programs and designs.

To implement a data abstraction we select a representation for the instances of the type, the "rep", and then implement operations in terms of that rep.

A rep is a not just a data structure, but also includes a set of conventions about that structure. The Rep Invariant describes the set of allowed values of the data structure, and the Abstraction Function shows how the structure is to be interpreted.

NOTE: Keep in mind, both notions are for use by the implementer of the class, not the users, and so should not be included in any of the public specifications (the javadoc).

Example: RatNum

    /**
     * @effects If d = 0, constructs a new RatNum = NaN.  Else
     * constructs a new RatNum = (n / d).
     */
    public RatNum(int n, int d) {
	// special case for zero denominator; gcd(n,d) requires d != 0
	if (d == 0) {
	    numer = n;
	    denom = 0;
	    checkRep();
	    return;
	}
	
	// reduce ratio to lowest terms
	int g = gcd(n,d);
	numer = n / g;
	denom = d / g;
	
	if (denom < 0) {
	    numer = -numer;
	    denom = -denom;
	    checkRep();
	}
    }

Recall that this implementation of rational numbers requires that the numbers are stored in reduced form. Therefore, creating RatNum(4,2) and RatNum(8,4) will both store 2 in the numer field and 1 in the demon field.

This lowest terms requirement should be communicated through the Representation Invariant for the RatNum class. Rep Invariants are placed at the beginning of the class, usually just following the fields of the class.

DOCUMENTATION NOTE: The Rep Invariant is usually placed just after the private fields because the Rep Invariant refers to the specific fields of the class

Possible Answer:

AF(r) = r is NaN if r.denom = 0, (r.numer / r.denom) otherwise

RI(r) = (r.denom >= 0) && (r.denom > 0 ==> there does not exist integer i > 1 such that r.numer mod i = 0 and r.denom mod i = 0)

What abstract values do the following concrete values represent?

Lessons:

Use of AF/RI

What we really want to do is be able to reason about the implementation of a type. These are techniques for "getting it right" which can, and will save you effort down the road.

The rep invariant, together with the abstraction function, can help you perform modular reasoning. That is, you can consider the correctness of one method independent of the other methods. This helps you narrow the scope of what you must consider, and thus can make the task much simpler.

The rep invariant can also help you identify errant behavior earlier, which makes debugging problems easier.

The abstraction function helps you understand the value you are representing. If you are going to write a method which performs some operation, it will usually be easier to think about what abstract value you want as the result, and then to consider how your implementation can create that value - a question which necessarily uses the abstraction function to map a concrete state to an abstract value.

Is the mapping from concrete to abstract values 1:1? No! In the RatNum example above it is not 1:1 since any concrete instance with r.denom = 0 will map to the abstract NaN. As another example, a Set implemented via a Vector can have many concrete representations map to the same abstract value (e.g. [1 2] and [2 1] are both {1 2}). However, note that every concrete value (which satisfies the rep invariant) must map to exactly one abstract value.

Preserving the Rep Invariant via Structural Induction

  1. Show that RI holds after all constructors
  2. Assume the RI holds on entry to all public methods (often we will include protected or package-private methods, too)
  3. Then, show that the method preserves the RI

Does the RI have to hold at private methods? Not necessarily; we only care that the RI is true upon return to calling code. For example, a private helper method may be used to operate on the rep when the invariant is false.

Do you have to consider observer methods in this induction? Yes, since some obeservers may have invisible side effects (a.k.a. benevolent side effects).

Question:

Does this approach guarantee that the RI will always hold?
Answer:
It's close, but not perfect, due to the chance of rep exposure.

Rep Exposure

Rep exposure is dangerous, and can happen in subtle ways. Consider the source code for the Period class.

Question:

What is a possible Abstraction Function and Representation Invariant for this class?
Answer:
RI(r) = (r.end.getTime() >= r.start.getTime()) &&
        ((r.end.getTime() - r.start.getTime()) == r.duration)

AF(r) = A Period r represents the time interval from r.start to r.end
         with duration r.duration.

Question:

What will the following code print out?

    Date start = new Date();
    Date end   = new Date(start.getTime() + 1000);
    Period p   = new Period(start, end);
    end.setTime(start.getTime() - 1000);
    System.out.prinln("Period p has duration of " + p.duration() + " ms.");

Answer:

The checkRep() throws a RuntimeException. The problem here is that the Period constructor fails to make defensive copies of its parameters.
Question:
How can you fix the constructor?

Answer:

//repaired constructor - make defensive copies of parameters
public Period(Date start, Date end) {
  this.start = new Date(start.getTime());
  this.end   = new Date(end.getTime());
  
  if (this.start.compareTo(this.end) > 0)
    throw new IllegalArgumentException(start + " after " + end);
  
  checkRep();
}

Question:

Consider the following code; what will be printed out?

//second attack on the internals of a Period instance
Date start = new Date();
Date end   = new Date(start.getTime() + 1000);
Period p   = new Period(start, end);
p.end().setTime(start.getTime() - 1000);
System.out.prinln("Period p has duration of " + p.duration() + " ms.");

Answer:

The checkRep() fails and throws a RuntimeException. The Period accessors fail to make defensive copies of mutable internal fields before returning them.

Question:

How can you fix this problem?

Answer:

//Repaired accessors - make defensive copies of external fields
public Date start() {
  return new Date(start.getTime());
}

public Date end() {
  return new Date(end.getTime());
}

LESSONS

One solution to prevent returned objects from being modified would be to use methods such as unmodifiableCollection or unmodifiableList from the java.util.Collections class. These methods return a new List which is backed by the given argument, but which does not allow mutations (not even though the Iterator).

Question:

Why can't you use these wrappers to "protect" objects which you receive as arguments and then store?
Answer:
Even if you have a wrapped ("safe") version of an object, another piece of code might have a reference to the object itself (with no wrapper), and thus could modify the value.

Note that not all cases of returning or accepting mutable objects will lead to rep exposure. For example, if you were to write an ImmVector class, you could very well specify that the meaning of an element of the vector is its "reference" and not its "abstract value". Then, even if the contained objects were mutated, the abstract value of your ImmVector would still be the same.

Checking the Rep Invariant

You should of course mind the rep invariant while writing your code. Additionally, it's often a great idea to check the rep invariant whevener you can at runtime. You should do this especially when it's inexpensive, and especially if you're debugging.

How do we check the rep invariant?

The book differs from current 6.170 practice. Here is what we suggest:

1) Write a method with this signature: public void checkRep(). In the body of checkRep, test all of the rep invariants. If all invariants hold, do nothing. If an invariant is broken, throw some unchecked exception to indicate failure. The junit.framework.Assert class can make this method especially easy to write.

2) Place calls to checkRep (i) at the exit of every public constructor and (ii) at the entry and exit point of every public method.

In this way, you are asserting that the rep invariant holds whenever control enters any method of the ADT, and also whenever any method is about to return to the user of the class. If some bit of code violates the rep invariant, the error will pop up immediately in a predictable way, instead of at some undetermined point in the future.

Additionally, test drivers may want to call checkRep while performing tests. That way, even if the implementation is not calling checkRep at every point (e.g. for performance reasons), during testing we always want to verify the RI.


Exercises

Matrix

Consider how you would implement this Matrix:
/** Mutable type */
public class Matrix {

  /** @effects creates a new (m x n) matrix where all the cells are null */
  public Matrix(int m, int n);

  /** @effects sets the (i,j)'th element of this to be o **/
  public void set(int i, int j, Object o);

  /** @effects returns the (i,j)'th element of this **/
  public Object get(int i, int j);

  // ...
}
What representations might you use? What if you knew about certain usage properties, such as that most of the matrix contains nulls?

Example answers:

You could do the obvious thing:
private Object[][] elements;

This would be fairly easy to understand. The RI would include the statement that the second-level arrays are all the same size.

Getting a bit more complecated, you could use:
private Object[] elements;
private int columns;

Here, to get to (i, j) you lookup (columns * j + i). The RI would include the statement that elements.length is a multiple of columns.

Both of the previous solutions take up O(m*n) memory. However, if you know that the matrix was sparse, you can do better. Also, the second example makes adding additional colums or rows more difficult (ask students to think about how that method would be implemented using this rep).

One example would be:
private HashMap elements;

The keys of the map would be some immutable type which represented the pair (i,j). This could even be java.lang.Integer, if you map a pair to an int as in the previous example (columns * j + i). This would use memory only as cells are filled, but would take up more space per cell. The get and set methods are still constant-time.

Question:

The specification now includes a getColumn(int) procedure. Which implementations above perform well and which perform poorly. How might you augment each data type to allow for efficient access of rows and columns.
Answer:
The Object[][] implementation can return an Object[] array representing a column in constant-time. Some spare matrix implementations will have a master linked list by both column and by row, which would allow for efficient iteration in any direction. However, this augmentation makes modifying the contents of the matrix less efficient. Use this example to discuss trade-offs.

Common Mistakes from Problem Set 1

Documenting Unit Tests
Comment the module that passed the tests, NOT the test that the modules passed. The test should be an unchanging piece of code, so it should not get edited every time a module that passes the test is written. The module depends on the test, not the other way around. Thus, the module should contain the comment. (Further, sometimes you may not have access to the source code for the test, in which case this point is moot.)
Put the comment about which tests were passed at the top of the class that passes the test suite instead of buried within your code. This makes it easier for other developers to find this information.
The comment should specify which tests were passed (and ideally, when the test was run and by whom, if it is a team project). The comment "passed all tests" is not as helpful to other developers as the comment: "ps1.cards.Deck passed all tests in ps1.cards.DeckTest on 2/12/04."
Formatting Your Code
Java is not a whitespace-delimited language, so please do not treat it like one, even though such syntax may compile. Thus, instead of this:
if (valComp < 0)
  return true;
else if (valComp > 0)
  return false;
else
  return suitComp;
Do this:
if (valComp < 0) {
  return true;
} else if (valComp > 0) {
  return false;
} else {
  return suitComp;
}
In practice, the former is more error-prone, so please avoid it.
Some of you have indentation issues – just remember that every new block should be indented. Thus, instead of:
if (!cards.contains(c)) {
cards.add(c);
}
Do this:
if (!cards.contains(c)) {
  cards.add(c);
}
You can format your code in Eclipse by pressing Ctrl-Shift-F. Please make sure that your code is formatted before submitting it.
Restrict the width of every line to 80 characters. If you use Ctrl-Shift-F in Eclipse, it should take care of this for you. This is an old standard, dating back to times when terminals were only 80 characters wide; however, it is still honored today for various reasons. One of which is that it makes your code easier to read when I print it out!
Performance Issues
Use an Iterator to loop over the elements in a Collection instead of a for loop that uses get(i). When your Collection is a LinkedList, looping over your Collection using get(i) will be quadratic in performance whereas looping over it using an Iterator will be linear in performance.
Minimize the number of casts and calls to instanceof in your code. Both of these operations are relatively expensive, so try to avoid them as much as possible. Instead of this:
public int compareTo(Object o) {
  int valComp = this.getValue().compareTo(((Card)o).getValue());
  int suitComp = this.getSuit().compareTo(((Card)o).getSuit());
  // other stuff...
}
Do this:
public int compareTo(Object o) {
  Card c = (Card)o;
  int valComp = this.getValue().compareTo(c.getValue());
  int suitComp = this.getSuit().compareTo(c.getSuit());
  // other stuff...
}
This improves performance as well as readability.
Avoid redundant code. In the table below, verbose code that I saw in problem sets is on the left with its simplified equivalent on the right:

if (containsObj) {
  return true;
} else {
  return false;
}
return containsObj;
if (!(obj instanceof Card)) {
  throw new ClassCastException();
}
Card c = (Card)obj;
Card c = (Card)obj; // may throw ClassCastException
if (this.compareTo(c) == 0) return 0;
else if (this.compareTo(c) < 0) return -1;
else if (this.compareTo(c) > 0) return 1;
else throw new Exception();
return this.compareTo(c);

Writing an equals() Method for Card
Using compareTo to help implement equals is not the best design decision, although the reason is very subtle. The equals method is special in Java – its specification is explicit and somewhat tricky, but it is extremely important that it is met. Thus, think of implementing equals as being the first priority when implementing Card and implementing Comparable as being second priority. If Card no longer implements Comparable, it must still meet the contract for equals, so equals should be developed independently of compareTo. A good equals method for Card is shown below:
public void equals(Object obj) {
  if (!(obj instanceof Card)) return false;
  Card c = (Card)obj;
  return getValue().equals(c.getValue()) &&
         getSuit().equals(c.getSuit());
}
What the final Keyword Means
Declaring a class final does not mean that it is immutable; instead, it means that no class can subclass that class. Classes are often declared final for security reasons (so they are not overridden maliciously), such as String and Integer. There is no keyword in Java to declare a class as immutable – the best you can do is implement your class so that it is immutable and include a note in the class's Javadoc comment that says it is immutable.

If you declare a method final, it means that it cannot be overwritten by a subclass.

Also, you can declare fields of a class as final, but this does not mean that they are immutable. For example, if you have:

private final HashMap map = new HashMap();
That does not mean that map is immutable. Instead, it means that map cannot be reassigned to another HashMap. However, you can still mutate map with a call to map.put().
Avoiding Rep Exposure
Some of you tried to check if a Card was an ace by checking if its name was "Ace"; however, that would break if its name were changed to "ACE". Instead, test if a card is an ace by doing:
boolean isAce = (card.getValue() == CardValue.ACE);
How to Do listCardsAcesLow() Efficiently
Many of you agonized over how to implement listCardsAcesLow() efficiently. Here is what is likely the most succinct implementation:
public Iterator listCardsAcesLow() {
  CardSuit lowSuit = 
   (CardSuit)CardSuit.SUITS.get(CardSuit.SUITS.size() - 1);
  Card lowAce = new Card(CardValue.ACE, lowSuit);
  SortedSet aces = cards.tailSet(lowAce);
  SortedSet nonAces = cards.headSet(lowAce);
  List acesLow = new ArrayList(aces);
  acesLow.addAll(nonAces);
  return acesLow.iterator();
}
Note that lowSuit is created in a way that it will still work if the order of the suits changes.

$Id: recitation3.html,v 1.13 2004/02/18 19:12:34 mbolin Exp $