6.170 / Fall 2002 / Documenting a Software System
Handout S9
Contents:
Objective
A poorly documented system is not worth much, however well it once
worked. For small, unimportant programs that are only used for a short
period of time, a few comments in the code may be enough. But for most
programs, if the only documentation is the code itself, the program
rapidly becomes obsolete and unmaintainable. A surprise to most
novices is that a small amount of effort on documentation is rewarded
even within the confines of a small project.
Unless you are infallible and live in a world where nothing ever changes,
you will find yourself returning to code
that you have already written, and you'll question decisions you made
earlier in the development. If you don't document your decisions,
you'll find yourself repeating the same mistakes or puzzling out what
you once could have easily described. Not only does lack of
documentation only create extra work, but it also tends to hurt the
quality of the code. If you don't have a clear characterization of the
problem, for example, you're unlikely to develop a clean solution.
Learning how to document software is hard, and it requires mature
engineering judgment. Documenting too little is a common mistake, but
the other extreme can be just as bad: If you document too much, the
documentation will be overwhelming to a reader, and a burden to
maintain. It's vital to document only the right things. The
documentation is no help to anyone if its length discourages people
from actually reading it.
Novices are often tempted to focus their efforts on the easy
issues, since these are easier to document. But that's a waste of
time; you don't learn anything from the effort, and you end up with
documentation that is worse than useless. Novices also tend to be
reluctant to document problems. This is short-sighted: if you know
that some aspect of your design is not quite right, that some part of
the problem has not been clarified, or that some code is likely to be
buggy, then say so! You'll spare the reader time puzzling over
something that appears to be wrong, you'll remember where to look
yourself if you run into problems, and you'll end up with a more
honest and useful document.
Another issue is when to document. Although it sometimes
makes sense to postpone documentation while performing experiments,
experienced developers tend to document systematically even temporary
code, initial problem analyses, and draft designs. They find that this
makes experimentation more productive. Furthermore, since they've
established habits of documentation, it feels natural to document as
they go along.
This handout gives you guidelines on how to document a software system
such as a 6.170
project. It gives an outline structure and some required elements, but
it leaves in the details much leeway for your own judgment. It is
crucial that you don't treat documentation as a dull, rote affair; if
you do, your documentation will be useless, painful to read, and
painful to write. So document consciously: ask yourself as you do it
why you're doing it, and whether you're spending your time most
effectively.
You should feel free to cut and paste text from any handouts we
have given you into your documentation. In particular, you may want to
use parts of the problem set handout in describing the
requirements. Make sure to indicate clearly, however, any changes you
make, so that your TA will not need to emulate Unix diff by
hand!
Outline
Your document should have the following structure. Rough sizes in pages
are provided for a typical 6.170 project; these are guides, not
requirements.
1. Requirements
The requirements section describes the problem being solved, as well as the
solution. This section of the document is of interest to users as well as
implementors; it should not contain details about the particular
implementation strategy. Other parts of the system documentation will not
be of interest to users, only to implementors, maintainers, and the like.
- Overview (up to 1 page). An explanation of the purpose of
the system and the functionality it provides.
- Revised Specification. If you were given detailed
specifications for the behavior of the system, you may find
certain portions of the system underspecified or unclear. In
this section you should make clear any assumptions you made
about the meaning of the requirements as well as making clear
any extensions or modifications you made to the
requirements.
- User Manual (1 - 5 pages). A detailed description of how the
user can use the system, what operations the user can perform,
what the command line arguments are, etc. Detailed
specifications of formats should be relegated to the
Appendix. Any environmental assumptions should be made explicit
here: For instance, note if the program only runs on certain
platforms, assumes a certain directory hierarchy is present,
assumes certain other applications are present, etc. Along with
the overview, this manual should provide all the information
needed by a user of the system.
- Performance (0.5 page). What resources does the system
require for normal operation, and what space and time can it be
expected to consume?
- Problem Analysis (2 - 10 pages). A clear description of the
underlying problem. This includes the conceptual model behind the
design (and possibly the user interface), if that has not already
been discussed. The problem analysis typically includes one or more
problem object models, definitions of their sets and relations, and a
discussion of any tricky issues. The objects in problem object
models come from the problem domain, not from the code. Object
models should include both diagrams and any essential textual
constraints, and they should be neatly laid out for readability.
This part should also describe alternatives considered but rejected,
with reasons, unresolved issues, or aspects not fully clarified and
to be resolved later.
You may find use cases helpful in writing the revised
specification and/or the user manual. A use case is a specific goal and a
list of the actions that a user performs in order to achieve the goal.
Among other things, a client can examine the list of actions to decide
whether the user interface is reasonable. If the collection of use cases
covers all desired user goals, then the client can have some confidence
that the system will fulfill its objective.
2. Design
The design section of your documentation gives a high-level picture of your
implementation strategy.
- Overview (0.5 - 3 pages). An overview of the design: top-level
organization, particularly interesting design issues, use of libraries
and other third party modules, and pointers to any aspects that
are unsettled or likely to change. Also include problems with
the design: decisions that may turn out to be wrong and
tradeoffs between flexibility and performance that may turn out
to be ill-judged.
- Runtime Structure (1 - 5
pages). A description of the state
structure of the running program, expressed as a code object
model. This model should hide the representations of abstract
data types; its purpose is to show the relationships amongst
objects. Object models should include both diagrams and any
essential textual constraints, and they should be neatly laid
out for readability. Representations of data types should be
explained (along with their abstraction functions and
rep. invariants) if those representations are unusual,
particularly complex, or crucial to the overall design. Note
that abstraction functions and rep invariants should still
appear in their natural place in the code itself.
- Module Structure (1 - 5 pages). A description of the
syntactic structure of the program text, expressed as a module
dependency diagram. Should include package structure and should
show Java interfaces as well as classes. It is not necessary to
show dependences on Java API classes. Your MDD should be neatly
laid out for readability. Explain why the particular syntactic
structure was chosen (e.g., introduction of interfaces for
decoupling -- what they decouple and why), and how particular
design patterns were used.
To explain the decomposition and other design decisions, argue that they
contribute to simplicity, extensibility (ease of adding new features),
partitionability (different team members can work on different parts of the
design without communicating constantly), or similar software engineering
goals.
3. Testing
The testing section of your documentation indicates the approach you have
taken to verifying and validating your system. (For a real system, this
might include user tests to determine the system's suitability as a
solution to the problem described in the requirements section, as well as
running test suites to verify the algorithmic correctness of the code.)
Just as you should not convey the design of your system by presenting the
code or even listing the classes, you should not merely list the tests
performed. Rather, discuss how tests were selected, why they are
sufficient, why a reader should believe that no important tests were
omitted, and why the reader should believe that the system will really
operate as desired when fielded.
- Strategy (1 - 2 pages). An explanation of the overall strategy
for testing: Black box and/or glass box, top down and/or bottom up,
kinds of test beds or test drivers used, sources of test data, test
suites, coverage metrics, compile-time checks vs. run-time
assertions, reasoning about your code, etc. You might want to use
different techniques (or combinations of techniques) in different
parts of the program. In each case, justify your decisions.
Explain what classes of errors you expect to find (and not to find!)
with your strategy. Discuss what aspects of the design make it hard
or easy to validate.
- Test results (0.5 - 2 pages). Summary of what testing has been
accomplished and what if any remains: Which modules have been
tested, and how thoroughly? Indicate degree of confidence in
the code: What kinds of fault have been eliminated? what kinds
might remain?
4. Reflection
The reflection (more commonly called "post mortem") section of the document
is where you can generalize from specific failures or successes to rules
that you or others can use in future software development. What surprised
you most? What do you wish you knew when you started? How could you have
avoided problems that you encountered during development?
- Evaluation (0.5 - 1 pages). What you regard as the successes
and failures of the development: unresolved design problems,
performance problems, etc.
Identify which features of your design are the important ones.
Point out design or implementation techniques that you are
particularly proud of.
Discuss what mistakes you made in your design, and the problems that
they caused.
- Lessons (0.2 - 1 pages). What lessons you learned from the
experience: how you might do it differently a second time round, and
how the faults of the design and implementation may be corrected.
Describe factors that caused problems such as missed milestones or to
the known bugs and limitations.
- Known Bugs and Limitations
In what ways does your implementation fall short of the specification?
Be precise. Although you will lose points for bugs and missing features,
you will receive partial credit for accurately identifying those errors,
and the source of the problem.
5. Appendix
The appendix contains low-level details about the system that are not
necessary in order to understand it at a high level, but are required in
order to use it in practice or verify claims made elsewhere in the
document.
- Formats. A description of all formats assumed or
guaranteed by the program: for file I/O, command line
arguments, user dialogs, message formats for network
communications, etc. These should be broken down into user-visible
formats, which are conceptually part of the user-visible requirements
and user manual, and internal formats that are conceptually part of
other components of your documentation.
- Module Specifications. You should extract the
specifications from your code and present them separately
here. If you write your comments in the style accepted by
Javadoc with the 6.170 doclet,
you'll be able to generate the specification documents
automatically from the code. The specification of an abstract
type should include its overview, specification fields, and
abstract invariants (specification constraints). The
abstraction function and rep invariant are not part of a
type's specification.
- Test cases. Ideally, your testbed reads tests from a file of
test cases in a format that is convenient to read and write. You need
not include very large test cases; for example, you might just note
the size of a random input generated for stress testing, and provide
the program that generated the tests. Indicate for each group of
tests what they are for (e.g., "stress tests, huge inputs",
"partition tests, all combinations of +/-/0 for integer args").
Documenting Code
Abstract data types. Every abstract data type (class
or interface) should have:
- An overview section that gives a one or two line
explanation of what objects of the type represent and
whether they are mutable.
- A list of specification fields. There might be only
one; for example, a set may have the field elems
representing the set of elements. Each field should have a
name, a type, and a short explanation. You may find it
useful to define extra derived fields that make it
easier to write the specifications of methods; for each of
these, you should indicate that it is derived and say how
it is obtained from the other fields. There may be
specification invariants that constrain the possible
values of the specification fields; if so, you should
specify them.
Method Specifications. All public methods of classes
should have specifications; tricky private methods should also be
specified. Method specifications should follow the requires,
modifies, throws, effects, returns structure described in the
specifications
handout and in class. Note that for 6.170, you may
assume arguments are non-null unless otherwise specified.
Implementation notes. Class comments should include the
following elements (which, for notable classes, appear also in
the Runtime Structure section of the
design documentation):
- An abstraction function that defines each
specification field in terms of the representation fields.
Abstraction functions are only required for classes which
are abstract data types, and not for classes like
exceptions or some GUI widgets.
- A representation invariant. RIs are required for
any class that has a representation (e.g., not most
exceptions). We strongly recommend that you test invariants
in a checkRep() method where feasible. Take care
to include in your invariants assumptions about what can
and cannot be null.
- For classes with complex representations, a note explaining
the choice of representation (also called the
representation rationale): what tradeoffs were
made and what alternatives were considered and rejected (and why).
Runtime assertions. These should be used judiciously,
as explained in lecture. For a longer discussion of the how
runtime assertions can improve the quality of your code, see
Writing Solid Code by Steve Maguire, Microsoft Press,
1995.
Comments. Your code should be commented carefully and
tastefully. Stylistic guidelines are available in the handout Java Style Guide. For an excellent
discussion of commenting style and for much good advice in
general about programming, see The Practice of Programming
by Brian W. Kernighan and Rob Pike, Addison-Wesley, Inc., 1999.
Back to Supplemental Info.
Back to the 6.170 home page.
For problems or questions regarding this page, contact: 6.170-webmaster@mit.edu.
$Id: documentation.html,v 1.2 2002/08/27 15:10:47 decouto Exp $