#### Example device: A Buffer



Static Discipline requires that we avoid the shaded regions aka "forbidden zones"), which correspond to *valid* inputs but *invalid* outputs. Net result: combinational devices must have GAIN > 1 and be NONLINEAR.

#### Due to unavoidable delays...

Propagation delay (t<sub>PD</sub>):
An UPPER BOUND on the delay from valid inputs to valid outputs.



#### Contamination Delay

an optional, additional timing spec

INVALID inputs take time to propagate, too...



Do we really need t<sub>CD</sub>?

Usually not... it'll be important when we design circuits with registers (coming soon!)

If  $t_{CD}$  is not specified, safe to assume it's 0.

CONTAMINATION DELAY, top

A LOWER BOUND on the delay from any invalid input to an invalid output

#### The Combinational Contract

$$A \rightarrow \triangleright B$$
  $\begin{array}{c|c} A & B \\ \hline 0 & 1 \\ 1 & 0 \end{array}$  t<sub>PD</sub> propagation delay t<sub>CD</sub> contamination delay



1. No Promises during XXXXX

2. Default (conservative) spec:  $t_{CD} = 0$ 

## Example: Timing Analysis

If NAND gates have a  $t_{PD}$  = 4nS and  $t_{CD}$  = 1nS

t<sub>CD</sub> is the *minimum* cumulative contamination delay over all paths from inputs to outputs

$$t_{PD} = 12$$
 nS

$$t_{CD} = 2$$
 nS



t<sub>PD</sub> is the *maximum* cumulative propagation delay over all paths from inputs to outputs

## The "perfect" logic family

- Good noise margins (want a "step" VTC)
- · Implement useful selection of (binary) logic
  - INVERTER, NAND, NOR with modest fan-in (4? Inputs)
  - More complex logic in a single step? (minimize delay)
- · Small physical size
  - Shorter signal transmission distances (faster)
  - Cost proportional to size (cheaper)
- · Inexpensive to manufacture
  - "print" technology (lithographic masks, deposition, etching)
  - Large-scale integration
- · Minimal power consumption
  - Portable
  - Massive processing without meltdown

6.111 Fall 2006

#### Transitor-transitor Logic (TTL)



NPN BJT  $I_{CE} = \beta I_{BE}$ 







TTL w/ totem pole outputs
("on" threshold = 2 diode drops)





74LS04 (courtesy TI)

#### TTL Signaling

- Typical TTL signaling spec
  - $I_{OL} = 16mA$ ,  $I_{OH} = -0.4mA$  ( $V_{OL} = 0.4V$ ,  $V_{OH} = 2.7V$ ,  $V_{CC} = 5V$ )
  - $-I_{IL} = -1.6mA$ ,  $I_{IH} = 0.04mA$  ( $V_{IL} = 0.8V$ ,  $V_{IH} = 2.0V$ )
  - Switching threshold = 1.3V
- Each input requires current flow ( $I_{IL},I_{IH}$ ) and each output can only source/sink a certain amount of current ( $I_{OL},I_{OH}$ ), so

Max number of inputs that can be driven by a single output is min( $-I_{IL}/I_{OL}$ ,  $-I_{IH}/I_{OH}$ )  $\approx$  10.

• Current-based logic  $\rightarrow$  power dissipation even in steady state, limitations on fanout

#### Complementary MOS Logic



MOSFET: I<sub>DS</sub> = f(V<sub>GS</sub>, V<sub>DS</sub>)





#### CMOS Inverter VTC



## CMOS Signaling

- Typical CMOS signaling specifications:
  - $V_{OL} \approx 0, V_{OH} \approx V_{DD}$  ( $V_{DD}$  is the power supply voltage)
  - $V_{IL} \approx \text{just under } V_{DD}/2, V_{IH} \approx \text{just over } V_{DD}/2$
  - Great noise margins! ~V<sub>DD</sub>/2
- · Inputs electrically isolated from outputs:
  - An output can drive many, many inputs without violating signaling spec (but transitions will get slower)
- · In the steady state, signals are either "0" or "1"
  - When  $V_{OUT} = OV$ ,  $I_{PD} = O$  (and  $I_{PU} = O$  since pullup is off)
  - When  $V_{OUT} = V_{DD}$ ,  $I_{PU} = 0$  (and  $I_{PD} = 0$  since pulldown is off)
  - No power dissipated in steady state!
  - Power dissipated only when signals change (ie, power proportional to operating frequency).

6.111 Fall 2006

## Multiple interconnect layers

IBM photomicrograph ( $SiO_2$  has been removed!)



Mosfet (under polysilicon gate)

#### Big Issue 1: Wires



· Today (i.e., 100nm):

 $\tau_{RC} \approx 50 \text{ps/mm}$ 

Implies > 1 ns to traverse a 20mm  $\times$  20mm chip This is a long time in a 2GHz processor

#### Big Issue 2: Power



• Energy dissipated =  $C V_{DD}^2$  per gate Power consumed =  $f n C V_{DD}^2$  per chip

where f = frequency of charge/discharge

n = number of gates /chip

#### Unfortunately...



**32 Amps** (@220v)

- Modern chips (UltraSparc III, Power4, Itanium 2) dissipate from 80W to 150W with a Vdd ≈ 1.2V (Power supply current is ≈ 100 Amps)
- Cooling challenge is like making the filament of a 100W incandescent lamp cool to the touch!
  - ·Worse yet...
    - Little room left to reduce Vdd
    - nC and f continue to grow

MIT Computation Center and Pizzeria

I've got the solution!



Hey: could we somehow recycle the charge?



#### CMOS Gate Recipe: Think Switches



## Beyond Inverters: Complementary pullups and pulldowns

Now you know what the "C" in CMOS stands for!

We want *complementary* pullup and pulldown logic, i.e., the pulldown should be "on" when the pullup is "off" and vice versa.

| pullup | pulldown | $F(A_1,,An)$  |
|--------|----------|---------------|
| on     | off      | driven "1"    |
| off    | on       | driven "O"    |
| on     | on       | driven "X"    |
| off    | off      | no connection |
|        |          | <b>†</b>      |

Since there's plenty of capacitance on the output node, when the output becomes disconnected it "remembers" its previous voltage - at least for a while. The "memory" is the load capacitor's charge. Leakage currents will cause eventual decay of the charge (that's why DRAMs need to be refreshed!).

What a nice  $V_{OH}$  you have...

#### CMOS complements



Thanks. It runs in the family...



conducts when  $V_{GS}$  is high conducts when  $V_{GS}$  is low



conducts when A is high conducts when A is low and B is high:  $A \cdot B$  or B is low:  $A + B = A \cdot B$ 



conducts when A is high or B is high: A+B

conducts when  $\underline{A}$  is  $\underline{low}$  and B is low:  $\overline{A} \cdot \overline{B} = \overline{A} + \overline{B}$ 

## A pop quiz!



What function does this gate compute?

| A | В | C |             |
|---|---|---|-------------|
| 0 | 0 | 1 | A 1 4 A 1 A |
| O | 1 | 1 | NAND        |
| 1 | 0 | 1 |             |
| 1 | 1 | 0 |             |
|   |   | _ |             |



6.111 Fall 2006

#### Here's another...





What function does this gate compute?

| _ | A                | В                | C                | _   |
|---|------------------|------------------|------------------|-----|
|   | 0<br>0<br>1<br>1 | 0<br>1<br>0<br>1 | 1<br>0<br>0<br>0 | NOR |
|   |                  |                  |                  |     |



## General CMOS gate recipe

Step 1. Figure out pulldown network that does what you want, e.g.,  $F = A^*(B+C)$  (What combination of inputs generates a low output?)





Step 3. Combine pfet pullup network from Step 2 with nfet pulldown network from Step 1 to form fully-complementary CMOS gate.



Looks pretty easy to do!



#### Basic Gate Repertoire

Are we sure we have all the gates we need? Just how many two-input gates are there?

| AN | ND OR NAND |    | NO | R  |   |    |   |
|----|------------|----|----|----|---|----|---|
| AB | У          | AB | У  | AB | У | AB | У |
| 00 | 0          | 00 | 0  | 00 | 1 | 00 | 1 |
| 01 | 0          | 01 | 1  | 01 | 1 | 01 | 0 |
| 10 | 0          | 10 | 1  | 10 | 1 | 10 | 0 |
| 11 | 1          | 11 | 1  | 11 | 0 | 11 | 0 |



Hmmmm... all of these have 2-inputs (no surprise)
... each with 4 combinations, giving 2<sup>2</sup> output cases

How many ways are there of assigning 4 outputs?  $\frac{2^2}{2} = 2^4 = 16$ 

#### There are only so many gates

There are only 16 possible 2-input gates ... some we know already, others are just silly

| I        |   |   |          |                  |                  |          |        |        |        |        |               |        |             |               |        |          |
|----------|---|---|----------|------------------|------------------|----------|--------|--------|--------|--------|---------------|--------|-------------|---------------|--------|----------|
| Ν        |   |   |          |                  |                  |          |        |        |        |        |               |        |             |               |        |          |
| Ρ        | Z |   |          |                  |                  |          |        |        |        | X      | N             |        | N           |               | N      |          |
| U        | Ε | A | A        |                  | В                |          | X      |        | Ν      | Ν      | 0             | A      | 0           | В             | A      | 0        |
| T        | R | Ν | >        |                  | >                |          | 0      | 0      | 0      | 0      | T             | <=     | T           | <b>&lt;</b> = | N      | N        |
|          |   |   |          |                  |                  |          |        |        |        | l      |               |        |             |               |        |          |
| AB       | 0 | D | В        | Α                | Α                | В        | R      | R      | R      | R      | 'B'           | В      | ' <i>A'</i> | Α             | D      | Ε        |
| 00<br>00 | 0 | 0 | <u>В</u> | <u>А</u><br>О    | <u>А</u><br>О    | <u>В</u> | R<br>0 | R<br>0 | 1<br>1 | 1<br>1 | 'B'           | B<br>1 | ' <i>A'</i> | 1             | D<br>1 | <u>E</u> |
|          |   |   |          | 0<br>0           | 0<br>1           |          |        |        |        | 1<br>0 | 'B'<br>1<br>0 |        |             | <u> </u>      |        |          |
| 00       | 0 | 0 |          | 0<br>0<br>1      | 0<br>1<br>0      |          |        | 0      | 1      | 1      | 1             | 1      | 1           | 1             | 1      | 1        |
| 00<br>01 | 0 | 0 |          | 0<br>0<br>1<br>1 | 0<br>1<br>0<br>0 | 0        |        | 0      | 1<br>0 | 1<br>0 | 1             | 1 0    | 1 1         | 1             | 1      | 1        |

How many of these gates can be implemented using a single CMOS gate?



CMOS gates are inverting; we can always respond positively to positive transitions by cascaded gates. But suppose our logic yielded cheap *positive* functions, while inverters were expensive...

6.111 Fall 2006

Fortunately, we can get by with a few basic gates...

AND, OR, and NOT are sufficient... (cf Boolean Expressions):

How many different gates do we really need?

#### One will do!

#### NANDs and NORs are universal:



Ah!, but what if we want more than 2 inputs?

# I think that I shall never see a circuit lovely as...



N-input TREE has  $O(\log N)$  levels... Signal propagation takes  $O(\log N)$  gate delays.

Question: Can EVERY N-Input Boolean function be implemented as a tree of 2-input gates?

## Here's a Design Approach

#### Truth Table

| С | В | A | У |
|---|---|---|---|
| 0 | 0 | 0 | 0 |
| 0 | 0 | 1 | 1 |
| 0 | 1 | 0 | 0 |
| 0 | 1 | 1 | 1 |
| 1 | 0 | 0 | 0 |
| 1 | 0 | 1 | 0 |
| 1 | 1 | 0 | 1 |
| 1 | 1 | 1 | 1 |

- 1) Write out our functional spec as a truth table
- 2) Write down a Boolean expression for every '1' in the output

$$Y = \overline{CB}A + \overline{C}BA + CB\overline{A} + CBA$$

3) Wire up the gates, call it a day, and declare success!

This approach will always give us Boolean expressions in a particular form:

## Straightforward Synthesis

We can implement
SUM-OF-PRODUCTS
with just three levels of logic.

INVERTERS/AND/OR



Propagation delay -No more than "3" gate delays
(well, it's actually O(log N) gate delays)

6.111 Fall 2006 Lecture 2, Slide 28