# Question # 1: (25 points)

### Part (a)

Show or list all of the dependencies in this program. For each dependency, indicate which instructions and register are involved. (5 points)



#### Part (b)

The pipelined datapath on the next page shows the fifth cycle of executing this program, including values for several of the stages. Fill in the ten remaining values, marked with a ? symbol, in the EX and MEM stages. (20 points; 2 points each)

- N.B.: Assume that registers initially contain their number plus 200: \$2 contains 202, \$8 contains 208, etc.
  - · Write your answers directly on the diagram, but write clearly.
  - · Show decimal values, and write 'X' for any numbers that cannot be determined.



|     |     |     |     |     |     |     | 1 5576     | 1  |
|-----|-----|-----|-----|-----|-----|-----|------------|----|
| IPG | FET | ROT | EXP | REN | WLD | REG | EXE DET WE | 90 |
| 1   | 2   | 3   | 4   | 5   | 6   | 7   | EXE DET WE |    |

One CPU manufacturer has proposed the 10-stage pipeline above for a 500MHz (2ns clock cycle) machine. Here are the correspondences between this and the MIPS pipeline:

- Instructions are fetched in the FET stage.
- Register reading is performed in the REG stage.
- ALU operations and memory accesses are both done in the EXE stage.
- Branches are resolved in the DET stage.
- WRB is the writeback stage.

How much time is required to execute one million instructions on this processor, assuming there are no dependencies or branches in the code? (5 points)

Time = 10000000x Time = 10000000 x 10 x 50  $= 2.5 \times 10^{17}$ 

Part (b)

Without forwarding, how many stall cycles are needed for the following code fragment? (5)

points)

\$t0, 0(\$a0) lw \$v1, \$t0, \$t0 add

2 stall cycles

Part (c)

If a branch is miss-predicted, how many instructions would have to be flushed from the

pipeline? (5 points)

instruction should would have to be flushed from the pipeline

#### Part (d)

Assume that a program executes one million instructions. Of these, 15% are load instructions which stall, and 10% of the instructions are branches. The CPU predicts branches correctly 75% of the time. How much time will it take to execute this program? (10 points)



## Memory Hierarchy

Question #3: (15 points)

Assume a processor has a byte addressable memory of 512 Kbytes and a cache block size of 4 Bytes. If the student architect was to design a direct-mapped cache of size 4 Kbytes, what should the following be:



b) How is the address partitioned for the cache design? Clearly show the fields and

summarize the function of each field (4 points)
$$2 - (10+2)$$

$$20$$

$$4 \times 10^{10}$$

$$5 \times 10^{10}$$

$$6 \times 10^{10}$$

$$7 \times 10^{10}$$

$$8 \times 10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{10}$$

$$10^{$$

Which one of these fields is also stored in cache and why? (3 points)

Tag is stored in cache in order to validate if the data is correct and it matches

d) What is the total size in bytes for each cache? (4 points)

Total (achesize = 
$$2^{10}$$
 (m x 3 2+ (32 - m - n - 2) + 1)

=  $2^{10}$  ((32 x 2) + (32 - 12) + 1)

Question # 4: (10 points) =  $85$  KB

A cache currently has an average memory access time of 12 ns, a hit time of 8 ns, and a his

A cache currently has an average memory access time of 12 ns, a hit time of 8 ns, and a hit rate of 90%. If doubling the size of the cache increases the hit time by 25% and increases the hit rate to 96%, what will the new average memory access time (AMAT) be ?

ANAT = hit time x hitrate + (miss rate x miss penalty)

AMAT = missrate x misspenalty

hit time x hitrate

missrate x misspenalty = 12×10-9

missrate x misspenalty = 8×10-9× 90×10-2

100/+25/=1.25

new AMAT = (1.25 x 8/s x 0.96) + 1.67

= 12.53nS