- Exam results
- mean 40.4 (67%)
- std dev 10.3 (17%)
- 25% point 33.5 (56%)
- median 44.3 (74%)
- 75% point 48 (80%)

- Assignment this week
- Add a shift instruction:
`srl rd, rt, shamt`

-- | 000000 | rs | rt | rd | shamt (5 bits) | 000010 |

- Add a shift instruction:

Seems pretty ad hoc.

- See below

See above

Different cycles take different amounts of time

- Clock can be no faster than the slowest

Phases of instruction execution

- Instruction fetch & decode
- Operand values
- Result calculation
- Result writeback

We would like to get these separated from one another,

- which means storing state between phases

The most popular processor architecture in the world is a single cycle one

No condition codes

- What are condition codes
- N - negative: bit 31 of most recent result
- Z - zero: NAND of all bits of result
- C - carry:
- addition: carry out of MSB (unsigned overflow)
- subtraction: borrow in MSB (unsigned underflow)
- shift: last bit shifted out
- otherwise unchanged

- V - overflow:
- addition/subtraction: overflow in TCI
- otherwise unchanged

MIPS uses the set instruction followed by conditional branch

Separating the phases

Store instruction in register

- You can now put a different address on the address lines
- The only restriction is that you can't latch it into the instruction register until you ar finished with the ocntents of the instruction register
- You can also change the program counter
- probably using the ALU, which is not yet being used by any of your arguments
- Latch the new program counter into the PC-register at the same time you latch the instructiuon into the instruction register.

Increment PC using ALU

Put operands into registers, which are input to

- ALU
- Memory

Highest six bits of instruction is opcode

- For opcode 0, lowest six bits select function

Control logic needs to accept opcode (26:31) and function (0:5)

- and output all the control signals

Split into two stages

- From opcode only generate all control signals except ALUctl, plus ALUop
- From ALUop plus function generate ALUctl

Signal | 0 | 1 |

RegDst | rt | rd |

RegWrite | n/a | write register |

ALUSrc | register | instruction |

Branch | no branch | branch |

MemRead | n/a | read memory |

MemWrite | n/a | write memory |

MemToReg | write register from ALU | write register from memory |

ALUOp0 | not branch | branch |

ALUOp1 | not R-type | R-type |

Opcode | Instruction | Assert |

100011 | lw | ALUSrc, MemToReg, RegWrite, MemRead |

101011 | sw | ALUSrc, MemWrite |

000100 | beq | Branch, ALUOp0 |

000000 | R-format | RegDst, RegWrite, ALUOp1 |

Operation | ALUOp | Funct | action | ALUCtl |

beq | 01 | XXXXXX | subtract | 110 |

add | 10 | 100000 | add | 010 |

sub | 10 | 100010 | subtract | 110 |

and | 10 | 100100 | AND | 000 |

or | 10 | 100101 | OR | 001 |

slt | 10 | 101010 | set on less than | 111 |

Suppose

- memory units 200 picoseconds
- ALUs 100 ps
- register files 50 ps

Then for R-type instructions

- Longest path (?)
- inst fetch : 200 ps (really?)
- argument values : 50 ps
- ALU : 100 ps
- result write : 50 ps
- total : 400 ps (200 ps ?)

- Other paths

But for loads

- Longest path (?)
- argument values : 50 ps
- ALU : 100 ps
- result read : 200 ps
- result write : 50 ps
- inst fetch : 200 ps (really ?)
- total : 600 ps (400 ps ?)

And for conditional branches

- Longest path
- argument values : 50 ps
- ALU: 100 ps
- inst fetch : 200 ps
- total : 350 ps

Possible competitor

- inst fetch : 200ps
- everything else done before start of clock

**The CPU plus memory is just a finite state machine**,
albeit a complex one.

Return to: