# Lecture 22 - Intreoduction to Multiple Cycle Designs

## Practical Details

1. Exam results
• mean 40.4 (67%)
• std dev 10.3 (17%)
• 25% point 33.5 (56%)
• median 44.3 (74%)
• 75% point 48 (80%)
2. Assignment this week
• Add a shift instruction: `srl rd, rt, shamt` -- | 000000 | rs | rt | rd | shamt (5 bits) | 000010 |

# Single Cycle Weaknesses

• See below

See above

#### Timing

Different cycles take different amounts of time

• Clock can be no faster than the slowest

#### Pipelining

Phases of instruction execution

1. Instruction fetch & decode
2. Operand values
3. Result calculation
4. Result writeback

We would like to get these separated from one another,

• which means storing state between phases

## And yet,

The most popular processor architecture in the world is a single cycle one

#### Which brings up a MIPS oddity

No condition codes

• What are condition codes
1. N - negative: bit 31 of most recent result
2. Z - zero: NAND of all bits of result
3. C - carry:
• addition: carry out of MSB (unsigned overflow)
• subtraction: borrow in MSB (unsigned underflow)
• shift: last bit shifted out
• otherwise unchanged
4. V - overflow:
• otherwise unchanged

MIPS uses the set instruction followed by conditional branch

# Multicycle Execution

Separating the phases

#### Instruction Fetch

Store instruction in register

• You can now put a different address on the address lines
• The only restriction is that you can't latch it into the instruction register until you ar finished with the ocntents of the instruction register
• You can also change the program counter
• probably using the ALU, which is not yet being used by any of your arguments
• Latch the new program counter into the PC-register at the same time you latch the instructiuon into the instruction register.

Increment PC using ALU

#### Evaluating Operands

Put operands into registers, which are input to

• ALU
• Memory

## Control Logic

Highest six bits of instruction is opcode

• For opcode 0, lowest six bits select function

Control logic needs to accept opcode (26:31) and function (0:5)

• and output all the control signals

Split into two stages

1. From opcode only generate all control signals except ALUctl, plus ALUop
2. From ALUop plus function generate ALUctl

#### Signals to generate

 Signal 0 1 RegDst rt rd RegWrite n/a write register ALUSrc register instruction Branch no branch branch MemRead n/a read memory MemWrite n/a write memory MemToReg write register from ALU write register from memory ALUOp0 not branch branch ALUOp1 not R-type R-type

#### Opcodes

 Opcode Instruction Assert 100011 lw ALUSrc, MemToReg, RegWrite, MemRead 101011 sw ALUSrc, MemWrite 000100 beq Branch, ALUOp0 000000 R-format RegDst, RegWrite, ALUOp1

#### ALUCtl

 Operation ALUOp Funct action ALUCtl beq 01 XXXXXX subtract 110 add 10 100000 add 010 sub 10 100010 subtract 110 and 10 100100 AND 000 or 10 100101 OR 001 slt 10 101010 set on less than 111

### Timing

Suppose

• memory units 200 picoseconds
• ALUs 100 ps
• register files 50 ps

Then for R-type instructions

• Longest path (?)
1. inst fetch : 200 ps (really?)
2. argument values : 50 ps
3. ALU : 100 ps
4. result write : 50 ps
5. total : 400 ps (200 ps ?)
• Other paths

• Longest path (?)
1. argument values : 50 ps
2. ALU : 100 ps
3. result read : 200 ps
4. result write : 50 ps
5. inst fetch : 200 ps (really ?)
6. total : 600 ps (400 ps ?)

And for conditional branches

• Longest path
1. argument values : 50 ps
2. ALU: 100 ps
3. inst fetch : 200 ps
4. total : 350 ps

Possible competitor

1. inst fetch : 200ps
2. everything else done before start of clock

### Climax

The CPU plus memory is just a finite state machine, albeit a complex one.