CS251 - Computer Organization and Design - Spring 2008
Lecture 19 - Execution Control
Practical Details
- Assignment 5
- Mid-term
Instruction Execution Components
Put It All Together
Instruction fetch
- PC
- Instruction memory
- Adder: no longer goes straight to PC, but is output (branch)
Control: none
R-type instruction
- Registers
- read1 -> ALU
- read2 -> ALU
- write: needs a MUX because sometimes the second one comes here
(load)
- ALU
Control signals:
- Clock
- Regwrite
- ALU operation select (3)
I-type instructions
Load/store
- Registers
- read1 -> ALU for address calculation
- read2 -> memory with data to be written
- write -> needs a MUX, might be
- third register field (R-inst)
- second register field (I-inst)
- Sign extender
- ALU: combines read1 and sign-extended immediate
- Data memory: needs a MUX on output
- could be load
- could be ALU output
Control signals
- Regwrite
- ALUctl (3)
- MemRead
- MemWrite
- MemToReg
Conditional branch
- Registers
- read1 -> ALU
- read2 -> ALU
- ALU:
- zero output
- needs a MUX on input
- could be sign extended instruction for load/store
- could be read2 for r-type
- Sign extender:
- Shifter
- Adder: MUX needed on output
Control signals
- ALUctl
- PCSrc
- ALUSrc
Control Logic
Highest six bits of instruction is opcode
- For opcode 0, lowest six bits select function
Control logic needs to accept opcode (26:31) and function (0:5)
- and output all the control signals
Split into two stages
- From opcode only generate all control signals except ALUctl, plus
ALUop
- From ALUop plus function generate ALUctl
Signals to generate
| Signal |
0 |
1 |
| RegDst |
rt |
rd |
| RegWrite |
n/a |
write register |
| ALUSrc |
register |
instruction |
| Branch |
no branch |
branch |
| MemRead |
n/a |
read memory |
| MemWrite |
n/a |
write memory |
| MemToReg |
write register from ALU |
write register from memory |
| ALUOp0 |
not branch |
branch |
| ALUOp1 |
not R-type |
R-type |
Opcodes
| Opcode |
Instruction |
Assert |
| 100011 |
lw |
ALUSrc, MemToReg, RegWrite, MemRead |
| 101011 |
sw |
ALUSrc, MemWrite |
| 000100 |
beq |
Branch, ALUOp0 |
| 000000 |
R-format |
RegDst, RegWrite, ALUOp1 |
ALUCtl
| Operation |
ALUOp |
Funct |
action |
ALUCtl |
| beq |
01 |
XXXXXX |
subtract |
110 |
| add |
10 |
100000 |
add |
010 |
| sub |
10 |
100010 |
subtract |
110 |
| and |
10 |
100100 |
AND |
000 |
| or |
10 |
100101 |
OR |
001 |
| slt |
10 |
101010 |
set on less than |
111 |
Timing
Suppose
- memory units 200 picoseconds
- ALUs 100 ps
- register files 50 ps
Then for R-type instructions
- Longest path (?)
- inst fetch : 200 ps (really?)
- argument values : 50 ps
- ALU : 100 ps
- result write : 50 ps
- total : 400 ps (200 ps ?)
- Other paths
But for loads
- Longest path (?)
- argument values : 50 ps
- ALU : 100 ps
- result read : 200 ps
- result write : 50 ps
- inst fetch : 200 ps (really ?)
- total : 600 ps (400 ps ?)
And for conditional branches
- Longest path
- argument values : 50 ps
- ALU: 100 ps
- inst fetch : 200 ps
- total : 350 ps
Possible competitor
- inst fetch : 200ps
- everything else done before start of clock
Climax
The CPU plus memory is just a finite state machine,
albeit a complex one.
Return to: