CS251 - Computer Organization and Design - Spring 2008

Lecture 31 - Cache Memory

Practical Details

  1. Assignment 7
  2. Finished pipelined execution

Memory Hardware

Hardware Performance

Access time



per Gbyte

Typical Size Notes
Registers 0.1 128 bytes Can't separate cost from remainder of processor
Cache (SRAM) 0.1 - 1.0 $4000.00 ~1 MByte Big on-chip, off-chip differences
Main memory (DRAM) 50 - 70 $100.00 ~2 Gbyte SDRAM is faster in bursts
Ramdisk (based on NVRam) read: 10.0

write: 10,000.0



Inadequate capacity for demand.

Many competing technologies.

Limited number of memory cycles

Disk seek: 5,000,000

continuous: ~100

$0.50 ~100Gbyte
Internet infinity free ~10Tbyte

Memory Concepts





The Big Picture

  1. We have a little fast expensive memory, and
  2. We replicate parts of the inexpensive memory in the expensive memory

Moral of the story

Better not to miss very often, because there are huge performance hits.

How the big picture is implemented

Break the address into pieces

Effect is that the low order bits apply to many locations in memory.

The cache normally contains an integral number of lines, usually a power of 2.

When a memory reference occurs

  1. Break the address into three pieces
    1. address in the line
    2. address of the line
    3. high order bits of the block number
  2. Using the line address retrieve the high order bits of the block number.
  3. Compare to the address
  4. If they match
    1. read from or write to the address
    2. on write hardware usually also writes the main memory asynchronously using a write buffer. This is called write-through.
    3. It is also possible to re-write memory only when the cache line is replaced. This is called write-back.


    1. Stall the processor
    2. Use the block number to get the relevant block from memory
    3. When it is installed, rerun the instruction

This way of doing things is called direct mapping.

Return to: