CS251 - Computer Organization and Design - Spring 2008

Lecture 34 - Virtual Memory


Practical Details

  1. Assignment 8
  2. Assignment 9

Virtual Memory

Like cache, but between main memory and disk

Disk Hardware

Spinning disk & read/write head

Disk speed

Seek time can be minimized by smart algorithms, rotational delay cannot. But predictive caching helps a lot in some disks.

Sizes

Surely, virtual memory is an obsolete concept.

Example

Virtual memory is indeed obsolete, but

Procedure

Terminology

Address translation (64 bit address, 4Kbyte pages, 1 terabyte physical memory)

Virtual address

63 to 12 11 to 0
Virtual page number Offset within page

maps to

Physical address

39 to 12 11 to 0
Physical page number Offset within page

Address translation uses a page table

Physical pages can be

Page table

Virtual page number Valid Physical page number
63 to 12 0/1 39 to 12

Algorithm

This is pretty gross, which means `not pretty at all'.

Integration of virtual memory with the cache

Table Lookaside Buffer (TLB)

Valid Dirty Ref Read Write Tag (Virtual Page Number) Physical Page Number
0/1 0/1 0/1 0/1 63 to 12 40 to 12

Algorithm

  1. Divide virtual address
    63 -- Virtual page number -- 12 11 -- page offset -- 0
  2. If virtual page number matches tag AND Valid AND Read OR Write then
    1. Increment Ref
    2. If (access is Write) then
      1. Set Dirty
    3. Form physical address
      39 -- Physical page number -- 12 11 -- page offset -- 0
    4. Divide phyical address
      39 -- Line number in memory -- 14 13 -- line number in cache --6 5 -- offset in line -- 2 1 -- acess size -- 0

      and access cache.

  3. Else if NOT (Read OR Write) then
    1. Raise memory protection exception
  4. Else if NOT(virtual page number matches tag AND Valid) then
    1. Raise page fault exception
    2. OS chooses page to replace.
    3. If (Dirty on that page) OS writes that page back to disk
    4. OS reads in new page from disk

Comment. OS usually swaps out any process that incurs a page fault so as to use the processor for something else.


Input/Output

The key concept is the `system bus', which is also known as the

though these two night be separate entities. Combined memory-I/O buses are used in systems with

On the system bus you will find three types of devices

  1. Parts of the User interface, interact with humans

    Note. Data rates that follow are nominal, which does not always match up with useful. E.g. keyboard

    But to get that rate most of us would need to type the same phrase over and over, which is not very useful

    Device Input/Output I/O data rate (Mbit/sec)
    Keyboard Input 0.0001
    Mouse Input/output 0.0038
    Voice Input/output 0.264
    Printer Output 3.2
    Bit-mapped graphics Output 100
  2. Memory
    Device Data rate (Mbit/sec) Access Delay (microsec)
    Magnetic tape 30 1,000,000,000 (human limited)
    Optical disk (CDROM) 80 100,000,000 (human limited)
    Magnetic disk 1,000 10,000
    SDRAM 20,000 0.05

    Built into a `seamless' memory hierarchy, but if you don't know where the seams are your programs won't run very well.

  3. Network interfaces
    Device Data rate (Mbit/sec) Access Delay (microsec)
    Modem 0.06 15,000,000
    Wireless LAN 50 1,000,000
    Wired LAN 1,000 < 1,000

Buses

Processor to Cache

Bandwidth: 1-5 Gwords/sec

Cache to High Bandwidth Devices (North Bridge)

Devices

  1. Main memory
  2. Bit-mapped graphics
  3. Network

Bandwidth: 200-500 Mwords/sec

High Bandwidth Devices to Low Bandwidth Devices (South Bridge)

Devices

  1. Disks
  2. Audio
  3. USB for keyboard, mouse, etc
  4. Slow ethernet
  5. CDROM
  6. Legacy devices

Bus Transactions

Concepts

  1. Master/slave
  2. Bus arbitration
  3. Synchronous/asynchronous
  4. Block transfer
  5. Multiplexed/non-Multiplexed

Typical Bus Transaction (Asynchronous, multiplexed)

  1. Master requests bus
  2. Master receives bus from bus arbitration hardware
  3. Master asserts Address, then Read.
  4. Slave sees Read, latches Address, then asserts Acknowledge
  5. Master sees Acknowledge, releases Read and Address
  6. Slave sees Read released, releases Acknowledge
  7. Slave asserts Data, then DataReady
  8. Master sees DataReady, latches Data, then asserts Acknowledge
  9. Slave sees Acknowledge, releases Data and DataReady
  10. Master sees DatReady released, releases Acknowledge
  11. Master releases bus

On a block transfer steps 7 to 10 are repeated until all data is transferred, the releases bus.

On a synchronous bus, assertion of Read, Acknowledge and DataReady are timed by a bus clock

On a non-Multiplexed bus there are separate address and data lines, and separate acknowledge lines

DMA (direct memory access) is possible if a device can become bus master.

Interrupts

Devices can assert interrupt lines on the bus to request service from the processor.

Interrupt Processing

  1. Device asserts its interrupt output
  2. Interrupt control unit (ICU) receives interrupt signal on its input, asserts its interrupt output
  3. Before each instruction fetch the processor checks its interrupt input.
  4. If it sees the input asserted
    1. It reads a register of the ICU, during which the pipeline drains
    2. The read returns a program counter (called an interrupt vector)
    3. The program counter is used to fetch the first instruction of the interrupt service routine (ISR). We saw this briefly earlier when we were discussing control signals in the processor.

Return to: