CS251 - Computer Organization and Design - Spring 2008
Lecture 34 - Virtual Memory
Practical Details
- Assignment 8
- Assignment 9
Examples of Cache Memories
Set Associative
Effectively a set of direct mapping caches
- associative mapping between sets
Cache
Cache address |
Valid |
Tag |
Data |
Valid |
Tag |
Data |
Valid |
Tag |
Data |
Valid |
Tag |
Data |
000...000 |
|
31 - n3 bits |
|
|
31 - n3 bits |
|
|
31 - n3 bits |
|
|
31 - n3 bits |
|
000...001 |
|
|
|
|
|
|
|
|
|
|
|
|
000...010 |
|
|
|
|
|
|
|
|
|
|
|
|
... |
|
|
|
|
|
|
|
|
|
|
|
|
111...110 |
|
|
|
|
|
|
|
|
|
|
|
|
111...111 |
|
|
|
|
|
|
|
|
|
|
|
|
Address
Line number in memory |
Line number in set |
Word number in line |
Access Size |
31 to n3+1 |
n3 to n2+1 |
n2 to n1+1 |
n1 to 0 |
Tag |
Cache address |
|
Ignored |
- n1: normally 2
- n2-n1: log_2 (number of words in a line)
- n3-n2: log_2 (number of lines in a set)
Circuit (Figure 7.17, page 503)
- Use line number in set to choose a horizontal set of lines
- In parallel match Line number in memory to each tag in the set
- AND with Valid to get SetHit
- In parallel use Word number in line to activate one word in each
set
- All SetHits ORed to get Hit
- If (Hit) then
- Multiplexer addressed by true SetHit selects word to return
- Else
- Processor is stalled
- Line is chosen to remove fram cache
- Line is retrieved from next level of memory
- Instruction is rerun
New Issues
Replacement Algorithm
Usually least recently used (LRU). i.e. most dusty
Others are possible
- Least Frequently Used (likely to be ties)
- Oldest, First In First Out
- Random
Virtual Memory
Like cache, but between main memory and disk
Disk Hardware
Spinning disk & read/write head
Disk speed
- Seek time
- about 5 milliseconds
- Rotational delay: 9,000 revolutions per minute (RPM)
- 150 RPS
- 7 milliseconds per revolution
- worst case is 7 milliseconds = 7,000,000 nanoseconds seek time
- average case is 3.5 milliseconds
- 125 Megabytes per second transfer rate
- Direct memory access (DMA)
- 30 microseconds = 30,000 nanoseconds to transfer a page
Seek time can be minimized by smart algorithms, rotational delay cannot.
But predictive caching helps a lot in some disks.
Sizes
- Disk: 100 Gbytes
- DRAM: 1-10 GBytes
Surely, virtual memory is an obsolete concept.
- It should be, but
- application and operating system sizes just keep getting bigger
- and so does the number of programs that users have open
simultaneously
Example
- 4 Gbyte real memory
- 32 bit addresses, which allow a 4 Gbyte virtual memory
Virtual memory is indeed obsolete, but
- 64 bit addresses allow a 16 exabyte virtual memory, so
- you have to learn it anyway.
Procedure
Terminology
- page = block
- page fault = cache miss
- copyback = write-back
Address translation (64 bit address, 4Kbyte pages, 1 terabyte physical
memory)
Virtual address
63 to 12 |
11 to 0 |
Virtual page number |
Offset within page |
maps to
Physical address
39 to 12 |
11 to 0 |
Physical page number |
Offset within page |
Address translation uses a page table
- table very large, various algorithmic techniques to lighten the
load
Physical pages can be
Page table
Virtual page number |
Valid |
Physical page number |
63 to 12 |
0/1 |
39 to 12 |
|
|
|
|
|
|
Algorithm
- Look up virtual page number in page table
- if (Valid) //page is in memory
- form phyaical address by concatenating page offset to physical page
number
- return the requested data
- (What does this mean for alignment of cache lines with respect to
pages?)
- else
- raise a `page fault' interrupt and let the operating system handle
getting the page
This is pretty gross, which means `not pretty at all'.
Integration of virtual memory with the cache
Table Lookaside Buffer (TLB)
- subset of page table containing recently used pages
Valid |
Dirty |
Ref |
Read |
Write |
Tag (Virtual Page Number) |
Physical Page Number |
0/1 |
0/1 |
|
0/1 |
0/1 |
63 to 12 |
40 to 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Algorithm
- Divide virtual address
63 -- Virtual page number -- 12 |
11 -- page offset -- 0 |
- If virtual page number matches tag AND Valid AND Read OR Write then
- Increment Ref
- If (access is Write) then
- Set Dirty
- Form physical address
39 -- Physical page number -- 12 |
11 -- page offset -- 0 |
- Divide phyical address
39 -- Line number in memory -- 14 |
13 -- line number in cache --6 |
5 -- offset in line -- 2 |
1 -- acess size -- 0 |
and access cache.
- Else if NOT (Read OR Write) then
- Raise memory protection exception
- Else if NOT(virtual page number matches tag AND Valid) then
- Raise page fault exception
- OS chooses page to replace.
- If (Dirty on that page) OS writes that page back to disk
- OS reads in new page from disk
Comment. OS usually swaps out any process that incurs a page fault so as
to use the processor for something else.
Return to: