CS444 class 19

CS444 Class 19 Paging

From last time:

Parts of the CPU, pg. 189 pic

All addresses we normally use as programmers are virtual addresses. The virtual machine that the OS provides to a process has memory accessible by VA. The addresses in CPU registers are VAs, for example, the ESP.

The PC (x86 EIP) is a VA, and specifies which next instruction to pull in from memory. A memory read request by VA is generated in the upper CPU and passed to the MMU, which maps it to a PA (with the help of a page table), and the PA specifies the exact physical location.

Recall that paging divides memory into fixed-sized blocks—we’ll assume of size 4KB bytes, the x86 page size. 4K = 0x1000, so it is easy to analyze hex addresses for what page they involve:

Example

VA = 0x12345, = 0x 12 |345

vpn = 0x12, in-page offset = 0x345

This address lies in virtual page 0x12, at offset 0x345 within that page.

It’s the virtual page number (vpn) that is mapped to the pfn (page frame number)

For example, this page could map to pfn 0x456

Then PA = 0x456|345 (same in-page offset, vpn replaced by pfn)

The MMU does this in nanoseconds once set up with the mapping information in the page table. It maps millions of addresses using the same page table setup, then the OS might change the page table to remap something.

Also note that the x86 PT is in (kernel data) memory, not usually physically inside the MMU as might be thought from pic on pg. 192. The OS can edit the PT by memory access to it.

Next: look at examples on pg. 190 and 191.

Look at Tan’s page table example on p. 190, using 4K pages, so could be an x86 example. Boils down to page number mapping:

page 0 -> pfn 2 VA 0K-4K (really 4K-1), PAs 8K-12K

page 1 -> pfn 1

page 2 -> pfn 6

...

page 6 -> nothing VA 24K-28K

Examples on pg. 191

MOV REG, 8192 (movl 8192, %eax, say)

Here page 0 is 0K-4K in the VA space (4K-1 to be exact), page 1 is 4K-8K, etc., as on pg. 190.

So 8192 = 8K is at the start of page 2. It’s mapped to page 6 in physical memory, page frame 6, with PA= 6*4K = 24K. So the PA access here is to PA 24K. The physical memory is accessed and the value there copied into the REG.

MOV REG, 32780 is accessing VA 32K, which is not mapped here (see “X” in its spot). When the MMU tries to map this address, it fails and causes an exception called a “page fault”. Assuming this process should be able to use this memory, the OS page fault handler gets a page of memory and adds it to the process image.

What if the address is not at the start of a page? If it’s at 8K+20, the PA is 24K+20, and so on. Look at this in hex: 4K = 0x1000 (how handy!) 20 = 0x14, so 8K+20 = 0x2014, and the cooresponding PA is 24K+20 = 0x6014.

VA 0x2014 (virtual page 2) à PA 0x6014 (page frame 6)

In binary: See lowest 12 bits the same, see page numbers in higher bits:

VA 0x2014 = 0010 0000 0001 0100 à

PA 0x6014 = 0110 0000 0001 0100

We see that the last 12 bits are the same in both addresses. This can be called the in-page offset. The page numbers are the parts that are mapped. The page table holds the vpn-pfn mapping, plus other status bits per page.

Page 192 Page Table (first half)

Vpn pfn present bit

7 0 0

6 0 0

5 3 1

4 4 1

3 0 1

2 6 1

1 1 1

0 2 1

The vpns are just the array indexes, so the contents of the array, the actual page table, has pfn’s and present bits in their entries. As discussed on the next page, the present bit is the most important bit, but there are others.

The page table entry (PTE)

Page tables entries (PTEs) are just the array entries of the page table.The general case is shown on pg. 193. Each page is tracked as present/absent, readable or read/writable, accessed (referenced) or not, dirty (modified) or not, and something called “protection”. You can ignore the caching bit.

This pic on pg. 193 is not showing another important bit, the U/S bit, that helps with user-kernel separation. The MMU needs to know which pages are kernel pages to prevent user code from accessing them.

The x86 PTE looks like this, as shown in Tan., pg. 887, important bits:

Page frame number: bits 12-31

U/S

R/W

P: present bit, bit 0 (AKA V bit, for valid)

R/W: writable if 1, bit 1 (a “protection” bit)

U/S: supervisor if 0, user if 1 ßfor user/kernel separation

A: accessed

D: dirty

Also on pg. 887: AMD x64, to show similar bits for 64-bit PTEs, and a much longer pfn to support huge physical memory

Translation Lookaside Buffers, to speed up MMU work: read in book, Sec. 3.3.3

Multi-level page tables—needed for efficient 4-GB address spaces. Otherwise, if PTE = 4 bytes, for 4KB, monolithic PT would be 4GB/1K= 4MB, expensive in memory.

The x86 32-bit paging system, a two-level paging system

A nice system, well adapted to use for any 32-bit OS you might wish to write, and in use by all the versions of UNIX on x86, plus Windows, to implement a “flat address space.” The x86 also offers segmentation, but this is not in use by these OS’s, except in a very trivial way, so we will try to ignore it.

Note: good memory management for 64-bit OS’s is more challenging, and this system is not ready for that. We’ll stick to 32-bit systems.

Uses 4K pages, and a group of 1024 (1K) pages uses one page table (or PT.) 1K*4K = 4M, so each page table specifies VA to PA mappings for 4M of VA space. The SAPC has 4MB of usable memory because it has only one PT set up, plus a "page directory" to point to it. (In fact our SAPC systems have more than 4MB of memory, but it's not usable because of lack of PTs.)

Since it takes 1K blocks of 4M to make 4G, we see that there are (up to) 1K PTs for a full 32-bit address space. Like pic on pg. 199.

The 1K PTs are located with the help of the page directory, the top-level page table. Each lower-level PT has an entry in the page directory with the pfn of the PT to locate it. These page directory entries are 4 bytes long, so the whole page directory fits in 4KB, or one page.

Similarly, each PT has one entry for each page, called a page table entry (PTE), also 4 bytes long. The 1K entries of a PT also fit exactly in one 4KB page.

This system is much better than a monolithic single PT, because the page directory entries can be marked as "not present", meaning there is no PT needed for that area of virtual memory. Thus it can efficiently handle a small program in a process, with maybe one PT for code and data and one for the DLL and one for the stack.

Page Faults (an intro)

Book coverage: pg. 191 (quick intro), pg 228-229 (details)

What about page 6, with a PTE having P=0, not present? An access to page 6 (addresses 0x6000-6fff) will cause a page fault. The MMU is saying “help! I don’t know what to do, please fill in this pfn!!” The page fault is a trap to the OS. The OS can often determine what should be in that page, say by reading the executable file for another page of code. It finishes by filling in a good pfn and setting P=1. Then it causes the CPU to re-execute the instruction, which succeeds this time because P=1 in the PTE.

Note that most instruction executions don’t cause page faults and proceed without a bit of help from the OS. The OS gets involved only occasionally, when a new page is referenced for the first time (or a system call or interrupt comes along.)

Page faults are crucial to demand paging, the core of virtual memory support. A program image starts out with only a few pages of memory, and page faults notify the OS of the needs of the program. This way, memory is provided “just in time.” This concept is called demand paging.

Page Replacement Algorithms: deciding which page to throw out of memory

Tan, Sec. 3.4

Page replacement algorithms make decisions in the most important spot where actual decisions can be made, that is, where the OS has some discretion in management of the memory pages. On the other hand, when pages are brought into memory under demand paging, there is very little choice about it—the program needs the page and cannot progress without it.

The OS maintains a free list of pages to use as needed. It gets a page from the free list at page fault time, and puts pages back on the free list by a kind of garbage collection scheme with the fancy name of the page replacement algorithm. Garbage collection is a real technical term used by Java VM people and compiler writers as well as OS folk.

Demand Paging and the circulation of memory pages.

With demand paging, a program is not “loaded into memory”, but rather only mapped into memory. PTEs with P=0 are set up for it, except for say one page of stack and possibly one page of code, which are given actual physical memory.

When another page is referenced by the program, the P=0 in the PTE causes the MMU to initiate a page fault. In the page fault handler of the kernel, the kernel fills in the page, using a page from its “free list.” (This often requires reading a page from an executable file while the process is blocked.) After the page itself is filled in, the kernel fills in the pfn in the PTE and sets P=1. Then it gets the process to re-execute the instruction that caused the page fault. More on this later.

We see that the memory assigned to a process grows incrementally. The pages of a process “age”. Page “age” is the length of time since the last reference to the page.

Old pages of a process are probably not as important to the program as the newer one, and can be removed, i.e., garbage-collected. That’s the underlying idea of “LRU” least-recently-used page replacement algorithm. The removed pages are put on the free list. If they are needed later, they can always be paged back in.

We see that the pages go round and round in the system, spending time on the free list and then in some process image (or possibly in a page table or other kernel object—the kernel uses paging for some of its own data too.)

Tan. mentions the free list on pg. 769, as maintained by the page daemon.

Add this discussion to the reading list.