CS444 class 19

CS444 Class 19 Paging (no class Wed., 11/27, Thanksgiving eve)

From last time:

Parts of the CPU, pg. 189 pic

All addresses we normally use as programmers are virtual addresses. The virtual machine that the OS provides to a process has memory accessible by VA. The addresses in CPU registers are VAs, for example, the ESP.

The PC (x86 EIP) is a VA, and specifies which next instruction to pull in from memory. A memory read request by VA is generated in the upper CPU and passed to the MMU, which maps it to a PA (with the help of a page table), and the PA specifies the exact physical location.

Recall that paging divides memory into fixed-sized blocks—we’ll assume of size 4KB bytes, the x86 page size. 4K = 0x1000, so it is easy to analyze hex addresses for what page they involve:

Example

VA = 0x12345, = 0x 12 |345

vpn = 0x12, in-page offset = 0x345

This address lies in virtual page 0x12, at offset 0x345 within that page.

It’s the virtual page number (vpn) that is mapped to the pfn (page frame number)

For example, this page could map to pfn 0x456

Then PA = 0x456|345 (same in-page offset, vpn replaced by pfn)

The MMU does this in nanoseconds once set up with the mapping information in the page table. It maps millions of addresses using the same page table setup, then the OS might change the page table to remap something.

Also note that the x86 PT is in (kernel data) memory, not usually physically inside the MMU as might be thought from pic on pg. 192. The OS can edit the PT by memory access to it.

Next: look at examples on pg. 190 and 191.

Look at Tan’s page table example on p. 190, using 4K pages, so could be an x86 example. Boils down to page number mapping:

page 0 -> pfn 2 VA 0K-4K (really 4K-1), PAs 8K-12K

page 1 -> pfn 1

page 2 -> pfn 6

...

page 6 -> nothing VA 24K-28K

Examples on pg. 191

MOV REG, 8192 (movl 8192, %eax, say)

Here page 0 is 0K-4K in the VA space (4K-1 to be exact), page 1 is 4K-8K, etc., as on pg. 190.

So 8192 = 8K is at the start of page 2. It’s mapped to page 6 in physical memory, page frame 6, with PA= 6*4K = 24K. So the PA access here is to PA 24K. The physical memory is accessed and the value there copied into the REG.

MOV REG, 32780 is accessing VA 32K, which is not mapped here (see “X” in its spot). When the MMU tries to map this address, it fails and causes an exception called a “page fault”. Assuming this process should be able to use this memory, the OS page fault handler gets a page of memory and adds it to the process image.

What if the address is not at the start of a page? If it’s at 8K+20, the PA is 24K+20, and so on. Look at this in hex: 4K = 0x1000 (how handy!) 20 = 0x14, so 8K+20 = 0x2014, and the cooresponding PA is 24K+20 = 0x6014.

VA 0x2014 (virtual page 2) à PA 0x6014 (page frame 6)

In binary: See lowest 12 bits the same, see page numbers in higher bits:

VA 0x2014 = 0010 0000 0001 0100 à

PA 0x6014 = 0110 0000 0001 0100

We see that the last 12 bits are the same in both addresses. This can be called the in-page offset. The page numbers are the parts that are mapped. The page table holds the vpn-pfn mapping, plus other status bits per page.

Page 192 Page Table (first half)

Vpn pfn present bit

7 0 0

6 0 0

5 3 1

4 4 1

3 0 1

2 6 1

1 1 1

0 2 1

The vpns are just the array indexes, so the contents of the array, the actual page table, has pfn’s and present bits in their entries. As discussed on the next page, the present bit is the most important bit, but there are others, helping with user-kernel memory separation, and supporting read-only pages, etc.

The page table entry (PTE)

Page tables entries (PTEs) are just the array entries of the page table.The general case is shown on pg. 193. Each page is tracked as present/absent, readable or read/writable, accessed (referenced) or not, dirty (modified) or not, and something called “protection”. You can ignore the caching bit.

This pic on pg. 193 is not showing another important bit, the U/S bit, that helps with user-kernel separation. The MMU needs to know which pages are kernel pages to prevent user code from accessing them.

The x86 PTE looks like this, as shown in Tan., pg. 887, important bits:

Page frame number: bits 12-31

U/S

R/W

P: present bit, bit 0 (AKA V bit, for valid)

R/W: writable if 1, bit 1 (a “protection” bit)

U/S: supervisor if 0, user if 1 ßfor user/kernel separation

A: accessed

D: dirty

For example, consider a PTE of value 0x00003007. Since there are 12 bits of flags, we can separate the hex digits like this 00003|007, and expand the flag bits to

0x007 = binary 0000 0000 0111. We see that the lowest 3 bits are on, so this page is a user page, writable, and present. Its A bit is 0, so the page has never been read. Its D bit is also 0, so the page has never been written to. Since the P bit is on, the pfn is valid. The pfn is 0x00003, i.e. page frame 3. This PTE maps some virtual page to physical page frame 3. From the given info, we don’t know what virtual page is being mapped.

With the additional info that this is PTE 3 of a page table (table index 3), we can say that this PTE maps vpn 3 to pfn 3, the identity map. This PTE is from the SAPC’s single page table. More on this soon.

Also on pg. 887: AMD x64, to show similar bits for 64-bit PTEs, and a much longer pfn to support huge physical memory

Translation Lookaside Buffers, to speed up MMU work: read in book, Sec. 3.3.3

Multi-level page tables—needed for efficient 4-GB address spaces. Otherwise, if PTE = 4 bytes, for 4KB, monolithic PT would be 4GB/1K= 4MB, expensive in memory.

The x86 32-bit paging system, a two-level paging system

A nice system, well adapted to use for any 32-bit OS you might wish to write, and in use by all the versions of UNIX on x86, plus Windows, to implement a “flat address space.” The x86 also offers segmentation, but this is not in use by these OS’s, except in a very trivial way, so we will try to ignore it.

Note: good memory management for 64-bit OS’s is more challenging, and this system is not ready for that. We’ll stick to 32-bit systems.

Uses 4K pages, and a group of 1024 (1K) pages uses one page table (or PT.) 1K*4K = 4M, so each page table specifies VA to PA mappings for 4M of VA space. The SAPC has 4MB of usable memory because it has only one PT set up, plus a "page directory" to point to it. (In fact our SAPC systems have more than 4MB of memory, but it's not usable because of lack of PTs.)

Since it takes 1K blocks of 4M to make 4G, we see that there are (up to) 1K PTs for a full 32-bit address space. Like pic on pg. 199.

The 1K PTs are located with the help of the page directory, the top-level page table. Each lower-level PT has an entry in the page directory with the pfn of the PT to locate it. These page directory entries are 4 bytes long, so the whole page directory fits in 4KB, or one page.

Similarly, each PT has one entry for each page, called a page table entry (PTE), also 4 bytes long. The 1K entries of a PT also fit exactly in one 4KB page.

This system is much better than a monolithic single PT, because the page directory entries can be marked as "not present", meaning there is no PT needed for that area of virtual memory. Thus it can efficiently handle a small program in a process, with maybe one PT for code and data and one for the DLL and one for the stack.

In this case, the memory overhead of paging is only 4 pages, one for the page dir, 3 for the PTs.

Page Faults (an intro)

Book coverage: pg. 191 (quick intro), pg 228-229 (details)

What about page 6, with a PTE having P=0, not present? An access to page 6 (addresses 0x6000-6fff) will cause a page fault. The MMU is saying “help! I don’t know what to do, please fill in this pfn!!” The page fault is a trap to the OS. The OS can often determine what should be in that page, say by reading the executable file for another page of code. It finishes by filling in a good pfn and setting P=1. Then it causes the CPU to re-execute the instruction, which succeeds this time because P=1 in the PTE.

Note that most instruction executions don’t cause page faults and proceed without a bit of help from the OS. The OS gets involved only occasionally, when a new page is referenced for the first time (or a system call or interrupt comes along.)

Page faults are crucial to demand paging, the core of virtual memory support. A program image starts out with only a few pages of memory, and page faults notify the OS of the needs of the program. This way, memory is provided “just in time.” This concept is called demand paging.

Page Replacement Algorithms: deciding which page to throw out of memory

Tan, Sec. 3.4

Page replacement algorithms make decisions in the most important spot where actual decisions can be made, that is, where the OS has some discretion in management of the memory pages. On the other hand, when pages are brought into memory under demand paging, there is very little choice about it—the program needs the page and cannot progress without it.

The OS maintains a free list of pages to use as needed. It gets a page from the free list at page fault time, and puts pages back on the free list by a kind of garbage collection scheme with the fancy name of the page replacement algorithm. Garbage collection is a real technical term used by Java VM people and compiler writers as well as OS folk.

Tan. mentions the free list on pg. 769, as maintained by the page daemon.

Add this discussion to the reading list.