CS444 class 19

Thurs, Nov. 9: Chap. 4, Memory Management

Memory Management Reading: Chap 4 to p. 199, 202-212, 214-232, UNIX case: 714-723, Win32 case: 811-823

Primitive Memory Management, and how to do better

Examples: SAPC, MSDOS

Both the user code and OS code sit in the same memory space, and each can see the other. The data is also in this space and mutually accessible, and must be writable (the code might be in ROM, and thus protected from write.) The user code can accidentally or on purpose corrupt the OS data. Thus the system is seriously insecure.

What do we need in order to get the needed user-OS separation? We need help from the hardware and software that uses it properly.

More specifically, we need help from memory management hardware (the MMU) to restrict memory access based on user vs kernel execution. That means we need to have the CPU keep track of whether it’s executing user or kernel code, and make sure the transition between the two is very carefully handled.

User vs. Kernel Mode Execution

All CPUs used for serious OS’s have at least two modes of execution, for user and kernel. In fact, the x86 has 4 “rings” for this, but only two of them are in use by UNIX or Win2K. Thus we will stick to the simple picture of two levels, user and kernel. In the x86, the current execution mode (ring number) is stored in the CS (code segment descriptor) register of the CPU.

User-Kernel Transitions

We have already studied the mechanism of system calls, and saw how careful they are to transfer only a minimal amount of information from the user execution environment to the kernel, just a syscall number and the arguments. The whole syscall execution looks like the execution of a single instruction in the user code, the syscall trap instruction.

Interrupts during user execution also make the user-kernel mode transition. We have seen that they utilize the same basic CPU trap cycle as the system calls, with the additional trick of turning off interrupts, and some way of getting a device identifier. They can’t even be detected by user code unless it does high-precision timekeeping.

The MMU (part of the CPU)

The MMU needs to know which addresses are user, which are kernel, as well as the current CPU mode, user or kernel. Then when it is handling a certain address, it can detect a user-mode access to kernel address, and make it fail. We will see how this works.

The MMU is another helper to the CPU proper, but it is not like the other helper, the PIC, which we called the “secretary to the CPU,” because it remembered things and organized work for the CPU. The MMU just takes orders, more like a telephone for the boss. Type in a number and a phone places a particular call. Similarly, the CPU asks the MMU for the data at a certain address.

Tan, p. 193: i/o overlap: the point here is that we need multiple processes in memory at one time, to fully utilize the CPU using timesharing. One process can block, and another can run while the first waits. That’s the kind of system we need to have. If the second process had to be brought into memory when the first blocked, that would take i/o itself, so we wouldn’t be gaining anything.

Swapping: We had studied the earlier picture on pg. 141, and put a couple of processes out on the disk, swapped out. In current systems, this happens normally only to idle processes (multiple minutes idle.) If a system is running two many processes to fit the non-idle ones all in memory (at least their most important pages), it is in trouble (overloaded). The process swap rate can be used as a danger signal.

Paging: This is the crucial method of memory management for systems operating within their normal range (not overloaded.) If a process can be provided with memory under its most commonly-used pages, and extra memory as needed for less-commonly used pages, it can run well and share the memory efficiently with other processes.

The memory is divided up into equal-sized blocks called pages, typically 4K (x86) or 8K (Sparc) in size. The VA space for each process is the memory it sees and works with. It is quietly divided into pages—the page division marks are not apparent to programmers. If the program extends even one byte into a page, the whole page is used (in theory, but if no reference is made to this one byte, it may never be given real memory.) Thus we don’t want pages to be too big, or too small, either, because each page needs to be kept track of, an overhead.

The physical memory is divided up into the same size blocks, called page frames. All the page frames are exactly the same in terms of memory resources. The idea of paging is that any page frame can be used as memory under any (virtual) page. A page table (PT for short) tells which page frame is assigned to each page of (some part of) a VA space.

Parts of the CPU, pg. 203 pic

Here we see that the MMU is part of the CPU, specifically the part closer to the bus, which makes sense since the bus carries the addresses out to memory to specify which RAM chip is accessed (the physical location.) The upper part of the CPU contains the ALU, the arithmetic-logical unit, i.e. the brains of the CPU. This part works in VAs. You could add the registers to the upper box: these all contain VAs if they contain addresses.

The PC (x86 EIP) is a VA, and specifies which next instruction to pull in from memory. A read request by VA is generated in the upper CPU and passed to the MMU, which maps it to a PA (with the help of a page table), and the PA specifies the exact physical location.

The x86 paging system

A nice system, well adapted to use for any 32-bit OS you might wish to write, and in use by all the versions of UNIX on x86, plus Win2K, to implement a “flat address space.” The x86 also offers segmentation, but this is not in use by these OS’s, except in a very trivial way, so we will try to ignore it.

Note: good memory management for 64-bit OS’s is more challenging, and this system is not ready for that. We’ll stick to 32-bit systems.

Uses 4K pages, and a group of 1024 (1K) pages uses one page table (or PT.) 1K*4K = 4M, so each page table specifies VA to PA mappings for 4M of VA space. The SAPC has 4MB of usable memory because it has only one PT set up, plus a "page directory" to point to it. (In fact our SAPC systems have more than 4MB of memory, but it's not usable because of lack of PTs.)

Since it takes 1K blocks of 4M to make 4G, we see that there are (up to) 1K PTs for a full 32-bit address space.

Note: good memory management for 64-bit platforms is more challenging, and this system is not ready for that. We’ll stick to 32-bit systems.

If there were only one PT, it would have to be huge, a waste of memory for tiny programs. By using a two-level PT system, like Figure 4-12, pg. 208, the setup can be customized to the actual process configuration of memory.

x86 paging uses 4K pages, and a group of 1024 (1K) pages uses one page table (or PT.) 1K*4K = 4M, so each page table specifies VA to PA mappings for 4M of VA space. Since it takes 1K blocks of 4M to make 4G, we see that there are (up to) 1K PTs for a full 32-bit address space.

The 1K PTs are located with the help of the page directory, the top-level page table. Each lower-level PT has an entry in the page directory with the pfn of the PT to locate it. These page directory entries are 4 bytes long, so the whole page directory fits in 4KB, or one page.

Similarly, each PT has one entry for each page, called a page table entry (PTE), also 4 bytes long. The 1K entries of a PT also fit exactly in one 4KB page.

This system is much better than a monolithic single PT, because the page directory entries can be marked as "not present", meaning there is no PT needed for that area of virtual memory.

Look at Tan’s page table example on p. 204, using 4K pages, so could be an x86 example. Boils down to page number mapping:

page 0 -> pfn 2 VA 0K-4K (really 4K-1), PAs 8K-12K

page 1 -> pfn 1

page 2 -> pfn 6

...

page 6 -> nothing (P=0) VA 24K-28K (really 28K-1) ß on x86, V bit = 0 here

Tan’s instruction example: MOV REG, 8192

VA = 8192 = 8K= 0x2000, start of page 2, so pfn = 6, PA = 6*4K = 24K = 0x6000. Here offset = 0

For VA = 8196 = 8K+ 4 = 0x2004, same page, pfn = 6, offset 4, PA = 0x6004