Intro to hw5—go over handout
Challenges of two options:
preemptive scheduler—get ticks working in kernel, call schedule from tick handler as appropriate
Instrumentation: . More on this later.
semaphores—new syscalls, but that should be no problem: new code in ulib.s, sysup, sysdown, etc. in kernel
See Tanenbaum pg. 110 for definitions of up and down.
Need sem struct in kernel, with sem count and a list of waiting processes or equivalent
—can use intqueue, supplied queue of ints (pids), or use proctab[i].waitcode values. (with a unique waitcode for each semaphore)
sysdown(int semid): decrement semid’s count if it’s positive, sleep if appropriate, but as usual, may be wakened inappropriately, so loop over sleeps is needed.
sysup(int semid): increment semid’s count as appropriate, call wakeup, which wakes up all waiters on semid’s waitcode
The
test for “do I need to go
back to sleep?” in sysdown is quite tricky.
For testing with testmutex, can be sloppy because at most
one
waiter. Prodcons is
a better test once
testmutex is working. You
could add
another waiter to testmutex by extending main1 with code similar to
main2 and
main3, that is, contesting for the mutex.
Timer Device: needed for hw5’s preemptive
scheduler, review from CS341
The timer device of the PC is a chip (or anyway digital logic) that contains a 16-bit counter and some logic controlling it. A counter is a register (array of bits) plus some logic to increment and/or decrement the number in the counter. In this case the counter decrements, so we talk of “downcounts.”
In addition to the counter register, the timer has a “holding register”, also 16 bits, which holds an initial count for the counter. See Tan, pg. 328.
The timer device is connected to a crystal oscillator that puts out a simple square wave at 1.193 Mhz, i.e., one cycle every microsecond (a little under.) On every cycle of the oscillator, the counter decrements by one. Eventually the count reaches zero. At this point, it reloads the initial count from the holding register and starts downcounting again. In addition, at each zero it sends an IRQ signal out. This happens over and over.
The timer generates a sequence of periodic interrupts, or “ticks”. Its IRQ line is connected to IR0 of the PIC, so the PIC treats the timer as the highest-priority device in the system, with IRQ 0. Using the Linux/SAPC initialization of the PIC, the PIC sends in an interrupt with nn = 0x20, and so we need to put the tick handler address in IDT[0x20].
The highest possible initial count is 64K (216). 64Kcyc/1.193x106cyc/sec = 55 ms, the longest possible period between ticks. For shorter periods we use smaller initial counts. For example, for 20ms ticks, we need
20x10-3sec * 1.193x106cyc/sec = 23860 cyc
i.e., 23860 is the needed initial count in the holding register. Converting to hex, this is 0x5d34, needing two bytes. This 16-bit number is output to an 8-bit port in two parts. Here’s the protocol, at the lowest level:
Timer
Initialization for 20ms ticks
outb 00
11 010 0 = 0x34 to i/o port 0x43 (control
port)
outb
0x34 to i/o port 0x40 (low byte of inicount to
count port)
outb
0x5d to i/o port 0x40 (high byte of inicount to
count port)
See $pcinc/timer.h for details on the control register fields. See timetest.c in $pcex for decent C code to borrow. Note the two outb accesses to the same port. The timer detects the write of its control port and expects the two follow-up writes to its count port. It catches each 8-bit value and puts it in the right place in its holding register. The counter keeps downcounting all the while.
You can’t turn the timer off, only change its period. You can’t turn off its interrupts either. Luckily, we can mask off its interrupts at the PIC, and that’s how peace is attained normally on the SAPC. To get interrupts working, we need to set a tick handler address in IDT[0x20] and then unmask the 0 bit in the PIC’s mask register.
Back to Memory Management to finish up
Quick
summary of page
replacement algorithms
LRU pretty good, too expensive (LRU-2 even better, even more expensive), FIFO no good, some others
--OS’s use clock algorithms (approx LRU) and/or working set algorithms (for inter-process fairness)
Memory Management Reading: Chap 4 to pg. 232 as mentioned before, skip page 233 for sure (or fix it.) You can skip 234-240, Pick up again at 4.6.5 and read to pg. 249. Also read 10.4 to pg. 719 on UNIX MM algorithms. Also 11.5 to pg 820 on Win2K algorithms.
Tan, Sec. 4.7 pp 242-243. MMU actions at process switch:
At process switch, the CR3 register is loaded with the PA of the scheduled process, and that causes a whole new process image to be mapped in. The caches need to be flushed, both the instruction/data and TLBs. Just after a process switch, cache misses are frequent until the caches again have the important data in them. This cache flush action is a big performance effect attached to a process switch.
Note that switching between threads of one process does not involve reloading CR3 or flushing caches, and thus is significantly more lightweight.
p. 241: Note copy-on-write (COW) trick to speed up fork.
MMU and system security
The MMU is very important to the job of keeping a user process "bottled up" in its virtual machine. Each address in a user program is tested out on every instruction, so the process can't see anything it shouldn't, in particular, the kernel code and data..
The MMU causes a page fault or general protection exception for addresses that fail its test, and this causes the kernel to execute and figure out what to do. Each page is marked U or S to allow the kernel to see things the user level execution can't, without generating an exception.
Naturally, the instruction to change the CR3 is privileged. The page tables and page directory are hidden away from the user program in the kernel data area.
Page Fault
Handling
in x86.
What’s a linear address (LA)? Gets us into segmentation...
A little bit
about
segmentation (optional, not covered in lecture)
The full x86 memory architecture
involves segmentation as well
as paging, so the address translation goes through two mappings, first
by
segmentation (VA -> LA) and then by paging (LA -> PA): VA -> LA
-> PA. We
have been trying to ignore the middle step
and say VA -> PA by paging.
This is
defensible in UNIX or Win2K because both of these
With a flat address space, the addressing in user memory is simplified to a sequence of 32-bit addresses from some lower limit to some upper limit, holding both code and data. Addressing in the kernel is similarly simplified. (Full VAs under real segmentation look like 0018:00010234 for a code address and 0024:00001234 for a data address, etc., code and data live in different segments with no useful relative position.)
In the SAPC, the 4M of VA space is mapped by segmentation to the beginning of the last 1G of LA space, starting at 3G = 0xc0000000. Then paging maps that 4M of LAs to the first 4M of PA space. Address 0x1234, a VA, is mapped to LA 0xc0001234 and then to PA 0x1234. When you see an LA, just drop the high byte of 0xc, and you’ll see the VA or PA.
F06: the following was covered in the next lecture, but was meant to be here--
Back to the page fault handler: look at Tan., pg. 243 for steps.
Tan., pg. 244. Instruction backup not such a problem in x86 or Sparc, because there is no auto-incrementation done along with memory access, and there is at most one operand memory address per instruction.
Examples of
PFs
Note that a PF is more like a system call than an interrupt. They are both exceptions or “traps.” When the kernel is executing after a trap, it is executing on behalf of the current process, so the process entry and process image are relevant and usable. No problem in blocking. An interrupt is quite different. Interrupt handlers execute as guests of a “random” process. They normally don’t access process data, only kernel global data relevant to their device.
Tan, pg. 710 UNIX memory management.
Discussion of process image regions—don’t forget DLLs too now.
pg. 711 pic. showing 2 processes in memory sharing their code pages. But note that only one of these is “current” in use of the CPU (unless there are multiple CPUs.) The other one is in memory but not scheduled at the moment. On the x86, the current one has the CPU register CR3 pointing at its page directory, which makes its whole process image mapped in. Other CPUs have similar master registers for the top-level of their paging support.
Memory-mapped
files.
A region of a file can be mapped into a process image. Then writes to that part of VA space cause corresponding writes to the file pages. If two processes map the same region of a file, they end up with shared memory with data that persists in the filesystem. However, this is not commonly used in applications, partly because the solution is not portable, and partly because there are subtle issues such as exactly when the file writes occur.
Often, the memory-mapping mechanism is used for shared memory, ignoring the file itself. You can use /dev/zero, the OS-supplied effectively infinite file of zeroes, instead of a real file. See mmap_nofile.c.
Daemons: swapper, page daemon (Solaris pageout, actually a kernel thread)
Core map—array with one entry for each physical page of memory used for paging. (Some memory is used without paging, notably the kernel code under UNIX, and the core kernel code under Win2K) This is used to implement the free list and keep track of where on swap space a page should be written, and other things.
pg. 718: two-handed clock algorithm-- we already covered
Don’t worry about Linux memory management.
pg. 812 Win2K memory management. Read to pg. 820.