CS444 Class 25: Final Review, teacher evals. 

Book Coverage: final is open books, open notes, open solutions—yours or mine or both.

Note midterm review in class 15

 

Tan., Ch. 1, but light on history

Ch 2 to pg. 144, plus 145-149, where preemptive scheduler is defined, and 162-163 for thread scheduling.

 

From class20:

Read Chap. 3 Memory Management to pg. 194, esp. 189-194, then 3.3.3, TLBs, 3.3.4 multilevel page tables,

3.4 on page replacement algorithms, skipping 3.4.8,

3.6 through pg. 232, Linux pp. 758-762

Sec 3.4.6: only need to read the first two paragraphs

Skip 3.4.7

Add pg. 887 x86 PTE bits

Sec. 10.4.4 Paging in Linux, to end of pg. 770, so you know PFRA is a clock-like algorithm that uses the page reference bit.

 

Ch 4 File Systems skipping

 

From class 21, class23:

Chap 5 I/O and Devices to pg. 332, 339-347, skip DMA, 348-360, Chap 10, Sec. 10.5: 771-773, 775-776

--also 5.5.2 Clock Software (pp 390-392)

 

 

From class 24:

Chap 6 Deadlocks through 6.6, pg. 457

 

We have covered all the topics in Tanenbaum related to the concept of the virtual machine that the OS creates for each process, and the various threads that can run in it.  Since access to file systems is always through system calls at runtime, we don’t need to understand file systems in detail to fully appreciate the virtual machine concept—that’s the excuse for dropping this coverage. Also, it is relatively straightforward to learn this material on your own.

 

File Systems: would be good to read Ch. 4.(was on syllabus).

 

Security: would be good to read Ch 9 after Ch. 4, plus UNIX and Win sections, when you have time.  You need an OS background, including understanding filesystems, to understand system security properly.  As we’ve discussed, the idea of user-kernel separation is fundamental to system security.  The same separation mechanisms keep the user programs apart too, each in their own virtual machine.

 

Multiprocessor Systems (intro was on syllabus): Ch. 8  We have touched on some issues such as spinlocks.

 

Review of Homework features

 

hw1—not an OS, just an interrupt-driven i/o package.  Ideas of interrupt-driven i/o, device-independent i/o via dispatch through devtab, typeahead and write-behind, how to handle shut down and restart of UART output interrupts.  See list of hw1 ideas in midterm review.

 

hw2—hw1 plus system calls for read, write, exit: first OS, with separate user and kernel code, transitions by system calls.  Points to understand:

·         mechanism of system calls: put args and syscall# in registers, special instruction, use of IDT

 

hw3—multitasking kernel with non-preemptive scheduler.  Each process has a process image, consisting of startup module, program code, data, and stack.  In the kernel, it has a PEntry, with “saved registers”, saved at process-switch time.  Now processes can block, so when one process needs to wait for i/o, another process can run. 

 

·         There are no busy loops left in ttyread or ttywrite.  Instead, there are calls to tsleep(event) at points where the caller needs to wait. 

·         If you didn’t finish hw3, be sure to check the solution’s ttyread coding.

 

·         There are calls to twakeup in the interrupt handlers, signaling (by unblocking waiters) input data or output buffer-space to the blocked processes (if any.)  

 

·         An interrupt handler is not allowed to block (i.e. call tsleep()), but it often unblocks waiting processes, by calling  twakeup().  

 

·          twakeup can be called by syscall code or interrupt-handler code.  An example of syscall code calling  twakeup()  is in sysup of hw5's semaphore service.

 

·         When none of  the user processes can run, process 0 runs calling the scheduler over and over (and also can be interrupted at some point on each pass.)

·         Where’s the process’s kernel stack? It is built on top of the user stack during a system call (in both hw2 and hw3). 

·         When a process is running at user level, its kernel stack is empty here or even on a full OS. 

·         When a system call starts for a process, in hw3 the stack grows on top of the user stack and later shrinks again to nothing as the system call returns, leaving the user stack right back where it was before.

·         In a real OS, the kernel stack is separate, in the kernel data area, and grows from empty to non-empty and then back to empty at system call return.  Each process has a separate kernel stack.

·         In hw3, we see the mechanism of process switch, with its saving of the old CPU state to the saved registers in the process’s PEntry, and then the restoration of the CPU state from the PEntry of the newly-chosen process.

·         In hw3, we are using a non-preemptive scheduler.  A CPU-bound process “hogs” the CPU until it finally does some i/o or exits.  It can tie up the whole system.

·         As in hw2, we are still using IF=0 as the kernel mutex mechanism

 

Note that the system is almost always executing in some process, or in an interrupt handler that has interrupted some process and borrowed its execution environment, including its stack.  The only exceptions are the tiny moments the scheduler is actually switching processes, plus startup and shutdown.  

 

Also note the difference between the mode switch (user mode to kernel mode) done in a system call and a process switch, which is a much bigger deal.  When a process does a syscall, it enters the kernel in the same process, quite a fast operation. When a process switch happens, the CPU registers and saved and others loaded, and in a real OS, the whole VA space is switched.

 

Note that our kernel mutex mechanism (setting IF=0) only works for uniprocessors. Multiprocessors need to use spinlocks.

 

 

hw4 Implementation of semaphores

·         Semaphore system calls provide mutex (and process synch) at user level, and are leveraged off of kernel mutex (still IF=0 for us) in their kernel implementation.

·         We need an array of structs, one for each semaphore, as kernel data.

·         Each semaphore needs a count, possibly a queue of pids of processes waiting on it, or searches through proctab as needed, and a used-flag. 

·         In addition, proc.h needs a new waitcode value and possibly a data member in PEntry to record the semid that a process is waiting for.

·         sysdown() uses tsleep to block the process when needed.

·         sysup() uses the  twakeup_one scheduler call instead of the classic UNIX “broadcast”  twakeup-everything waiting on an event.

 

hw4preemptive scheduler:

·         preemption is implemented by a call to schedule from tick handler when the quantum is counted down.

·         Our standard settings: tick every 10ms, character-transmission time 1ms (9600 baud), QUANTUM=4, so 40ms to run before preemption.  Also, the output buffer holds 6 chars.

 

You should be able to predict behavior.  For ex., suppose process A is CPU-bound and process B is i/o-bound, constantly outputting b’s.  How do they run?

 

Suppose A starts running. It takes 4 ticks to preempt A, then process B runs, quickly fills up the output queue and blocks, so A runs again, and for the first 6 ms of that time, output drains at interrupt level while A is running, and after that a tick happens, and then three more before A is preempted again, etc.

 

Note that here we are seeing some overlapping of work by A and B: A is computing while B is outputting (via interrupts).

 

 

x86 paging demo, in lieu of another hw.  The one and only page table for the SAPC is available at VA 52000 on the SAPC, and consists of 1024 4-byte PTEs, each of format given on pg. 887.

 

Thus the PTE for page 0 is at 0x52000,

               PTE for  page 1 is at 0x52004

               PTE for page 2 is at 0x52008

                           ...

              PTE for   page 0x50 is at 0x52000 + 4*0x50 = 0x52140,

           and this PTE maps VAs 0x50xxx, i.e., 0x50000 through 0x50fff.

In the PTE, you should be able to work with the P, A, D, bits.

 

The whole paging system involves the page directory pointing to 1024 PTs, each pointing to 1024 pages, for a total of 1M pages, each 4K, so 4GB = 32-bit addr. space.

 

Process switch: reload CR3, the CPU register holding the page directory location.  Causes TLB flush, its biggest cost.