Thurs, Sept. 29

Tan. Chap. 2: Processes and Threads

process = program in execution

sequential process = process, with emphasis on its single program counter showing where the CPU is in the code—the CPU just “follows its nose” through the program code

multiprogramming = multitasking = OS method of switching one CPU from process to process, so that for each process seems to have its own (slower) CPU. Almost synonymous with “timesharing”, except that it is more specifically single-CPU-oriented. Of course a system could have several CPUs each doing multiprogramming.

Look at Fig. 2-1(c). Shows 4 processes executing in “round-robin” fashion—over and over in some pattern. Of course the pattern is usually more irregular. Each little interval of running is typically about 50 ms long, the “quantum” of CPU for the process.

Processes vs. programs. Programs are algorithms embodied in machine code and usually stored in an executable file. Processes are alive on the system and are using memory, CPU, etc.

Example of program that forks. End up with two processes each running the same program. For Win2K, could run the same program twice and get the same situation.

Process Creation and Termination

All processes are born by the process-create syscall (fork, CreateProcess), except the very first. On bootup, the kernel is running as a standalone program, and morphs itself into a proper process.

Process Termination: usually by (exit, ExitProcess). These both have a parameter for an error/success code. (In UNIX, this code is reported to the parent in waitpid, and then forgotten by the kernel. In Win2K, it is reported via the process handle—I’m not sure how it is ever garbage-collected)

Other ways processes terminate:

fatal error: bad address, bad instruction, div by 0, etc., causes process termination unless specially setup for exception handling, in UNIX or Win2K.
by action of another process: kill/TerminateProcess

Zombies. If a process exits in UNIX, and its parent does not pick up its exit status, it stays as a “zombie” in the system. It no longer has a virtual machine, so it isn’t using resources, but you can see it in “ps –a” output with a Z in the “S” (process state) column. It will stay there until the system is rebooted. Like a proper zombie, you can’t kill it by normal methods. To avoid zombies, make your parent processes do a wait or waitpid to pick up the status.

Process Hierarchies

UNIX example: shell runs mtip. mtip forks, and parent-mtip runs the “keymon” loop, handling user input, while the child-mtip runs the “linemon” loop, shuttling chars from the SAPC to the user. Separate processes for separate types of inputs works nicely in a case like this where the actions for the two types are completely independent: all we have to do for the chars from the SAPC is put them on the screen, regardless of user input. We want two processes so that the hanging read from one input source doesn’t prevent us from reading from the other source.

Resulting process group: typical state:

shell --waiting for child termination, in waitpid

mtip running in keymon function --waiting in read from user (on stdin)

mtip running in linemon function --waiting in read from line to SAPC

User control-C sends SIGINT signal to all processes of the process group. Each process either terminates or has been set up to do exception handling, so the multiple user processes get cleaned up (unless there are bugs in exception handling code)

Actually, mtip uses a system call to put stdin into “raw mode” in order to get each user-typed char as soon as possible, and this means it gets control-C as an ordinary data character. If it sees two control-C’s in a row, it exits (and signals the other process to exit too.)

Win2K: no process groups or parent-child relationships, but one process can control another via its process handle. There is special provision for control-C handling for console apps, so a multi-process app can be terminated by control-C much like the UNIX case.

Process States.

Fig. 2-2 is the classic process-state transition diagram for multiprogramming systems. We can name each arrow with a verb:

Running->Blocked: block (start waiting)

Blocked->Ready: unblock (stop waiting)

Ready->Running: schedule (scheduler chooses this process to run)

Running->Ready: preempt (scheduler chooses another process to run, even though this one could use the CPU more)

For example, suppose a program has a read(...) from the user, and when it’s executed, the user has not yet entered anything. The read blocks on input, that is, the code in the kernel for read does a block action on the process, putting it into a wait, or blocked state. Then the kernel finds another process to run among the Ready processes, and schedules it, making it Running.

Later, the user finishes the requested input, and an interrupt handler for the input device runs, and does an unblock action on the process. The process then enters the set of Ready processes, and sometime later will be chosen to run.

Preemption occurs when the CPU is taken away from a process that could continue using it.

Example: Back to Fig. 2-1 we looked at earlier: four processes want to use the CPU constantly, i.e., they are “CPU bound”. Each gets to run for a while in turn, for a time known as the “CPU quantum”, typically 50ms. The point at which one process loses the CPU in this case is a preemption and causes the process to go from Running to Ready, while the other goes from Ready to Running.

Question: where are the interrupts here?

Answer: the interrupts execute between any two instructions of the code of a process (user or kernel code) and the resulting execution of the interrupt handler is a kernel-code execution not part of any process. All interrupt handlers are kernel code in a modern OS. There is no special execution environment set up for the interrupt handler like there is for a process (the virtual machine.) Instead, it “borrows” the current memory set-up from the process that it interrupts, for just the few moments that it executes the interrupt handler.

Example of changing process states:

Process A: CPU-bound the whole time

Process B: about to read from user, block, eventually unblock

Process C: about to do a large write to file, blocks on output, eventually unblocks

Timeline showing process lifetimes and also i/o interrupts.

Key: _____running

-----ready

.....blocked

char \n

input disk input disk

Int Int Int Int

A _____----___----__________V_______V______V___----___V__---

B -----____.................................---____........

C ------------____....................................---__

times: a b c d e f g h i j k

a: preempt of A, schedule of B

b: block of B, schedule of A

c: preempt of A, schedule of C

d: block of C, schedule of A

e: interrupt for char input, buffered, not yet given to process, so no effect on B

f: interrupt for disk-done for C, but not finished yet with output, so no effect on C

g: interrupt for char input, buffered, and end of line, so provided to process, B unblocked

h: preempt of A, schedule of B, so it reads input line, computes for a little while

i: B blocks on input again, A scheduled

j: interrupt for disk-done for C, output done, C made ready

k: preempt of A, schedule of C

Note how interrupts ride on the currently-running process, running the interrupt handler execution between two instructions of the currently-running process. When process A is interrupted, the interrupt handler runs with process A all available in (user) memory. The char that B is waiting for is delivered, causing an interrupt, while A is running. This is typical—each process is bombarded with interrupts for other processes’ data, and is usually blocked when interrupts for its own data come in.

The interrupt handler is kernel code and uses only kernel data, and purposely ignores the current process image that it is “borrowing”.

hw2: System Calls

Look at details of system call mechanism, like Tan. pg. 46 but using the Linux syscall linkage we will use in hw2:

the syscall # is put in eax
the syscall args are put in ebx (first), ecx (second), and edx (third).
int $0x80 is the syscall instruction, executed with eax, ebx, ecx, and edx set up as above

The system call instruction is a certain instruction in the CPU instruction set, designed for use as a system call by the CPU designer, i.e. Intel for x86, Sun for Sparc. On x86, it’s the int instruction, on Sparc, it’s the ta (trap always) instruction. Its job is to cause execution to jump out of user execution into the kernel in a safe way, much like an interrupt causes kernel execution of the interrupt handler between two instructions in user code (interrupts can also happen in kernel code.)