All
the files are compiled and
loaded together into one downloaded .lnx
file:
Kernel: startup0.s, startup1.c, sysentry.s, tunix.c (with some new code), sched.c (new), asmswtch.s (provided), io.c, ioconf.c, tty.c
User program1: crt01.s, uprog1.c (with main1)
User program2: crt02.s, uprog2.c (with main2)
User program1: crt03.s, uprog3.c (with main3)
User-level library: ulib.s, shared between user programs
Plus debugging support: debuglog.c, which is neither user nor kernel, usable both places.Scheduling. Tan., pg.
132-146
We’ve
already discussed
CPU-bound = compute-bound processes.
Similarly i/o-bound.
Also preemption and
preemptive schedulers.
Scheduling
is about sharing
resources appropriately among processes.
There are 3 major resources to focus on:
Each
process wants some
fraction of total resource in each direction, and if these fractions
(in some
direction) add up to more than 100%, then things slow down, even with
good
scheduling.
Study
pic
on pg. 141. The
number of processes in
memory is called the “degree of multiprogramming”,
and it is an important
number to scheduling. If
it is too high,
each process gets only a small slice of the CPU.
But that’s not the only thing.
If the OS has squeezed a lot of runnable
processes into memory by reducing their memory
footprints (by paging,) then they are all paging like crazy trying to
do their
work.
A
batch system has the
“admissions scheduler” to help control the degree
of multiprogramming, but UNIX
and Win2K don’t have this facility, and are expected to try
to run whatever is
started.
The
only way out for an
overloaded timesharing system (UNIX or Win2K) is to swap out some
processes,
that is, copy their entire memory image out to disk and back again a
little
while later. This
is a desperate action
and costs a lot of i/o,
and is rarely seen today on
our big-memory systems. Processes
are
swapped out on today’s systems (usually) only when they go
idle for a good part
of a minute.
Round-robin—see Tan.
Quantum =
CPU time a process/thread gets before it is preempted
Often
a thread blocks before
it reaches this point.
Quantum
is set at about 50
ms, as compromise between throughput and delay in responsiveness.
A
preemption
causes a “process
switch” or “context switch”. The second
term is too vague, since context just means state.
And sometimes it is used for just a mode
switch, which happens in a system call.
A system call causes execution to go from user mode to
kernel mode in
the same process. So
it’s not a
process switch, and in fact is a much lighter-weight action than a
process
switch. Good thing,
since system calls
(and mode switches in interrupts) happen much more often.
Priority scheduling. Note the UNIX “nice” command, that allows you to run a user process at a lowered priority
Real-time
systems—just know
what they are.
Thread
scheduling: In both
Solaris and Win2K, threads are the primary scheduling unit. For each thread, the
process it belongs to is
known, of course.
Solaris
has different
scheduling classes, with different policies:
SYS, TS (timesharing), RT (real-time), IA
(interactive). In a
multi-CPU system,
you can bind a CPU to a thread. In
the
RT class, you can even turn off interrupts on that CPU.
Scheduler setup, kernel global variables:
proctab[], curproc: global, used by tunix.c and sched.c
Recall our discussion of PEntry structs, one for each process in the proctab array. There are 4 spots in the array, one for process 0 and 3 for the user processes 1, 2, and 3.
process 0 derives from the startup execution, i.e., the kernel initialization in tunix.c morphs itself into a process by filling in proctab[0] and curproc. It is always runnable, so the scheduler can always find a process to run.
Scheduler API (not encapsulated: uses global proctab[], as do other functions in the kernel)
sched.c will contain the scheduler code—here are its major functions.
Where are these called from?
sleep—called from ttywrite (to replace the busy wait)
wakeup—called from the tty output interrupt handler, when a new spot in the output queue becomes available.
schedule—called from process 0 code, to get started, and as an “idle loop” to try to hand off the CPU to a user process. Also from sysexit.
These are all critical code, so should run with interrupts off. If you call wakeup only from int handlers, of course it will get IF=0 automatically, since we never turn ints back on in int handlers. You could put checks in to make sure IF=0 ((get_eflags()>>9)&1 gives IF)
Note that wakeup is a “broadcast” action, so multiple processes may wake up for just one char-spot in the tty output queue, for example. Also note that wakeup doesn’t call schedule, so it doesn’t actually get a process running, it just marks it runnable. Sometime later, another process calls sleep, and that calls schedule, and chooses this one to run. Or process 0 calls schedule, and that does the trick.
The only tricky coding in sched.c is schedule(), because it calls asmswtch, a completely weird function. I’d code it like this:
You don’t have to do it this way, but be aware that asmswtch does return after the given process is rescheduled, and then it should return from schedule and go on with its business (unless the process is exiting). Don’t let it fall into inappropriate code because you think it never returns!
The calls into the scheduler are straightforward to code except for the sleep call from ttywrite. There we had a busy loop of attempted enqueues into the output queue, and we need to replace it by blocking the process.
Might try:
if (enqueue(...)
==
FULLQ)
sleep(OUTPUT);
/*
can we enqueue
now? Not for sure!
*/
Suppose two processes, A and B, were unblocked by wakeup. If A gets to run first, it will use up the space in the output queue, but B will still be marked RUN, and will be chosen to run a little while later. It will return from sleep, but there will be no space in the output queue for it to use. It needs to call sleep again. So we really need something like:
while (enqueue(...)
==
FULLQ)
sleep(OUTPUT);
But what about mutex? We know from hw2 that enqueue
needs mutex (cli()) to protect it against
concurrent modifications to the
queue data structure by itself and the dequeue
in the
interrupt handler, which can run in the middle of an enqueue
unless we turn off interrupts.
Is that enough? Suppose we write guarded_enqueue(), which is just enqueue’s body surrounded by cli and sti (or set_eflags back to original value.), and similarly guarded_sleep(). Then we would have:
while (guarded_enqueue(...)
== FULLQ)
guarded_sleep(OUTPUT);
But there’s a sneaky problem here called the lost wakeup problem. It was discussed in Tan, pg. 110 in the context of user-level sleep and wakeup calls. Luckily we are working inside the kernel here, so we are allowed to turn off interrupts as needed.
The problem is this. Suppose that just after guarded_enqueue returns with FULLQ, an interrupt happens that causes a dequeue and a call to wakeup. But this process is not yet blocked, so it is not unblocked. Instead, the wakeup does nothing at all to proctab[] (that’s OK in itself, maybe there’s no more output.) Then this code resumes and calls guarded_sleep, and this process may sleep forever (if it’s the last non-zombie process, for example.)
The solution is to put the cli()—sti() mutex region across more code:
cli();
while (enqueue(...) ==
FULLQ)
sleep(OUTPUT);
sti();
It’s a little scary to call sleep with interrupts off, but in fact they won’t be off long, because the sleep immediately calls schedule, which finds another process to run, and asmswtch swaps the EFLAGS with the IF in it, and the other process resumes execution, typically with IF=0 inside asmswtch, but it soon restores IF to 1 as it returns to user code. So the mutex (IF=0) is only holding for the execution in this process. The important thing for us here is that an interrupt can no longer occur in that bad spot described above. So even though there is a “mutex hole” at the call to sleep, we have what we need here.
More on the mutex hole: Suppose we have a global kernel variable x and we put:
cli();
x = 1;
while (enqueue(...) ==
FULLQ)
sleep(OUTPUT);
/* is x still = 1 here? Not
necessarily,
because of the mutex
hole at sleep()
*/
sti();
You can see that the mutex hole allows another process to run in the middle of this sequence, and it could set x to some other value. So the use of cli—sti mutex along with sleep has to be thought out carefully. Basically there are two mutex regions, one before and the other after the call to sleep.