CS444 class 14

Class 14

Note: midterm, Wednesday Oct. 30

Reading, continued

Sec 2.4 Scheduling to just before Categories of Scheduling Algorithms

Hw2 discussion.

How the stack gets used in hw2.

Recall the discussion of the kernel stacks in a real OS: When the user code for a thread is running, the kernel stack is empty, but when a system call is in progress, the kernel thread stack grows, and later shrinks back to nothing at the system call rti.

In hw2, the kernel stack just grows on top of the user stack, and later shrinks back down to the old user stack at the end of the system call, so this stack use is never noticed by the user code (assuming no bugs of course). There is only one user code to execute here, i.e., only one process.

In hw2.html, step 1 says “write a tiny tunix.c init function that calls ioinit, then calls main (cheating for now--later it should call ustart)”

Why is this “cheating”. It’s building the user stack on top of the initial kernel stack, so the user code can “see” part of a kernel stack. But in fact it works fine, because the user stack can grow as it wants, and then at a system call it grows further, then shrinks back, etc.

The final transition from user code back to the kernel is via the system call exit. We discussed the details of this in class 11. Thus we never use the part of the stack left by the kernel by its initialization, the stack that existed at the call to main.

Adding crt0.s in step 4.

Since we never need the stack that exists at the call to main, we can let the user code initialize a new stack, and never let the user code “see” any kernel stack contents. This change means the kernel should not call main directly, but instead should call some assembly language code (in crt0.s) that can set the stack pointer to 0x400000 (memory limit on the SAPC) or a little less if you want. Then that assembly code calls _main. To keep the gdb debugger happy, it’s good to set %ebp to 0 as well.

Back to Semaphores and Monitors

Semaphores have all the power you need to get your threads to wait at the right moment if you manage to use them right. However, bugs in code using semaphores are hard to track down, partly because when you use them, you’re doing real multithreading, which is always hard to debug.

Our example of a monitor (making insert and remove for a buffer synchronized) did not show the full power of monitors. Tanenbaum goes on to show how you can get a thread to block inside a monitor if you use certain facilities to make it safe and not a deadlock (using condition variables). We are skipping this topic. Semaphores have the power we need. Note that Tanenbaum uses semaphores to solve the classical IPC problems in Sec. 2.5 (which we are skipping).

If you want more information on coding semaphores and monitors, see this excellent slide set from Harvard:

http://www.eecs.harvard.edu/~mdw/course/cs61/mediawiki/images/7/7e/Lectures-semaphores.pdf

As Tanenbaum points out on pg. 140, semaphores and monitors only solve the synchronization problems between threads on the same shared-memory machine. To synchronize work between machines, we need message passing, usually by network connections, but here we study it without the networking setup.

Message Passing (Sec. 2.3.8)

Tan, pg. 140:

send(dest, &message);

receive(source, &message);

To start it isn’t really clear what kind of id’s dest and source are—thread ids? The code in Fig. 2-36 shows thread names in the send and receive calls. This simple naming assumes only one thread is running the named function, and doesn’t work for multiple programs, but let’s go along with it as we did with pg. 130 code. Clearly better names can be cooked up as needed.

Receive can block if no message is waiting for it, but another version might just return an error code, according the text on pg. 140. Let’s assume it blocks in this case.

Send is not discussed as to blocking. Many real systems do block the sender if it sends too much data in a short time, or return in error, because otherwise the system would have to store all this data until the receiver accepted it.

Producer-consumer using messages. Idea here is to set up a flow of data messages from producer to consumer. Normally a receiver waits (blocked) if there are no messages to receive. How do we prevent the sender from getting way ahead of the receiver? Recall that the producer is supposed to be blocked when the queue is “full” in some bounded way controlled by the program.

To prevent the producer from run-away production, we require it to receive an “empty” message from the consumer before it can produce a full one. We prime the system by producing N empty messages to start with. After that, there are always nearly N messages total in the system, some empty, some full.

Initialization: do N times: send (producer, &m); // permission to send N messages

(Tan’s code has this done in the consumer, but it’s part of initialization)

Create Producer thread

Create Consumer thread

Wait for thread exit, i.e, forever, but this avoids doing an exit and destroying everything.

Producer Loop:

Produce item

Receive(consumer,&m); // wait for empty

Put item in message m

Send(consumer,&m);

Consumer Loop:

Receive( producer, &m);

Get item from m

Send(producer, &m); // send empty

Consume item

Pg. 143—another somewhat magic program, using the empty messages for flow control as outlined above.

Mutex by message passing: use a mutex server thread

Using Tan’s system: Have one thread as mutex server, other “worker” threads that need mutex service. This pattern is useful for a network lock server too.

Initialization:

Create mutex server thread, other worker threads

A message can be a lock request or a lock release, going to the server

A message can be an OK from the server to a worker.

mutex server: has locked flag variable and waiter_q queue of thread names

Loop:

Receive(ANY, &m); //must be able to determine sender name

If m is a lock-request

If locked is false

Set locked = true

send OK to sender

else enqueue sender name on waiter_q

else /* m is a release */

set locked to false

if (waiter_q is not empty)

Dequeue waiter name from waiter_q

Set locked to true

Send OK to waiter name // waiter is no longer waiting

in worker thread:

send req to mutex_server

receive(mutex_server, &m); // receive OK, maybe after blocking a while

< critical code>

Send(mutex_server, &m); // release mutex

For semaphore_server: have count variable and waiter_q

Loop:

Receive(ANY, &m); //must be able to determine sender name

If m is a down-request

If count > 0

Count--

send OK to sender

else enqueue sender name on waiter_q

else /* m is a up-request */

count++;

if (waiter_q is not empty)

Dequeue waiter name from waiter_q

Count--; /* finish down action */

Send OK to waiter name // waiter is no longer waiting

in worker thread:

send up-or-down-req to mutex_server

receive(mutex_server, &m); // receive OK, maybe after blocking a while in down

Message passing is more powerful than Semaphores

Thus we have shown that message passing has all the power of semaphores on a single system, and can do the same across systems, so it is more powerful than semaphores.

In Java, with threads in one process, we don’t typically use messages because it’s so easy to share data in memory, so the code we already have studied shows how to do this. When we communicate between systems, we usually use the networking support, i.e., TCP, which means we need to talk about network host addresses and then within hosts, the ports that are used to define communication endpoints. There is a part of JEE called JMS, for reliable and transactional messaging. These are both “heavyweight” services. Java seems to be lacking in “lightweight” message mechanisms for activities on the same system.

In Android, however, the architecture favors messages for communication, because it works just as well between processes, and lots of functionality in Android is implemented in servers, separate processes from the app processes.

So for Android, Apple has implemented a new form of Linux IPC, having apparently rejected as too slow TCP and System V IPC, and certainly JMS.

The new service is called “Android Binder”. It sends messages from one process to another by copying them to kernel memory from one process and then out to the other process. The fact that this is not a data stream makes it not properly fit into read/write Linux syscalls, so it is actually an “ioctl”, the miscellaneous file syscall. This service underlies broadcasts as well as calls to servers, including the Service Manager that keeps track of all the servers and services.

See Academic article (thesis) on Android Binder

Semaphores vs. Message Passing

Message passing can be done across systems, whereas semaphore systems offered by OSs are for synchronization of processes or threads all on one system. On a single system (which may have multiple CPUs, and has a common memory system), semaphores and message queues have the same synchronization power. Semaphores can by implemented by message queues plus shared memory, and vice versa, on a single system. Semaphores are often lighter-weight than message passing, partly since they can depend on being between threads/processes on a single system.

Message passing by semaphores? No way to move the data.

Actual message services

POSIX message queues, etc.: available on most UNIX/Linux systems. Provides message passing between processes on one system. (On Linux, you may have to rebuild the kernel to get message queues)

Windows

Pipes, Network i/o: stream-oriented, but can subdivide stream into messages.

Win32: pipes, network i/o

Something portable between Windows and UNIX? Answer: TCP network i/o, works between processes on one machine as well as over the network.

Rest of IPC coverage in Tanenbaum:

pg. 123:barriers—can skip.

Dining philosophers: has easy solution with mutex: get mutex, pick up both forks, eat, release mutex. But this allows only one eater at a time. The challenge is to safely allow 2 eaters.

Readers and Writers: more realistic problem. Allow N readers XOR 1 writer to each data item. The solution uses two semaphores, one (db) to make sure the system is EITHER doing reading or writing, the other (mutex) to guard rc, the reader count. Only the first reader does the down on the “db” mutex, and subsequent readers get blocked at the down on mutex, since the first reader still holds that until it gets db. But that’s OK, since they need to block. They get unblocked at the right time too.

Sleeping barber—skip.

Scheduling

We’ve already discussed CPU-bound = compute-bound processes. Similarly i/o-bound. Also preemption and preemptive schedulers.

Scheduling is about sharing resources appropriately among processes. There are 3 major resources to focus on:

CPU
memory
I/O system

Each process wants some fraction of total resource in each direction, and if these fractions (in some direction) add up to more than 100%, then things slow down, even with good scheduling.

Tanenbaum concentrates on CPU scheduling here. Let’s downplay batch and real-time systems and concentrate on Interactive scheduling.

Interactive systems have users inputting commands and expecting fast response. Response Time, pg. 151.

Round-robin—simplest scheme for interactive systems. We’ll use this in hw3.

Quantum = CPU time a process/thread gets before it is preempted, about 50 ms. (pg. 155 20-50 ms)

Often a thread blocks before it reaches this point.

Quantum is set at about 50 ms, as compromise between throughput and delay in responsiveness.

A preemption causes a “process switch” or “context switch”. The second term is too vague, since context just means state. And sometimes it is used for just a mode switch, which happens in a system call. A system call causes execution to go from user mode to kernel mode in the same process. So it’s not a process switch, and in fact is a much lighter-weight action than a process switch. Good thing, since system calls (and mode switches in interrupts) happen much more often.

Process Switch is a mysterious operation. Recall that the CPU is handed over to a process for their 50 ms, then an interrupt occurs (or a system call) and the kernel decides to hand the CPU over to another process. But the kernel is using the CPU to make this decision!

Priority scheduling. Note the UNIX “nice” command, that allows you to run a user process at a lowered priority

Real-time systems—just know what they are.

Thread scheduling: In both Solaris and Win2K, threads are the primary scheduling unit. For each thread, the process it belongs to is known, of course.

Solaris has different scheduling classes, with different policies: SYS, TS (timesharing), RT (real-time), IA (interactive). In a multi-CPU system, you can bind a CPU to a thread. In the RT class, you can even turn off interrupts on that CPU.