CS444 Class 12

The errno problem—C’s one and only visible global variable

 

From last time:

Luckily, the C lib mostly uses a pure API rather than memory variables to do its services.  (A “pure” API can still have thread-safe problems as discussed above, i.e., ctime is not “impure” in this sense, since it does its work all via a function call.) errno is one of the very few exceptions.  errno is an int variable (as originally defined) that gets filled in by the C library at each system call with the error number.  So you can consult errno to find out the error code for your program’s last system call.

 

Obviously two threads doing concurrent system calls would both try to fill in the same variable if it stays a simple global variable.  The trick that’s actually in use on Solaris is to define a macro like this in errno.h:    to be  continued...

 

#ifdef BLAHBLAHBLAH

      #define errno (*(___errno()))   ßthread-safe way

#else

      extern int errno;               ß old single-threaded way

#endif

 

Then when you write  if (errno == EINTR) ..., you’re really calling this ___errno function to get the errno for your thread.  It works even when you write

 

            errno = 0;   /* reinitialize errno */

 

That’s why the underlying function __errno() returns a pointer to the thread-private cell for errno, not the value.  This function has to determine the thread id (via a system call) and then find its errno spot for that thread.

 

Note that Tan. discusses errno on pg. 114, but doesn’t give this practical solution.

 

The C library fills in errno by having appropriate code just after the trap returns from the kernel:

 

            int $0x80

     movl %ebx, errnos[tid]  -- pseudo-assembler for saving the errno from %ebx to some array indexed by the thread id “tid”

 

 

We’ll look a little at thread methods in Java a bit later.

 

 

 

 

 

Mutex for kernel code and (under more restrictive rules) for user code

 

pg. 119 read about critical sections. 

Terminology:

Race condition: a situation where two or more independent processes (or threads or activities) are reading or writing shared data and the final result depends on exactly how they execute.

 

Mutual exclusion (or mutex): way of making access to shared data behave as one process (activity) at a time.

 

Critical region or critical section: code that accesses shared data, causing a race condition unless appropriate mutex is used.

 

We’ve already seen critical sections in hw1, since the program-level and int-handler executions can interleave execution and they both access shared data, the input queue and the output queues.  So we needed a mutex to guard the critical sections, and we used cli/sti, and depended on the fact that IF=0 in the interrupt handler. 

 

Here, in hw1 (and hw2) the critical code is the access to shared queue pointers done by enqueue and dequeue, a total of 6 critical sections (2 for the input queue, 2 for the output queue, and 2 for the echo queue).

 

However, this simple and effective mutex (disabling interrupts) is only available to kernel code, and only works for uniprocessors, so we need additional mutex mechanisms.

 

Mutex Mechanisms other than disabling interrupts.

 

Lock Variables:  if we have several critical sections all accessing some shared data D, then we could set up a lock variable for D, say “lockforD”.  If lockforD is 0, the lock is free, and if 1, some thread of execution has the lock and can access D. 

 

(For example, an additional memory variable inQlock for the tty input queue.)

 

So far so good, but when we try to actually write code to obtain the lock, we run into the test-and-set problem:

 

   /* WARNING: this doesn’t work!! */

   while (lockforD ==1)  /* testing the lock */      

;

   lockforD = 1    /* setting the lock */

 

If this code is preempted (and all user code is preemptible) just after seeing lockforD go to 0, another process can run and also see it 0 and set it to 1.  Then when this thread runs again, it also continues to behave on its observation of lockforD = 0, and both threads are executing in the critical code.

 

The problem is the gap in control between the last test of the lock and the set of the lock.  This gap can be closed by having a special CPU instruction with both test and set actions in the same instruction, generically the TSL instruction.  Since interrupts logically come only between instructions (at least for x86), the TSL provides an atomic test-and-set action for a uniprocessor.  For multiprocessors (with shared memory), we need an “interlocked test-and-set”, interlocked between the processors.  This is usually provided by locking the bus between the multiprocessors for the duration.

 

Locks based on TSL are called spin locks.  They can be used in both kernel code and user code, since TSL is not a privileged instruction.  It doesn’t need to be, because it is only delivering up user-accessible data.  Preemption work in the spin, as always, to share the CPU. Tan., p. 122-124.

 

For x86, the BTS (bit test and set) instruction does the TSL action.  For multiprocessors, use the “lock” prefix on this instruction. 

 

enter_region: 

btsl $1, lockforD   # atomic test-and-set bit 1

            jc enter_region     # if was 1, try again

            ret                 # if was 0, is now 1, i.e. our lock

 

leave_region: movl $0, lockforD

            ret

 

 

How “bstl” works:

  1. read target bit of memory word into C bit of EFLAGS register—this is the TEST part
  2. set target bit of memory word to 1.  This is the SET part

How “jc addr” works:   if C bit of EFLAGS is on, branch to addr, else go on to next instruction

 

The shared-memory multiprocessor case requires the “interlocked BSTL”. More on this later.

 

Unlocking the lock:  This is easy. We just write a 0 into the lock variable, which can be done by a single instruction and thus is atomic on a uniprocessor.  It is still atomic on a shared-memory multiprocessor (helped by the fact that there is only one bit of memory that matters.)

 

Aside on Sparc, ulab’s CPU: The TSL instruction is ldstub (load-store-unsigned byte). The Solaris kernel uses “adaptive” locks, where the waiter checks to see if the lock-holder is running, and if so spins, if not, blocks.  Since both methods are effective, there’s no failure if things change between the decision and the action.

Ref: Solaris Internals, Maura et al, 2001, Sun via Prentice Hall.

 

Advantages of spin locks:

No system calls, instead uses a CPU capability directly  (system calls have overhead)

Can be used in both user and kernel code

Can be used for both uniprocessor and shared-memory multiprocessor systems (with “interlocked bstl”).

 

Disadvantages

Wastes CPU by spinning

Needs assembler coding

 

Note: we could generalize our assembler code to take a pointer to the lock variable.  Then several different lock variables could be managed by one implementation.

 

Example. Suppose we want to port hw1/hw2 to a multiprocessor. We could no longer use cli/sti for mutex, so we’ll switch to using a spin lock.  One spin lock could do the whole job guarding the 6 critical sections, or we could use 3 spin locks, one for each queue. That would allow simultaneous enqueue for input and enqueue for output, for example, while still ensuring the health of each queue.  We could even use 6 spin locks, one for each queue for a single COM port, i.e. 3 for COM1 and 3 for COM2.  That’s the most we could use.

 

For longer expected waits, we should be using blocking, not spinning.  Recall that blocking means waiting without using CPU, because the kernel has switched the CPU to another activity. Clearly to block, the process must enter the kernel, i.e., execute a system call. Thus we need additional system calls for mutex that can block.

 

This brings us to semaphores, which (for user level) have system calls that can block the user program until something happens, via the same semaphore being used by another thread or process.

 

Summary on mutex mechanisms, so far:

Uniprocessor kernel code can turn off interrupts for mutex, but user code can’t.

Uniprocessor or multiprocessor kernel code can use spin locks for short waits, but kernels avoid spinning if possible.

User code needs appropriate system calls to help it, or use spin locks if waits are small enough

 

Semaphores

 

Pg. 126: At this point, Tan. explores a bare-bones sleep/wakeup facility for user-level programming, and finds it doesn’t really solve the problem.  The problem boils down to the fact that a wakeup sent to a process that is just about to sleep gets lost.  We need a facility that remembers the pending wakeup and applies it to a sleep.  Well, that’s what semaphores do.

 

Semaphores were invented in 1965 by Dijkstra  Really useful construct.  A semaphore is a system object with some sort of identifier, here “sem”.  The sem object has a count associated with it, where count >= 0.  The two important operations on sem (after its creation) are:

 

 

Note: Tanenbaum’s description on pg. 128 does not make it clear that down, after blocking, and then unblocking, does decrement the semaphore’s count.

 

To make a real system, we also need to be able to create a semaphore with a certain initial count: sem = createsem(initcount).  POSIX semaphores: sem_init(…)

 

Clearly we need multiple threads/processes to utilize semaphores.

Stupid single-thread user-level program, using semaphores via system calls:

 

int sem = createsem(2);

down(sem);  /* sem count goes 2->1 */

down(sem);  /* 1 -> 0 */

down(sem);  /* and then this thread waits forever! */

 

Now if another thread knew about "sem", it could do an up(sem) and unhang this one.  You can see that semaphores need mulitple threads to really be useful.  The first use we'll look at is providing mutex.  After all, mutex is a matter of making other threads wait while one proceeds.  It turns out to be quite easy to make a mutex out of a semaphore--more on this next time.


It is easy to create a mutex from a semaphore:

 

/* initial count of one represents the one process allowed to run in the critical code */

int mutex = createsem(1); 

void enter_region(int mutex)

{

    down(mutex);

}

 

void leave_region(int mutex)

{

    up(mutex);

}

 

When process A enters the critical code, mutex’s count goes from 1 to 0.   When B tries to enter, the count of 0 puts it to sleep.  When C tries to enter, same thing.  When A finishes, it increments count to 1, but the existence of waiters means that one (say B) is chosen to run and it decrements the count back to 0.  When B finishes, same story.  When C finishes, it increments the count to 1, and since there are no waiters now, it stays at 1 for a while.

 

We have been talking about user code with semaphore system calls.  A kernel can also have a semaphore services for its own internal use, in which case the up and down calls are just ordinary function calls.  The code being called (by system calls or kernel semaphore calls) needs to implement semaphores using a more primitive form of kernel mutex, say disabling interrupts.  It’s a kind of bootstrapping idea.

 

Next: producer-consumer, a classic problem solved neatly with semaphores.