CS444 Class 2

Tuesday, Sept. System Calls, SAPC Programming Environment

Notes on last class:

Virtual machine terminology

“Virtual machine” is used for several similar concepts. See the Wikipedia page on this.

1. A system virtual machine: the whole computer system is virtualized. This is what Tanenbaum describes in Sec. 1.7.4.

2. A process virtual machine: the environment in which a program runs is a virtual machine, having a virtualized CPU, virtual memory, and ability to do i/o. This is what I was describing last class.

3. The Java virtual machine: the environment in which Java programs run. See Tan., pg. 71

CPU kernel mode vs. user mode

CPUs that can run modern OS’s must have kernel mode execution and user mode execution, where kernel mode has privileges denied to user mode, such as turning off the interrupt system. The kernel code runs in kernel mode, and the user code runs in user mode. See Tan., pg. 2. The system call is a special instruction in the CPU instruction set that changes the current execution mode from user to kernel. There must also be an instruction that returns from kernel mode to user mode, iret for x86.

Continuing from last time…

User Memory Layout, flat address space. I like to draw it horizontally, because I think of it as the floor on which things are built, but the text draws it vertically (pg. 51 for example): Each byte of memory usable by the program running in the virtual machine has a unique address. We can think of all these addresses as a sequence, and there is more structure—code uses the lower addresses and static data somewhat higher addresses, for a simple C program:

                 code      data         C lib DLL stack
                 ---       ---            ----    <----
        |--------|--------|------------ … -------------------|

Address 0 A1 A2 Amax 0 < A1 < A2 < Amax

First consider a 32-bit system, one with 32-bit addresses, UNIX/Linux/Windows.

0xf = 1111 binary, so 4 binary 1’s for each f in an address. 0xffffffff has 8 f’s, so has 32 bits of 1s. This is the highest possible 32-bit address.

Thus for a 32-bit system, Amax <= 0xffffffff. 64 bit systems can have higher Amax. More on this case later.

Important powers of 2: 1G = 2³⁰, 1M = 2²⁰, 1K = 2¹⁰

So 0xffffffff = 2³² - 1 = 4G -1. Thus the maximum possible 32-bit user address space is 4 G bytes in size, the full 32-bit address space size. Of our example OS’s, only Sun Solaris UNIX provides this maximum possible size.

There can be holes in the available memory for a program, stretches of addresses that cause segmentation faults when referenced. We still call the memory "flat," because one sequence of memory addresses still can describe the whole thing, and every byte of usable memory has its own unique address.

The OS code, the kernel, is not in this space but off somewhere else—shown in a cloud on the board. The system call causes execution to jump right out of this user space into the kernel. In the kernel, the system call implementation code executes to do the service, and then returns to the next user instruction after the system call instruction.

Solaris 32-bit UNIX, gives user space the entire 32-bit address space. Thus the Solaris user address space is 4 G bytes in size. Other UNIX implementations provide 3-4G. 32-bit Linux provides 3G. 32-bit Windows provides 2G by default, 3G by special boot command for Advanced Servers. The size of the user memory space (above 1G) is only relevant for the largest apps, notably huge database systems. Nice diagrams for Linux and Windows 32 bit systems.

DLL: dynamic-link library, or just dynamic library in UNIX parlance, code that can be called by a program but is not stored in the program’s executable file, Instead, it is brought into user memory at runtime. Functions are located in the DLL via “dynamic linkage” at runtime. Once this linkage is done, calls are direct, since the DLL is in user memory.

User Memory Layout for Solaris UNIX (32 bit): 4G user address space (the first 0x10000 bytes are purposely made unavailable to trap null pointer accesses)

                 code      data              C lib DLL    stack
                 ---       ---                -----      <----
        |--------|--------|------------ … -------------------|

Address 0 0x10000 0x20000 0xffffffff

User Memory Layout for Win32 (32 bit): 2G user address space Amax = 0x7fffffff, which has the leading bit = 0, rest 1s, so only half of 0xffffffff

code data C lib DLL stack

--- --- ----- <----

|--------|--------|------------ … -------------------|

Address 0 0x7fffffff

32-bit Linux: Amax = 0xbfffffff, so 3GB of user space. (sf08.cs.umb.edu for example)

64-bit systems: much bigger user space, no longer bottled up in the 32-bit address space. But not really “64 bit” addresses, more like 48 bit.

Example: linux1.cs.umb.edu, a 64-bit Linux system you have access to.

From this, we see that the stack grows down from 0x7fff ffff ffff, the code starts at 0x400000, and data starts at 0x600000, and the C DLL is around 0x7ffff7a8d6d0, below the stack but at the high end of user memory.

0x7fff ffff ffff has 15 bits of 1s from 7fff, plus 32 bits of 1s from ffff ffff, for a total of 47 bits in use in user space addresses. The 32 bits provide 4GB of user space, and the additional 15 bits a factor of 32K (0xffff is 64K and this is half of that), so the total user address size is 32*4 G*K = 128 TB of user space. That should be enough for anything we might need! At least for the next 20 years...

Of course this is just user space, not allocated memory. The OS does a “shell game” to put memory where it’s needed under the user space. We’ll study that in more detail under memory management.

Next time: what about Android?

We looked at the nice large (multi-GB) flat user address spaces for 32-bit UNIX/Linux/Windows and huge flat address spaces for 64-bit systems.

This user address space is part of the “virtual machine” that the OS provides for a running program.

What about Android?

Android runs Linux on 32-bit processors (ARM or x86), so at the OS level it has 3GB of user address space. But smart phones don’t have much physical memory, so this is just in theory.

user stack

ß--

Startup, Dalvik or C program libraries (DLLs)

|---------------------------------|--------------|

0 0x80000000 = 2G 0xc0000000 = 3G

However, most Android apps run in the Java environment, built on top of the Linux. Within this environment (Dalvik), user address space is artificially restricted to (say, for Android 4) 48MB to force app developers to use memory stingily. You can run C on the Linux environment (native code), however, to avoid this limit.

Each app execution runs in its own virtual machine, on Android and on UNIX/Windows, so one app program can’t look into another app’s memory.

Note that in general, the Java “virtual machine” is built on top of the OS virtual machine. So when you use println, the Java runtime has to do a system call to do the output. The Java environment uses the flat address space. Each object ref is an address, and by the flatness, this is unique, so when you compare refs and see them the same, you know you have two refs to the same object. Of course it would be possible to support Java on a non-flat address space, but it would be more difficult. The object ref would have to have more than an address in it.

Finding UNIX system calls

Example: simple UNIX program, handout on debugging session linux1.cs.umb.edu (64-bit Intel-compatible AMD processor) to see how a program uses a system call.

Demo: followed script pretty closely. Also looked at some other addresses to see how huge the user address space is on a 64-bit Linux system.

What the handout shows:

You can use a system call from an ordinary C program.
The code for main contains a call instruction, “call write”. This is an ordinary call instruction.
After startup (when execution reaches main), write is at a very high address, in the C library DLL. This is the little routine that contains the system call instruction.
We followed the code inside write and saw the system call instruction, syscall. (called TRAP in Tanenbaum)
An “si” in gdb (si = step instruction) executes the whole write system call in the kernel—it all appears as a single instruction execution at user level.
After the system call executes, the rest of the write function executes, then it returns to main
There is a startup module, where execution starts, the C library and DLLs are initialized
The startup module calls main with an ordinary call instruction (not actually shown), because main is just an ordinary C function with a special name.
After main finishes, it returns to the startup code (with an ordinary function return instruction.)
Then the startup code calls back into the C library, and eventually executes the exit system call, collapsing the whole user virtual machine

Each system call has a little envelope routine in the C library. We saw the one for write, and another one for exit. These are written in assembler. This allows us to make a normal function call to write to do a system call from C. It’s not possible to execute the “syscall” instruction directly from C code (except with embedded assembler code.)

SAPC Programming Environment

Ref: SAPC Programming Environment

4M flat memory, addresses 0 to 0x400000 – 1 = 0x3fffff.

(from SAPC Programming Environment:)

--increasing memory addresses--->

|-------------|---------------------------------|---- ... ---------------|

0 0x00100000 = 1M 0x00400000 = 4M 0xffffffff

<----------4M RAM=read/write memory-------------><----No usable memory--->

<--sys area-->|<-----------3M user memory------>|------------

The first 1M is the “system area”. We could use parts of it, but it is cluttered with reserved areas—BIOS, video memory, Tutor, etc.

So we just use the upper 3M from 0x100000 to 0x3fffff for downloaded programs. Code is downloaded to start at 0x100100, a convenient address above 0x100000. The data follows it immediately. There is a startup module that calls main. The startup code sets the stack pointer to the top-of-memory address to locate the stack there.

What’s in that first megabyte of SAPC memory labeled “sys area” above?

--ordinary memory from 0 to 0x9ffff, with Tutor residing in 0x50000-0x60000 or so.

--video memory in a0000-b0000 or so.

--BIOS in f0000-fffff, the upper end of the sys area.

This means that we can use the ordinary memory between 0 and 0x50000 for experiments if we want to. It is used temporarily during bootup by the “bootstrap”, but this use is long over when the system is booted up.

User program layout on the PC:

code data <--stack

|----------------|----------------------------------|---- ... ----------|

0 0x100000 user memory 0x400000 0xffffffff

There is no C library DLL here. Instead, code for the C library functions in use by the program is part of the code area shown here. The i/o functions call into Tutor to do the i/o. This saves some downloading time and allows remote gdb to know about the i/o. However, this doesn’t qualify Tutor as an OS, it’s just serving as an i/o library, like the system you’ll write for hw1.

SAPC hardware:

Pentium CPU (or 486, even 386 would be compatible, “x86” for short)
4M usable memory
(no hard disk, floppy is for booting only)
(no keyboard or monitor or mouse)
COM2 serial port: used for console i/o
COM1 serial port: used for remote gdb protocol
(LPT1 parallel port: but nothing is connected to it, so not much use)
timer device: PIT, for programmable interval timer, used for periodic interrupts, “ticks”
reset circuitry: used to reboot via ~r in mtip

SAPC software:

BIOS: at reboot (or “reset”) this code initializes the hardware, loads and starts the Tutor bootstrap. It is in ROM, “burned in”, thus always available at power-up.
Tutor (and its bootstrap) – this debugger is loaded from floppy disk into RAM, i.e., ordinary read/write memory. It does more hardware initialization and switches from 16-bit real mode to 32-bit protected mode, while staying in kernel mode. This code derives from Linux, remote gdb, plus some local code.