Thursday, Sept. 8

Finding UNIX system calls

Example: simple UNIX program, handout on debugging session on Sun Solaris (Sparc processor) to see how a program uses a system call.

What the handout shows (in Sparc assembly language, but we can figure it out):

You can use a system call from an ordinary C program.
The code for main contains a call instruction, “call write”. This is an ordinary call instruction.
After startup (when execution reaches main), write is at a very high address, in the C library DLL. This is the little routine that contains the system call instruction.
We followed the code inside write and saw the system call instruction, ta 8.
An “si” in gdb (si = step instruction) executes the whole write system call in the kernel—it all appears as a single instruction execution at user level.
After the system call executes, the rest of the write function executes, then it returns to main
There is a startup module, where execution starts, the C library and DLLs are initialized
The startup module calls main with an ordinary call instruction, because main is just an ordinary C function with a special name.
After main finishes, it returns to the startup code (with an ordinary function return instruction.)
Then the startup code calls back into the C library, and eventually executes the _exit system call, collapsing the whole user virtual machine

SAPC Programming Environment

Ref: SAPC Programming Environment

4M flat memory, addresses 0 to 0x400000 – 1 = 0x3fffff.

(from SAPC Programming Environment:)

--increasing memory addresses--->

|-------------|---------------------------------|---- ... ---------------|

0 0x00100000 = 1M 0x00400000 = 4M 0xffffffff

<----------4M RAM=read/write memory-------------><----No usable memory--->

<--sys area-->|<-----------3M user memory------>|------------

The first 1M is the “system area”. We could use parts of it, but it is cluttered with reserved areas—BIOS, video memory, Tutor, etc.

So we just use the upper 3M from 0x100000 to 0x3fffff for downloaded programs. Code is downloaded to start at 0x100100, a convenient address above 0x100000. The data follows it immediately. There is a startup module that calls main. The startup code sets the stack pointer to the top-of-memory address to locate the stack there.

What’s in that first megabyte of SAPC memory labeled “sys area” above?

--ordinary memory from 0 to 0x9ffff, with Tutor residing in 0x50000-0x60000 or so.

--video memory in a0000-b0000 or so.

--BIOS in f0000-fffff, the upper end of the sys area.

This means that we can use the ordinary memory between 0 and 0x50000 for experiments if we want to. It is used temporarily during bootup by the “bootstrap”, but this use is long over when the system is booted up.

User program layout on the PC:

code data <--stack

|----------------|----------------------------------|---- ... ----------|

0 0x100000 user memory 0x400000 0xffffffff

There is no C library DLL here. Instead, code for the C library functions in use by the program is part of the code area shown here. The i/o functions call into Tutor to do the i/o. This saves some downloading time and allows remote gdb to know about the i/o. However, this doesn’t qualify Tutor as an OS, it’s just serving as an i/o library, like the system you’ll write for hw1.

SAPC hardware:

Pentium CPU (or 486, even 386 would be compatible, “x86” for short)
4M usable memory
(no hard disk, floppy is for booting only)
(no keyboard or monitor or mouse)
COM2 serial port: used for console i/o
COM1 serial port: used for remote gdb protocol
(LPT1 parallel port: but nothing is connected to it, so not much use)
timer device: PIT, for programmable interval timer, used for periodic interrupts, “ticks”
reset circuitry: used to reboot via ~r in mtip

SAPC software:

BIOS: at reboot (or “reset”) this code initializes the hardware, loads and starts the Tutor bootstrap. It is in ROM, “burned in”, thus always available at power-up.
Tutor (and its bootstrap) – this debugger is loaded from floppy disk into RAM, i.e., ordinary read/write memory. It does more hardware initialization and switches from 16-bit real mode to 32-bit protected mode, while staying in kernel mode. This code derives from Linux, remote gdb, plus some local code.

Thus after boot-up, the SAPC’s x86 CPU is running in 32-bit protected mode and in kernel mode. 32-bit protected mode means that we can use 32-bit addresses, addressing up to 4G locations. 16-bit mode would only allow 64K different addresses in one sequence, a terrible handicap for today’s programs. Even our measly 3M user memory would constitute 0x30 = 48 different sequences of 64K (or 0x10000). In 32-bit mode, an address of 0x300000 (3M) is just an ordinary address, in fact seen to be on the small side if you write all 32 bits out like this: 0x00300000.

Warning: many texts on Pentium architecture treat only the real mode architecture. To tell, look at the register names in use. AX = 16-bit register, whereas EAX = 32-bit register. Similarly SP vs. ESP. We can use AX in 32-bit mode, but we wouldn’t use it a lot (it’s just the lower 16 bits of EAX.)

Building and downloading programs for the SAPC

See www.cs.umb.edu/ulab for first steps--get the ulab module load into your .cshrc and build and run test.c. Note that the ulab module adds environment variables to your UNIX process to make SAPC work easier. $pcex is the examples directory path. Try “echo $pcex” to check, “env | grep pc” to see more.

Note that we use a cross-compiler i386-gcc and other cross-tools—they run on Sparc but generate or work with x86 machine code. The makefile in $pcex can build from any single C source xyz.c by “make C=xyc”. This will make xyz.lnx, an SAPC executable stored in a UNIX file.

We can download a .lnx file by using mtip, a Sparc UNIX program usable on ulab.cs.umb.edu. The reason it has to be ulab is that the serial lines to the 14 SAPCs are connected to ulab. mtip finds a SAPC not in use already and assigns it to you, and then provides a conduit between your keyboard and monitor and the console of your SAPC. It watches the characters as they go by, and if you type “~r”, it springs into action and arranges a reboot for you, or if you type “~d”, a download.

With the help of mtip, you will see the Tutor prompt “Tutor>” coming from your SAPC. Tutor was listed above as SAPC software—it is running on the SAPC.

System setup for SAPCs

Each SAPC is (effectively) connected to the UNIX host “ulab” by a serial line from COM2 on the SAPC to a serial port on ulab. We call this the console line. When Tutor prints a prompt, those bytes are going out COM2 and are handled by the UNIX program “mtip” that we run on ulab.

mtip shuttles bytes back and forth between your stdin/stdout user i/o setup and the console line. When you type a character during an mtip session, it goes via stdin to the mtip program, and from there out to COM2 of the connected SAPC. If you type “~”, the mtip escape character, mtip takes notice and then waits for the next character to see what to do, for example ~d for download.

Each SAPC numbered 5 or more has its COM1 separately connected to another serial port on ulab. This connection is used by remote gdb to provide debugging.

When you type “~r” in mtip, the connected SAPC gets rebooted. This works by first being interpreted by mtip, and then mtip runs another program that sends a command via a serial port on ulab to the “reset server”, a little machine that John Lentz built for this purpose. It interprets the command and then generates a signal on a wire into the reset circuitry of the right SAPC, that is, a signal equivalent to pressing the reset button on the SAPC box.