Reading: Tanenbaum
Previously assigned:Chap 1, specifically, Sec. 1.1, [1.2 optional
history], 1.4, 1.5, 1.6, 1.8
Now add Chap. 5, 5.1, 5.2,
5.3 I/O systems
News: 64-bit Android
coming! Thanks to Nick Rosato for noting this.
C “Objects”, a quick
intro/review.
Here the Queue implementation
in C is provided in the queue subdirectory, along with a unit test testqueue.c. queue.h
defines the Queue API and the Queue type.
You know how to do objects in
Java, what about in C? Recall the Chip object in CS341. Also the Cmd object for tutor.
Example: Rectangle objects
struct rect
{
int x1, y1, x2, y2;
};
struct rect
*recp;
recp->x1 = 10;
Watch out!,
there’s no (validly-allocated) memory pointed to by recp!
This is a garbage pointer, a pointer that points nowhere
good. Using it usually causes segmentation violations on UNIX and similar
exceptions on Windows. But on SAPC, all user memory is writable, so it
may quietly work to damage some memory.
Normally under UNIX or
Windows, we could malloc space for the Rectangle, but
we have no malloc in the SAPC library. Anyway, malloc is quite expensive in CPU, and we can easily avoid
using it. Just set up whole objects like this:
struct rect
rect;
Now we can put
rect.x1
= 10;
As a Java programmer, you’re
used to “Rectangle rect;” meaning rect
is a ref to a Rectangle, but in C, rect is the whole
object, more like int x; in C or Java.
To make a rectangle pointer,
we have to use * in the type:
“struct rect
*rp;” defines rp to be a struct rect pointer.
Living without malloc—use C’s ability to place objects in memory
directly--
struct
rect {
int
x1, y1, x2, y2;
};
struct
rect r; /* this sets up memory for r “right here” */
r.x1 = 10;
Similarly, we can set up
larger objects that contain rectangles:
struct
view {
struct rect
first, last;
};
struct
view v;
v.first.x1 =
10; /* set coord
of v’s first rect */
v.first = rect;
/* set whole rect by struct
copy in C */
With typedef
we can drop “struct” in the type name:
typedef
struct rect {
int x1, y1, x2, y2;
} Rectangle;
struct
rect r; /* one
way */
Rectangle r1; /*Nice looking, but
different from Java! */
struct
view {
Rectangle first, last;
};
Note this typedef
syntax used for the supplied Queue in hw1.
From testqueue.c:
Queue q1obj,q2obj; /* gives memory to Queue objects */
…
Queue *q1 = &q1obj; /* pointers to queue mem
objects */
Queue *q2 = &q2obj; /* Now use ptrs
like Java refs */
…
enqueue(q1, c) /* enqueue a char c */
Setting up whole objects like
this may seem strange after using Java a long time, but it’s good to really
know C too!
There is more we should cover
on this topic. We have no encapsulation here, so calling this an “object”
is hard to defend to object-oriented people. It’s an object in a
practical sense that it brings together related data, and has an API that
describes what can be done with the data—see queue.h.
Hw1: Dataflow to be
handled by Queues
Then testio.c
does a read(ldev, buf, 10) requesting 10 bytes from the user. It is
implemented with interrupts already, but you need to switch it from direct use
of rbuf and tbuf to use of
Queues from queue.c. Also it doesn’t wait
properly for all the chars requested. Note the busy loop before the
read—this wastes time so you have a chance to input a few characters before
read executes.
Input: chars arrive in
the interrupt handler, where they are each enqueued
into the input queue. This capability is called “typeahead”,
and allows the user to type input early, before the read call happens.
ttyread dequeues each char, and puts it in the user buffer.
What’s the user buffer?
That’s described by the parameters of ttyread, or
read itself: “char *buf, int
nchars”. The app code (here testio.c)
is requesting a bufferful of chars by calling read.
The code as provided already delivers some chars to the user buffer.
Output: chars arrive
from the user, in the user buffer, i.e. buf and nchars as arguments.
The code in ttywrite enqueues them in the
output queue.
Eventually, your new
interrupt code will dequeue
them in the interrupt handler. As provided, the code just outputs them
from ttywrite.
Don’t worry about this part yet—we will cover interrupts soon.
First note why we want to do
this for an OS course: recall that the OS receives work to do via system calls
from the running app program, and then does that work by controlling all the
hardware. So to write an OS, we need to do hardware programming.
Ref:
Notes on using
C to access hardware registers, Tan, Sec. 1.3.5, 5.1.5
The
PC uses I/O ports for most I/O devices, not memory-mapped I/O. It is
capable of memory-mapped I/O, however.
First look at $pcex/echo.c.
You can build it the same basic way as test.c.
Just have $pcex/makefile as
well as echo.c in a directory and “make
C=echo”. You’ll get echo.lnx to download.
The (32-bit) x86 architecture
specifies 32-bit memory addresses and, separately, 16-bit i/o
port numbers. A device is assigned a little block of i/o
port numbers for communication over the bus. For example, COM2 has
0x2f8-0x2ff, and COM1 has 0x3f8-0x3ff. This is very standard across all
PC models and vendors.
x86
(32-bit) CPU
Registers
EAX,
32 bits, AX, its low 16 bits, AL, its low 8 bits, and similarly
EBX,
BX, BL,
ECX,
CX, CL,
EDX,
DX, DL,
a few
other general registers
ESP,
32-bit stack pointer
EIP,
32-bit instruction pointer, also known as the program counter or PC for short
EFLAGS,
control and status
64-bit: RAX, RBX, etc.
Gnu assembler uses %eax for EAX, %ax, %al, etc. It turns the order of operands
around from the original Intel syntax, which is weird, but we’ll use it anyway
since it’s the Linux assembler that goes with our software.
For example, the MOV
instruction, in Gnu syntax: “ movb (%edx),
%al”
moves a byte from the address given in %edx to
AL, i.e. from the first operand to the second.
This same instruction is written in Intel syntax “mov al,[edx]”, i.e., from the
second operand to the first.
IN and OUT instructions,
the x86 i/o instructions.
In Gnu assembler, for 8-bit i/o, what we’ll be using:
outb %al, %dx
Put 8-bit data to output in
%al, i/o port number in %dx
(CPU registers)
Execution puts the 8 bits of
data in the device register specified by the i/o port
number, by sending it over the bus.
inb %dx,
%al
Put i/o
port number in %dx.
Execution gets the 8 bits of
data from the device register specified by the i/o
port number, by sending it over the bus, into the CPU register %al.
PC Serial port device (COM1 or COM2, a “UART”)
Each has 8 I/O ports, but
luckily we only need to use a few of them.
COM2’s “base port” is
0x2f8. Its 8 ports are 0x2f8, 0x2f9, 0x2fa, ...,
0x2ff. Other devices have other I/O port assignments.
See $pcinc/serial.h for def—
#define
COM2_BASE 0x2f8
Each UART has several
registers accessible over the bus by various i/o
ports:
TX--Transmit register
RX--Receiver register
IER—interrupt enable register
LSR—line status register
others we don’t need to use…
BTW,
what’s a register? It’s an array of hardware bits, usually in
flip-flops.
You
can have a register chip on a breadboard holding bits completely separately
from any computer.
Important
idea that the UART can have registers and hold “state”, i.e. data, on its own
separate from the CPU.
<picture of CPU and UART connected by a bus, communicating
over it>
The
UART’s base port, 0x2f8, is use for both input and output using the in and out
instructions:
So
two registers in the UART are in use via the one i/o
port. This is a common trick to save on i/o
ports. The UART is fully aware of the difference between a read and a
write access over the bus (using the R/W bus line), so no confusion arises.
Luckily we can avoid
programming in assembler by writing a C-callable assembler function once and
for all to do each of these instructions. These are in the SAPC support
library, prototyped in $pcinc/cpu.h:
ch = inpt(port)
outpt(port,
ch)
Here
ch is an 8-bit quantity, usually an unsigned char.
For example, to output ‘A’ to COM2,
outpt(0x2f8,
‘A’);
But
we don’t want to use wired-in numbers like this in real programs. We
could write:
outpt(COM2_BASE,
‘A’);
This
is better but still not perfect—base addresses are usually accompanied with
offsets to say which port of the set is being used:
from serial.h: offsets for a PC serial port device
#define
UART_RX 0 /*
receiver reg */
#define
UART_TX 0 /*
transmit reg */
#define
UART_IER 1 /* interrupt
enable reg*/
#define
UART_LSR 5 /* line status
reg */
So
we end up with
outpt(COM2_BASE + UART_TX, ‘A’); /* output ‘A’ to COM2, using TX reg*/
/*
using i/o port 0x2f8 + 0 = 0x2f8 */
Similarly,
if we know there’s a character ready to be read in from COM2:
ch = inpt (COM2_BASE + UART_RX); /* input char from COM2, using RX */
/* using i/o port 0x2f8 + 0 = 0x2f8 */
Note
that this involves communication with two different hardware registers in the
serial device. The byte being read or written travels over the bus
between the CPU and the device. The device knows when it is selected (by
logic sensing the i/o port # on the address bus) and
whether it is a read or write over the bus. Thus there is no ambiguity
caused by using the same i/o port for both
actions.
/* get the current value
of the line status register (8 bits) */
stat
= inpt (COM2_BASE + UART_LSR);
We usually want a particular bit from this byte, most commonly the DR bit for data-ready (receiver has a char) or THRE (transmitter can take another char). The mask for DR is #defined in serial.h with name UART_LSR_DR, and the mask for THRE is UART_LSR_THRE. These names come from the Linux serial driver sources.
Numbering bits: we count the rightmost bit as bit 0. The highest bit number in a byte is 7, in a 32-bit int it's 31.
Idea of mask for one bit: just that one bit is on. Here are some examples.
hex
binary
Mask for bit 0: 0x01 0000 0001
Mask for bit 1: 0x02 0000 0010
Mask for bit 2 0x04 0000 0100
...
Mask for bit 7: 0x80 1000 0000
In general, mask for bit i can be written:
(1 << i)
This
takes 1, a mask for bit 0, and left shifts it i bits,
making it produce a mask for bit i.
From
serial.h: masks for status bits:
#define UART_LSR_THRE 0x20 /*
Transmit-hold-register empty */
…
#define UART_LSR_DR 0x01 /*
Receiver data ready */
.
We can test the one bit we want in stat by bitwise-anding it with the mask for the bit we want. Thus for the DR bit, we form
stat & UART_LSR_DR
and this
is a quantity which is either 0 or has the one bit on, i.e., equals
UART_LSR_DR, the one-bit mask.
True and false in C: non-0 and 0
Recall
that in C, there is no Boolean type, and instead we use ints
with 0 representing FALSE, and any non-0 value representing TRUE.
Thus
the expression (stat
& UART_LSR_DR) is TRUE if the DR bit
is on and FALSE otherwise.
The
upshot is that can write these basic tests on COM2's status:
inpt(COM2_BASE + UART_LSR) & UART_LSR_DR
is TRUE or FALSE depending
if the receiver is ready
inpt(COM2_BASE + UART_LSR) &
UART_LSR_THRE is TRUE or FALSE depending
if the transmitter is ready
We
can make a loop of inpt's testing the DR bit and
thus wait for the UART to get a new character. Or a loop of the THRE inpt's to wait for the transmitter to be ready for another
byte. For THRE:
while (inpt(COM2_BASE
+ UART_LSR)& UART_THRE) == 0)
;
/* not ready yet, keep trying */
/* here when transmitter is ready, OK to output another
character */
outpt(COM2_BASE + UART_TX, ch);
This
is like the code of Figure 5-8 in Tanenbaum, pg. 346,
except that we use the IN instruction to get the device's statis
register value instead of memory-mapped i/o.
Idea of “programmed I/O”
When
the CPU loops testing a device ready bit, waiting for the device to produce or
accept data, we call it “programmed I/O”, or “polling for data”. The loop
is called a “busy loop” or a “polling loop.” The other alternative is
interrupt-driven I/O. Polling I/O is simpler and is commonly used in
programs that don’t utilize an OS for I/O, for example, all ordinary programs
on the SAPC. When you write printf(“hi”) in the
SAPC environment, printf calls putc
in Tutor, which does a polling loop on COM2 (or whatever the console device
is.) OS drivers usually use interrupt-driven I/O, so as not to waste CPU
in busy loops.
Look
at echo.c. There you see a polling loop for
input. Surprisingly, there is no polling loop for output, but this is a
special case where the output is slowed down so much by the input that the
transmitter will always be ready when it’s used.
We
have looked at COM2 here. COM1 is exactly the same device, so just use
COM1_BASE instead of COM2_BASE in the examples above. Also read about
LPT1 in the linked document above.
Next time: x86 interrupts