cs240.NewNotes.17

Daily Course Notes for CS240

Computer Architecture 1

Patrick E. O'Neil
Class 19. Administer Quiz 2

Starting Chapter 7. We will be defining the standard library for ANSI C, available on any UNIX system, portable to other OS's (NT).

You will be responsible for having Glass UNIX in class; questions from Glass on Quizzes and Exams.

Full library function specifications are available in K&R Appendix B, Section B.1. You are responsible for all specifications in Appendix B.

Note: all header files for libraries (stdio.h, string.h, ctype.h, etc.) are online in /usr/include. You should cd to this directory and nose around.

Here's an important command you should know: grep. Read about it on Glass UNIX manual (look up in index, bold page number shows definition).

%grep -hilnvw pattern {fileName}*

This is a command to search for a pattern in a list of files. It's like the program we wrote to illustrate command line argument use. (The example for that was: find -xn pattern. But use -v instead of -x in grep.)

In grep, look for and print out lines in a file or list of files that contain match to pattern.. Each file encountered is named in output unless -h option is given (in which case fname is not listed in output!).

A list of file names delimited by spaces is what is meant by the argument {fileName}*. We could write:

%grep US studentfile staffile facultyfile (3 files to search, pattern: US)

In the command: grep printf cs240/hw5/*.c, the *.c gets expanded to a list of files by the shell and grep looks at them one by one.

Options: -n means give line numbers, -i means ignore case ("And" matches "and"), -l displays only filenames that contain pattern, -v displays lines that DON'T match pattern, -w means only matching for whole words count.

Use grep to search for function names in header files.

%grep isupper /usr/include/*

Note, if no filenames are specified, grep will search stdin. Common idea in C, useful for "pipes", explained below.

We can use a wildcard specification of a pattern in grep, called "Regular Expressions". See pg 606 of Glass (2nd Edition, see index in 1st). Note that:

%grep U.S. studentfile staffile facultyfile

would use wildcards, since the character period, ".", stands for any single character, so we need to "quote" each period with a "\" sign.

But the "\" sign itself will be interpreted by the shell so the "." will be passed, and "." will continue to be interpreted by grep as "any character".

To get the "\" sign seen by grep, you would need to write:

%grep U\\.S\\. studentfile staffile facultyfile

You need to VERY defensive using grep with regular expressions. (Read about in Glass. more on this later.)

AND YOU NEED TO TEST IDEAS. In a fresh directory, create one file with "U.S." and another with "U_S_", and a third with "US" and try:

%grep U.S. * (* here stands for all files in current directory.) Next try:

%grep U\.S\. And finally, try

%grep U\\.S\\. (This should work to retrieve "U.S." lines only.)

Note how we would perform grep for either US or U.S. Note on page 606 "*" following a character denotes zero or more occurrences of that character.

The following doesn't work:

%grep U.*S.* * (The * at the end stands for all files, of course.)

Because "." is interpreted as any character. This doesn't work either:

%grep U\\.*S\\.* *

Because * after \\. is interpreted by the shell. And this doesn't work:

%grep U\\.\\*S\\.\\* *

because grep sees "U\.\*S\.\*" and will now be looking for literal characters "U.*S.*" But this works:

%grep U\\.\*S\\.\* *

Because grep sees "U\.*S\.*". See what I mean about being "defensive" in using regular expressions?

You should read Glass to understand why this works. Responsible on Quiz! Note should type "man grep" when you're online and forget some feature.

Chapter 7. Now start going through Chapter 7 of K&R. Follow along. Also, look for MORE DETAILS of everything we cover in Appendix B!

You should now learn and use EVERYTHING YOU CAN in the C Library Function coverage of Appendix B! You'll be expected to know almost ALL.

Section 7.1 Standard Input (stdin) and Standard Output (stdout.) Already know a lot of this. Standard I/O redirection:

%prog <infile >outfile

Idea of a pipe is new: prog1 | prog2. This will execute prog1 and prog2, send stdout from prog1 into stdin for prog2. For example:

%grep S *|grep U

Will perform "grep S *" on all files, pass lines through the interface as stdin, then grep U will find all lines with both U and S in them somewhere!

Note, specification in Glass, pg. 33: more -f [+lineNumber] {fileName}*

The option -f means don't fold long lines (useful if piping stdout to something else that deals with lines, e.g., wc command); can start at given line number. If no file name specified, will use stdin. See how this is useful?

For example: ls -lt|more This gives a long listing of files in the current directory, most recent first (-t option), & pipes through "more"; more uses stdin, so most recent files won't go off screen. Also: finger name|more

7.2. Formatted output: Printf. Formats and prints out internal values.

int printf(char *format, arg1, arg2, . . .);

Note printf has a VARIABLE LENGTH ARGUMENT LIST (as many as there are % conversions in the format string). We will learn how to do this shortly.

The return from printf is the number of characters printed (haven't used this up to now, but useful if there is error or some limit truncation).

Between the % and the conversion character, there are a number of other characters which may exist. In the order they must be placed, they are:

- (minus sign) left adjust printing of argument

m (number m) minimum field width

. (dot) separates min field width from precision

p (integer p) precision: max chars for string, min digits for int

h or l (letter h or l) h for short int, l for long int

(ORDER of options for %d is: %[-][m][.][p][h|l]d, Note: no embedded spaces!)

E.g., figure out what these would do: %10d, %-10d, %hd, %ed, %10ld, %10.p

Try spending 10 minutes with a program, using different formats.

Learn these with examples of string precision given on pg. 154, bottom.

Also, to print at most max characters from string s (max is int type var or const), use * after % and include the int max as an argument before s:

printf("%.*s", max, s);

Note, it is possible to print out a character string as a format string with no % variables: what we do with printf("hello, world!\n"); could write:

char s[ ] = "hello, world";

printf(s);

But if s might get a % character in it, this is unsafe, since format will then require another argument after s. Better to write out s[ ] as:

printf("%s", s);

Finally, the function sprintf will work same as printf, but write to string.

int sprintf(char *string, char *format, arg1, arg2, . . .);

See sprintf in Appendix B, pg 245 K&R. Note how useful sprintf is!

Recall how we wrote itoa() and itox() functions. No functions like this in C library! Instead use sprintf() to print int into a string, using %d or %x.

Might now want to strcpy the string into a malloc'd area you create. You can start reading ahead now, using any C library function. See Appendix B.3 and B5 for malloc.

7.3 Variable-length argument lists. How to declare a variable argument list such as printf( ) has in a functional prototype with an ellipsis (. . .).

void minprintf(char *fmt, . . .); /* three dots in a row: var arg list */

Here is how this works. Recall that when a function call is called, a new stack frame is created for the function execution.

The stack frame holds a memory location to return to from this func, the arguments of the function, and local variables of the function. Leave Up!

Return location

Argument 1

Argument 2

. . .

Argument k

Local variable 1

Local variable 2

. . .

Local variable n

This representation is not guaranteed for all machines. The va_ package we cover here hides actual representation.

The function can pick off one argument after another, and in the case of variable length argument lists, the k is not set in advance.

In a function with a var-arg list, we start by declaring a variable (say, ap for arg pointer), of type va_list, to refer to the arguments in the list.

void func(int n, . . .) /* Note the "ellipsis", i.e. (. . .) */

{

va_list ap;

Now use macros va_start, va_arg, and va_end to retrieve argument values. These macros exist in stdarg.h library: B7, pg 254 of K&R. Look at it.

Start by initializing ap, using va_start:

va_start(ap, n);

The variable "n" named in va_start is the last named argument before the ellipsis (. . .) in the var-arg list function, function. There must be at least one such argument: use it to tell how many arguments in list.

E.g., in printf( ), first argument gives string with % conversions for all later arguments.

Now ap points just BEFORE first unnamed arg. Each call to va_arg will advance ap one argument, return value; va_arg must name type of arg:

ival = va_arg(ap, int); /* use this where int argument */

dval = va_arg(ap, double); /* use where double argument */

sval = va_arg(ap, char *); /* use where string argument (ptr) */

But think about how you are going to KNOW the type of each arg!!! This is why format string is normally passed for printf( ): % conversions tell you.

Of course, you could assume that all the arguments for a specific var-arg function are of the same type, say string.

But you MUST know HOW MANY arguments. If overestimate, parsing past Argument k in stack frame, get garbage. To terminate, write: va_end(ap);

Example minprintf, page 156 (PUT ON BOARD). Will have homework on this.

7.4 Formatted input: scanf. This is the opposite of printf. Reads in variables from stdin using conversion format string. See pg. 246.

int scanf(char *format, . . .);

The value returned from scanf( ) is the number of successfully scanned tokens: not successful if can't parse the value brought in from stdin.

In calling scanf, call with any number of arguments. but must call with POINTER to variable so that the variable values can be set by scanf!

int age, weight;

char lname[100];

while(some condition) {

printf("Input your last name, age, and weight, separated by spaces);

cnt = scanf("%s %d %d", lname, &age, &weight);

. . .

}

Note: name is an array, and is already like a pointer to the char string.

Scanf is useful to allow you to read in int or double value AS A NUMBER, instead of a character string, where you have to do your own conversion.

(Of course, scanf() will always see a character sequence in stdin: just does its own conversion to int or double.)

However, scanf is FLAWED, because it ignores '\n' characters. Can get very confusing if user puts too few arguments on some line.

(Prompt) Input your last name, age, and weight, separated by spaces:

(User input) Clinton 50

(No response after carriage return. User tries again, remembers to include weight this time.)

(User input) Clinton 50 300

(scanf will see: Clinton 50 Clinton, since '\n' character from user carriage return is seen as white space separator; so thinks weight is weird value, and will return 2 as the number of successfully scanned tokens.)

Worst part is we're out of synch, since now 46 will be seen as last name in next prompt loop. Can't code defensively with scanf( ): can't count number of tokens parsed ON A LINE – scanf doesn't care about input lines.

The best approach is to read a line into an array s[ ] and use "sscanf( )" function to pick apart the arguments in the line just input. This also allows you to try to interpret things in more than one way.

Know that sscanf works on a string if successfully scans all tokens in the string; there's another

But instead of getline( ) to bring in a user line, use library function fgets(), K&R pg. 247. Cover a bit later, Section 7.7. Prefer fgets( ) to gets() since they behave differently, and we must use fgets() for files.

Note in that in both scanf and sscanf, if you put special characters in the format string, we MUST see exactly those special characters in user input.

cnt = sscanf(s, "%d/%d/%d", &month, &day, &year); /* s has string */

will expect input like: 07/23/96. If not, cnt returned by sscanf will be less than 3.

RECALL how we wrote function atoi, axtoi to convert character string s to integer i. There is a function atoi in C library, but no axtoi. Question: How would we do this?

Use sscanf(s, "%d", &i) for atoi, or sscanf(s, "%x", &i) for axtoi.

RULE: Use sscanf only for programs needing only ONE input item, usually "quick and dirty" programs with no input checking.

Class 20.

7.5 File Access.

We have had practice reading from stdin and writing to stdout. We can redirect stdin from a file and stdout to a file at the command level.

In this Section, we learn to get program control over reading and writing named files. An example of an application of this is the command "cat".

cat fname1 fname2

This reads from file fname1, then from file fname2, and puts all characters it reads to standard output. Can catenate two files to a third, thus:

cat fname1 fname2 >fname3

Any number of files can be catenated; the Glass UNIX syntax is:

cat -n {FileName}*

The option -n gives line numbers to the output. What do you think happens if no files are named? (Yes. Read from stdin.)

Dealing with named files is surprisingly similar to dealing with stdin and stdout. Start by declaring a special named object, a "file pointer":

FILE *fp; (See Appendix B1.1, pg. 242)

The <stdio.h> header contains a structure definition with typedef name FILE, which contains component variables (buffer, etc.) used in file I/O.

You don't need to know the details of structs to use simple file I/O. Just use primitive functions, such as fopen():

fp = fopen(name, mode)

(Functional prototype: FILE * fopen(char *name, char * mode);)

Here fp is the return value, set to NULL if fopen fails! Now fopen is asked to open a named file (character string "name") in a particular "use mode".

Legal mode values include "r" for read, "w" for write, and "a" for append.

These modes cause the open file to have different behaviors. We can make calls to get a char out of an "r" file with getc(): (Appendix B1.4, pg 247)

c = getc(fp); /* like getchar(): an "r" mode file acts like stdin */

(Functional prototype: int getc(FILE *stream); Return EOF if fails.)

or put a char to an "a" or "w" file:

status = putc(c, fp); /* like putchar: "w" or "a" mode files like stdout */

(Func. prototype: int putc(int c, FILE * stream); Return EOF if error.)

When we fopen a file in "w" or "a" mode, if the file does not already exist, it will be created (as the vi editor creates a file it has never heard of).

If the file does already exist, then "w" mode fopen will destroy the old contents (like the command mv) and "a" mode will append new material to the end of the existing file (like the "save" command in mail).

More modes are given in Appendix B, pg. 242. (Will come to update later.)

When you have finished reading from a file or writing to a file, you should call fclose to close the file.

status = fclose(fp);

(Functional prototype: int fclose(FILE *stream);)

The function fclose( ) returns EOF if any error occurs, and zero otherwise.

Reading from stdin or writing to stdout, you sit at a particular character in a virtual file, called a "stream", and move only to the right to the next character.

But as we will see, in named files it is possible to "go to the left" to read characters over again (using the function fseek(), App. B1.6, pg 248).

When a C program is started, the operating system opens three files and provides file pointers (FILE *) to them: stdin, stdout, and stderr.

We can now define our old friends getchar and putchar as macros:

#define getchar( ) getc(stdin)

#define putchar(c) putc((c), stdout) <-- see why (c) is in parens?

Other file oriented analogs to input and output functions we've known are:

int fscanf(FILE *fp, char *format, . . .); /* mode of fp must be "r" */

int fprintf(FILE *fp, char *format, . . .); /* mode of fp is "w" or "a" */

Typically, we will use "f" version of primitives, fgets rather than gets.

(But use getc and putc)

(Note, we would still use fgets and sscanf in preference to fscanf.) OK, Now here's the cat program (LEAVE UP).

#include <stdio.h>

/* program to be compiled as "cat" executable (gcc cat.c -o cat) */

main(int argc, char *argv[ ])

{

FILE *fp;

void filecopy(FILE *, FILE *); /* funtional prototype */

if (argc == 1) /* no args: copy standard input */

filecopy(stdin, stdout); /* from on left, to on right is common */

else

while (--argc > 0)

if ((fp = fopen(*++argv, "r")) == NULL) {

printf("cat: can't open %s\n", *argv);

return 1;

} else {

filecopy(fp, stdout); /* copy this file to stdout */

fclose(fp);

} /* loop through all files named */

return 0;

}

/* filecopy: copy file ifp to ofp */

void filecopy(FILE *ifp, FILE *ofp)

{

int c;

while ((c = getc(ifp)) != EOF)

putc(c, ofp);

}

Every file open requires resources, and there is a limit on files open at once; good idea to close fp when done (all close at program termination).

FILE structure has buffer for disk data in memory; when putc, may not get written out to file. THUS IT THE DATA IS NOT SAFELY ON DISK.

This is important to a database, say. Functions fclose() (and fflush()) will flush buffer to disk file.

Look at Appendix B, pg. 241. This Appendix describes the standard library (ALWAYS LOOK AT the standard headers listed there). B1 contains <stdio.h> stuff.

Class 21.

QUIZ 3 Next class. BRING GLASS UNIX.

7.6 Error Handling

Trying to fopen a file that does not exist is an error, and there are other errors as well: reading or writing a file without appropriate permission.

With the "cat" program just covered (put up again), an error performing fopen will write something to stdout; maybe this was redirected to a file.

%cat fname1 fname2 >fname3

But recall there are three streams opened by the operating system when a program begins execution, stdin, stdout, and stderr.

And stderr usually goes to the screen even if stdout is redirected to a file

prog . . . >outfile (redirect stdout to outfile; destroy old outfile)

prog . . . >&outfile (redirect stdout and stderr to outfile, destroy old)

prog . . . >>outfile (redirect stdout to outfile; append on end)

Note it is reasonably common to use both > and >& in a single command:

prog . . . >outfile1 >&outfile2

Then you can grep for error in outfile2, but in any case outfile1 has no error msgs.

How do we rewrite the "cat" program so write error msgs to stderr? Done on pg 163 of K&R (below). Rewrite fopen() in loop of that program as:

if ((fp = fopen(*++argv, "r")) == NULL {

fprintf(stderr, "%s: can't open %s\n", prog, *argv);

exit(1);

}

Here, the error msg goes to stderr; the variable prog printed out under %s is a char array containing the name of this compiled program, initialized:

char *prog = argv[0]; /* name invoked for this program */

The exit(int) function terminates program execution when called and returns argument to invoking process (debugger, shell program, fork parent)

Of course, a "return value" from a main program would do this as well, but exit() will terminate execution as if we executed a return from main(), and can be called from any nested function!

A zero returned by a program means no error; non-zero values mean exceptional condition. You can set up conventions as to what values mean, but it's best to keep the values positive.

At the end of the main program on pg. 163, have new statement:

if (ferror(stdout)) {

fprintf(stderr, "%s: error writing stdout\n", prog);

exit(2)

}

The function ferror returns non-zero if an error occurred on the stream fp:

int ferror(FILE *fp); /* See B1.7, pg. 248 */

But this doesn't give the actual error number to allow us to tell the user or programmer what problem has occurred. Need Error handling functions.

Error handling functions covered in Appendix B, Section B1.7. (See pg 248) These are errors that can arise from any library calls, not just I/O.

Note there is a problem with program on pg. 163. To handle errors, we should #include <errno.h> (See B1.7). The function ferror() tells us if there is an error indication for a stream (the last one that occurred).

More generally, errno.h contains a macro expression "errno" which can be tested; it is zero if there is no problem and non-zero otherwise.

(Text in B1.7 says errno "may" contain an error number; it will contain one if there has been an error–any error, not just in a stream–unless the error is so serious it has corrupted the error structs.)

We can use the function perror to write out the error msg associated with errno, but we have to test for error right after it occurs to get right one.

Note too that the most recent error that has occurred ON A STREAM may not be the most recent error that occurred ON THE SYSTEM.

Since perror will print out last error on the system, this might be an error that occurred on a different file!

So do test:

if (errno != 0) {

perror(s); exit(2);

}

after each system call. perror will print out an errormsg corresponding to integer in errno, as if by: fprintf(stderr, "%s: %s\n", s, "error message");

7.7 Line input and output. The standard C library equivalents to getline and putline: fgets and fputs. only slightly different from getline().

char *fgets(char *line, int maxline, FILE *fp); (like getline from file)

Reads the next input line (including '\n' at the end) from file fp into the char array line; at most maxline-1 chars will be read, then there will be a terminal '\0' added. Returns ptr to line or NULL (means EOF).

For output, the function fputs writes out line to fp. Usually printed out if end with '\n'.

int fputs(char *line, FILE *fp);

It returns EOF if error occurs (disk fills up?), and zero otherwise. Recall can use perror to print out exact error cause (to stderr, not user screen).

Don't use gets() (stdin) and puts() (stdout). They are confusing in inclusion of newline char. (No maxline in gets.) Always use fgets and fputs.

K&R shows (pg 165) how fgets is written in terms of getc in the standard library that was present on their system. Very simple.

/* fgets: get at most n chars from iop into char array s */

char *fgets(char *s, int n, FILE *iop)

{

cs = s;

while (--n >0 && (c = getc(iop)) != EOF))

if ((*cs++ =c) == '\n')

break;

*cs = '\0''

return (c == EOF && cs == s) ? NULL : s; /* error return if nothing new */

}

7.8. Miscellaneous functions (pg 166).

7.8.1 String operations. Talk through. Note strncpy variant, etc. Anything important missing? See Appendix B3, pg 249-250. Several of interest.

One valuable one: char * strstr(cs, ct). Find first example of char string ct in cs. (Look for string ct "01/05/97" in cs named "movie_times".)

Note you can look for ALL occurrences of ct in cs: After you find a match (ptr to char), advance pointer cs to that ptr + 1.

Function char *strtok(s, ct), pg 250, very commonly used indeed. Does the same thing suggested for strstr(), but does it automatically. For example:

char *tok[30] /* handle up to 30 tokens */

char s[ ] = " , Clinton, 50, 300.25"; /* normally would use fgets */

char ct[ ] = " ,"; /* space and comma are two delimiters */

int count = 1;

tok[count-1] = strtok (s, ct);

if (tok[count-1] == NULL) /* would indicate no tokens in line */

. . .; /* take appropriate action */

/* now calls for subsequent tokens */

while ((tok[count] = strtok(NULL, ct)) != NULL)

count++; /* count is right when fall through */

Following this, we can use sscanf to convert various arguments, e.g., second argument tok[1] under conversion %d.

The mem... functions are very much like the str... functions, except there is no null terminator. A good C library has very efficient mem... functions.

Note difference between memcpy and memmove (overlap objects check). Clearly the size n can be given by a sizeof(struct . . .) reference.

Section 7.8.2. Char Class testing. isalnum(). See App. B2, list pg 249. Less common but interesting: isxdigit(), iscntrl(), isspace(), ispunct().

7.8.3 Ungetc. Had something like this in Chapter 4, ungetchar(). Only have guarantee can push one char back, but usually enough.

7.8.4 Command Execution. The function system(). See pg 253.

int system(const char *s);

The string s has a system command (pwd, or date, or ls, or ls >fname, or could run a program: prog). Return depends on system. Learn from man.

A very common use is to run a program with parameters in another shell that will do some work I want.

int a, b;

char command[MAXCMD];

sprintf(command, "prog %d %d > prog.out", a, b);

system(command);

In program compiled as prog, get a and b values through argc and argv[ ].

The system call doesn't return string returned from the command, but by writing > prog.out, will create file with this output.

The calling function can then open this file, input the result and parse it (tokenize it).

7.8.5 Have talked about malloc()/free() before. you need to review it.

7.8.6 Math functions. Need to have #include <math.h>, and use flag for gcc:

gcc -lm source.c

7.8.7 Random functions: rand(), srand(); Look at pg. 252 where covered.

Also see bsearch() and qsort() on pg. 253. There is homework on this.

Here is the beginning of the man page on the C Library function qsort. (You should use the man command to get specifications of commands & fns?)

-------------------

man qsort

Reformatting page. Please wait ... done

C Library Functions qsort(3C)

NAME

qsort - quick sort

SYNOPSIS

#include <stdlib.h>

void qsort(void *base, size_t nel, size_t width,

int (*compar) (const void *, const void *));

DESCRIPTION

The qsort() function is an implementation of the quick-sort

algorithm. It sorts a table of data in place. The contents

of the table are sorted in ascending order according to the

user-supplied comparison function.

The base argument points to the element at the base of the

table. The nel argument is the number of elements in the

table. The width argument specifies the size of each ele-

ment in bytes. The compar argument is the name of the com-

parison function, which is called with two arguments that

point to the elements being compared.

The function must return an integer less than, equal to, or

greater than zero to indicate if the first argument is to be

considered less than, equal to, or greater than the second

argument.

The contents of the table are sorted in ascending order

according to the user supplied comparison function.

EXAMPLES

The following program sorts a simple array:

static int intcompare(int *i, int *j)

{

if (*i > *j)

return (1);

if (*i < *j)

return (-1);

return (0);

}

main()

{

int a[10];

int i;

a[0] = 9; a[1] = 8; a[2] = 7; a[3] = 6; a[4] = 5;

a[5] = 4; a[6] = 3; a[7] = 2; a[8] = 1; a[9] = 0;

qsort((char *) a, 10, sizeof(int), intcompare);

for (i=0; i<10; i++) printf(" %d",a[i]);

printf("\n");

}

ATTRIBUTES

See attributes(5) for descriptions of the following attributes:

__________________________________

|_ATTRIBUTE_TYPE___ATTRIBUTE_VALUE_ |

| MT-Level | MT-Safe |

|_______________|___________________|