## IT 117: Intermediate Scripting Class 10

### Homework 5

I have posted homework 5 here.

It is due this coming Sunday at 11:59 PM.

### Mid-term

The mid-term exam will be given on Tuesday, March 20th.

This is the first Tuesday after the Spring break.

It will consist of questions like those on the quizzes along with questions asking you to write short segments of Python code.

60% of the points on this exam will consist of questions from the Ungraded Class Quizzes.

The last class before the exam, Thursday, March 8th, will be a review session.

You will only be responsible for the material in the Class Notes for that class on the exam.

The Mid-term is a closed book exam.

### Review

#### The Size of a Set

• One way to compare sets is to compare the number of their elements
• Mathematicians call the size of a set its cardinality
• The `len` function gives the size of a set
```>>> set_1 = {1, 2, 3}
>>> len(set_1)
3
>>> set_2 = {3, 2, 1}
>>> len(set_2)
3
>>> set_3 = {'one', 'two', 'three', 'four'}
>>> len(set_3)
4
```

#### When Are Sets Equal?

• If two sets have the same elements ...
• they are equal
```>>> set_1 = {1, 2, 3}
>>> set_2 = {3, 2, 1}
>>> set_1 == set_2
True```

#### Elements in A Python Set

• Only immutable values ...
• can be elements of a set in Pythone
• So you can create a set of tuples
```>>> tuple_set = {(1,2), (3,4), (5,6)}
>>> tuple_set
{(5, 6), (1, 2), (3, 4)}```
• but you cannot create a set of lists
```>>> list_set = {[1,2], [3,4], [5,6]}
Traceback (most recent call last):
File "<stdin>", line 1, in <module>```

#### `for` Loops with Sets

• Sets are iterable
• This means that they can be used in a `for` loop
• The general format of a `for` loop looks like this
```for LOOP_VARIABLE in ITERABLE_OBJECT:
STATEMENT
...```
• If you use a set in a `for` loop ...
• you will get each element in the set
```>>> set_1 = {1, 2, 3, 4, 5}
>>> for number in set_1:
...     print(number)
...
1
2
3
4
5
>>> set_2 = {'one', 'two', 'three', 'four', 'five'}
>>> for number in set_2:
...     print(number)
...
one
two
five
four
three```
• Notice that the order in which the elements appear ...
• when we define the set ...
• is not necessarily the order in which they appear ...
• in the loop

#### Testing for Set Membership

• You can test whether a set contains a value ...
• by using the `in` operator
```>>> set_1
{1, 2, 3, 4, 5}
>>> 7 in set_1
False
>>> 8 in set_1
False
>>> 3 in set_1
True```
• To test whether a value is not inside a group ...
• we can use the `not in` operator
```>>> 8 not in set_1
True
>>> 3 not in set_1
False```

#### Union of Sets in Python

• The union of two sets ...
• is a new set having all the elements of both sets ...
• with no duplicates
• We can form the union of two sets in Python ...
• by using the union method
```>>> A = {1, 4, 8, 12}
>>> B = {1, 2, 6, 8}
>>>  A.union(B)
{1, 2, 4, 6, 8, 12}```
• The union operation is symmetrical
• This means that
`A ∪ B`
• is the same as
`B ∪ A`
• In addition to the union method ...
• of the set object ...
• there is also a union operator
• The union operator gives the same results as the union method
```>>> A | B
{1, 2, 4, 6, 8, 12}```

#### Intersection of Sets in Python

• The intersection of two sets ...
• is a new set consisting of all the elements ...
• that are present in both sets
• Sets in Python have an intersection method
```>>> A
{8, 1, 12, 4}
>>> B
{8, 1, 2, 6}
>>> A.intersection(B)
{8, 1}```
• Intersection is also symmetrical ...
• so
`A ∩ B = B ∩ A`
• So we can get the same results by running the intersection method ...
• on either object
```>>> B.intersection(A)
{8, 1}```
• Python also has an intersection operator, &
```>>> A & B
{8, 1}```

#### Difference between Sets in Python

• If we have two sets, A and B ...
• the difference between A and B ...
• is a new set consisting of all the elements in `A` ...
• that are not in B
• This is written
`A - B`
• In Python, we can use the set difference method
```>>> A
{8, 1, 12, 4}
>>> B
{8, 1, 2, 6}
A.difference(B)
{12, 4}```
• Set difference is not a symmetric operation
`A - B ≠ B - A`
• So the difference method is not symmetric
```>>> B.difference(A)
{2, 6}```
• Python also has a set difference operator, -
```>>> A - B
{12, 4}```

#### Symmetric Difference between Sets in Python

• The symmetric difference between two sets A and B ...
• consists of all the elements of A that are not in B ...
• and all the elements of B that are not in A
• In mathematics, this is written
`A Δ B`
• We can take the symmetric difference between two sets in Python ...
• by using the symmetric_difference method
```>>> A
{8, 1, 12, 4}
>>> B
{8, 1, 2, 6}
>>> A.symmetric_difference(B)
{2, 4, 6, 12}```
• The symmetric difference operation is symmetric
`A Δ B = B Δ A`
• Python also has a symmetric difference operator, ^
```A ^ B
{2, 4, 6, 12}```

#### Subsets and Supersets

• If all the elements of set A ...
• are also contained in set B ...
• then A is a subset of B
• We can tell if one set is a subset of another ...
• by using the issubset method
• If we have two sets
```>>> A = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
>>> B = {1, 3, 5, 7, 9}```
• We can ask if one set is the subset of another like this
```>>> A.issubset(B)
False
>>> B.issubset(A)
True```
• Python also provides the subset operator, <=
```>>> A <= B
False
>>> B <= A
True```
• If all the elements of the set B are contained in A ...
• then A is a superset of B
• We can ask if one set is a superset of another using the issuperset method
```>>> A.issuperset(B)
True
>>> B.issuperset(A)
False
```
• The superset operator is >=
```>>> A >= B
True
>>> B >= A
False```

#### Disjoint

• If two sets have no element in common ...
• they are said to be disjoint
• The isdisjoint method of a set object ...
• will tell you if two sets are disjoint
```>>> B = {1, 3, 5, 7, 9}
>>> C = {2, 4, 6, 8, 10}
>>> B.isdisjoint(C)
True
```
• Since this condition is symmetric ...
• we can run the method on either set object
```>>> C.isdisjoint(B)
True```

#### The clear Method

• The clear method removes all elements from a set
```>>> D = {1, 2, 3, 4, 5}
>>> D
{1, 2, 3, 4, 5}
>>> D.clear()
>>> D
set()```

#### `min` And `max` with Sets

• To find the set element with the maximum value ...
• you can use the `max` built-in function
```>>> B = {1, 3, 5, 7, 9}
>>> max(B)
9
```
• To find the set element with the minimum value ...
• use the `min` function
```>>> min(B)
1
```

#### Sets More Efficient Than Lists

• Why use sets if we can use lists?
• Lists can do everything sets can do ...
• and they also have an order associated with them
• But the way that sets are implemented in Python ...
• means that getting a value from a set is very fast
• So sets are more efficient than lists
• If the number of elements in a problem is small ...
• this efficiency doesn't make much of a difference
• But what if we were doing something ...
• that involved a large number of elements?
• Here sets could make things significantly faster

### New Material

#### Working with the Operating System

• Certain operations can only be performed by the operating system
• For example
• Creating files
• Renaming files
• Deleting files
• Creating directories
• Many of the things you can do at the command line ...
• can be done within Python
• The Python interpreter can ask the operating system ...
• to perform these task for you

#### The os Module

• When you need the operating system to do something for you ...
• you need to use Python's os module
• Of course you must import it first
`>>> import os`
• Whenever you need to do something with a file ...
• other than read it ...
• or write to it ...
• you will need to use the os module

#### os.getcwd()

• The current directory is the starting point ...
• for all relative paths
• os.getcwd() returns a string ...
• that gives the pathname of your current directory
```>>> os.getcwd()
'/home/ghoffmn'```
• The name means get current working drectory

#### os.listdir(path)

• os.listdir(path) returns a list ...
• of the contents of the directory ...
• specified by its argument
```>>> course_dir = os.listdir('/courses/it117/s14/ghoffmn')
>>> for entry in course_dir :
...     print(entry)
...
GROUP
MAIL
cmanuel1
jpinto
fortinsy
ebeazer
...```
• os.listdir does not return the special entries . and ..
• The list is not in any particular order ...
• but you can use the built-in function sorted() to change that
• To see the contents of your current directory ...
• run os.listdir with no argument
```>>>os.listdir()
['News', 'mail', 'it114', '.ssh', '.bash_history', '.bashrc', ...```
• Notice that os.listdir() includes the "invisible files" ...
• in the list it returns
• You can use an absolute path
```>>> os.listdir('/home/ghoffmn/assignments_submitted')
['homework_submitted', 'code_entry_submitted']```
• or a relative path
```>>> os.listdir('assignments_submitted')
['homework_submitted', 'code_entry_submitted']```
• as the argument to os.listdir()

#### os.chdir(path)

• To move to another directory ...
• inside a Python script ...
• you need to use os.chdir(path)
• You can use either an absolute path
```>>>  os.chdir('/home/ghoffman')
>>> os.getcwd()
'/home/ghoffman'```
• or a relative path
```>>> os.chdir('..')
>>> os.getcwd()
'/home'```
• Relative paths are more convenient than absolute paths...
• because they are shorter
• But to use a relative path in your script ...
• you know where the new directory is located ...
• with respect to your current directory
• You don't have this problem when you use an absolute path

#### os.rename(old_name, new_name)

• You can change the name of a file ...
• with os.rename
```>>> os.chdir('/home/ghoffmn/tmp')
>>> os.listdir('.')
['test.txt', 'dir1']
>>> os.rename('test.txt', 'file.txt')
>>> os.listdir('.')
['dir1', 'file.txt']```
• os.rename() also works on directories
```>>> os.rename('dir1', 'test_dir')
>>> os.listdir('.')
['test_dir', 'file.txt']```

#### os.remove(path)

• To delete a file use os.remove()
```>>> os.remove('file.txt')
>>> os.listdir('.')
['test_dir']```
• os.remove() does not work on directories
```>>> os.remove('test_dir')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [Errno 21] Is a directory: 'test_dir'```

#### os.rmdir(path )

• To remove a directory use os.rmdir()
```>>> os.rmdir('test_dir')
>>> os.listdir('.')
[]```
• You cannot remove a directory using os.rmdir() ...
• unless the directory is empty

#### os.mkdir(path )

• To create a directory, use os.mkdir()
```>>> os.mkdir('dir1')
>>> os.listdir('.')
['dir1']```

#### Running Unix Commands within Python

• You can run a Unix command from within Python ...
• using os.system()
• The argument to os.system() is a string ...
• which contains a Unix command
• os.system() runs the command ...
• and returns the exit status
```>>> result = os.system('touch foo.txt')
>>> result
0
>>> os.listdir('.')
['dir1', 'foo.txt']```
• When the exit status is 0 the ...
• the command ran without error
• If the number is greater than 0 ...
• the command did not work
• os.system() will also work with Windows

#### os.environ

• One of the ways we customize a Unix environment ...
• is with shell variables
• os.environ is not a function
• It is a module variable ...
• which holds a dictionary
• The keys in the dictionary are names of variables
• The values in the dictionary are the values of the variables
• To get the value of a shell variable use the [ ] operator ...
• with the name of the variable
```>>> os.environ['HOME']
'/home/ghoffmn'
>>> os.environ['SHELL']
'/bin/bash'
>>> os.environ['PATH']
'/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games'```

#### The os.path Module

• The os.path module contain some functions ...
• which operate on pathnames
• Like all modules, it must be imported
`>>> import os.path`
• os.path is NOT part of the os module
• It is completely independent ...
• so it must be imported even if you have already imported os

#### os.path.isfile(path) and os.path.isdir(path)

• os.path.isfile() and os.path.isdir() are boolean functions
• os.path.isfile() returns true ...
• if its argument is a file
```>>> os.path.isfile('foo.txt')
True
>>> os.path.isfile('dir1')
False```
• os.path.isdir() returns true ...
• if its argument is a directory
```>>> os.path.isdir('dir1')
True
>>> os.path.isdir('foo.txt')
False```

#### os.path.basename(path)

• os.path.basename() returns the filename part ...
• of a pathname
```>>> os.getcwd()
'/home/ghoffmn/tmp'
>>> os.path.basename(os.getcwd())
'tmp' ```
• It should be used in usage messages ...
• which we'll discuss in a few minutes

#### The sys Module

• Python scripts run inside two environments
• The operating system
• The Python interpreter
• The sys module contains variables and functions ...
• that let you interact with the Python interpreter
• You must import the sys module ...
• before you can use it
`>>> import sys`

#### Getting Values from the Command Line

• You have see two way a Python script can get information ...
• from the outside the script
• A file
• The user
• But there is a third way to get user data
• The script can get the values ...
• from the command line
• The sys module contains the variable argv
• sys.argv is a list variable ...
• that contains all the command line arguments ...
• as well as the pathname used to run the program
• Here is a script that demonstrates this
```\$ cat print_args.py
#! /usr/bin/python3

import sys

print("The command line arguments:")
for index in range(len(sys.argv)) :
print("Argument ", index, ":", sys.argv[index])

\$ ./print_args.py foo bar bletch
The command line arguments:
Argument  0 : ./print_args.py
Argument  1 : foo
Argument  2 : bar
Argument  3 : bletch```
• Notice that the first element in sys.argv is the pathname ...
• that was used to run the script
• The name argv comes from the C language ...
• where it stands for argument vector

#### Leaving a Running Script

• When the interpreter gets to the end of a script ..
• But what if you wanted to leave before this?
• You can leave a script before the end of the code ...
• by using the sys.exit() function
• Why would you want to do this?
• There are many reasons
• The most common is when you encounter an error ...
• that prevents the script from proceeding
• For example, consider the following a script ...
• that prints the contents of a file
```\$ cat print_file.py
#! /usr/bin/python3

import sys

file_name = input("Please enter the name of a file: ")
try :
file = open(file_name, 'r')
except :
print("Could not open file", file_name)
sys.exit()
for line in file :
print(line.strip())

\$ ./print_file.py
Please enter the name of a file: xxxxxxxxxxxxxxx
Could not open file xxxxxxxxxxxxxxx```

#### Usage Messages

• Most scripts need input from the user to do their work
• Though a script can get this using `input`...
• this is not the best solution for a utility program ...
• because it requires an extra step
• You have to type the command ...
• and then wait to be prompted for the values
• It is more convenient to supply the values on the command line ...
• and not wait to be prompted
• But this raises a problem
• How do you tell the user what values are needed?
• The best way to do this is through a usage message
• A usage message is a message that a program prints ...
• when it does not get the right number of command line arguments
• In this class, usages messages must have the form
`Usage: SCRIPT_NAME ARGUMENT_1 ARGUMENT_2 ...`
• For example, let's say the script list_dir.py needs the name of a directory ...
• from the command line
• If it does not get it ...
• it should print a usage message that indicates the argument it needs
```\$ ./list_dir.py
Usage: list_dir.py DIRECTORY_NAME```
• The message is printed by the following code fragment
```if len(sys.argv) < 2:
print('Usage:', os.path.basename(sys.argv[0]), 'DIRECTORY_NAME')
sys.exit()```
• Let's examine this code
• The first line checks the number of tokens on the commands line
• You might have thought that the value should be 1, not 2
• But sys.argv is a list containing all command line strings ...
• including the pathname of the script
• So the name of the script can be obtained from the expression
`sys.argv[0]`
• The second line prints the message ...
• and uses the os.path module function basename() ...
• to strip away everything except the name of the script
• If I had not done this the usage message would read
```\$ ./list_dir.py
Usage: ./list_dir.py DIRECTORY_NAME```
• The third line ends the running of the script
• You must print a usage message for any script that takes command line arguments
• If you don't you will lose points