There is one deliverable for this assignment
The script must have 5 functions
This function must have the following header:
def open_file_read(filename):
It must try to create a file object for reading on the file whose name is given by the parameter filename.
If it is succesful in creating the file object it should return the object.
If it is cannot create the object it should print an error message and return None
.
This function must have the following header:
def count_words(file):
It must read in a file and count the words in the file.
It must return the word count.
The function should use the following algorithm.
initialize word_count to 0
for line in file:
create the list words by calling split
on line
add the length of the list words to word_count
return word_count
This function must have the following header:
def word_set_create(file):
It must read in a file and create a set of the words in that file.
All words added to the set must be lowercase.
It must return that set.
The function should use the following algorithm.
create the emtpy set word_set
for line in file:
create the list words by running split
on line
for word in words:
make word lowercase
add word to word_set
return word_set
This function must have the following header:
def different_words(set_1, set_2):
It must return a set of all the words that are in one set, but not the other.
This function must have the following header:
def common_words(set_1, set_2):
It must all the words that are found in bothsets.
Open an a text editor and create the file hw5.py.
You can use the editor built into IDLE or a program like Sublime.
Your hw5.py file must contain the following test code at the bottom of the file:
filename_1 = "gettysburg.txt" filename_2 = "gettysburg_hay.txt" file_1 = open_file_read(filename_1) file_2 = open_file_read(filename_2) count_1 = count_words(file_1) print(count_1) count_2 = count_words(file_2) print(count_2) print() file_1 = open_file_read(filename_1) file_2 = open_file_read(filename_2) word_set_1 = word_set_create(file_1) word_set_2 = word_set_create(file_2) print("Filename Words Unique Words") print("---------------------------------------") print(filename_1 + " " + str(count_1) + " " + str(len(word_set_1))) print(filename_2 + " " + str(count_2) + " " + str(len(word_set_2))) print() different_word_set = different_words(word_set_1, word_set_2) print("The two files have", len(different_word_set), "words in one file, but not the other" ) for word in sorted(different_word_set): print(word) print() common_words_set = common_words(word_set_1, word_set_2) print("The two files have", len(common_words_set), "words in common")
For this test code to work, you must copy gettysburg.txt and gettysburg_hay.txt to your machine.
To do this use FileZilla to copy the files from /home/ghoffman/course_files/it117_files into the directory that holds your hw5.py script.
Write this program in a step-by-step fashion using the technique of incremental development.
In other words, write a bit of code, test it, make whatever changes you need to get it working, and go on to the next step.
pass
. pass
statement in open_file_read
with the body of the code
from your hw4.py script. pass
statement in count_words
with
a statement assigns 0 to the variable count.for
loop that loops over the lines in the file. count_1 = count_words(file_1)Run the script.
print
statement. print
statement. for
loop return count. 272 268Fix any errors you find.
pass
statement from
word_set_create. for
loop that loops over the lines in the file. file_1 = open_file_read(filename_1)Run the script.
print
statement. print
statement. for
loop that loops over
the words in word_list. print
statement. print("Filename Words Unique Words")Run the script.
Filename Words Unique Words --------------------------------------- gettysburg.txt 272 132 gettysburg_hay.txt 268 135Fix any errors you find.
pass
statement from different_words. return
statement which calls this method on the
parameters set_1 and set_2.different_word_set = different_words(word_set_1, word_set_2)Run the script.
The two files have 9 words in one file, but not the other advanced battle battlefield carried field fought god under uponFix any errors you find.
pass
statement from common_words. return
statement which calls this method on the
parameters set_1 and set_2.The two files have 132 words in commonFix any errors you find.
272 268 Filename Words Unique Words --------------------------------------- gettysburg.txt 272 132 gettysburg_hay.txt 268 135 The two files have 9 words in one file, but not the other advanced battle battlefield carried field fought god under upon The two files have 132 words in common
cd 117
cd hw
cd hw5
chmod 755 hw5.py
Copyright © 2021 Glenn Hoffman. All rights reserved. May not be reproduced without permission.