CS210 Project 3

Bob Wilson

 

Big-O Analysis and Experimentation for Sorting

 

In this project, you will experiment with and observe the performance of various sorting algorithms applied to data in different states of organization.  Two of the data files will be organized randomly (two different sizes), one will be already sorted, and one will be sorted in exactly in the opposite order of the desired order.  You will learn that different sort algorithms have different performance in these different situations

 

Assignment Instructions:

Download and build a Dr Java project file for the code in Project3.zip.  It contains the L&C example code for the class “SortingAndSearching” which has multiple sorting methods each of which uses a different algorithm.  It contains code for a SortCounter class with a main method that reads a file and sets up its data in an array for sorting by each of the five methods.  There are a number of data files as samples for sorting – the names should be self-explanatory except for “mystery.txt”.

 

SortCounter class:

 

NOTE:  DO NOT CHANGE THE OPERATOR INTERFACE IMPLEMENTED IN THIS CLASS.  IT MUST BE KEPT AS-IS SO THAT THE GRADER CAN USE A COMMON SCRIPT FOR TESTING EACH STUDENT’S CODE.

 

Modifications to SortingAndSearching class:

There are attributes and methods to maintain an integer attribute compareCount for the number of comparisons that the sort algorithm makes on the data in each file while sorting it.  It also has a memoryCount and maxMemoryCount attributes for the amount of extra memory used by mergeSort.  A constructor initializes all attributes to zero.  A getCompareCount() method reads the compare counter value (after performing a sort experiment on a file).  A resetCompareCount()method resets the compare count to zero prior to doing the next sort experiment.  A getMemoryCount method reads the maximum memory counter after that merge sort method is called.  None of the code needed to generate data in these attributes is provided.

 

You must “instrument” the code of the “SortingAndSearching” class to generate the correct data.  Study the code for each sorting method.  Identify all places where the code calls compareTo on the data values that are being sorted and add code to increment the compareCount value at those places.  Also, add code to the mergeSort method to generate the memoryCount and maxMemoryCount data.  Now, run experiments using the supplied data files to characterize the performance of each sort algorithm.  Copy and paste the results of your experiments from the DrJava interactions pane to your report to record the data as a baseline. 

 

Study the code for the bubble sort method further.  You should have already run experiments that show how well the bubble sort code performs.  Now, implement an improvement in the code so that the inner loop keeps track of if a swap is made and the outer loop terminates if the inner loop makes no swaps.  That would imply that the data is already sorted.  Run experiments that show how well it performs after you make the change.  Explain the observed differences from the baseline in your report.

 

Study the code for the quick sort method further.  You should have already run experiments that show how well the quick sort code performs.  Now, implement the suggestion made in the lecture notes to select the middle element instead of the first element as the partition element.  Hint:  Swap the middle element and the first element before doing the rest of the sort.  Run experiments that show how well it performs after you make the change.  Explain the observed differences from the baseline in your report.

 

Study the code for the merge sort method further.  You should have already run experiments that show how well the merge sort code performs and display the amount of extra memory it used.  Now, implement the suggestion made in the lecture notes to use less memory by allocating memory for the merge process after the recursive calls instead of before the recursive calls.  Don’t forget to move the code for measuring the memory use data.  Explain the observed differences from the baseline in your report.

 

Experimenting and Reporting:

The zip file also contains some test data files that present different situations for sorting.  Any file named random contains data that is not in any sorted order.  Any file named sorted contains data that has previously been sorted in the same order that the algorithms will be trying to sort the data.  Any file named reverse contains data that has previously been sorted in exactly the opposite order from the order that the algorithms will be trying to sort the data.  Sort the file mystery.txt and try to figure out what problem it causes for one of the algorithms. Include your conclusion in your report.

 

Gather your experimental test results and discuss your observations/conclusions in your memo.txt report file.  You must discuss your observations relative to the expected theoretical behavior of each algorithm based on Big-O notation.  Include copy and paste results of your runs from the DrJava interactions window into memo.txt to provide supporting evidence for your discussion and your conclusions. 

 

Report (memo.txt):

 

For this project and all projects in this course, the memo.txt file that you upload to the turn-in system MUST BE A PLAIN TEXT FILE - NOT A WORD (.DOC or .RTF) FILE.  On a Windows PC, I recommend that you use Notepad to create this file.  On a MAC, you must use a suitable editor to create a plain text file.

 

Report on the results of your experiments as explained above.

 

Turning in your Assignment:

 

Use the secure file transfer program to upload your Project3 folder containing your .java files and memo.txt file to your UNIX file directory: “account/cs210/Project3” where “account” is your account ID.

 

You must upload the files to our UNIX system before the posted deadline.  You will get no credit for late submissions.  Plan your time carefully so that you have enough time to upload your files.  If you have not finished, upload your files on time anyway - even if the program does not work correctly yet.  That way you are eligible for part credit on the assignment.  If you upload your files more than once, only the most recent copy will be saved.