CS210 Thurs, Feb. 23

PA02 Intro, continued.

Last time: tokenization with Scanner.  Note TestScanner1.java and TestScanner2.java now with the supplied files for pa02.  These have working sentence and word delimiters ready for pa02.

Random Numbers for random word in a Set.


In the last part of PA02, we want to choose a random first word for our sentence.  How can we do that?
Can get Set of words (all not stop words) from our Map.
We can number them by putting them in an array:

Object[] wordArray = words.toArray[];  // puts refs to Strings
Random r = new Random();     //  java.util.Random
int index = r.nextInt(words.size());  // index will be random from 0 to N-1 where N = words.size();
// now wordArray[index] is a random String from the Set

Inventory Example Expanded, to show data structure like pa02's.

We had a single bin for each item.  Suppose some items need several bins.  We can expand ItemInfo to hold a List of bin numbers.

public class ItemInfo {
    private String name;
    private Map<Integer, Integer> binQuantities;
    public ItemInfo(String name, int binNum, int quantity) {
        // sets up new HashMap, one bin in Map
    }
    public void addBin(int binNo) {
        binQuantities.put(binNo, 0);
    }

    public int findFullestBin() {
        Set<Integer> binNumbers = binQuantities.keySet();   //<--not "getKeys" Corrected from lecture!
        int max = 0;
        int maxBinNo = -1;
        for (Integer b: binNumbers) {
            if (binQuantities.get(b) > max) {
                max = binQuantities.get(b);
                maxBinNo = b;
            }
        }
        return maxBinNo;
    }
}

The whole inventory is held in a Map as before:

Picture of ItemInfo:  object with name="pencil", Map of bin 44 --> 60 pencils, bin 33 --> 42 pencils

Map<String, ItemInfo> inventory;

Picture of whole inventory object:  two Strings as keys: "pencil" --> ItemInfo object, "tape" --> its ItemInfo object, where each of these ItemInfo objects has their own name plus a Map of bin# --> quantity in bin.

Example: Given inventory object, find fullest bin for "pencil":
ItemInfo pencilInfo = inventory.get("pencil");  // first get the ItemInfo from the inventory Map
int binNo = pencilInfo.findFullestBin();     // then get the desired info from the ItemInfo

Simlarly for pa02:

public class WordStats {
    private String word;
    private int frequency;
    private Map<String, Integer> followers;  // constructed from a TreeMap
        ... constructor, methods...
    public String getMostFrequentFollower();
}

Example WordStats for "run" of text1.txt:

    word = "run", frequency = 5, Map followers has ("spot", 1) and ("fast", 1).

In TextAnalyzer, the whole text is represented:

Map<String, Wordstats> textStats;

Here "run" maps to the WordStats object described above.

Example: given textStats,  get the frequency of "run":

WordStats stats = textStats.get("run");   // first get the relevant WordStats object
int freq = stats.getFrequency();     // then get the desired info from the WordStats object

Similary can get the most frequent follower
String follower = stats.getMostFrequentFollower().
which can be returned by TextAnalyzer's getMostFrequentFollower for this word.

What have we covered so far in Chap. 6?
Sections 6.1-6.3, 6.5, 6.7 (except "implementing equals and hashCode"), 6.8

1.  The power of the Collection interface (pg. 211), parametrized by the element type. It gives us a basic API good for Lists (ArrayList, LinkedList) and Sets (TreeSet, HashSet)  Most useful of these are size(), add(), iterator(), but be sure to know them all.  

2. The List interface (pg. 220) has additional important methods: get(), set(), listIterator(). Two important concrete classes, ArrayList and LinkedList.  ArrayList is a "vanilla" List implementation (no important extra methods)  LinkedList does have extra methods--see pg. 224.

3. The Set interface (for cs210) is just the Collection interface.  Two important concrete classes, TreeSet and HashSet.  Neither has any important extra methods for us.

4.  The Map interface (pg. 237) is independent of the Collection interface and has two parametrized types, one for keys, one for values. The most important methods are size(), get(), put(), and Set<K> keySet().

5. So far, we have been concentrating on element types and key types that are objects supplied by the language: Integers, Doubles, Strings, .. That way, we don't have to worry about writing equals and hashCode for them.  We can use any type for a Map value type, since it does not have rules about equals and hashCode.

Map<String, ItemInfo>  : OK because String is fully supported with "perfect" equals, hashCode methods

Map<ItemInfo, Integer>   NG so far--need to specify equals for ItemInfo, etc.

The missing sections--

Sorting and Searching Arrays—Weiss Sec. 6.4--Intro

If you have a Collection, then it’s easy to put its contents into an array, and then use Arrays.sort() to sort it.  See the Collection interface again on pg. 211, and its toArray() method.  Thus we know that any Collection can be called upon to deliver an array holding its contents.  The element objects are not copied, just their refs.