CS210 Thurs., Feb. 16

Set Implementations in the JDK

There are two major concrete classes implementing Set in the JDK, TreeSet and HashSet.  You can read about them, pp. 210-216.  We will not study their (actual) implementation in cs210, but will in cs310.  In cs210, we will study binary search trees, which can be used to implement TreeSet, although not as well as the more advanced trees.  The HashSet implementation requires “hashing”, a topic of cs310.

However, we can understand how to use these powerful JDK classes. It’s not difficult.  For TreeSet, you need to have a way to compare elements, either through them implementing Comparable, or supplying a Comparator for them.  For HashSet, you need element “equals” implemented, and also element hashCode.

The easiest cases are Set of String, or Set of Integer, or Set of Double, or …: these come with comparison and equals and hashCode already set up, so there’s hardly any work at all to using them.  See the sample programs of Figures 6.26 and 6.27.  Will be on hw2.  Can simplifiy Fig. 6.26 by taking out "reverseOrder" spec.

Which to use, TreeSet or HashSet?  For large problems, hashing is faster than tree search, O(1) vs O(log n), for n elements in the set.  TreeSet maintains order, which can be a useful thing for the app. 

Toy example: vowels, think of as set

Consider counting vowels.  The pseudocode is Loop through c in text
  if (c is a vowel) count it

How do we code the condition here?  The ordinary programmer says

  if (c == ‘a’ || c == ‘e’ || c == ‘i’ || c == ‘o’ || c == ‘u’)
or maybe:
  String vowelStr = “aeiou”;
  if (vowelStr.indexOf(c) >= 0)  // search string for c

but with sets, we can avoid this search or repetitive comparison, once we build a Set of vowels:

       String vowelStr = “aeiou”;

   Set<Character> vowels = new HashSet<Character>();
   for (int i=0; i<vowelStr.length(); i++)
       vowels.add(new Character(vowelStr.charAt(i)));
   System.out.println(vowels); // check set of vowels:
                               // prints [o, i, u, e, a]
                               // note funny order because of hashing
 

in main loop over input string--

      Character c = s.charAt(i);

   if (vowels.contains(c))   // assuming c a Character: O(1) lookup, even for large sets

Of course for 5 items this is not important, but for large sets, it really matters.  We have gone from an O(N) search to an O(1) lookup, a big savings.

So when you see a loop of tests, think “could I use a Set here?”

See Vowels.java to try it yourself.

In pa2, we will use Set of Strings to check “stop words”, those words like “the” “at”, etc. that are not used in searches.

Maps 

Maps, Starting with language-neutral info

 

The picture here has a domain and range sets, with arrows from individual domain elements (keys) to corresponding specific range elements (values).  Like a math function, each domain element has assoc. with it exactly one range element.

 

In CS we think of the action of following the arrow most often as a "lookup" action from keys to values.  For ex., employee records are looked up by social-security no..  The first would need a map with Integers for social security nos. as keys, Employee objects as values.

 

Let's look at a trivial example: reactions to grades:

 

'A'  ->   "yeah!"

'B'  ->    "hohum"

'C'  ->    "ak"

 

Here the keys are Characters and the values are Strings.  Each of these lines can be called a "key/value pair", or just “pair” or “association”.  ('A', "yeah") is a pair of the grade 'A' (the key) and the phrase "yeah" (the value).  The whole mapping is the set of these 3 pairs.

 

M = { ('A', "yeah"), ('B', "hohum"), ('C', "ak") }          --a map as a set of pairs

 

That's how a mapping turns out to be a collection, in the same league as the other collections we are studying. However, in Java a Map has its own interface separate from Collection, in recognition of the importance of the domain-range view of Maps that is not captured completely by the “set of pairs” model.  For one thing, not every collection of pairs makes a proper map: M qualifies as a map only if the collection of keys has no duplicates, i.e., constitutes a set.

 

Actions on Maps

We can add a key/value pair to a Map, and this operation is called put in Java (pg, 217), and we can lookup the associated range element (value) of any given domain element (key), and this action is called get.  Put is more pushy than “set” for Lists – it can put another fact into the collection rather than just changing one that’s there.

 

The model of maps generalizes the model of arrays.  In the special case of integers (not too large) in the domain we can use arrays to hold maps.  For example, M = { (2, ‘C’), (3,’B’), (4, ‘A’) } maps integer codes to grades.  We could do this:

 

Character m[] = { ‘ ‘, ‘ ‘, ‘C’, ‘B’, ‘A’ }; 

 

Then we lookup (get) the grade for 3 by evaluating m[3] and getting ‘C’ as a Character, but compatible with ‘C’.  We can put a grade for code 1 by assigning m[1] = ‘D’.  We can do if (m[1] == ‘D’) after that if we want.

 

Now suppose we want A- handled too, so we switch to Strings for grades:

 

“A”  ->   "yeah!"

“A-”  ->   "yeah!"

“B”  ->    "hohum"

“C”  ->    "ak"

 

Note that we can’t do this with arrays:

         Arr[“A”] = "yeah!"      //  Impossible: Array subscripts must be integers!      

 

But we can do this with Maps:

 

Map<String, String> gradeReaction[] = new TreeMap<String,String>();

gradeReaction.put(“A”, “yeah!”);

gradeReaction.put(“A-, “yeah!”);

gradeReaction.put(“B, “hohum”);

gradeReaction.put(“C, “ak”);

 

Now we can get a reaction out by grade, any time:

 

String reaction = gadeReaction.get(“A-“);

 

Gets “yeah!” in reaction.

 

We can also iterate through the Map, as we will see in other examples.

For another example of a Map from String to String, see pg. 240

 

Various ways of thinking about maps:

As holding conversions, like codes to grades, social security number to name

As generalized arrays

As math functions: y = f(x) is a map.

As a “database” with key lookup: SSno to employee record, ISBN to book record, name to inventory record.

 

Often in applications we use maps of strings to some object type, so let's next look at a little example of that sort.  Suppose we have a simple inventory system where part names map to inventory objects holding  (name, QOH, bin#), QOH= quantity on hand, for ex.,

 

"pencil"  ->  (“pencil, 120, 42)          120 pencils in bin 42

"tape" -> (“tape”, 44, 11)                  44 rolls of tape in bin 11

 

Let’s set up the inventory example.

 

public class ItemInfo {

   private String name;

   private int quantity;

   private int binNumber;

   Constructor, getters, setters…

};

 

Map<String,ItemInfo> inventory = new TreeMap<String, ItemInfo>;

// note: don’t need equals in ItemInfo, only in keys, and String is all set up for us

 

ItemInfo item = new ItemInfo(“pencils”, 120, 42);

inventory.put(“pencils”,item);

 

later:

ItemInfo item = inventory.get(“pencils”);

System.out.println(“we have “+ item.getQuantity() + item.getName() + “s”);

 

Can deduct 10 pencils from inventory:

int quantity = item.getQuantity();

Item.setQuantity(quantity-10);

 

Now here’s the question: are we done, or do we need to check this back into inventory????

Answer: we’re done, because item is a ref into the object held in the Map.

 

We can find all the items in the inventory… next time