CS680 Object Identity and equals/hashCode
The idea of object identity is crucial to object-oriented programming
and design. Each object has a unique identifier that follows it around
through its lifetime, even through changes to its attributes.
In memory-based objects, the identifier can be its reference value,
which is simply its address in the JVM address space, a very
lightweight unique id that can live as long as the enclosing program
execution. There can be multiple program variables with the same
reference value (aliasing.) If x and y both point to the same
object, then x==y.
Later, when we study persistent objects, we will switch to unique ids
that are held in fields (key fields), but it's the same basic idea.
Equals and hashCode have default implementations in Object based on
reference values. For o and x of type Object, o.equals(x) is true if o
== x. o.hashCode() is a certain hash function of this address
value.
For example, if a class Customer does not implement equals and
hashCode, its objects will be compared by reference value and Object's
hash function applied to its reference will be used for its hashCode.
We can implement Set<Customer> with a HashSet, and the resulting
set will have all-different ref values, that is, all different Customer
objects. If we change an attribute of one of these Customers, the set
is still the same: it just contains the modified Customer object.
This simple setup is used for most app objects. Note that
TreeSet<X> is not usually used because the only available
ordering is by hashCode, an arbitrary ordering.
Value Objects
There are also value objects in common use. These have equals based on
equality of all attributes. For example, p and q of type Point2D.Double
are equal (p.equals(q) is true) if their x and y coordinates are equal.
HashCode is implemented based on both x and y. A set of points,
HashSet<Point2D>, has points of all different positions in
2-space.
The JDK has lots of classes with value-based equality. Integer,
Double, etc., String, Date, Set (even though it's an interface),
Point2D, Rectangle2D.Double, and so on. It also has lots of classes
with ref-based equality: FileWriter, etc., Swing Components, Swing
Events, etc. The Set equality by value means that for sets a and b,
a.equals(b) is true iff a and b have the same number of elements and
each element of a is contained in b and vice versa (See Javadoc for
official definition.)
App classes can be given value-based equality, but you need to be careful:
1.You need to be sure equals and hashCode are consistent, that is, if x.equals(y), then x.hashCode() == y.hashCode().
2. If the classes are in an inheritance hierarchy, implementing equals properly is tricky. See Core Java v 7 pg. 171
3. If elements of a Set are value-based, and you change an element that
is in a set, you can break the set. The Set is not notified of changes
to an element, and has placed the element in its own data structure
using the original attribute values.
Note that point 3. is not relevant to immutable objects. Integer,
Double, and String are immutable, so Sets of these are completely
sturdy. Point2D.Double and Date and Set are mutable, so you have to be
a little careful with Sets of these.
The upshot of all this is that we usually use ref-based identity and
equality for app objects, unless they are like points or numbers or
enums or other JDK value-based classes. (Later, when using database
persistence, we'll use key-based identity.)
Two ref-based objects (different ref values) can have the same
attributes, and we say they are different objects. This happens quite
often because we often model real world objects without enough
information to discriminate between them. For example, we could have a
Person object with firstName and lastName fields, and end up with two
different people in the system with two different Person objects but
the same attributes.
Also, although we usually keep only one object instance around, there
are cases when we duplicate an object for some particular purpose so as
to keep the original one untouched. Suppose we are doing what-if
analysis. We can duplicate the original object and do the what-if
processing on the copy, then throw it away.