Class 4.

 

Homework is due soon—questions?

 

 

Order of precedence of operators (Fig. 2.7, pg 53)

 

Precedence         Operator                              Symbols

Highest                 PROJECT                            R[ ]

                               SELECT                               R where C                       

                               TIMES                                   X                                        

                               JOIN, DIVIDEBY                 , ¸                                  

                               INTERSECTION                 Ç                                        

Lowest                   UNION, MINUS                   È, -                                     

                                                                          

You should be able to interpret expressions that use these rules, but you can use parentheses yourself if you're in doubt.

 

Examples:

CUSTOMERS where cname = ‘TipTop’[cid]  not good, should be

 

(CUSTOMERS where cname = ‘TipTop’)[cid] to make [cid] operate on a relation

 

CUSTOMERS X PRODUCTS[pname]

This is OK, has rows like (cid,cname,city,discnt,pname), not (pname)

 

CUSTOMERS X PRODUCTS where pname < ‘New’

 

CUSTOMERS where cname <’B’ and discnt < 10

 

precedence in conditions...

 

Display on board R ´ S (Cartesian Product) (remember qualified names) for R & S below.

 

                                    R                                 S

A

B

 

B

C

a1

b1

 

b1

c1

a1

b2

 

b2

c2

a2

b1

 

b3

c3

 

But now show what it is for R JOIN S, R  S.  Leave following on board.

 

(1) JOIN.  Start by taking R ´ S.

(2) Cross out rows where equivalently named attributes with qualifiers R or S in R ´ S do not have equal values (do it above, rewrite smaller table).

(3) Do away with one column with duplicate name;  drop qualification names (don't need, since now all columns have different names)

 

Draw lines between matching rows in R & S. Note b3 in table S has no matching row in table R, does not appear in join.  Write out R  S.

 

Example:  Look at: CUSTOMERS  ORDERS (pg. 55).  For each row in orders, exactly one in customers with same cid value (only common name).  Draw one row.  extends orders row with more information about customer. 

 

Most Relops defined now.  Things can get kind of complex in forming rela­tional algebra expressions to answer queries; take things one step at a time.

 

Example:  Give the aid values of agents who take orders in dollar volume greater than 500 for customers living in Kyoto.

 

In following, assume have:  C := CUSTOMERS, A := AGENTS, P := PRODUCTS,

O := ORDERS.

 

Note will need info from ORDERS and from CUSTOMERS, since don't know where customers live in ORDERS, don't know anything about orders in CUS­TOMERS.  Join ORDERS and CUSTOMERS, and have necessary info.

 

    ORDERS  CUSTOMERS  or O  C

 

But now need to select, then project.

 

    ((O  C) where city = 'Kyoto' and dollars > 500.00) [aid]

 

Would the following be OK?

 

    O  C where city = 'Kyoto' and dollars > 500.00 [aid]

 

See precedence of operations table, pg. 53.  Strongest binding is projec­tion: [aid] would try to bind to C (no such column).  What about

 

    (O  C) where city = 'Kyoto' and dollars > 500.00 [aid]

 

If projection happened first, then where clause wouldn't work. Instead:

 

    ((O  C) where city = 'Kyoto' and dollars > 500.00) [aid]

 

(Slow.)  Note could have done this differently)

 

  ((C where city = 'Kyoto')  (O where dollars > 500.00))[aid]

 

Example.  Give all (cname, aname) pairs where the customer places an or­der through the agent.  Need to involve three tables.  Does this work?

 

    (C  O  A) [cname, aname]

 

Look at pg. 28.  No, but why not? (Slow:  look at definition of Join).  OK that haven't defined double join:  turns out that (R  S)  T = R  (S  T), so can define R  S  T as either one of these. Also COMMUTES, so

(R  S)  T = (S  R)  T

 

Problem is:  build up attribute names going left to right.  Results in all (cname, aname) pairs where the customer places an order through the agent and the customer and agent are in the same city.  To leave that out, need:

 

    ((C[cid, cname]  O)  A) [cname, aname]

 

Cuts out city in first join.  Why need cid AND cname attributes for C?

 

Example:  give (cid, cid') pairs of customers who live in the same city.

Look at pg. 28.  What do we want?  (c001, c004) and (c002, c003).  Will the following do this? (Assume K := CUSTOMERS)

 

    (C  K) [C.cid, K.cid]

 

No.  No qualified names are defined to be in join result (although allowed in Relational product). In fact C  K (all same attribute names) looks like what?  (By def of Join, all cols C & K equal , so C Ç K.)  Don't always use Join, try:

 

    ((C ´ K)  where C.city = K.city)[C.cid, K.cid]

 

Look at pg. 28.  Draw multiplication table, 5 rows (C cid values), 5 columns (K cid values), put X in when equal city.  Problem, will get (c002, c002).  Will also get both (c001, c004) and (c004, c001). To solve this, write:

 

    ((C ´ K)  where C.city = K.city and C.cid < K.cid)[C.cid, K.cid]

 

Like choosing lower triangle below the diagonal. You'e responsible for tricks like these on homework, exams.

 

Now, Relational Division.  Define it in terms of Cartesian product.  The idea is, as with integers, given two integers X and Y, define Z = X/Y, so that Z*Y = X.  Can't always quite do this, even with integers, so instead define Z to be the largest integer such that Z*Y ≤ X:  e.g., if X = 17 and        Y = 5, then 3.4 = 17/5 and Z = 3.  Same with table division.

 

 

Definition 2.7.5.  Division.  Consider two tables R and S, where the schema of S is a subset of the schema of R.  Tables R and S:

 

    Head(R) = A1 . . .An B1 . . .Bm,  and  Head(S) = B1 . . .Bm.

(No common attribute names between Ai and Bj.)

 

The table T is the result of the division   R ÷ S (which is read as "R DI­VIDEBY S") if Head(T) = A1 . . .An  and T consists  of those rows t such that for every  row s in S, the row resulting from concatenating t and s can be found in table R, and there is no larger set of rows for which this is true.ð

 

Example 2.7.15.  (In book) Start with the table R given by:

                                                                R

A

B

C

a1

b1

c1

a2

b1

c1

a1

b2

c1

a1

b2

c2

a2

b1

c2

a1

b2

c3

a1

b2

c4

a1

b1

c5

 

We list a number of possible tables S and the resulting table T := R ÷ S.

                                                         S                    T

C

 

 

A

B

c1

 

 

a1

b1

 

 

 

a2

b1

 

 

 

a1

b2

 

Note that all of the rows in S ´ T are in R, and there couldn't be any larger a set of rows in T for which that is true because there are only three rows in R with a c1 value in column C.

                                                        S                      T

C

 

 

A

B

c1

 

 

a1

b2

c2

 

 

a2

b1

 

All of the rows in S ´ T are in R, and there couldn't be any larger a set of rows in T with this true:  look at rows in R with C values c2.