Class 7.

 

Example 3.3.7.  Find pids of products ordered by at least two customers.

 

NOT EASY!  How to do?  Think of idea of two range variables alive at same time in one table.  Picture in orders tables.  Say pid the same and cid dif­ferent for two.

 

    select [distinct?]  x1.pid from orders x1, orders x2

         where x1.pid = x2.pid and x1.cid <> x2.cid;

 

Need distinct?  Yes.  Picture Cartesian product, pid p01 once with three different cid values, c001, c004, c006.  What happens?  There are three pairs of rows that x1, x2 can fall on:  (c001, c004), (c001, c006), and (c004, c006), AND the REVERSE!

 

It's a better idea to write x1.cid < x2.cid above, instead of x1.cid <> x2.cid, but would STILL NEED to use DISTINCT keyword for the Select list.

 

Example 3.3.8.  Get cids of customers ordering a product for which an or­der is placed by agent a06.  Reminiscent of an example we had in rel alg.  Note that a cid retrieved doesn't necessarily order any product through a06.

                                  /-----  Customers                     CREATE PICTURE OF RANGE

         /-- Products       ---  who                                  VARIABLES AND HOW TO

    a06 ---ordered by  ---  place orders                    SOLVE THIS THINKING IN

         \--agent a06      ---  for those              TERMS OF QBE;  THAT

                                 \------  products             TRANSLATES DIRECTLY

                                                                                                TO SQL STATEMENT

TWO connections through orders relation, so need two aliases.  First prod­ucts ordered by a06 (first cloud above):

 

    select [distinct?] x.pid from orders x where x.aid = 'a06';

 

Would need distinct if wanted unique x.pid values.  Next, customers who order those products:

 

    select distinct y.cid from orders x, orders y        NESTED LOOP OF

         where x.aid = 'a06' and y.pid = x.pid;  (SLOW)   RANGE VARIABLES

 

(PICTURE) x.pid reports back the first cloud above, and set y.pid = x.pid to get second cloud y.cid.

 

But an even MORE straightforward way to do that is to use a subquery (Section 3.4):

 

    select distinct y.cid from orders y where y.pid in

         (select x.pid from orders x where x.aid = 'a06');

 

The "subquery" retrieves the first cloud, and then the outer select re­trieves all y.cid where y.pid is "in" the first cloud.

 

The condition "in" is an operator expecting a single value on the left and a "set" of values on the right (from a subquery ot list);  it's true when the value is one of the set of values on the right.

 

Of course the outer Select retrieves all rows where the condition is true.

 

For the first time, we are using some recursion based on the idea that a select retrieves a table, and now can operate on that table as a set of rows (or in the case just covered, a set of values).

 

NOTE:  For any type of query we have learned so far, if we can do it with join we can do with subquery and vice-versa.  Subquerys are supposed to give a slightly more natural approach to some queries.  BUT you should be able to translate Subqueries to joins in most cases.

 

Example 3.4.1.  Retrieve cids of customers who place orders with agents in Los Angeles or Dallas.  First find all agents in LA or Dallas, and make that a subquery:

 

    select distinct cid from orders where aid in (select aid from agents

         where city = 'Los Angeles' or city = 'Dallas');

 

Note don't need to qualify aid in subquery use or outside subquery:  in each case there is only one table that is natural.

 

Conceptually, innermost subquery operates first, returns a set of values, then outer select runs through customer rows and tests whether cid is in this set.  In fact that might NOT be the way it is done by the system (after query optimization).  Equivalent to:

 

    select [distinct?] x.cid from orders x, agents a where

         x.aid = a.aid and (a.city = 'Los Angeles' or a.city = 'Dallas');

 

Why ( )s?  Because of order of operations:  p and q or r is (p and q) or r,

but that isn't what we want;  want p and (q or r).

 

Otherwise, if r were true for any p, q, whole statement true;  then if agents has some agent in Dallas, will get all cids reported (won't need x.aid = a.aid).

 

Skip a few simple examples.  In particular, can construct your own set using a condition such as:  . . . where . . . city in ('Duluth', 'Dallas').  Of course we could also do this using or,

 

All these variant ways of doing things don't necessarily help — just make user uncomfortable about having a complete grasp of the SQL syntax.

 

OK, now idea of correlated subquery.  Recall in 3.4.1 didn't need qualified names because rows had natural home.  But can refer to OUTER range vari­able from INNER subquery.

 

Example 3.4.4.  (Variant)  Find names of customers with discnt >= 10 who order product p05.  To illustrate a point, using subquery:

 

    select cname from customers where discnt >= 10 and

         'p05' in (select pid from orders where cid = customers.cid);

 

Yes, unusual to say where <const> in subquery (looks like a typo), but per­fectly valid and as long as condition is true will retrieve cname.

 

Note that in subquery, the cid without a qualifier has home qualifier of or­ders, while customers.cid harks back to outer select.  As if we wrote:

 

    FOR c FROM ROW 1 TO LAST OF customers

         discard c if c.discnt < 10

         create L as empty list

         FOR x FROM ROW 1 TO LAST OF orders

             Add x.pid to list L if x.cid = c.cid   /* value set outside loop */

       END FOR x

       discard c if 'p05' not in list L

       otherwise place c.cname in ANSWER list

    END FOR c

    print out ANSWER list

 

Note can't precalculate inner loop;  need outer loop to set customers.cid. (This is called a correlated subquery).  Of course, we could change query to:

 

    select cname from customers where discnt >= 10 and

         cid in (select cid from orders where pid = 'p05');

 

Can evaluate this with precalculation of subquery.

 

 

 

 

    create L as empty list

    FOR x FROM ROWS 1 TO LAST OF orders

         Add x.cid to list L if x.pid = 'p05'

    END FOR x

    FOR c FROM ROWS 1 TO LAST OF customers

         if (c.discnt < 10 or if c.cid not in list L)

             discard c and go to next in loop

       else place c.cname in ANSWER list

    END FOR c

    print out ANSWER list

 

OK, can refer to outer variable from inner loop.  But can't do the REVERSE!! 

 

Example 3.4.5.  To get names of customers who order product p07 from agent a03, can't use query:

 

    select cname from customers where orders.aid = 'a03' and

         'p07' in (select pid from orders where cid = customers.cid);

    ***ILLEGAL***

 

The condition orders.aid = 'a03' can't be used outside subquery (doesn't exist yet -- scoping rule)

 

(** new **) Example.   Find cids of customers who do not place any order through agent 'a03'.  Can use not in, true exactly when in is false:

 

    select cid from customers where cid not in (select cid from

         orders where orders.aid = 'a03'); /* example of minus operator */

 

Example 3.4.6.  In older products, IN predicate only works for single val­ues.  Can't have condition:

 

    . . . where (cid, aid) in (select cid, aid from orders . . .

 

This is Basic SQL standard (Note need (,)), and Oracle permits it.

 

Quantified Comparison Predicates.  List:

 

=some          <>some      <=some      >some     <some     >=some

=any              <>any          <=any          >any         <any         >=any

=all <>all      <=all            >all               <all            >=all

 

         See Figure 3.7.  Some, Any, or All predicates

 

If comparison operator µ stands for operator in the set {<, <=. =, <>, >, >=},

       **IMPORTANT -- MEMORIZE**

 

    v µsome (subquery)  is TRUE iff  v µ s for some s in the subquery

    v µany (subquery)  is TRUE iff  v µ s for some s in the subquery

    v µall (subquery) is TRUE iff  v µ s  is true for all s in the subquery.

 

µsome and µany mean same thing:  µsome is easier to remember.

 

Example 3.4.7.  Find aid values of agents with a minimum percent com­mission.  This was HARD in relational algebra.  Now easy.

 

    select aid from agents

       where percent <= all(select percent from agents);

 

Alternatively

select aid from agents

       where percent = (select max(percent) from agents);

 

but we haven’t yet covered the set function “max”, so this is premature.