Class 16

hw2 solutions are online

 

Note additions to class 15 notes for Java case:

How to get SQLSTATE and sqlcode from a SQLException object

How to update a column value to null using a ResultSet: rset.updateNull(colname);

 

Midterm Review

Mostly about queries:

·   Relational Algebra queries

·   Basic SQL queries

·   Advanced SQL queries

 

--writing queries given English description

--understanding given queries, expressing them in English

--determining output of queries

--rewriting queries sometimes

 

Other topics: insert/update/delete, null handling, understanding syntax elements, proofs for CS630 only, anything on homework.

 

Syntax elements example:

 

In the query

select cid, cname from customers

   where 2* discnt < (select max(discnt) from customers) and upper(cname) like ‘F%’;

 

1.    give the select list(s):    cid, cname for outer query, max(discnt) for subquery

2.    give the predicate(s), and their classification (Fig. 3.19):   one quantified predicate “2*discnt < (select max(discnt) from customers)”, and one comparison predicate upper(cname) like ‘F%’

3.    give the search_condition(s):  “2* discnt < (select max(discnt) from customers) and cname like ‘F%’ “            

Note that a search_condition can also show up in a HAVING clause.                                                                            

4.    give the subquery/ies:  “select max(discnt) from customers”, a scalar subquery

5.    give the expressions: 2*discnt, upper(cname), ‘F%’, max(discnt), cid, cname

 

Is this Basic SQL?

Yes. Note that Figs. 3.16a and 3.16b for expression syntax has 7 forms of Basic SQL and then 3 Advanced forms.  (scalar_subquery) as an expression is advanced, but we are separately allowed to use scalar subqueries in predicate forms of Fig. 3.19, and this predicate fits the form for quantified predicate there.

 

Fix to Fig.3.19: quantified predicate should have [ALL|SOME|ANY]

 

If we rewrite the first predicate as

 

discnt < 0.5 *(select max(discnt) from customers),

 

it’s now an Advanced comparison predicate, not a quantified predicate, i.e., it fits the form

     expr op expr

with the second expr using (subquery) by form 8 of Fig. 3.16b.

 

Also if we turn the predicate around, (subq) > 2*discnt, it becomes an Advanced comparison predicate.

 

Consider

select cid, discnt, (select avg(discnt) from customers where cid <> c.cid)

    from customers c;

 

Here we have a scalar subquery in the select list—is that OK for Basic or Advanced SQL?

See pg. 135 for select list syntax: it’s a list of expr’s.  As an expr, the scalar subquery qualifies only as Advanced SQL.

 

 

Queries

Example. Get names of agents ordering products priced over $1.00.

 

Need join of orders to agents and also to products.

RA: watch out, need to avoid join on city here.

    (A[aid, aname] JOIN O JOIN P where price > 1.00)[aname]

 

Basic SQL:

    select aname from agents a, orders o, products p

        where a.aid = o.aid and o.pid = p.pid and price > 1.00

 

Advanced SQL:

select aname from agents a join orders o on a.aid = o.aid

    join products p on o.pid = p.pid where price > 1.00

 

Example. Get names of agents ordering all products priced over 1.00.

RA: ((O[aid,pid] DIVIDEBY (P where price > 1.00))[pid] JOIN A)[aname]

 

SQL: a FORALL query...  Fix on one agent to be output, with certain a.aid.

Counterexample: a product priced over 1.00 with no order by a.aid

all counterexamples for given a.aid:

          select * from products p where price > 1.00 and not exists

                   (select * from orders x where x.pid = p.pid and x.aid = a.aid)

 

lack of counterex’s:

            select aid from agents a where not exists (...above)

 

Is there any way to do this without 2 levels of subquery?

Yes, by counting cases in having:

# products priced over 1.00: (select count(*) from products where price > 1.00)

# products ordered by a.aid priced over 1.00: group orders (where price > 1.00) by aid, look at count(distinct pid)

 

         select aname from orders o, products p, agents a

             where o.aid = a.aid and o.pid = p.pid and p.price > 1.00

         group by a.aid, a.aname

         having count(distinct o.pid) = (select count(*) from products where price > 1.00)

     

 

Example Find cids of customers who have put in orders for 2 or more products.

RA: can’t count, but can reason with pairs of (cid, pid) in orders:

(cid1, pid1) and (cid2, pid2): want cid1=cid2 but pid1 <> pid2

 

Need to rename orders and the pid columns to do a cross product or join:

O1[cid1, pid1] := O[cid,pid]

O2[cid2,pid2] := O[cid,pid]

(O1 x O2 where cid1 = cid2 and pid1 <> pid2) [cid1]

or keep same names on cid’s and do a join

 

SQL: can follow RA idea, or use pid counts of cid-grouped orders:

      select cid from orders

        group by cid

        having count(distinct pid) > 1;

 

Queries with max/min: no set functions in RA, so concentrate on SQL.

 

Remember that quantified predicates can do a loop of tests, so a query with “max” or “min” in it may not end up with a max() in the SQL.

 

Ex 3.4.7, pg. 101: find aids of agents with min commission.

   select aid from agents

     where percent <= all (select percent from agents)

 

Example. Find orders with max qty for its product:

max(qty) being considered:

   select pid, max(qty) from orders

     group by pid

 

select ordno from orders

   where (pid, qty) in (select pid, max(qty) from orders group by pid)

 

or

select ordno from orders o

  where qty = (select max(qty) from orders x

                           where x.pid = o.pid)                 (a correlated subquery)

 

or even

select ordno from orders o

  where not exists (select * from orders x

                               where x.pid = o.pid and x.qty > o.qty)

                a FORALL condition

 

gives us a hint that to do this in RA, need division—if interested, see last example in class 5 notes.

 

 

Example. Find customer(s) dealing with the most agents.

think most<--> max count, watch out, two set functions! Quantified predicate to the rescue...

Here the counts involved: aids for certain cids:

select count(distinct aid) from orders

  group by cid

 

we want cids with the max such count. The counts are available to the HAVING clause

 

select cid from orders o

group by cid

having count(distinct aid) >= all (select count(distinct aid) from orders group by cid)    

 

Note that this is an uncorrelated subquery.

 

Given Advanced SQL, we can use max: replace >= all with

= (select max(ct) from (select count(distinct aid) ct from orders group by cid)  t )