Class 16
hw2 solutions are online
Note additions to class 15 notes for Java case:
How to get SQLSTATE and sqlcode from a SQLException object
How to update a column value to null using a ResultSet: rset.updateNull(colname);
Midterm Review
Mostly about queries:
· Relational Algebra queries
· Basic SQL queries
· Advanced SQL queries
--writing queries given English description
--understanding given queries, expressing them in English
--determining output of queries
--rewriting queries sometimes
Other topics: insert/update/delete, null handling, understanding syntax elements, proofs for CS630 only, anything on homework.
Syntax elements example:
In the query
select cid, cname from customers
where 2* discnt < (select max(discnt) from customers) and upper(cname) like ‘F%’;
1. give the select list(s): cid, cname for outer query, max(discnt) for subquery
2. give the predicate(s), and their classification (Fig. 3.19): one quantified predicate “2*discnt < (select max(discnt) from customers)”, and one comparison predicate upper(cname) like ‘F%’
3. give the search_condition(s): “2* discnt < (select max(discnt) from customers) and cname like ‘F%’ “
Note that a search_condition can also show up in a HAVING clause.
4. give the subquery/ies: “select max(discnt) from customers”, a scalar subquery
5. give the expressions: 2*discnt, upper(cname), ‘F%’, max(discnt), cid, cname
Is this Basic SQL?
Yes. Note that Figs. 3.16a and 3.16b for expression syntax has 7 forms of Basic SQL and then 3 Advanced forms. (scalar_subquery) as an expression is advanced, but we are separately allowed to use scalar subqueries in predicate forms of Fig. 3.19, and this predicate fits the form for quantified predicate there.
Fix to Fig.3.19: quantified predicate should have [ALL|SOME|ANY]
If we rewrite the first predicate as
discnt < 0.5 *(select max(discnt) from customers),
it’s now an Advanced comparison predicate, not a quantified predicate, i.e., it fits the form
expr op expr
with the second expr using (subquery) by form 8 of Fig. 3.16b.
Also if we turn the predicate around, (subq) > 2*discnt, it becomes an Advanced comparison predicate.
Consider
select cid, discnt, (select avg(discnt) from customers where cid <> c.cid)
from customers c;
Here we have a scalar subquery in the select list—is that OK for Basic or Advanced SQL?
See pg. 135 for select list syntax: it’s a list of expr’s. As an expr, the scalar subquery qualifies only as Advanced SQL.
Queries
Example. Get names of agents ordering products priced over $1.00.
Need join of orders to agents and also to products.
RA: watch out, need to avoid join on city here.
(A[aid, aname] JOIN O JOIN P where price > 1.00)[aname]
Basic SQL:
select aname from agents a, orders o, products p
where a.aid = o.aid and o.pid = p.pid and price > 1.00
Advanced SQL:
select aname from agents a join orders o on a.aid = o.aid
join products p on o.pid = p.pid where price > 1.00
Example. Get names of agents ordering all products priced over 1.00.
RA: ((O[aid,pid] DIVIDEBY (P where price > 1.00))[pid] JOIN A)[aname]
SQL: a FORALL query... Fix on one agent to be output, with certain a.aid.
Counterexample: a product priced over 1.00 with no order by a.aid
all counterexamples for given a.aid:
select * from products p where price > 1.00 and not exists
(select * from orders x where x.pid = p.pid and x.aid = a.aid)
lack of counterex’s:
select aid from agents a where not exists (...above)
Is there any way to do this without 2 levels of subquery?
Yes, by counting cases in having:
# products priced over 1.00: (select count(*) from products where price > 1.00)
# products ordered by a.aid priced over 1.00: group orders (where price > 1.00) by aid, look at count(distinct pid)
select aname from orders o, products p, agents a
where o.aid = a.aid and o.pid = p.pid and p.price > 1.00
group by a.aid, a.aname
having count(distinct o.pid) = (select count(*) from products where price > 1.00)
Example Find cids of customers who have put in orders for 2 or more products.
RA: can’t count, but can reason with pairs of (cid, pid) in orders:
(cid1, pid1) and (cid2, pid2): want cid1=cid2 but pid1 <> pid2
Need to rename orders and the pid columns to do a cross product or join:
O1[cid1, pid1] := O[cid,pid]
O2[cid2,pid2] := O[cid,pid]
(O1 x O2 where cid1 = cid2 and pid1 <> pid2) [cid1]
or keep same names on cid’s and do a join
SQL: can follow RA idea, or use pid counts of cid-grouped orders:
select cid from orders
group by cid
having count(distinct pid) > 1;
Queries with max/min: no set functions in RA, so concentrate on SQL.
Remember that quantified predicates can do a loop of tests, so a query with “max” or “min” in it may not end up with a max() in the SQL.
Ex 3.4.7, pg. 101: find aids of agents with min commission.
select aid from agents
where percent <= all (select percent from agents)
Example. Find orders with max qty for its product:
max(qty) being considered:
select pid, max(qty) from orders
group by pid
select ordno from orders
where (pid, qty) in (select pid, max(qty) from orders group by pid)
or
select ordno from orders o
where qty = (select max(qty) from orders x
where x.pid = o.pid) (a correlated subquery)
or even
select ordno from orders o
where not exists (select * from orders x
where x.pid = o.pid and x.qty > o.qty)
a FORALL condition
gives us a hint that to do this in RA, need division—if interested, see last example in class 5 notes.
Example. Find customer(s) dealing with the most agents.
think most<--> max count, watch out, two set functions! Quantified predicate to the rescue...
Here the counts involved: aids for certain cids:
select count(distinct aid) from orders
group by cid
we want cids with the max such count. The counts are available to the HAVING clause
select cid from orders o
group by cid
having count(distinct aid) >= all (select count(distinct aid) from orders group by cid)
Note that this is an uncorrelated subquery.
Given Advanced SQL, we can use max: replace >= all with
= (select max(ct) from (select count(distinct aid) ct from orders group by cid) t )