Class 12. hw2 due Sunday. hw3 following soon.

Check on accounts, etc.

Expressions, predicates and search_conditions.

Start with expres­sions: numeric value expressions, string value expression, [won't cover: datetime ex­pressions, interval value expressions, and conditional expressions].

 

   expr = numexpr | strvexpr

 

numexpr, arithmetic stuff, Figure 3.16a, pg 137.  Defined very much like a programming language expression, except allow columnname (like vari­able) and set_function(numexpr).

 

A numexpr can be a constant, or colname, or qualifier.colname or (recursively):  numexpr arith_op numexpr, (numexpr), function(numexpr), set_function(numexpr)

 

Similarly for strvexpr, Figure 3.16b, pg 139.  Concatenate two strings with a ||.

 

Arith functions in ORACLE, DB2 UDB Figure 3.17, pg. 139.  abs(n), mod(n,b), sqrt(n); NOT standardized.  String handling functions ARE standardized, Figure 3.18, pg 140: See them there.

 

Note that companies sometimes have unusual capitalization:  is it TipTop or Tiptop?  It is common to store only uppercase, retrieve in that form.

 

If sometimes mixed case is important, store column cname with Upper/lower, re­trieve asking for upper(cname) = 'TIPTOP' or lower(cname) = ‘tiptop’.

 

Now, Predicates, Fig 3.19, pg 141 (used in Where clause and some in Having clause):  Evaluate to TRUE, FALSE, or UNKNOWN

 

    comparison predicate:    expr1 Q expr2 | expr1 Q (Subquery)

    between predicate:                      c.discnt [not] between 10 and 12

    in predicate:                                  expr [not] in (subquery)

    Quantified predicate:                  expr Q[all|any|some] (subquery)

    exists predicate;               [not] exists(subquery)

    is null predicate:                          column is [not] null

    like predicate                                columname [not] like 'pattern'

 

A Scalar Subquery in Advanced SQL is a subquery giving a single value. See pg. 141, bottom half. Can use in target list, comparison in Where clause.

 

Now a search_condition (occurs in where and HAVING clauses) is a simple predicate or recursively:

 

    (search_condition)                      (o.pid = 'p01')

    not search_condition                  not exists(select * . . .)

    s_c and s_c                                  not(o.pid = 'p01') and o.cid = 'c001'

    s_c or s_c

 

 

We are going through a list of Predicates allowed for SQL.  Subqueries next.  Read these carefully, will ask Exam Questions about this.

 

Comparison predicate:   e.g., expr1  <  (expr2 | subquery)

 

Subquery must return at most ONE value (maybe ZERO); otherwise must use quan­tified predicate.  If empty set, result of comparison is U, UNKNOWN. If the one value itself is null, the result of the comparison is U. Only if the one value is non-null can the predicate evaluate to T or F.

 

Question.  In comparison predicate with subquery is it always necessary that the expression value be on the left and subquery on the right?  Yes, to qualify as Basic SQL. On the left is allowed in Advanced SQL.

 

Can we have more than one subquery predicate in a search condition for a where clause?   For example “x not in (subquery) and y <some (subquery)”.  Answer:  No problem, as long as the individual predicates qualify.

 

Note that for not equal, you should use (<>) — this is the most standard; others might use (!=) or (^=) on some database systems.

 

TRUTH VALUES:  T, F, and U

 

A row is ruled out in a WHERE clause (pg 144) if the search_condition is not T, i.e., if it is F or U.  However, U is not equivalent to F.

 

Look at AND, OR, NOT operator definitions on pg 151, Fig. 3.18.

   

AND

T

F

U

 

OR

T

F

U

 

NOT

 

T

T

F

U

 

T

T

T

T

 

T

F

F

F

F

F

 

F

T

F

U

 

F

T

U

U

F

U

 

U

T

U

U

 

U

U

 

U just acts like there's doubt, might be T or F, and remains U in result if doubt remains.  But of course F  AND  U is not in doubt:  it results in F, since this is the case whether U is T or U is F.

 

But now why do we need U?  Why can't we get along with T and F? 

 

Recall our previous example with a customers table with a null-discnt row added.

 

    select cname from customers where discnt > 8;

 

retrieves only rows with non-null discnts, to make the predicate T, not U or F.

 

   select cname from customers where not(discnt > 8);

 

also retrieves only rows with non-null discnts, to make the predicate T, not U or F.

 

So the row with null discnt is output in neither of these queries.

 

We can retrieve from such rows with the “is null” predicate:

 

   select cname from customers where discnt is null;

 

There is no problem displaying nulls:

   select * from customers;

 

displays the null-discnt row along with the others. The behavior of ignoring a row happens only where the null value is involved in a search condition in a where clause or having clause.

 

We can replace discnt in discnt > 8 by a subquery to check out subquery behavior with nulls in search conditions (subquery on right of < to qualify as Basic SQL):

 

    select cname from customers c

       where 8 < (select discnt from customers where cid = c.cid);

 

Clearly this is a stupid query, but we are just exercising subqueries here.

 

·   Here, for the rows with non-null discnts, the discnt is retrieved and compared to 8 the same way as in the earlier simpler version.

 

·   For the row with null discnt, a null is returned by the subquery, so the predicate has value U, and the row is not output, as in the simple query.

 

To see the case that the subquery returns no rows, let’s add “discnt is not null” to the subquery:

 

    select cname from customers c

       where 8 < (select discnt from customers where cid = c.cid and discnt is not null);

 

·   Here, for the rows with non-null discnts, the discnt is retrieved and compared to 8 the same way as in the earlier simpler version.

 

·   For the row with null discnt, no rows are returned by the subquery, so the predicate has value U, and the row is not output, as in the simple query.

 

The Between Predicate: expr1 [not] between expr2 and expr 3

 

Conceptually equivalent to (without not): 

 

    expr1 >= expr2 and expr1 <= expr3

 

No good reason for existing;  originally just more efficient to evaluate (didn't trust query optimizer to notice that two comparison predicates on a single (col >= c1 and col <= c2), possibly separated by other predicates, implied a range between).

 

 

Quantified Predicate.

 

    expr  q[some | all | any] (subquery) [Don't use "any": deprecated!]

 

Seen before.  Only tricky part is that expr: qall (subquery) is TRUE if sub­query is empty, and expr qsome (subquery) (equivalently qany (subquery)) is FALSE if subquery is empty.

 

Idea is that qall is TRUE if there is no COUNTER-EXAMPLE, but for qsome to be true, need at least one EXAMPLE.

 

Ex. 3.9.3.  Retrieve maximum discount of all customers.

 

    select discnt from customers c where discnt >=all

         (select discnt from customers d where d.cid <> c.cid);

 

Never mind if you'd do it that way.  It's valid in most cases.  But if there were only a single customer discnt?  Empty set.  Yes, still OK.  But only if >=all is TRUE for empty set in subquery, so that's why we need this rule!

 

Note too, if used <some instead of >=all, the reverse of the above, it would come out FALSE for single (maximum) customer discnt; reverse of above.

 

Skip In predicate (identical to =any), Exists predicate.

 

Is null Predicate:     columname is [not] null

 

NOT valid to say:  columname = null or <> null.  SQL won't get it.

 

Like predicate: columname [not] like 'pattern'

 

Build up patterns using normal characters and wildcards.  Like UNIX file wildcards:   ls  *.c  (* stands for any string). In DOS, “dir *.java” works similarly.

 

    Underscore (_)                 any single character

    Percent (%)                       zero or more characters of any form

    All other chars                  represent themselves

 

e.g.:  select * from customers c where city like 'N_w%k';

 

'New York' and 'Newark' are selected, but not 'Nome' or 'Novow Sibirsk'.

 

To search for strings that contain % or _, we can use the longer form of LIKE:

 

Long-form Like predicate: columname [not] like 'pattern' escape ‘escape-char’

 

Older forms, predating the ESCAPE clause:

 

In Oracle, if use % or _ twice in a row, it means REALLY '%' or "_'.  Thus,

 

    select * from customers where city like 'New_ _%';

 

Will retrieve New_Rochelle, or New_Caledonia, but not New York.

 

From Oracle 10.1 SQL manual:

You can include the actual characters % or _ in the pattern by using the ESCAPE clause, which identifies the escape character. If the escape character precedes the character % or _ in the pattern, then Oracle interprets this character literally in the pattern rather than as a special pattern-matching character. You can also search for the escape character itself by repeating it. For example, if @ is the escape character, then you can use @@ to search for @.

In DB2, '+%' means REALLY the character %.  To type REALLY + , need to type '++'.