CS210 Class 13

Tues. Mar 7 Expression Evaluation and pa03

Note: midterm Mar 28

The precedence table from pa3, agreeing with Java and C except for ^, is:

Very high:	^
High:	*, /, %
Medium:	+, -
Low:	&
Lower:	\|

Here & is bitwise AND and | is bitwise OR. ^ is exponentiation.

Precedence example: 10+5|3 = (10+5)|3, becuase + has higher precedence than |

All these operators are left-to-right associative, except ^. That means that 3/4/5 means (3/4)/5. But 2^3^4 means 2^(3^4)

Infix calculators: expression evaluation like C/Java code

The algorithm to process the input symbol by symbol for an infix calculator is quite a bit harder than the postfix calculator, and needs two stacks, the prefixStack and the opStack.

Last time we considered 1 + 2 * 3:

Here the numbers are read 1, 2, 3, but the 1 is used last in the calculation. Because of the precedence, we must multiply 2*3 before adding 1. We can save the 1 on the postfixStack until it's ready for calculation, as determined by the operators.

Similarly, we save the + on the opStack until it gets the go-ahead.
On the other hand 2*3 + 1 starts off similarly, but when the + is encountered, the earlier multiplication is given the go-ahead.
For 12/3/2, we want to OK the 12/3 calculation when we see the second /, to get the associativity right.

So we save things on stacks until given the go-ahead by lower precedence operator, also ) and end of input, and equal precedence with left-to-right assoc.

The infix calculator algorithm, the pseudocode from pa03:

The shunting algorithm works by scanning through the expression and considering each token in turn:

· If the token is an integer, push it onto postfixStack.

· If the token is a left parenthesis (the ‘(‘ symbol), push it onto opStack.

· If the token is a right parenthesis (the ‘)‘ symbol), then you are evaluating a sub-expression. Repeat evaluating the top until the top of opStack is a left parenthesis. Finally, pop the left parenthesis from the top of opStack.

· If the token is an operator, do the following. While an operator on the top of opStack exists and has higher precedence* (or is left-to-right associative and has equal precedence) than the encountered operator, repeat evaluating the top, Finally, push the encountered operator on the opStack.

· When the entire expression has been scanned, repeat evaluating the top until opStack is empty. The value on top of postfixStack is the value of the expression.

* i.e., the input operator has lower precedence than the operator at top of stack (this way of saying it is closer to the code in Weiss.)

First we consider expressions that don’t need associativity rules. For example, they have only one operator from each level.

2 * 3 + 4

read 2: push 2 on prefixStack prefixStack: [2] opStack: []

read *: push * on opStack prefixStack: [2] opStack: [*]

read 3: push on prefixStack prefixStack: [2,3] opStack: [*]

read +: there is an operator at top of opStack and it’s of higher precedence, so eval it now: pop *, pop 2 and 3, calc 2*3=6 and push 6 on prefixStack. push + on opStack prefixStack: [6] opStack: [+]

read 4, push on prefixStack prefixStack: [6, 4] opStack: [+]

read EOS, eval top until opStack is empty: pop +, pop 6 as rhs and 4 as lhs, 4+6 =10, push 10. Done

prefixStack: [10] opStack: []

If we use parentheses, we’ll see more of the pseudocode in use:

1 + (2 * 3):

read 1: push 1 on prefixStack prefixStack: [1] opStack: []

read ‘[‘: push on opStack prefixStack: [1] opStack: [[]

read +: no real operator on top of opStack, push + on opStack

prefixStack: [1] opStack: [(,+]

read 2: push on prefixStack prefixStack: [1, 2] opStack: [(,+]

read *: there is an operator at top of opStack but it’s of lower precedence, so skip it. push * on opStack

prefixStack: [1, 2] opStack: [(,+, *]

read 3, push on prefixStack prefixStack: [1, 2, 3] opStack: [(,+, *]

read ), eval top until opStack has ( on top: pop *, pop 3 as rhs, 2 as lhs, calc 2*3, push on stack: prefixStack: [1, 6] opStack: [(,+]

pop +, pop 6 as rhs and 1 as lhs, 1+6 = 7, push 7. Now ( on top of opStack, pop it off as last step of this clause.

prefixStack: [7] opStack: []

We have seen the basic mechanism of the infix calculator and its arithmetic expression evaluation.

I skipped this example in class:

Let’s do one more expression with no equal-precedence operators, using the psuedocode from pa3.

6|5&3 which as we covered earlier, = 6|(5&3) = 6|1 = 7

Using square brackets to contain stacks, to avoid confusion with parentheses on the stack:

input s = “6|5&3”

prefixStack opStack

read 6, push on prefixStack [6] [ ]

read |, see nothing at top of opS, push | [6] [ | ]

read 5, push on prefixStack [6, 5] [ | ]

read &, topop = |, lower precedence, push & [6, 5] [|, & ]

read 3, push on prefixStack [6, 5, 3] [|, & ]

read EOS, eval stack until empty:

pop &, 3, 5, calc 5&3 = 1, push 1 [6, 1] [| ]

pop |, 1, 6, calc 6|1 = 7, push 7 [7] [ ] done, result = 7

Handling Associativity

Consider 12/2/3, which we know from earlier coverage = (12/2)/3 = 6/3 = 2, using the left-to-right associativity rule of C/Java.

Let’s try using the same pseudocode, except reconsidering the one point where precedence is used to decide whether or not to eval the stack.

input s = “12/2/3” prefixStack opStack

read 12, push on prefixStack [12] [ ]

read /, see nothing at top, push / [12] [ / ]

read 2, push on prefixStack [12, 2] [ / ]

read /, see / as topop, same precedence. What to do?

If we eval stack, will do 12/2 = 6—YES, that’s right. (and according to algorithm)

then push new / [6] [ / ]

read 3, push on prefixStack [6, 3] [ / ]

read EOS, finish up: eval stack [2] [ ]

done, result = 2

Now consider 2^3^2, which we know from earlier coverage = 2^(3^2) = 2^9 = 512, using right-to-left associativity as specified for this calculator.

input s = “2^3^2” prefixStack opStack

read 2 [2] [ ]

read ^, see no topop, push ^ [2] [ ^ ]

read 3 [2, 3] [ ^ ]

read ^, see topop = ^, same precedence. What to do?

If we eval stack, will do 2^3 = 8—NO, wrong, leave it on stack, according to algorithm..

Just push second ^ [2, 3] [ ^, ^ ]

read 2 [2, 3, 2] [ ^, ^ ]

read EOS, eval stack until empty:

pop ^, 2, 3, 3^2 = 9, push [2, 9] [ ^ ]

pop ^, 9, 2, 2^9=512, push [512] [ ]

done, result = 512

How Weiss handles precedence and associativity using a scoring system.

Example of a Resume Scoring System

First consider a real-life example of a scoring system. You are in charge of hiring, and sifting through a pile of resumes. You or your boss decide that the most important measure of a candidate is number of years of experience, but among candidates of equal experience, the number of years of college should count.

How can you fold both measures into a single score for each candidate? Note that #years of college = 0, 1, 2, 3, or 4.

Scheme 1, supplied by class member: use low digit for #years of college, higher digits for #years experience.

This works fine. For example, 3 years experience and 2 years of college mean score = 32, higher than 3yr exp, 1yr col, score 31, but below all scores of people with 4 years experience, who would have scores between 40 and 44.

We can draw a number line with points at 0, 1, 2, 3, and 4 for 0 yrs experience, various college lengths, then points at 10, 11, 12, 13, and 14 for those with 1 year experience, and so on. We see groups of points for each level of experience, and each group is disjoint from other groups, to implement the idea that the number of years of experience is the first criterion, and the other is just a tie-breaker.

Scheme 2: We can bring these scores closer together on the number line, leaving out the gaps of unused scores such as 6, 7, 8, and 9. Consider this formula

score = 6*#years_exp + #years_col

Then 0,1,2,3,4,5 mean the same as before, but 1 year of experience means scores of 6,7, 8, 9, 10, and 11, and 2 years experience have scores of 12 .. 17 and so on.

Both these schemes work exactly the same way to compare resumes, that is, the top scoring resume is the same, the next one down, and so on.

Note that scoring systems are widely used in programming, so it’s good to see an example. And, like many in use, Weiss’s is not well commented in the code.

Weiss’s Scoring System for Precedence/Associativity

The basic idea is to score each operator, and then compare scores at the decision point discussed above. However, like the resume scoring case, there are two inputs into the decision, one that’s a tie-breaker for the other. The tie-breaking input is the associativity status of the top-of-stack operator. That status determines the do-stack-eval decision for equal-precedence operators (top-of stack vs. input operators.)

Note that an “input” operator is just another name for the most recently encountered operator in the string we are processing.

Weiss does the decision this way: input_op_score <= top_op_score

(actually written in terms of array lookups, on pg. 413, lines 27-28)

We need to understand how to score operators, input ops and top-of-stack ops, by Weiss’s system.

Without associativity considerations, it would be easy: just score + and - as score=1, * and / as score = 2, and ^ as score = 3.

Or equivalently, use scores 1, 3, and 5, or 10, 20, and 30. All that matters is the numerical order.

Weiss’s basic scores are: + and - have score=1, * and / have score = 3, and ^ has score = 6. These are the scores given to input operators. The space between these scores is room for the secondary adjustments.

We made a number line showing the scores 1, 3, and 6, and the space between these scores.

Then associativity comes in as adjustments to these base scores. The resulting adjusted scores are used for top_op’s to resolve equal-precedence cases. A left-associative operator gets an increment of 1 to the base score, whereas a right-associative operator gets a decrement of 1. The resulting scores for top_ops are 2 for + and -, 4 for * and /, and 5 for ^.

Now the number line has scores 1, 2, 3, 4, 5, and 6. Here 1 and 2 belong to + and -, 3 and 4 to * and /, and 5 and 6 to ^.

Just like resume-scoring case, we see there are groups of scores that don’t overlap, so that operators of different groups compare by precedence alone, as they should. Only equal-precedence operators are affected by the adjustments.

Note that for pa03 you'll need to add & and |, with lower precedence that + and -, to this scoring system. You can see that it's OK to move all the numbers over to the right to accommodate the new low values, or you could try negative scores.

See pg. 412 for precedence table, precTable[], a private field of Evaluator, and an array of Precedence objects, each a "struct-like" object containing two scores, the input-op score and the top-stack-op score. The way they are formatted lines up all the input-op scores in a column and similarly all the top-stack op scores. Label the columns of numbers "input_op_score" and "top_op_score".

By precTable, we see that the minus operator has input_op_score = 1 and top_op_score = 2.

Note (not covered in class) Weiss uses scores to process ‘(‘ and EOL.

Our pseudocode breaks out the case of processing an input ‘(‘ symbol as a separate clause, but if you look at the code on pg. 394 you see no such case. This is because Weiss is handling ‘(‘ through his scoring system, even though it is not a real operator.

On the other hand, an input symbol of ‘)’ does have its own code

‘(‘ as an input symbol scores 100, but at top of stack scores 0.

Another peculiarity of Weiss’s implementation is that he pushes a special EOL symbol on the opStack at the very start of an evaluation.

Also note that EOL gets a score of 0 as an input symbol and -1 at top of stack, which comes into play at the end of the processing.

Next time we’ll look at the code, so bring Weiss or at least a printout of Evaluator.java.