Homework 6
Math 114Q, Section 10

Due in class Tuesday, October 23.

  1. Remember that the description of your term paper project is due on Thursday, October 25.

  2. Spam is junk email. Most mail systems have a spam filter that tries to decide whether each piece of email you get is spam. When the spam filter finds something it thinks is spam, it may throw it away, or put it in a junk mail folder so that you can decide whether to throw it away without reading it.

    There's lots of interesting useful stuff for you to learn in this question, even though there's only a little bit of quantitative reasoning at the end.

    1. Describe clearly how you deal with spam. Here are the kinds of questions you should consider. You may need to do some research on your email provider's web site to find the answers.
      • Who provides your email service? (UMB, google, yahoo, ... ?)
      • How does your email provider filter for spam?
      • Do you have any say in how your email provider filters spam for you? If so, what do you tell it?
      • How much spam email do you receive - how many per day, on the average? What percentage of your email is spam? What kinds of spam do you tend to get? If your provider does all the filtering for you you may not even know it's throwing things away before you see them. Try to find out.
      • Does the spam filter on your system ever call something spam when it's in fact legitimate email?

    2. A spam filter can make two kinds of errors. It can mark legitimate mail as spam, and can mark spam as legitimate. Generally, the more of one kind of error the less of the other. The balance depends on how you tell the filter to behave.

      Discuss the costs to you of each kind of error.

    3. What is the original meaning of the word "spam"? Does the company that sells (the real) spam object to the new meaning?

    4. I get about 250 emails each day. My spam filter traps about 175 of them. Of those about five are legitimate, and should have been delivered directly to me. My regular mailbox, which should contain just the emails that aren't spam, usually is about half spam. So (in words) my spam filter is pretty good (but not perfect) at recognizing legitimate email but not very good at calling spam spam. Make those words quantitative: what percentage of my emails are legitimate? What percentage spam? What is the percentage of false negatives (spam that's called legitimate)? What is the percentage of false positives (legitimate email that's called spam)?

    5. For the curious, here is what my spam filter collected in about four hours today. Several of these are legitimate.

      Here are some web sites to look at if you want to find out more.

  3. Betting on uncommon numbers. In many state lotteries the customer picks the numbers she thinks will win. The prize is then divided among all the people who happened to pick the winning numbers. Much as we try to analyze only real situations, the real Massachusetts lottery is too complicated for this class. (Many people find it too complicated to choose the numbers they want to bet on, and elect "quick picks" instead.) So this question is about an imaginary lottery.

    Here is how our lottery works. Tickets cost $1. Each person buying a ticket chooses the number between 1 and 100 that she thinks will win. When all the tickets have been sold, the state picks a number at random between 1 and 100. All the people who have chosen that number divide 70% of the total collected among themselves. (The other 30% the state uses for overhead and local aid.) So the fair price for a $1 ticket is $0.70 or 70 cents.

    Of course the winners collect much more than the fair price (since the losers collect nothing). For example, if 1000 people bought tickets, 39 was the winning number, and 8 people chose 39, each would get $1000 * 0.7 / 8 = $87.50.

    If everyone buying tickets used "quick pick" then the 1000 tickets would (more or less) consist of 10 for each of the 100 numbers, ten people would have the winning number and the typical payoff would be $1000 * 0.7 / 10 = $70.

    Now that you've read this far and understood the game, we can ask an interesting question. Suppose you know that people are so afraid of the number 13 that no one ever picks it. You think (correctly) "If I buy a ticket and choose 13, I'm probably not going to win. But if I do win, I will win big because I won't have to share the prize." So every day you buy one of the 1000 tickets, and choose 13, knowing that no one else will. You lose with probability 0.99 and win with probability 0.01. In the long run, how much money do you win (on the average) each day?

    (You might find it easiest to answer this question by imagining that you played the lottery 100 days in a row.)

  4. Complete the second graph on Boston population from today's class by finishing the excel Tutorial we worked on. Your graph should clearly show the percent change in the Boston population each decade using the data from the Census Bureau. Print that graph and the first graph that you made and include them with this assignment. Use them to answer the following questions. Make sure to write complete grammatically correct sentences.

    1. When was the population of Boston increasing? Describe this using a range of years ("year x to year z", for example).

    2. When was the population of Boston decreasing? Again, use a range of years to describe this.

    3. Use the information in the first graph to write a short (3-4 sentence) description (mostly words, perhaps a few numbers) of how the population of the city of Boston has changed in the years since the first census in 1790.

    4. During what 10-year time period did the city of Boston see the largest percentage change in population? Did this correspond to an increase or a decrease?

    5. Use both the graphs you've drawn to rewrite your answer to part (c), making it more informative. This part of your homework should be typed, revised, and proofread. Don't forget the spell checker. Your answer should be about half a page long.

  5. Download the spreadsheet BeverageConsumption.xls from the course website.

    1. What does "per capita consumption" mean? Write a sentence that explains this for a particular piece of data (for example, what would you say about: 1980, milk, 27.1 gallons?).

    2. Use Excel to make a scatter plot of this data. To do this, select all 4 columns of data and go through the steps that we followed in class. After you have constructed your graph go back to the Source Data and label each Data Series correctly. Print your graph and include it with this homework.

    3. Using your graph and the data in the table, write a short paragraph describing the changes you see in per capita consumption of selected beverages in the US during this time period. Some suggestions for your paragraph: start with a topic sentence that captures your main point. Back that up with some specific references to the data. You may want to calculate some percentage changes to further support your argument. Type this, proofread it, revise it, and attach it to this homework. It should be at least one-half page in length.