Question 1) For each of the following studies or surveys identify as least one source of potential bias.
(a) You want to know people’s opinions about Hilary Clinton running for president. You randomly generate 1000 phone numbers and call them between the hours of 12pm and 2pm, and ask if they like the idea of Hilary Clinton running for president.
(b) You are interested in studying college students’ views on gun ownership in America. You go out to a college campus and decide to ask every fifth person that walks out of the math building there the following question “Do you believe people should be allowed to own very dangerous, life threatening guns that often kill or harm innocent people?”
city. They have a police officer randomly interview people on the street on how often they don’t pay the meter at their parking space.
(c) A city council wants to know how many people don’t pay for parking around their
(d) A person wants to know what genre of music is most popular in America. They take a poll by asking all their friends to tell them their favorite genre of music.
Question 2) IQ scores are known to be normally distributed with μ = 100 and σ = 15.
(a) Find the probability that a randomly selected person has an IQ between 80 and 120.
(b) Find the probability that a randomly selected person has an IQ above 110.
(c) Find the probability that in a group of 4 randomly selected people all of them have an IQ above 110.
(d) Find the probability that in a group of 8 randomly selected people the average IQ is above 110.
(e) Find the probability that in a group of 20 randomly selected people the average IQ is above 110.
Question 3) Consider the following probability experiment.
I get two fair six sided dice. I roll both of them. If I roll two even numbers I record the largest one. If I roll an even number and an odd number I record the even number. If I roll two odd numbers I record the largest one.
(a) Write down a probability model for this experiment. That means write down all the possible outcomes and their associated probabilities. (Remember: outcomes here would be what I actually record)
(b) Calculate the probability you record an odd number.
(c) Calculate the probability you record a number greater than 3.
(d) Calculate the probability you record an odd number and a number greater than 3.
(e) Calculate the probability you record an odd number or a number greater than 3.
(f) Calculate the probability you record a 4 given that you rolled a 4 on the first dice.
Question 4) I take a random sample of 12 people at a large company and record their salaries in thousands of dollars per year. Here are my results.
43, 48, 53, 54, 57, 65, 72, 75, 81, 88, 92, 175
(a) Give the 5 number summary for this data.
(b) Draw a boxplot for this data. Describe the shape of the distribution.
(c) Are there any outliers in this data set? Why or why not?
Question 5) I take a cooler full of soda out to the beach. Inside the cooler I have 14 cans of Pepsi, 10 cans of Sprite, and 9 cans of Mountain Dew. Suppose I randomly pull out 5 cans from the cooler without looking.
(a) What is the probability that I select all Pepsis?
(b) What is the probability that I select 2 cans of Pepsi, 2 cans of Sprite and 1 can of Mountain Dew?
(c) What is the probability that I select exactly 2 cans of Pepsi?
(d) What is the probability that I don’t select any cans of Mountain Dew?
(e) What is the probability that I select at least one can of Mountain Dew?
Question 6) During a zombie apocalypse every person has a 15% chance of survival. Suppose there is a group of 20 people.
(a) Find the chance that no one survives.
(b) Find the chance that exactly 3 of the 20 people survive.
(c) Find the chance that at least 2 of the 20 people survive.
(d) Suppose instead you had a group of 1000 people. Estimate the chance that more than 170 of them survive.
Question 7) I am studying how GPA in college affects starting salaries at a large business firm. I take a sample of 12 new employees all of whom had GPA’s between 2.7 and 4.0 during college and create the following regression line where y represents starting salary in thousands and x represents GPA on a 4.0 scale. I get the following regression line:
yˆ=13.5x+30 r=.6
(a) What is the value of the slope? Interpret this value in context of the problem.
(b) What is the value of the intercept? Interpret this value in context of the problem.
(c) Is there enough evidence for a linear association? Why or why not?
(d) Calculate the coefficient of determination. Interpret this value in context of the
(e) If someone graduated with a 3.3 GPA what is their predicted initial salary? Would you trust this estimate? Why or why not?
(f) If someone graduated with a 2.8 GPA and was offered 70 thousand as a starting salary should they take this offer? Why or why not?
Question 8) Suppose you are interested in the relationship between daily temperature and daily sales at an ice cream stand. You track both for a week and get the following data:
Day Temperature Sales (in dollars)
Monday 73 230
Tuesday 70 210
Wednesday 75 240
Thursday 84 310
Friday 80 350
Saturday 82 400
Sunday 85 420
(a) Create a scatterplot for this data.
(b) Calculate the correlation coefficient for this data. Is their enough evidence for an association?
(c) Construct the least squares regression line.
(d) Construct a scatterplot for this data. Is a linear model appropriate here? Why or why not?
(e) Describe at least one lurking variable in this study. What could you do to remove this
lurking variable?
Question 9) Suppose there is a slot machine that has four wheels. The first three wheels each have one of five images on it, a heart, a club, a spade, a diamond or a dollar sign. The last wheel has five images as well, 4 X’s and a Jackpot. The has the following pay out system.
(a) What is the probability that you win money on any one play of this machine?
(b) Calculate the expected value of this slot machine. Interpret this value. Is this a good or
(c) Interpret the expected value for this game if you played the game 1000 times.
(d) What should be the pay out for 3 Dollar Signs and a Jackpot for the game to be fair?
Question 10) You work for your favorite drink company. You are in charge of redesigning their logo. After making some changes you want to check that a majority of people like the new logo. You create a focus group of 2000 people and ask them if they like the redesign. 1040 say they like the redesign.
(a) Carry out a hypothesis test at α = .05 to decide if the majority of consumers will like the redesign. Make sure to confirm all necessary conditions.
(b) Regardless of your answer to (a) build a 95% confidence interval for the true proportion of people who like the redesigned logo.
(c) Using your interval would you want to invest significant money in the redesign process. Why or why not?