ChiSquare and Quiz3


CLICK HERE TO DOWNLOAD THIS ANSWER  INSTANTLY $19.95 Only

 

Computing Chi-Square statistics and Contingency tables

The Chi-square statistic tests how well the data fits with a theoretical model. Now there will always be differences between the data and the model. The problem is whether these differences are due to random errors or because the data and model do not fit with each other. As an example, I will do Illowsky Chap 11, Problem 91.

 

  1. A major food manufacturer is concerned that the sales for its skinny French fries have been decreasing. As a part of a feasibility study, the company conducts research into the types of fries sold across the country to determine if the type of fries sold is independent of the area of the country. The results of the study are shown in Table 11.48. Conduct a test of independence.

































Type of friesNortheastSouthCentralWest
Skinny70502025
Curly100601530
steak20401010

 

  1. (5 points) What is the null hypothesis?

  2. (5 points) What is the alternative hypothesis?

  3. (5 points) What is a type I error? Be sure to explain α, the significance level, and (1-α) the confidence level.

  4. (5 points) What is a type II error? Be sure to explain β, which has no standard name, and (1-β), the power.

  5. (10 points) The distribution of weights for a large sack of rice is approximately a mean of 100 lbs and a standard deviation of 2 lbs.

  6. Use the z-table calculator and find weight of a sack which is in the 99th percentile. Use value from an area. (No screen shot necessary.)


http://davidmlane.com/hyperstat/z_table.html

  1. What is the z-score for this weight? (z-score = (x-µ)/σ)

  2. Suppose you want to try to save money. You will weigh each sack first and then only buy the sack if it weighs 101 lbs or more. If you weigh 100 sacks, how many of these sacks will you end up buying on average? (Show a screen shot of your answer.)


Use the central limit theorem to answer the following questions.

  1. Suppose instead to save time, you decide to weigh 10 sacks at a time and take the average value. What is the average weight of a sack? (µ)

  2. For a group of 10 sacks, what would be the standard deviation of the group? (i.e. σ/)

  3. If you only buy the group of 10 sacks when its average weight is 101 lbs or more, what is the percentage of the groups of 10 that you will end up buying? Use the z-table calculator. (No screen shot necessary)

  4. (10 points) Download the data for the Quiz 1 Excel data file. Use Column A.


A certain drug is said to produce at least 3.4 micrograms of its product in the bloodstream. We want to test whether the data taken from 10 test patients shows this effect (µ >= 3.4). If the effect is less than its advertised level, then we will report the company for false advertising.

  1. What is the null hypothesis Ho for our test?

  2. What is the alternative hypothesis Ha?

  3. What type of tail test will we use? (left tail, right tail, or two tails)?

  4. What is the mean of the sample xbar?

  5. What is the standard deviation of the sample s?

  6. What is the size of the sample n?

  7. We going to use a t-statistic. Explain why we are not going to use a z-statistic.

  8. Calculate the t-statistic using xbar, µ, n, and s.

  9. How many degrees of freedom does this data set have?

  10. Use the t-distribution calculator to compute a p-value. Include a screen shot of your answer.

  11. Typically, we want a 95% confidence level. Based on your value of p, should we accept or reject the null hypothesis?

  12. What if we want a 99% confidence level? Based on your value of p, should we accept or reject the null hypothesis?

  13. (10 points) Use Columns C and D for this question.


You are a manager and are testing two methods of production – Method C and Method D. You want to know whether there is any difference in the output depending on which method is used. Column C gives the output for 9 workers using the first method. Column D gives the output for the SAME 9 workers using the second method. Now there are two ways to do the problem. We could test if µ(C) is equal to µ(D). Or since there is data for the same workers using different methods, we could test whether µ(C-D) = 0.

The easier way to do this problem is to test whether µ(C-D) equals 0, or is not equal to 0.

  1. Make a new series of data samples by letting E = C – D. List your new series of 9 numbers.

  2. What is the null hypothesis H0 ?

  3. What is the alternative hypothesis Ha ?

  4. What type of tail test are we going to use? (left tail, right tail, two tail)

  5. What is the mean xbar of this new sample?

  6. What is the standard deviation of the sample s?

  7. What is the size of the sample n?

  8. How many degrees of freedom does this data set have?

  9. What is the t-statistic for this sample?

  10. Use the t-distribution calculator to compute a p-value. Show a screen shot of your answer.

  11. Based on this value of p and using a 95% confidence level, is there a difference between the two methods in the production? Should we accept or reject the null hypothesis?

  12. (10 points) Use columns F and G for the Least-Squares line.

  13. Use Excel to make a scatter plot of the data.

  14. Adjust the values of the x and y axes so that the data is centered in the plot.

  15. Put the trendline on your plot.

  16. Put the equation of the trendline on your plot.

  17. Put the R2 value on your plot.

  18. The R value is a measure of how well the data fits a line. What is R?

  19. Make a screen shot of your final plot. How well do you think the data fits the line? (good fit, moderate fit, marginal fit, no fit)

  20. (10 points) You are working for the Center for Disease Control (CDC). It is flu season and you suspect that flu is affecting the very young and the very old more than other age groups. Your hypothesis is that the number of sick people in each age group should be about the same. You are going to use a X-squared test on your hypothesis. The average numbers of people who were reported to have the flu last month was:


0 – 10 years – 185

11 – 20 years – 175

21 – 30 years -140

31 – 40 years – 145

41 – 50 years – 150

51 – 60 years –140

61 – 70 years – 165

71 and over - 180

  1. What is the null hypothesis?

  2. What is the alternative hypothesis?

  3. What is the total number of people who were reported sick last month?

  4. Your model is that the number of people who have the flu should be the same in each age group. Therefore, what is the expected number of people who should be sick in each age group?

  5. Enter the observed number of people who have the flu and the expected number of people who have the flu into the X-squared goodness of fit applet.

  6. What is the number of degrees of freedom?

  7. What is the p-value? Provide a screen shot of your answer.

  8. Using a 95% confidence interval, should you accept or reject the null hypothesis?

  9. (10 points) This problem is the check to see whether you understand the X-squared test. There are only 2 test columns, so you cannot use the X-squared Goodness of Fit applet from the previous problem as it requires 3 or more test intervals.


You are flipping coins and wonder if the coin is “fair” or “weighted”. After 100 flips, you get 60 heads and 40 tails. Determine if the coin is “fair” or “weighted” using a X-squared test.

  1. What is the null hypothesis for this test?

  2. What is the alternative hypothesis?

  3. Fill in the following table.





































FlipObservedExpected(O – E)(O-E)2(O-E)2/E
Heads60
Tails40
Sum100n/a


  1. What is the value of X2 for this data?

  2. What is the number of degrees of freedom?

  3. Use the X2 calculator to compute p (use the right tail option). Provide a screen shot of your calculation.

  4. Does this value of p support the null hypothesis strongly, moderately, weakly? Or does the value of p support the alternative hypothesis? Explain why.