Statistics 203Mid-term 1Questions 1-4: Multiple choice: circle only one answer per question (1 mark per question).1. The probability of passing a Statistics 203 final exam is 0.80. Which of the following statementsgives a valid interpretation of this probability?a. Out of every 10 students, 8 will pass the final exam.b. In the long run, the proportion of students passing the final exam is 0.80.c. For any group of 10 students, at least 8 students will pass the final exam.d. In the long run, the proportion of students passing the final exam is 0.50.2. Suppose you have a large random sample from a population. Furthermore, suppose that thepopulation distribution of measurements does not follow a normal distribution. What does thecentral limit theorem tell us?a. For large samples, the distribution of the data is approximately normal.b. For large samples, the distribution of the population mean,€?, is approximately normal.c. For large samples, the distribution of the sample mean,€X , is approximately normal.d. For large samples, the sample mean,€X , is very close to€?.3. Hypothesis: Individuals who listen to music whilst studying for exams will achieve significantlyhigher exam grades than will individuals who study in silence. A research study is conducted tosee if there is evidence in favour of the hypothesis. Thinking about this research hypothesis,which of the below would be an appropriate summary of a statistical significant difference in thissetting?a. The observed average exam grades for students who listen to music are about the same asthe average exam grades for students who study in silence.b. The observed difference in average exam grade for students who listen to music from theaverage exam grade for students who study in silence can be attributed to chance.c. The observed average exam grade for students who listen to music is larger than theaverage exam grade for students who study in silence.d. The observed difference between the average exam grade for students who listen tomusic and the average exam grade for students who study in silence is so large as to beunlikely to have occurred by chance.4. According to the US Census Bureau, the average number of children per American family is 2.2.Which of the following most adequately describes this mean for the American population.a. The mean of 2.2 children makes no sense because a family cannot have 0.2 children.b. The mean of 2.2 is the long-term average number of children based on repeatedlysampling families from the American population.c. The mean of 2.2 children implies that American families have 1, 2 or 3 children.d. American families have between 2 and 3 children.5. (10 marks) Motivated students from across Canada can participate in an annual mathematicscompetition. A random sample of 1,000 students is taken from each of three regions (Maritimesand Newfoundland, Central Canada and Western Canada) to compare student performance on thecompetition. The test was out of 60 marks. A boxplot of the 2013 results is shown below. Usethis plot to answer the following questions.a. Use the above plot to compare the distributions of student scores by region.To get full marks, must correctly compare centres and spread … some examples:Centre: The centres of the 3 distributions appear to be different. Western Canada has a larger median,followed by the Maritimes and then Central Canada. There is little overlap in the boxes (ie middle 50%from each distribution).Spread: The scores in the Maritimes and Newfoundland are much less spread out than the other regions,while the range and IQR for western Canada is much larger than the other two regions.Outliers: There are potential outliers shown in each population.b. What is the interval covering the middle 50% of test scores observed in Western Canada(explain how you determined this interval)?[50,54)c. What percentage of students from the Western Canada scored higher than 50 marks?Q1=50, so 75% of scores were larger than 50 marks.d. Roughly, what is the score for the worst performing student from Central Canada in thissample of students?About 45 markse. Can you tell from this plot whether any of the distributions are uni-modal (explain)?No, a box-plot does not display information about the modes (peaks). A histogram would display this(no requirement to mention histogram)6. (8 marks) Metabolic rate is important in studies of dieting and exercise. The lean body mass (kg)and resting metabolic rate (cal./24hrs) for 100 men participating in a study on dieting wererecorded. The histogram of the recorded resting metabolic rates is plotted below. A scatter-plot ofthe resting metabolic rates versus the lean body mass is also shown.a. What percentage of men in the study had a resting metabolic rate of larger than 1350?There are 5 men in the [1350-1450) interval and 1 in each of the next three intervals.8%b. In which bin would you expect the median resting metabolic rate for this study.There are 40 men in the first bin and 20 in the second. So, the median is in the secondbin… [950-1050)c. Suppose that the correlation between resting metabolic rate and lean body mass iscomputed. What does correlation attempt to measure in this setting?It is attempting to measure the strength and direction of the linear association betweenresting metabolic rate and lean body massd. Is it appropriate to use the correlation to describe the relationship between restingmetabolic rate and lean body mass (why or why not)?No, since the association is not linear.7. (4 marks) An advice columnist asked divorced readers, via her advice column, whether theyregretted their decision to divorce. About 30,000 responses were received, of which about 23,000were from women. Nearly 75% of respondents said that they were glad that they divorced.a. What type of survey is this?Voluntary response surveyb. Briefly explain why this survey is likely to be biased.There are many good answers.For example, people who are motivated typically will reply to voluntary surveys.Thus those who are really happy to be divorced may have written in a response.Could also say that had 75% women and maybe men are more/less happy thanwomen.8. (2 marks) A university has 10,000 undergraduate and 5,000 graduate students. A survey of thestudents’ opinions is conducted by first randomly selecting 100 of the 10,000 undergraduatestudents and then 50 of the 5,000 graduate students. Very briefly explain why this is not a simplerandom sample.A simple random sample requires that each sample of size n have the same probability ofbeing selected. In this question, it is impossible to get a sample of, say, 150undergraduates only.1 mark if only give 1st sentence only.Give one mark if only state that it is a stratified random sample.9. (2 marks) Average before-tax income in the City of Burnaby in 2005 for female single parenthouseholds was $46,228. This statistic was reported in the 2006 City of Burnaby NeighborhoodProfile. Briefly explain why reporting the median income is likely to be a better measure of thecentre for the distribution of female single parent household in 2005 than reporting the average.Incomes typically follow a right-skewed distribution (no not have to mention this). In this case, a fewwell-paid single mothers will cause the average to be higher and not reflect the centre of thedistribution.10. (8 marks) The distribution of moisture content per pound of dehydrated protein concentrate isnormally distributed with a mean of 3.5% and standard deviation of 0.6%.a. Interpret the meaning of the standard deviation of 0.6% in this setting.Based on repeated samples from this distribution, we would expect the averagedistance of observations from the population mean (3.5%) to be roughly 0.60%.Lose ½ mark for each missing bolded ideab. A random sample of 36 one-pound specimens is taken and the moisture content of each ismeasured. What is the distribution of the sample mean moisture content?Mean of the distribution of the sample mean is 3.5%Standard deviation of the sample mean is ? / n = 0.6 / 36 = 0.6 / 6 = 0.1So, distribution of the sample mean is Normal with a mean of 3
.5% and standard deviation of 0.1%Or N(3.5,0.1)c. What is the probability that the sample mean of the 36 specimens in part b is larger than3.8%?Sample mean follows a N(3.5,0.1) distributionP(X > 3.8) =1? P(X ? 3.8)=1? P(Z ?3.8?3.50.1)=1? P(Z ?0.30.1)1? P(Z ? 3) =1? 0.9987=0.0013d. Find the 99th percentile of the distribution of sample mean moisture content based on asample of 36 specimens as in part b?From Table A, 99th percentile of the standard normal distribution is z=2.33To get the 99th percentile, we set the standardized value to the 99th percentile of the standardnormal and solve for x.2.33 =x ??? / n=x ?3.50.1?0.233 = x ?3.5?x = 3.5+ 0.233 = 3.733So, the 99th percentile is 3.733%11. (6 marks) A simple game-of-chance at a high school fund-raising day used a single six-sided die(Note: die is the singular of dice). It costs $2 to play the game. If after rolling the die the numbers1 or 6 are showing, the player is given a brand new $5 bill. If the numbers 2-5 are showing theplayer loses and has to do a silly dance.a. What is the probability distribution for the expected monetary return of this game fromthe player’s point of view?X is the random variable denoting the gambler’s return.X-2 3P(X) 4/6 2/6To get full marks, must list outcomes and associated probabilitiesb. What is the expected monetary return of this game from the player’s point of view?Xkii i x x p X E ? = =?=1( ) ( )E(X)=-2(4/6)+3(2/6)= -2/6 dollars or -0.33 dollars or -33 centsc. What is the minimum amount that the high school should charge to play the game if it isto expect to make a profit?Currently, they charge $2.Let W be the random variable denoting the school’s profit.Denote the amount the school charges as y.Wy y-5P(X) 4/6 2/6For the school to make a profit, E(W) must be greater than $0.E(W)= y(4/6) +(y-5)(2/6) > 04y/6 + 2y/6 -10/6 >06y/6 -10/6 >06y – 10 >0y>10/6y>1.66666So, the minimum amount they can charge and make a profit is $1.67 (you cannot charge a fractionof a cent)12. (3 marks) Volunteers were given a 5×5 square puzzle to solve and the time it took them to solve itwas measured in seconds. The data recorded are listed below:132, 141, 142, 143, 143, 147, 148, 149, 150, 158, 163Find quartiles for these data.1 mark eachThere are n=11 observationsQ1: To get 25th percentile, compute np=11(.25)=2.75The position of the 25th percentile in the sorted sample is 3.Q1=142Q2: The position of the median in the ordered sample is (n+1)/2=6So, Q2=147Q3: To get 75th percentile, compute np=11(.75)=8.25The position of the 75th percentile in the sorted sample is 9.Q1=150Formula SheetDescriptive StatisticsSample Mean: