Testing for a Linear Correlation
In Exercises 13–28, con...

Question

Testing for a Linear Correlation

In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of α = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Powerball Jackpots and Tickets Sold Listed below are the same data from Table 10-1 in the Chapter Problem, but an additional pair of values has been added in the last column. Is there sufficient evidence to conclude that there is a linear correlation between lottery jackpot amounts and numbers of tickets sold? Comment on the effect of the added pair of values in the last column. Compare the results to those obtained in Example 4.

[IMAGE]

Accepted Answer

All right. Hello, everyone. So this question says, a researcher is investigating whether there is a linear correlation between the number of hours studied and exam scores among a group of students. The data collected in the corresponding scatter plot are as follows. Calculate the value of the linear correlation coefficient R and determine the critical values of R at a significance level of alpha equals 0.05. Is there sufficient evidence to support the claim that there is a linear correlation between our studied and exam scores? All right, so first you can see here that on the screen, I went ahead and just pre-wrote the data that we're already given. So in this case, the hours studied represents the X axis because that is the independent variable. Exam scores, therefore are Y values because that's the dependent variable. And the reason why I bring that up has to do with the formula itself for the linear correlation coefficient. So the formula for R is equal to N multiplied by the sum of XY. Subtracted by the sum of all X values multiplied by the sum of all Y values. And then this is all divided by the square root of N multiplied by the square root, or excuse me, by the sum of all X values squared. Subtracted by the sum of all X values. Squared and we'll discuss that difference in a little bit. This is multiplied by the square root of N multiplied by the sum of all y values squared. Subtracted by the sum of all Y values squared separately. All right, so first, let's go back to this table that I drew earlier. Now, there are some calculations that need to be made for every XY pair. Now, specifically first. You take each X value and square it. That's one extra column of information, and then you take one or all Y values and square that. That gives you another column. And then for each XY pair, you would then multiply them to find XY. And so once you repeat these calculations for all data values given, you would then find the sum of each column. So for example, the sum of all X values is equal to 325. The sum of all Y values is 834. The sum of all X values squared. Is 12,625. The sum of all y values squared. Is 70,544. And the sum of all XY values is 28,495. So now this is the information that you're going to be plugging in to your Formula 4R. Now keep in mind that when the square root, excuse me, when the power of 2, rather, is inside of the parentheses, the number that you're plugging in is the sum of all X values or the sum of all Y values already having been square. By contrast, If the power of 2 is outside of the parentheses, what you're plugging into the equation is the sum of all X values without squaring them. And the same is true for all the Y values. All right, so now let's plug in the information given into our expression. So, N for the record is our sample size, which in this case is 10. So that's 10 Multiplied by 28,495 Subtracted by 325, multiplied by 834. This is divided by Now the square root Of 10 multiplied by 12,625. Subtracted by 325 squared. Multiplied by The square root of 10. Multiplied by 70,544. Subtracted by 834 square. So, evaluating this expression gives you approximately 0.974 as your correlation coefficient. OK, so now let's check our requirements using the scatter plot. Recall that for this problem, the assumption that needs to be made is that the data is a simple random sample, or SRS as I'm writing here. What's also worth noting is that the scatter plot approximately follows a straight line, and there are no outliers present. So now let's find critical values. So here we can use the table of critical values. For the ����app correlation coefficient are. So here, we would refer to the column with alpha equals 0.05 for a two-tailed test. And then look for the corresponding degrees of freedom. Recall that in this case, the degrees of freedom is equal to the sample size subtracted by 2, so 10 subtracted by 2 gives you 8. So now, the critical values of R according to the table. Are positive and negative 0.632. So when the. Absolute value of our calculated R. is greater than or equal to the critical value, we can conclude that there is sufficient evidence to support the claim of a linear correlation. And in this case, our value for our 0.974, is indeed greater than our critical value. Which now leads us to our final answer, which reads as follows. R is equal to approximately 0.974. And there is sufficient evidence to support the claim of a linear correlation between our studied and exam scores. And there you have it. So with that being said, thank you so very much for watching and I hope you found this helpful.

��app

Key Concepts

Linear Correlation Coefficient (r)

P-value

Scatterplot

Watch next