Table of contents
- 1. Intro to Stats and Collecting Data55m
- 2. Describing Data with Tables and Graphs1h 55m
- 3. Describing Data Numerically1h 45m
- 4. Probability2h 16m
- 5. Binomial Distribution & Discrete Random Variables2h 33m
- 6. Normal Distribution and Continuous Random Variables1h 38m
- 7. Sampling Distributions & Confidence Intervals: Mean1h 3m
- 8. Sampling Distributions & Confidence Intervals: Proportion1h 12m
- 9. Hypothesis Testing for One Sample1h 1m
- 10. Hypothesis Testing for Two Samples2h 8m
- 11. Correlation48m
- 12. Regression1h 4m
- 13. Chi-Square Tests & Goodness of Fit1h 20m
- 14. ANOVA1h 0m
1. Intro to Stats and Collecting Data
Intro to Stats
Problem 10.4.10
Textbook Question
Garbage: Finding the Best Multiple Regression Equation
In Exercises 9–12, refer to the accompanying table, which was obtained by using the data from 62 households listed in Data Set 42 “Garbage Weight†in Appendix B. The response (y) variable is PLAS (weight of discarded plastic in pounds). The predictor (x) variables are METAL (weight of discarded metals in pounds), PAPER (weight of discarded paper in pounds), and GLASS (weight of discarded glass in pounds).
[IMAGE]
If exactly two predictor (x) variables are to be used to predict the weight of discarded plastic, which two variables should be chosen? Why?

1
Step 1: Understand the problem. The goal is to determine which two predictor variables (METAL, PAPER, GLASS) should be used to predict the response variable (PLAS) in a multiple regression model. The decision will be based on statistical measures such as correlation coefficients, p-values, or adjusted R-squared values provided in the accompanying table.
Step 2: Review the accompanying table. Look for statistical metrics that indicate the strength of the relationship between each predictor variable and the response variable. For example, check the correlation coefficients to identify which predictors are most strongly correlated with PLAS.
Step 3: Evaluate the significance of each predictor variable. If p-values are provided in the table, identify the two predictor variables with the smallest p-values, as these indicate stronger statistical significance in predicting PLAS.
Step 4: Consider the adjusted R-squared value. If the table includes adjusted R-squared values for different combinations of predictor variables, identify the combination of two predictors that results in the highest adjusted R-squared value. This indicates the best fit for the model while accounting for the number of predictors.
Step 5: Make a decision. Based on the analysis of the correlation coefficients, p-values, and adjusted R-squared values, select the two predictor variables that provide the strongest and most statistically significant relationship with PLAS. Justify your choice using the data from the table.

This video solution was recommended by our tutors as helpful for the problem above
Video duration:
2mPlay a video:
Was this helpful?
Key Concepts
Here are the essential concepts you must grasp in order to answer the question correctly.
Multiple Regression Analysis
Multiple regression analysis is a statistical technique used to model the relationship between a dependent variable and two or more independent variables. It helps in understanding how the independent variables collectively influence the dependent variable, allowing for predictions based on their values. In this context, the dependent variable is the weight of discarded plastic, while the independent variables are the weights of discarded metals, paper, and glass.
Recommended video:
Probability of Multiple Independent Events
Variable Selection
Variable selection is the process of identifying which independent variables should be included in a regression model to optimize its predictive power. This involves evaluating the significance and contribution of each variable to the model's performance. In the given question, selecting the two most relevant predictor variables is crucial for accurately predicting the weight of discarded plastic.
Recommended video:
Guided course
Intro to Random Variables & Probability Distributions
Correlation and Multicollinearity
Correlation measures the strength and direction of a linear relationship between two variables. In multiple regression, it's important to assess the correlation between predictor variables to avoid multicollinearity, which occurs when independent variables are highly correlated. This can distort the regression results and make it difficult to determine the individual effect of each predictor on the dependent variable.
Recommended video:
Guided course
Correlation Coefficient
Watch next
Master Introduction to Statistics Channel with a bite sized video explanation from Patrick
Start learning