Download this R markdown as: R, Rmd.
Outline of this lab session:
anova, drop1, add1,
vif.library("mffSM") # Cars2004nh
library("colorspace") # rainbow_hcl() colors
library("psych") # pairs.panels()
library("car") # vif()
We will work again with the dataset peat and
Cars2004nh from previous exercises.
peat <- read.csv("http://www.karlin.mff.cuni.cz/~vavraj/nmfm334/data/peat.csv",
header = TRUE, stringsAsFactors = TRUE)
data(Cars2004nh, package = "mffSM")
Given response \(\mathbf{Y}\) and available set of covariates \(\mathbb{Z}\) we consider two different model matrices \(\mathbb{X}^{0}\) and \(\mathbb{X}^1\) of ranks \(r_0\) and \(r_1\). These correspond to two different models: \[ \mathrm{M}_0: \mathbb{Y} \mid \mathbb{Z} \sim \left(\mathbb{X}^0 \boldsymbol{\beta}^0, \, \sigma_0^2 \mathbb{I}_n\right), \qquad \mathrm{M}_1: \mathbb{Y} \mid \mathbb{Z} \sim \left(\mathbb{X}^1 \boldsymbol{\beta}^1, \, \sigma_1^2 \mathbb{I}_n\right) \] Model \(\mathrm{M}_0\) is considered to be a submodel of model \(\mathrm{M}_1\) if the linear subspace of \(\mathbb{R}^n\) generated by the columns of \(\mathbb{X}^0\) (\(\mathcal{M}(\mathbb{X}^0)\)) is a subspace of the space generated by the columns of \(\mathbb{X}^1\). In particular, \(\mathcal{M}(\mathbb{X}^0) \subset \mathcal{M}(\mathbb{X}^1)\).
Let us label the quantities computed from models \(\mathrm{M}_0\) and \(\mathrm{M}_1\) with a subscript or superscript \(0\) or \(1\), respectively. If submodel \(\mathrm{M}_0\) holds, then
If additionally, a normal linear model is assumed, \(\mathbb{Y} \mid \mathbb{Z} \sim \mathsf{N}\left(\mathbb{X}^0 \boldsymbol{\beta}^0, \, \sigma_0^2 \mathbb{I}_n\right)\), then
\[ F = \frac{\frac{\mathsf{RSS}_0 - \mathsf{RSS}_1}{r_1 - r_0}}{\frac{\mathsf{RSS}_1}{n - r_1}} \sim \mathsf{F}_{r_1 - r_0, n - r_1}. \]
This result can used for testing the following statistical hypotheses: \[\begin{align*} \mathrm{H}_0:& \text{ model } \mathrm{M}_0 \text{ holds, i.e. } \mathsf{E} \left[\mathbf{Y} \mid \mathbb{Z}\right] \in \mathcal{M}(\mathbb{X}^0), \\ \mathrm{H}_1:& \text{ model } \mathrm{M}_1 \text{ holds, i.e. } \mathsf{E} \left[\mathbf{Y} \mid \mathbb{Z}\right] \in \mathcal{M}(\mathbb{X}^1) \setminus \mathcal{M}(\mathbb{X}^0). \end{align*}\] This aims at answering the following questions:
\(F\)-test on submodel on significance level \(\alpha\):
Consider the peat data from previous exercises.

We will compute the \(F\)-test for comparison of the additive (parallel lines) vs. interaction (group-specific lines) model. Clearly, additive model is a submodel of the interaction model.
fit_add <- lm(N ~ depth + group, data = peat)
fit_int <- lm(N ~ depth * group, data = peat)
(RSS0 <- sum(residuals(fit_add)^2))
## [1] 5.566937
(RSS1 <- sum(residuals(fit_int)^2))
## [1] 2.568338
(D2 <- RSS0 - RSS1)
## [1] 2.998599
(df_dif <- fit_add$df.residual - fit_int$df.residual)
## [1] 3
(Fstat <- D2/df_dif / (RSS1/fit_int$df.residual))
## [1] 54.4845
(Fcritical <- qf(0.95, df1 = df_dif, df2 = fit_int$df.residual))
## [1] 2.669256
(rejectH0 <- (Fstat > Fcritical))
## [1] TRUE
(pval <- pf(Fstat, df1 = df_dif, df2 = fit_int$df.residual, lower.tail = FALSE))
## [1] 2.128525e-23
Beyond any doubt we proved that the interaction model significantly
improves the additive model. The submodel \(F\)-test is already implemented by
anova function.
anova(fit_add, fit_int)
## Analysis of Variance Table
##
## Model 1: N ~ depth + group
## Model 2: N ~ depth * group
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 143 5.5669
## 2 140 2.5683 3 2.9986 54.484 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This function also excepts only a single model and provides intermediate results for model construction. In this case it sequentially compares the following models:
N ~ 1 (intercept-only model),N ~ depth (adds one coefficient),N ~ depth + group (adds three coefficients),N ~ depth + group + depth:group (adds three
coefficients).anova(fit_int)
## Analysis of Variance Table
##
## Response: N
## Df Sum Sq Mean Sq F value Pr(>F)
## depth 1 0.6263 0.62628 34.138 3.44e-08 ***
## group 3 3.3386 1.11287 60.663 < 2.2e-16 ***
## depth:group 3 2.9986 0.99953 54.484 < 2.2e-16 ***
## Residuals 140 2.5683 0.01835
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Notice that the tests provided by anova this way depend
on the order of the terms.
anova(lm(N ~ group * depth, data = peat))
## Analysis of Variance Table
##
## Response: N
## Df Sum Sq Mean Sq F value Pr(>F)
## group 3 3.2724 1.09079 59.459 < 2.2e-16 ***
## depth 1 0.6925 0.69251 37.749 7.877e-09 ***
## group:depth 3 2.9986 0.99953 54.484 < 2.2e-16 ***
## Residuals 140 2.5683 0.01835
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Also note that when compared to the intercept-only model we get the
\(F\)-test provided in the basic
summary of the model:
summary(fit_int)
##
## Call:
## lm(formula = N ~ depth * group, data = peat)
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.36829 -0.08469 -0.00789 0.06112 0.37713
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 0.846956 0.051193 16.544 < 2e-16 ***
## depth 0.005048 0.005724 0.882 0.379
## groupCB-VJJ 0.322113 0.064417 5.000 1.68e-06 ***
## groupVJJ -0.064041 0.075374 -0.850 0.397
## groupVJJ-CB 0.075906 0.064417 1.178 0.241
## depth:groupCB-VJJ -0.032600 0.007389 -4.412 2.03e-05 ***
## depth:groupVJJ 0.043942 0.008312 5.287 4.68e-07 ***
## depth:groupVJJ-CB 0.041591 0.007389 5.629 9.56e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.1354 on 140 degrees of freedom
## Multiple R-squared: 0.7306, Adjusted R-squared: 0.7171
## F-statistic: 54.23 on 7 and 140 DF, p-value: < 2.2e-16
anova(lm(N ~ 1, data = peat), fit_int)
## Analysis of Variance Table
##
## Model 1: N ~ 1
## Model 2: N ~ depth * group
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 147 9.5318
## 2 140 2.5683 7 6.9635 54.226 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In the rest of the exercise we will work with Cars2004nh
dataset from mffSM package. It contains information about
\(n=\) 425 new cars of the year 2004.
We will work with only a subset of variables suitable for predicting the
consumption of the vehicle:
y <- "consumption"
x <- c("ftype", "fdrive", "engine.size", "ncylinder", "horsepower", "weight", "length", "width", "wheel.base")
head(Cars2004nh[, c(y, x)])
## consumption ftype fdrive engine.size ncylinder horsepower weight length width wheel.base
## 1 7.65 personal front 1.6 4 103 1075 424 168 249
## 2 7.65 personal front 1.6 4 103 1065 389 168 249
## 3 7.70 personal front 2.2 4 140 1187 465 175 264
## 4 7.70 personal front 2.2 4 140 1214 465 173 264
## 5 7.70 personal front 2.2 4 140 1187 465 175 264
## 6 7.30 personal front 2.0 4 132 1171 442 170 267
summary(Cars2004nh[, c(y, x)])
## consumption ftype fdrive engine.size ncylinder horsepower weight
## Min. : 5.65 personal:242 front:223 Min. :1.300 Min. :-1.000 Min. :100.0 Min. : 923
## 1st Qu.: 9.65 wagon : 30 rear :110 1st Qu.:2.400 1st Qu.: 4.000 1st Qu.:165.0 1st Qu.:1412
## Median :10.70 SUV : 60 4x4 : 92 Median :3.000 Median : 6.000 Median :210.0 Median :1577
## Mean :10.75 pickup : 24 Mean :3.208 Mean : 5.791 Mean :216.8 Mean :1626
## 3rd Qu.:11.65 sport : 49 3rd Qu.:3.900 3rd Qu.: 6.000 3rd Qu.:255.0 3rd Qu.:1804
## Max. :21.55 minivan : 20 Max. :8.300 Max. :12.000 Max. :500.0 Max. :3261
## NA's :14 NA's :2
## length width wheel.base
## Min. :363.0 Min. :163.0 Min. :226.0
## 1st Qu.:450.0 1st Qu.:175.0 1st Qu.:262.0
## Median :472.0 Median :180.0 Median :272.0
## Mean :470.6 Mean :181.1 Mean :274.9
## 3rd Qu.:490.0 3rd Qu.:185.0 3rd Qu.:284.0
## Max. :577.0 Max. :206.0 Max. :366.0
## NA's :26 NA's :28 NA's :2
# psych::pairs.panels(Cars2004nh[, c(y,"weight", "iweight", "lweight")])
# keep only
data <- na.omit(Cars2004nh[, c(y, x)])
# exploration
table(data$ncylinder)
##
## 4 5 6 8 12
## 121 7 179 75 2
data$fcyl <- cut(data$ncylinder, breaks = c(4, 6, 8, 12), right = FALSE, include.lowest = TRUE)
summary(data$fcyl)
## [4,6) [6,8) [8,12]
## 128 179 77
data$ftype <- factor(data$ftype, levels = setdiff(levels(data$ftype), "pickup")) # pickups not represented in data
# consumption vs categorical variables
xcat <- c("ftype", "fdrive", "fcyl")
par(mfrow = c(1, length(xcat)), mar = c(4,4,0.5,0.5))
for(x in xcat){
plot(as.formula(paste("consumption ~ ", x)), data = data)
}

# consumption vs numeric variables
xnum <- c("engine.size", "horsepower", "weight", "length", "width", "wheel.base")
psych::pairs.panels(data[, c(y,xnum)])

The ultimate goal is to find as simple as possible (for
interpretation) model for explaining consumption of a
vehicle given other characteristics. First, we fit an additive model
with 3 categorical and 6 numerical regressors.
fit_add <- lm(consumption ~ ftype + fdrive + fcyl + engine.size + horsepower + weight + length + width + wheel.base, data = data)
summary(fit_add)
##
## Call:
## lm(formula = consumption ~ ftype + fdrive + fcyl + engine.size +
## horsepower + weight + length + width + wheel.base, data = data)
##
## Residuals:
## Min 1Q Median 3Q Max
## -3.0291 -0.4651 -0.0348 0.4332 3.5840
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 9.8605120 1.3260663 7.436 7.33e-13 ***
## ftypewagon 0.0048205 0.1455164 0.033 0.97359
## ftypeSUV 0.8448182 0.1640069 5.151 4.22e-07 ***
## ftypesport 0.1126023 0.1756491 0.641 0.52188
## ftypeminivan 0.2003173 0.2146830 0.933 0.35139
## fdriverear 0.1774620 0.1170848 1.516 0.13046
## fdrive4x4 0.2395297 0.1180205 2.030 0.04312 *
## fcyl[6,8) 0.2816061 0.1418941 1.985 0.04793 *
## fcyl[8,12] -0.0040983 0.2747498 -0.015 0.98811
## engine.size 0.4702588 0.1179395 3.987 8.06e-05 ***
## horsepower 0.0043882 0.0010376 4.229 2.96e-05 ***
## weight 0.0045723 0.0003484 13.123 < 2e-16 ***
## length -0.0050020 0.0027741 -1.803 0.07219 .
## width -0.0119624 0.0094784 -1.262 0.20772
## wheel.base -0.0178717 0.0054620 -3.272 0.00117 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.716 on 369 degrees of freedom
## Multiple R-squared: 0.8845, Adjusted R-squared: 0.8801
## F-statistic: 201.8 on 14 and 369 DF, p-value: < 2.2e-16
The model could be sequentially built by adding one variable at a
time, which can be monitored by anova:
anova(fit_add)
## Analysis of Variance Table
##
## Response: consumption
## Df Sum Sq Mean Sq F value Pr(>F)
## ftype 4 605.26 151.314 295.1589 < 2.2e-16 ***
## fdrive 2 225.46 112.728 219.8913 < 2.2e-16 ***
## fcyl 2 411.26 205.631 401.1115 < 2.2e-16 ***
## engine.size 1 72.13 72.132 140.7035 < 2.2e-16 ***
## horsepower 1 41.67 41.666 81.2757 < 2.2e-16 ***
## weight 1 67.92 67.925 132.4966 < 2.2e-16 ***
## length 1 18.06 18.060 35.2279 6.766e-09 ***
## width 1 1.04 1.039 2.0266 0.155412
## wheel.base 1 5.49 5.488 10.7059 0.001169 **
## Residuals 369 189.17 0.513
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
However, as mentioned above the individual tests are heavily dependent on the order of the variables despite reaching the same model in the end:
anova(lm(consumption ~ width + wheel.base + length + ftype + engine.size + fcyl + horsepower + weight + fdrive, data = data))
## Analysis of Variance Table
##
## Response: consumption
## Df Sum Sq Mean Sq F value Pr(>F)
## width 1 702.18 702.18 1369.7021 < 2.2e-16 ***
## wheel.base 1 1.51 1.51 2.9481 0.0868198 .
## length 1 47.62 47.62 92.8803 < 2.2e-16 ***
## ftype 4 272.77 68.19 133.0208 < 2.2e-16 ***
## engine.size 1 248.00 248.00 483.7547 < 2.2e-16 ***
## fcyl 2 8.83 4.41 8.6086 0.0002218 ***
## horsepower 1 50.54 50.54 98.5763 < 2.2e-16 ***
## weight 1 114.20 114.20 222.7604 < 2.2e-16 ***
## fdrive 2 2.64 1.32 2.5773 0.0773435 .
## Residuals 369 189.17 0.51
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The submodel test compares two models where both or just the larger ones need to hold. So when an important variable is missing the conclusions may be invalid.
Hence, we need a different kind of test to determine the significance
of a variable. The function drop1 compares the original
large model with many different submodels, which are created by
dropping one term without breaking the hierarchy:
drop1(fit_add, test = "F")
## Single term deletions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 189.17 -241.873
## ftype 4 15.091 204.26 -220.400 7.3591 1.036e-05 ***
## fdrive 2 2.643 191.81 -240.545 2.5773 0.077344 .
## fcyl 2 7.115 196.28 -231.694 6.9395 0.001100 **
## engine.size 1 8.150 197.32 -227.674 15.8985 8.059e-05 ***
## horsepower 1 9.169 198.34 -225.697 17.8855 2.962e-05 ***
## weight 1 88.285 277.45 -96.795 172.2114 < 2.2e-16 ***
## length 1 1.667 190.84 -240.504 3.2511 0.072192 .
## width 1 0.817 189.99 -242.219 1.5928 0.207721
## wheel.base 1 5.488 194.66 -232.890 10.7059 0.001169 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now we know which variables are significant while adjusting for the
effect of other remaining variables. The insignificant ones seem to be
redundant, thus, we can remove them from the model by
update function. Now all remaining variables are
significant:
fit1 <- update(fit_add, .~. - fdrive - length - width)
anova(fit1, fit_add)
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 373 198.03
## 2 369 189.17 4 8.8561 4.3188 0.001996 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(fit1, test = "F")
## Single term deletions
##
## Model:
## consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 198.03 -232.303
## ftype 4 21.843 219.87 -200.124 10.2859 6.542e-08 ***
## fcyl 2 6.949 204.97 -223.059 6.5447 0.001608 **
## engine.size 1 4.850 202.88 -225.012 9.1348 0.002681 **
## horsepower 1 13.679 211.70 -208.655 25.7649 6.099e-07 ***
## weight 1 114.666 312.69 -58.884 215.9842 < 2.2e-16 ***
## wheel.base 1 26.324 224.35 -186.376 49.5842 9.197e-12 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
However, by removing multiple variables at once we might have
eliminated too many variables. It can happen that one of the eliminated
variables may still significantly improve a model. As we can see in
anova output above fit_add is a significant
improvement of fit1 despite all eliminated variables being
insignificant in fit_add. Similarly as we can check
dropping one variable, we can also inspect, how model improves upon
adding one more term with add1 function.
The scope parameter should be a vector of terms that we
wish to add to the current model. Or alternatively set
scope as a larger model, where only terms that do not
appear in fit1 will be added - one at a time:
# add1(fit1, scope = c("fdrive", "length", "width"), test = "F")
add1(fit1, scope = fit_add, test = "F")
## Single term additions
##
## Model:
## consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 198.03 -232.30
## fdrive 2 4.8187 193.21 -237.76 4.6265 0.010361 *
## length 1 5.3414 192.68 -240.80 10.3123 0.001437 **
## width 1 3.2180 194.81 -236.59 6.1450 0.013621 *
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
It seems like adding length back could improve the
model:
fit2 <- update(fit1, .~. + length)
anova(fit2, fit_add) # fit2 comparable to fit_add
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base + length
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 372 192.68
## 2 369 189.17 3 3.5147 2.2853 0.07848 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
drop1(fit2, test = "F") # all variables are significant
## Single term deletions
##
## Model:
## consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base + length
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 192.68 -240.803
## ftype 4 14.670 207.35 -220.627 7.0805 1.674e-05 ***
## fcyl 2 7.466 200.15 -230.206 7.2066 0.0008498 ***
## engine.size 1 7.392 200.07 -228.348 14.2704 0.0001843 ***
## horsepower 1 10.572 203.25 -222.293 20.4100 8.406e-06 ***
## weight 1 118.015 310.70 -59.339 227.8421 < 2.2e-16 ***
## wheel.base 1 6.139 198.82 -230.760 11.8523 0.0006414 ***
## length 1 5.341 198.03 -232.303 10.3123 0.0014365 **
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
add1(fit2, scope = fit_add, test = "F") # none other can improve the model
## Single term additions
##
## Model:
## consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base + length
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 192.68 -240.80
## fdrive 2 2.69812 189.99 -242.22 2.6273 0.07362 .
## width 1 0.87213 191.81 -240.54 1.6869 0.19482
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
After adding just length we end with model
fit2 that
drop1 output),fdrive and width) do not
improve the model (\(p > 0.05\) in
add1 output).So far we have neglected an option for the effects of variables to be modified by other variables: different lines for different groups, etc. We will start with a model with all possible variables and eliminate the insignificant terms on our way to final model. Here, we will limit ourselves to two-way interactions that are usually sufficient for modelling:
fit_int <- lm(consumption ~ (ftype + fdrive + fcyl + engine.size + horsepower + weight + length + width + wheel.base)^2, data = data)
anova(fit_add, fit_int) # two-way interaction improve additive model
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Model 2: consumption ~ (ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base)^2
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 369 189.17
## 2 288 113.02 81 76.153 2.3958 5.722e-08 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(d1fit_int <- drop1(fit_int, test = "F")) # however, majority seem irrelevant in the full model
## Single term deletions
##
## Model:
## consumption ~ (ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base)^2
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 113.02 -277.68
## ftype:fdrive 7 0.7716 113.79 -289.06 0.2809 0.96113
## ftype:fcyl 7 1.9191 114.94 -285.21 0.6986 0.67322
## ftype:engine.size 4 3.2190 116.23 -274.89 2.0508 0.08740 .
## ftype:horsepower 4 1.1508 114.17 -281.79 0.7331 0.56998
## ftype:weight 4 1.3244 114.34 -281.20 0.8438 0.49838
## ftype:length 4 0.4079 113.42 -284.29 0.2599 0.90348
## ftype:width 4 3.7659 116.78 -273.09 2.3992 0.05031 .
## ftype:wheel.base 4 1.5952 114.61 -280.29 1.0163 0.39919
## fdrive:fcyl 4 1.9176 114.93 -279.21 1.2217 0.30170
## fdrive:engine.size 2 0.3558 113.37 -280.47 0.4534 0.63591
## fdrive:horsepower 2 0.3370 113.35 -280.53 0.4294 0.65130
## fdrive:weight 2 0.0911 113.11 -281.37 0.1161 0.89042
## fdrive:length 2 1.1987 114.22 -277.62 1.5273 0.21887
## fdrive:width 2 1.0536 114.07 -278.11 1.3424 0.26284
## fdrive:wheel.base 2 1.8076 114.82 -275.58 2.3031 0.10178
## fcyl:engine.size 2 1.8858 114.90 -275.32 2.4028 0.09228 .
## fcyl:horsepower 2 2.2727 115.29 -274.03 2.8958 0.05687 .
## fcyl:weight 2 0.7753 113.79 -279.05 0.9879 0.37362
## fcyl:length 2 0.9034 113.92 -278.62 1.1511 0.31773
## fcyl:width 2 2.0903 115.11 -274.64 2.6634 0.07143 .
## fcyl:wheel.base 2 0.2770 113.29 -280.74 0.3529 0.70295
## engine.size:horsepower 1 0.6172 113.63 -277.58 1.5727 0.21083
## engine.size:weight 1 1.7137 114.73 -273.90 4.3669 0.03752 *
## engine.size:length 1 0.0448 113.06 -279.52 0.1142 0.73562
## engine.size:width 1 0.0038 113.02 -279.66 0.0096 0.92191
## engine.size:wheel.base 1 0.0825 113.10 -279.39 0.2102 0.64692
## horsepower:weight 1 0.0590 113.08 -279.48 0.1503 0.69856
## horsepower:length 1 0.0126 113.03 -279.63 0.0320 0.85810
## horsepower:width 1 0.6883 113.70 -277.34 1.7541 0.18641
## horsepower:wheel.base 1 0.6519 113.67 -277.47 1.6612 0.19847
## weight:length 1 1.1320 114.15 -275.85 2.8846 0.09051 .
## weight:width 1 0.0624 113.08 -279.46 0.1590 0.69034
## weight:wheel.base 1 0.0012 113.02 -279.67 0.0030 0.95629
## length:width 1 0.0044 113.02 -279.66 0.0113 0.91533
## length:wheel.base 1 0.1389 113.16 -279.20 0.3539 0.55239
## width:wheel.base 1 0.6650 113.68 -277.42 1.6945 0.19405
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(a1fit_int <- add1(fit_add, scope = fit_int, test = "F")) # a dozen improves the additive model
## Single term additions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 189.17 -241.87
## ftype:fdrive 7 3.7211 185.45 -235.50 1.0377 0.4042049
## ftype:fcyl 7 12.0978 177.07 -253.25 3.5332 0.0010994 **
## ftype:engine.size 4 11.5556 177.61 -258.08 5.9368 0.0001227 ***
## ftype:horsepower 4 3.3282 185.84 -240.69 1.6342 0.1650099
## ftype:weight 4 5.4219 183.75 -245.04 2.6926 0.0308789 *
## ftype:length 4 4.0795 185.09 -242.24 2.0112 0.0923456 .
## ftype:width 4 3.0427 186.13 -240.10 1.4917 0.2040791
## ftype:wheel.base 4 2.3947 186.77 -238.76 1.1699 0.3236844
## fdrive:fcyl 4 5.8866 183.28 -246.01 2.9307 0.0208522 *
## fdrive:engine.size 2 3.7778 185.39 -245.62 3.7393 0.0246823 *
## fdrive:horsepower 2 1.5298 187.64 -240.99 1.4960 0.2253856
## fdrive:weight 2 3.1855 185.98 -244.39 3.1430 0.0443178 *
## fdrive:length 2 2.7738 186.40 -243.54 2.7308 0.0664945 .
## fdrive:width 2 1.9720 187.20 -241.90 1.9330 0.1461778
## fdrive:wheel.base 2 1.5470 187.62 -241.03 1.5130 0.2216173
## fcyl:engine.size 2 5.1166 184.05 -248.40 5.1013 0.0065280 **
## fcyl:horsepower 2 7.0358 182.13 -252.43 7.0886 0.0009537 ***
## fcyl:weight 2 6.4373 182.73 -251.17 6.4644 0.0017413 **
## fcyl:length 2 0.7900 188.38 -239.48 0.7695 0.4639793
## fcyl:width 2 5.6239 183.54 -249.46 5.6225 0.0039340 **
## fcyl:wheel.base 2 0.2132 188.96 -238.31 0.2070 0.8130828
## engine.size:horsepower 1 0.6185 188.55 -241.13 1.2072 0.2726124
## engine.size:weight 1 3.4547 185.71 -246.95 6.8456 0.0092521 **
## engine.size:length 1 0.3215 188.85 -240.53 0.6265 0.4291480
## engine.size:width 1 1.3281 187.84 -242.58 2.6018 0.1075978
## engine.size:wheel.base 1 0.5624 188.61 -241.02 1.0973 0.2955463
## horsepower:weight 1 0.0167 189.15 -239.91 0.0325 0.8569586
## horsepower:length 1 2.7018 186.47 -245.40 5.3322 0.0214888 *
## horsepower:width 1 0.7577 188.41 -241.41 1.4799 0.2245767
## horsepower:wheel.base 1 0.0169 189.15 -239.91 0.0328 0.8563657
## weight:length 1 0.5910 188.58 -241.07 1.1532 0.2835803
## weight:width 1 1.1161 188.05 -242.15 2.1841 0.1402996
## weight:wheel.base 1 0.3088 188.86 -240.50 0.6018 0.4384019
## length:width 1 0.0317 189.14 -239.94 0.0616 0.8040616
## length:wheel.base 1 0.0682 189.10 -240.01 0.1328 0.7157639
## width:wheel.base 1 0.5523 188.62 -241.00 1.0776 0.2999184
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We start slowly by eliminating those interaction terms that have
\(p\)-value in drop1 lower
than \(0.5\), which is not a mistake,
we do not want to eliminate too much variables at the same time.
(insignif_d1 <- rownames(d1fit_int)[-1][d1fit_int$`Pr(>F)`[-1] > 0.5])
## [1] "ftype:fdrive" "ftype:fcyl" "ftype:horsepower" "ftype:length"
## [5] "fdrive:engine.size" "fdrive:horsepower" "fdrive:weight" "fcyl:wheel.base"
## [9] "engine.size:length" "engine.size:width" "engine.size:wheel.base" "horsepower:weight"
## [13] "horsepower:length" "weight:width" "weight:wheel.base" "length:width"
## [17] "length:wheel.base"
fit3 <- update(fit_int, as.formula(paste(".~. -", paste(insignif_d1, collapse = " - "))))
(d1fit3 <- drop1(fit3, test = "F")) # many remaining are significant
## Single term deletions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:fcyl +
## fdrive:length + fdrive:width + fdrive:wheel.base + fcyl:engine.size +
## fcyl:horsepower + fcyl:weight + fcyl:length + fcyl:width +
## engine.size:horsepower + engine.size:weight + horsepower:width +
## horsepower:wheel.base + weight:length + width:wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 124.69 -317.93
## ftype:engine.size 4 7.0802 131.77 -304.72 4.6420 0.0011719 **
## ftype:weight 4 4.5891 129.28 -312.05 3.0088 0.0184637 *
## ftype:width 4 9.0813 133.77 -298.93 5.9540 0.0001233 ***
## ftype:wheel.base 4 5.4174 130.11 -309.60 3.5518 0.0074576 **
## fdrive:fcyl 4 2.2165 126.91 -319.16 1.4532 0.2162380
## fdrive:length 2 4.1071 128.80 -309.48 5.3854 0.0049985 **
## fdrive:width 2 0.5814 125.27 -320.14 0.7624 0.4673780
## fdrive:wheel.base 2 5.7513 130.44 -304.61 7.5414 0.0006282 ***
## fcyl:engine.size 2 3.2488 127.94 -312.05 4.2600 0.0149141 *
## fcyl:horsepower 2 2.1875 126.88 -315.25 2.8683 0.0582247 .
## fcyl:weight 2 2.4051 127.09 -314.59 3.1537 0.0439944 *
## fcyl:length 2 4.2328 128.92 -309.11 5.5502 0.0042616 **
## fcyl:width 2 7.4819 132.17 -299.55 9.8107 7.281e-05 ***
## engine.size:horsepower 1 2.7908 127.48 -311.43 7.3188 0.0071812 **
## engine.size:weight 1 3.1177 127.81 -310.45 8.1762 0.0045169 **
## horsepower:width 1 4.0397 128.73 -307.68 10.5942 0.0012532 **
## horsepower:wheel.base 1 0.0310 124.72 -319.83 0.0812 0.7759010
## weight:length 1 4.3449 129.03 -306.78 11.3945 0.0008253 ***
## width:wheel.base 1 5.5246 130.21 -303.28 14.4883 0.0001684 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
add1(fit3, scope = fit_int, test = "F") # none should be added back
## Single term additions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:fcyl +
## fdrive:length + fdrive:width + fdrive:wheel.base + fcyl:engine.size +
## fcyl:horsepower + fcyl:weight + fcyl:length + fcyl:width +
## engine.size:horsepower + engine.size:weight + horsepower:width +
## horsepower:wheel.base + weight:length + width:wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 124.69 -317.93
## ftype:fdrive 7 2.4399 122.25 -311.52 0.9124 0.49695
## ftype:fcyl 7 2.5637 122.13 -311.91 0.9596 0.46081
## ftype:horsepower 4 3.2015 121.49 -319.92 2.1280 0.07712 .
## ftype:length 4 0.6264 124.06 -311.86 0.4077 0.80305
## fdrive:engine.size 2 1.0774 123.61 -317.26 1.4163 0.24410
## fdrive:horsepower 2 2.1975 122.49 -320.76 2.9153 0.05561 .
## fdrive:weight 2 0.1095 124.58 -314.27 0.1428 0.86694
## fcyl:wheel.base 2 0.9652 123.72 -316.91 1.2677 0.28286
## engine.size:length 1 0.0946 124.59 -316.22 0.2476 0.61914
## engine.size:width 1 0.0701 124.62 -316.14 0.1833 0.66884
## engine.size:wheel.base 1 0.3601 124.33 -317.04 0.9443 0.33190
## horsepower:weight 1 0.6940 124.00 -318.07 1.8247 0.17770
## horsepower:length 1 0.0222 124.67 -316.00 0.0580 0.80989
## weight:width 1 0.1181 124.57 -316.29 0.3091 0.57861
## weight:wheel.base 1 0.0001 124.69 -315.93 0.0004 0.98430
## length:width 1 0.0376 124.65 -316.04 0.0984 0.75395
## length:wheel.base 1 0.7308 123.96 -318.19 1.9220 0.16659
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(fit3, fit_int) # comparable to all two-way interaction model
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:fcyl +
## fdrive:length + fdrive:width + fdrive:wheel.base + fcyl:engine.size +
## fcyl:horsepower + fcyl:weight + fcyl:length + fcyl:width +
## engine.size:horsepower + engine.size:weight + horsepower:width +
## horsepower:wheel.base + weight:length + width:wheel.base
## Model 2: consumption ~ (ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base)^2
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 327 124.69
## 2 288 113.02 39 11.674 0.7628 0.8462
anova(fit_add, fit3) # significantly improves additive model
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:fcyl +
## fdrive:length + fdrive:width + fdrive:wheel.base + fcyl:engine.size +
## fcyl:horsepower + fcyl:weight + fcyl:length + fcyl:width +
## engine.size:horsepower + engine.size:weight + horsepower:width +
## horsepower:wheel.base + weight:length + width:wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 369 189.17
## 2 327 124.69 42 64.479 4.0261 3.882e-13 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now we eliminate at \(0.05\) level
for \(p\)-value in
drop1:
(insignif_d1 <- rownames(d1fit3)[-1][d1fit3$`Pr(>F)`[-1] > 0.05])
## [1] "fdrive:fcyl" "fdrive:width" "fcyl:horsepower" "horsepower:wheel.base"
fit4 <- update(fit3, as.formula(paste(".~. -", paste(insignif_d1, collapse = " - "))))
drop1(fit4, test = "F") # all remaining are significant
## Single term deletions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 129.25 -322.12
## ftype:engine.size 4 8.7170 137.97 -305.06 5.6651 0.0002010 ***
## ftype:weight 4 4.9527 134.21 -315.69 3.2187 0.0129974 *
## ftype:width 4 10.6892 139.94 -299.61 6.9468 2.206e-05 ***
## ftype:wheel.base 4 5.4773 134.73 -314.19 3.5596 0.0073391 **
## fdrive:length 2 4.0676 133.32 -314.23 5.2869 0.0054866 **
## fdrive:wheel.base 2 5.3194 134.57 -310.64 6.9139 0.0011415 **
## fcyl:engine.size 2 2.6454 131.90 -318.34 3.4384 0.0332502 *
## fcyl:weight 2 3.5838 132.84 -315.62 4.6581 0.0101055 *
## fcyl:length 2 4.9192 134.17 -311.78 6.3938 0.0018825 **
## fcyl:width 2 8.7594 138.01 -300.94 11.3852 1.644e-05 ***
## engine.size:horsepower 1 1.5465 130.80 -319.56 4.0202 0.0457595 *
## engine.size:weight 1 3.4876 132.74 -313.90 9.0661 0.0028012 **
## horsepower:width 1 6.5634 135.82 -305.10 17.0617 4.572e-05 ***
## weight:length 1 3.4297 132.68 -314.07 8.9157 0.0030349 **
## width:wheel.base 1 5.7406 134.99 -307.44 14.9230 0.0001344 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
add1(fit4, scope = fit_int, test = "F") # none interaction should be added back
## Single term additions
##
## Model:
## consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 129.25 -322.12
## ftype:fdrive 7 1.3933 127.86 -312.29 0.5122 0.8253
## ftype:fcyl 7 3.2647 125.99 -317.95 1.2179 0.2922
## ftype:horsepower 4 1.9373 127.32 -319.92 1.2630 0.2843
## ftype:length 4 0.7814 128.47 -316.45 0.5049 0.7322
## fdrive:fcyl 4 1.9058 127.35 -319.83 1.2421 0.2928
## fdrive:engine.size 2 0.1635 129.09 -318.61 0.2115 0.8095
## fdrive:horsepower 2 0.3257 128.93 -319.09 0.4218 0.6562
## fdrive:weight 2 0.1742 129.08 -318.64 0.2254 0.7984
## fdrive:width 2 0.5533 128.70 -319.77 0.7179 0.4885
## fcyl:horsepower 2 1.6217 127.63 -322.97 2.1219 0.1214
## fcyl:wheel.base 2 1.1893 128.06 -321.67 1.5509 0.2136
## engine.size:length 1 0.0000 129.25 -320.12 0.0000 0.9952
## engine.size:width 1 0.0022 129.25 -320.13 0.0058 0.9393
## engine.size:wheel.base 1 0.3319 128.92 -321.11 0.8625 0.3537
## horsepower:weight 1 0.2748 128.98 -320.94 0.7138 0.3988
## horsepower:length 1 0.0719 129.18 -320.34 0.1866 0.6661
## horsepower:wheel.base 1 0.0014 129.25 -320.13 0.0035 0.9526
## weight:width 1 0.1055 129.15 -320.44 0.2735 0.6013
## weight:wheel.base 1 0.0189 129.24 -320.18 0.0489 0.8252
## length:width 1 0.0552 129.20 -320.29 0.1431 0.7055
## length:wheel.base 1 0.9640 128.29 -323.00 2.5173 0.1135
anova(fit4, fit_int) # comparable to all two-way interaction model, spares 48 coefs
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Model 2: consumption ~ (ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base)^2
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 336 129.25
## 2 288 113.02 48 16.238 0.8621 0.7282
anova(fit_add, fit4) # significantly improves additive model
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 369 189.17
## 2 336 129.25 33 59.915 4.7198 4.612e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(fit4, fit3) # comparable to fit3, spares 9 coefs
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:fcyl +
## fdrive:length + fdrive:width + fdrive:wheel.base + fcyl:engine.size +
## fcyl:horsepower + fcyl:weight + fcyl:length + fcyl:width +
## engine.size:horsepower + engine.size:weight + horsepower:width +
## horsepower:wheel.base + weight:length + width:wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 336 129.25
## 2 327 124.69 9 4.564 1.3299 0.2201
The final model seems to be fit4 where every variable
appears in some interaction, even width and
fdrive which were excluded in model fit2 when
no interactions were considered:
anova(fit2, fit4)
## Analysis of Variance Table
##
## Model 1: consumption ~ ftype + fcyl + engine.size + horsepower + weight +
## wheel.base + length
## Model 2: consumption ~ ftype + fdrive + fcyl + engine.size + horsepower +
## weight + length + width + wheel.base + ftype:engine.size +
## ftype:weight + ftype:width + ftype:wheel.base + fdrive:length +
## fdrive:wheel.base + fcyl:engine.size + fcyl:weight + fcyl:length +
## fcyl:width + engine.size:horsepower + engine.size:weight +
## horsepower:width + weight:length + width:wheel.base
## Res.Df RSS Df Sum of Sq F Pr(>F)
## 1 372 192.68
## 2 336 129.25 36 63.43 4.5802 2.658e-14 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
General exceptions from eliminating variables:
Elimination despite significance:
Our model was fitted on many highly correlated variables, see the exploration above. We compute the so called variance inflation factor (VIF) that reveals which variables may cause inflation in the estimation of variance of coefficients. It compares the current estimates of variance with the variability of estimates under orthogonal (uncorrelated) regressors.
car::vif(fit4, type = "predictor")
## GVIF Df GVIF^(1/(2*Df)) Interacts With
## ftype 1.232958e+11 26 1.634133 engine.size, weight, width, wheel.base
## fdrive 1.058330e+07 8 2.748140 length, wheel.base
## fcyl 1.808539e+09 16 1.946666 engine.size, weight, length, width
## engine.size 6.213474e+27 23 4.019786 ftype, fcyl, horsepower, weight
## horsepower 1.679671e+07 5 5.278645 engine.size, width
## weight 6.364075e+24 25 3.133825 ftype, fcyl, engine.size, length
## length 1.789088e+18 13 5.035288 fdrive, fcyl, weight
## width 1.708979e+20 21 3.032018 ftype, fcyl, horsepower, wheel.base
## wheel.base 4.004078e+19 19 3.279860 ftype, fdrive, width
## Other Predictors
## ftype fdrive, fcyl, horsepower, length
## fdrive ftype, fcyl, engine.size, horsepower, weight, width
## fcyl ftype, fdrive, horsepower, wheel.base
## engine.size fdrive, length, width, wheel.base
## horsepower ftype, fdrive, fcyl, weight, length, wheel.base
## weight fdrive, horsepower, width, wheel.base
## length ftype, engine.size, horsepower, width, wheel.base
## width fdrive, engine.size, weight, length
## wheel.base fcyl, engine.size, horsepower, weight, length
car::vif(fit_add)
## GVIF Df GVIF^(1/(2*Df))
## ftype 8.283051 4 1.302488
## fdrive 2.737493 2 1.286288
## fcyl 7.636180 2 1.662337
## engine.size 10.586511 1 3.253692
## horsepower 3.905445 1 1.976220
## weight 9.183397 1 3.030412
## length 6.447956 1 2.539283
## width 4.879748 1 2.209015
## wheel.base 7.199453 1 2.683180