Use of the lm function
Everything in this tutorial will be exemplified on pseudodata that we now create.
y <- c(-9, -11, 1, 19)
x <- c(-3, -1, 1, 3)
x2 <- x^2
Data <- data.frame(y = y, x = x, x2 = x2)
print(Data)
## y x x2
## 1 -9 -3 9
## 2 -11 -1 1
## 3 1 1 1
## 4 19 3 9
lm
functionWe will be fitting the following linear model: EY=β0+β1x+β2x2.
m <- lm(y ~ x + x2)
print(m)
##
## Call:
## lm(formula = y ~ x + x2)
##
## Coefficients:
## (Intercept) x x2
## -6.25 4.80 1.25
class(m)
## [1] "lm"
The R
function lm
uses the QR decomposition of the model matrix X to calculate the LSE and related quantities which are now stored inside the object m
.
m
names(m)
## [1] "coefficients" "residuals" "effects" "rank" "fitted.values" "assign"
## [7] "qr" "df.residual" "xlevels" "call" "terms" "model"
Meaning of many elements of m
should be quite clear from their names.
m$coefficients
## (Intercept) x x2
## -6.25 4.80 1.25
m$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
m$rank
## [1] 3
m$fitted.values
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
m$df.residual
## [1] 1
m
useful only in special circumstanciesComponents of the object m
are useful only in some special circumstancies. There is no need to explore them now in much detail.
m$assign
## [1] 0 1 2
m$xlevels
## named list()
m$call
## lm(formula = y ~ x + x2)
m$terms
## y ~ x + x2
## attr(,"variables")
## list(y, x, x2)
## attr(,"factors")
## x x2
## y 0 0
## x 1 0
## x2 0 1
## attr(,"term.labels")
## [1] "x" "x2"
## attr(,"order")
## [1] 1 1
## attr(,"intercept")
## [1] 1
## attr(,"response")
## [1] 1
## attr(,".Environment")
## <environment: 0xecf5b10>
## attr(,"predvars")
## list(y, x, x2)
## attr(,"dataClasses")
## y x x2
## "numeric" "numeric" "numeric"
Nevertheless, there are still some components of the m
object that deserve a detailed exploration.
m$model
## y x x2
## 1 -9 -3 9
## 2 -11 -1 1
## 3 1 1 1
## 4 19 3 9
Note that in more complex models, the model matrix does not directly appear here.
QR decomposition of the model matrix is stored in a compact form:
m$qr
## $qr
## (Intercept) x x2
## 1 -2.0 0.0000000 -10.0000000
## 2 0.5 -4.4721360 0.0000000
## 3 0.5 0.4472136 8.0000000
## 4 0.5 0.8944272 -0.9296181
## attr(,"assign")
## [1] 0 1 2
##
## $qraux
## [1] 1.500000 1.000000 1.368524
##
## $pivot
## [1] 1 2 3
##
## $tol
## [1] 1e-07
##
## $rank
## [1] 3
##
## attr(,"class")
## [1] "qr"
qr.Q(m$qr)
## [,1] [,2] [,3]
## [1,] -0.5 0.6708204 0.5
## [2,] -0.5 0.2236068 -0.5
## [3,] -0.5 -0.2236068 -0.5
## [4,] -0.5 -0.6708204 0.5
qr.R(m$qr)
## (Intercept) x x2
## 1 -2 0.000000 -10
## 2 0 -4.472136 0
## 3 0 0.000000 8
qr.Q(m$qr, complete = TRUE)
## [,1] [,2] [,3] [,4]
## [1,] -0.5 0.6708204 0.5 0.2236068
## [2,] -0.5 0.2236068 -0.5 -0.6708204
## [3,] -0.5 -0.2236068 -0.5 0.6708204
## [4,] -0.5 -0.6708204 0.5 -0.2236068
We keep all above matrices:
(Q <- qr.Q(m$qr))
## [,1] [,2] [,3]
## [1,] -0.5 0.6708204 0.5
## [2,] -0.5 0.2236068 -0.5
## [3,] -0.5 -0.2236068 -0.5
## [4,] -0.5 -0.6708204 0.5
(P <- qr.Q(m$qr, complete = TRUE))
## [,1] [,2] [,3] [,4]
## [1,] -0.5 0.6708204 0.5 0.2236068
## [2,] -0.5 0.2236068 -0.5 -0.6708204
## [3,] -0.5 -0.2236068 -0.5 0.6708204
## [4,] -0.5 -0.6708204 0.5 -0.2236068
(N <- matrix(P[, (m$rank + 1):(m$rank + m$df.residual)], ncol = m$df.residual))
## [,1]
## [1,] 0.2236068
## [2,] -0.6708204
## [3,] 0.6708204
## [4,] -0.2236068
(R <- qr.R(m$qr))
## (Intercept) x x2
## 1 -2 0.000000 -10
## 2 0 -4.472136 0
## 3 0 0.000000 8
effects
m$effects
## (Intercept) x x2
## 0.000000 -21.466253 10.000000 1.788854
Compare it with:
crossprod(P, y)
## [,1]
## [1,] 0.000000
## [2,] -21.466253
## [3,] 10.000000
## [4,] 1.788854
It is seen that effects
= P⊤Y. This term is useful for several things as will be now illustrated.
The first r elements of effects
equals to Q⊤Y and can be used to calculate
easily the fitted values: ˆY=HY=QQ⊤Y.
Q %*% m$effects[1:m$rank]
## [,1]
## [1,] -9.4
## [2,] -9.8
## [3,] -0.2
## [4,] 19.4
m$fitted ## the same
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
The remaining elements of effects
equals to N⊤Y and can be used to calculate
easily the residuals: U=MY=NN⊤Y.
N %*% m$effects[(m$rank + 1):(m$rank + m$df.residual)]
## [,1]
## [1,] 0.4
## [2,] -1.2
## [3,] 1.2
## [4,] -0.4
m$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
Solution to normal equations is obtained as X⊤Xb=X⊤Y Rb=Q⊤Y Rb=effects[1:r], which is solved by backward substitution:
(b_from_QR_again <- backsolve(R, m$effects[1:m$rank]))
## [1] -6.25 4.80 1.25
print(m$coefficients) ## previously calculated b
## (Intercept) x x2
## -6.25 4.80 1.25
Residual sum of squares is SSe=U⊤U=Y⊤M⊤MY=Y⊤NN⊤NN⊤Y=Y⊤NN⊤Y=‖
(SSe_from_QR <- crossprod(m$effects[(m$rank + 1):(m$rank + m$df.residual)]))
## [,1]
## [1,] 3.2
as.numeric(crossprod(m$residuals)) ## also sum of squares
## [1] 3.2
lm
Many methods have been implemented for objects of class lm
which return the important quantities.
methods(class = "lm")
## [1] add1 alias anova Anova
## [5] avPlot Boot bootCase boxCox
## [9] case.names ceresPlot coerce confidenceEllipse
## [13] confint cooks.distance crPlot deltaMethod
## [17] deviance dfbeta dfbetaPlots dfbetas
## [21] dfbetasPlots drop1 dummy.coef durbinWatsonTest
## [25] effects ellipse3d extractAIC family
## [29] formula hatvalues hccm infIndexPlot
## [33] influence influencePlot initialize inverseResponsePlot
## [37] kappa labels leveneTest leveragePlot
## [41] linearHypothesis logLik mcPlot mmp
## [45] model.frame model.matrix ncvTest nextBoot
## [49] nobs outlierTest plot powerTransform
## [53] predict print proj qqnorm
## [57] qqPlot qr residualPlot residualPlots
## [61] residuals rstandard rstudent show
## [65] sigmaHat simulate slotsFromS3 spreadLevelPlot
## [69] summary variable.names vcov
## see '?methods' for accessing help and source code
coefficients(m)
## (Intercept) x x2
## -6.25 4.80 1.25
coef(m) ## the same
## (Intercept) x x2
## -6.25 4.80 1.25
m$coefficients ## the same
## (Intercept) x x2
## -6.25 4.80 1.25
residuals(m)
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
m$residuals ## the same
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
fitted(m)
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
m$fitted.values ## the same
## 1 2 3 4
## -9.4 -9.8 -0.2 19.4
deviance(m)
## [1] 3.2
In full rank models, \mbox{var}\bigl(\widehat{\beta}\bigr) = \frac{\mbox{SS}_e}{n-r}\,\bigl(\mathbf{X}^\top\mathbf{X}\bigr)^{-1}:
vcov(m)
## (Intercept) x x2
## (Intercept) 2.05 0.00 -0.25
## x 0.00 0.16 0.00
## x2 -0.25 0.00 0.05
In full rank models, for given j = 0,\ldots, k-1, a confidence interval for \beta_j with a coverage of 1 - \alpha is given as \widehat{\beta}_j \;\pm\; \mbox{S.E.}\bigl(\widehat{\beta}_j\bigr)\,\mbox{t}_{n-r}\bigl(1 - \alpha/2\bigr), where \mbox{S.E.}\bigl(\widehat{\beta}_j\bigr) is standard error of \widehat{\beta}_j (square root of appropriate diagonal element of matrix \mbox{var}\bigl(\widehat{\beta}\bigr)).
confint(m)
## 2.5 % 97.5 %
## (Intercept) -24.4425166 11.942517
## x -0.2824819 9.882482
## x2 -1.5911938 4.091194
confint(m, level = 0.99)
## 0.5 % 99.5 %
## (Intercept) -97.39258 84.89258
## x -20.66270 30.26270
## x2 -12.98408 15.48408
summary(m)
##
## Call:
## lm(formula = y ~ x + x2)
##
## Residuals:
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.2500 1.4318 -4.365 0.1434
## x 4.8000 0.4000 12.000 0.0529 .
## x2 1.2500 0.2236 5.590 0.1127
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.789 on 1 degrees of freedom
## Multiple R-squared: 0.9943, Adjusted R-squared: 0.983
## F-statistic: 87.62 on 2 and 1 DF, p-value: 0.07532
Note that the result of summary
is an object of its own class
sm <- summary(m)
class(sm)
## [1] "summary.lm"
names(sm)
## [1] "call" "terms" "residuals" "coefficients" "aliased" "sigma"
## [7] "df" "r.squared" "adj.r.squared" "fstatistic" "cov.unscaled"
It is then possible to extract different important elements (model statistics).
sm$residuals
## 1 2 3 4
## 0.4 -1.2 1.2 -0.4
sm$coefficients
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6.25 1.4317821 -4.365189 0.14336634
## x 4.80 0.4000000 12.000000 0.05292935
## x2 1.25 0.2236068 5.590170 0.11269007
\widehat{\sigma} = \sqrt{\frac{1}{n-r}\mbox{SS}_e}.
sm$sigma
## [1] 1.788854
F = \frac{\;\;\frac{\mbox{SS}_T}{n - 1}\;\;}{\;\;\frac{\mbox{SS}_e}{n - r}\;\;}.
sm$fstatistic
## value numdf dendf
## 87.625 2.000 1.000
R^2 = 1 - \frac{\mbox{SS}_e}{\mbox{SS}_T}, \qquad R^2_{adj} = 1 - \frac{n-1}{n-r}\;\frac{\mbox{SS}_e}{\mbox{SS}_T}.
sm$r.squared
## [1] 0.9943262
sm$adj.r.squared
## [1] 0.9829787