# NMSA407 Linear Regression: Tutorial

Use of the lm function

## Dataset

Everything in this tutorial will be exemplified on pseudodata that we now create.

y <- c(-9, -11, 1, 19)
x <- c(-3, -1, 1, 3)
x2 <- x^2
Data <- data.frame(y = y, x = x, x2 = x2)
print(Data)

##     y  x x2
## 1  -9 -3  9
## 2 -11 -1  1
## 3   1  1  1
## 4  19  3  9


## Linear model and LSE using the lm function

### Model

We will be fitting the following linear model: $\mathsf{E}Y = \beta_0 + \beta_1\,x + \beta_2\,x^2.$

### Least squares solution and calculation of related quantities

m <- lm(y ~ x + x2)

print(m)

##
## Call:
## lm(formula = y ~ x + x2)
##
## Coefficients:
## (Intercept)            x           x2
##       -6.25         4.80         1.25

class(m)

## [1] "lm"


The R function lm uses the QR decomposition of the model matrix $$\mathbf{X}$$ to calculate the LSE and related quantities which are now stored inside the object m.

### Content of the object m

names(m)

##  [1] "coefficients"  "residuals"     "effects"       "rank"          "fitted.values" "assign"
##  [7] "qr"            "df.residual"   "xlevels"       "call"          "terms"         "model"


Meaning of many elements of m should be quite clear from their names.

m$coefficients  ## (Intercept) x x2 ## -6.25 4.80 1.25  ### Residuals $$\mathbf{U}$$ m$residuals

##    1    2    3    4
##  0.4 -1.2  1.2 -0.4


m$rank  ## [1] 3  ### Fitted values $$\widehat{\mathbf{Y}}$$ m$fitted.values

##    1    2    3    4
## -9.4 -9.8 -0.2 19.4


m$df.residual  ## [1] 1  ### Components of m useful only in special circumstancies Components of the object m are useful only in some special circumstancies. There is no need to explore them now in much detail. m$assign

## [1] 0 1 2

m$xlevels  ## named list()  m$call

## lm(formula = y ~ x + x2)

m$terms  ## y ~ x + x2 ## attr(,"variables") ## list(y, x, x2) ## attr(,"factors") ## x x2 ## y 0 0 ## x 1 0 ## x2 0 1 ## attr(,"term.labels") ## [1] "x" "x2" ## attr(,"order") ## [1] 1 1 ## attr(,"intercept") ## [1] 1 ## attr(,"response") ## [1] 1 ## attr(,".Environment") ## <environment: 0xecf5b10> ## attr(,"predvars") ## list(y, x, x2) ## attr(,"dataClasses") ## y x x2 ## "numeric" "numeric" "numeric"  Nevertheless, there are still some components of the m object that deserve a detailed exploration. ### Response vector and vectors needed to create a model matrix m$model

##     y  x x2
## 1  -9 -3  9
## 2 -11 -1  1
## 3   1  1  1
## 4  19  3  9


Note that in more complex models, the model matrix does not directly appear here.

### QR decomposition of the model matrix $$\mathbf{X}$$

QR decomposition of the model matrix is stored in a compact form:

m$qr  ##$qr
##   (Intercept)          x          x2
## 1        -2.0  0.0000000 -10.0000000
## 2         0.5 -4.4721360   0.0000000
## 3         0.5  0.4472136   8.0000000
## 4         0.5  0.8944272  -0.9296181
## attr(,"assign")
## [1] 0 1 2
##
## $qraux ## [1] 1.500000 1.000000 1.368524 ## ##$pivot
## [1] 1 2 3
##
## $tol ## [1] 1e-07 ## ##$rank
## [1] 3
##
## attr(,"class")
## [1] "qr"


qr.Q(m$qr)  ## [,1] [,2] [,3] ## [1,] -0.5 0.6708204 0.5 ## [2,] -0.5 0.2236068 -0.5 ## [3,] -0.5 -0.2236068 -0.5 ## [4,] -0.5 -0.6708204 0.5  #### 2. $$\mathbf{R}$$ matrix: qr.R(m$qr)

##   (Intercept)         x  x2
## 1          -2  0.000000 -10
## 2           0 -4.472136   0
## 3           0  0.000000   8


#### 3. $$\mathbf{P}$$ matrix (matrix $$\mathbf{Q}$$ supplemented by additional columns to form an orthonormal basis of $$R^n$$):

qr.Q(m$qr, complete = TRUE)  ## [,1] [,2] [,3] [,4] ## [1,] -0.5 0.6708204 0.5 0.2236068 ## [2,] -0.5 0.2236068 -0.5 -0.6708204 ## [3,] -0.5 -0.2236068 -0.5 0.6708204 ## [4,] -0.5 -0.6708204 0.5 -0.2236068  We keep all above matrices: (Q <- qr.Q(m$qr))

##      [,1]       [,2] [,3]
## [1,] -0.5  0.6708204  0.5
## [2,] -0.5  0.2236068 -0.5
## [3,] -0.5 -0.2236068 -0.5
## [4,] -0.5 -0.6708204  0.5

(P <- qr.Q(m$qr, complete = TRUE))  ## [,1] [,2] [,3] [,4] ## [1,] -0.5 0.6708204 0.5 0.2236068 ## [2,] -0.5 0.2236068 -0.5 -0.6708204 ## [3,] -0.5 -0.2236068 -0.5 0.6708204 ## [4,] -0.5 -0.6708204 0.5 -0.2236068  (N <- matrix(P[, (m$rank + 1):(m$rank + m$df.residual)], ncol = m$df.residual))  ## [,1] ## [1,] 0.2236068 ## [2,] -0.6708204 ## [3,] 0.6708204 ## [4,] -0.2236068  (R <- qr.R(m$qr))

##   (Intercept)         x  x2
## 1          -2  0.000000 -10
## 2           0 -4.472136   0
## 3           0  0.000000   8


### Component effects

m$effects  ## (Intercept) x x2 ## 0.000000 -21.466253 10.000000 1.788854  Compare it with: crossprod(P, y)  ## [,1] ## [1,] 0.000000 ## [2,] -21.466253 ## [3,] 10.000000 ## [4,] 1.788854  It is seen that effects $$=$$ $$\mathbf{P}^\top\mathbf{Y}$$. This term is useful for several things as will be now illustrated. #### 1. Fitted values $$\widehat{\mathbf{Y}}$$ again The first $$r$$ elements of effects equals to $$\mathbf{Q}^\top\mathbf{Y}$$ and can be used to calculate easily the fitted values: $$\widehat{\mathbf{Y}} = \mathbf{H}\mathbf{Y} = \mathbf{Q}\mathbf{Q}^\top\mathbf{Y}$$. Q %*% m$effects[1:m$rank]  ## [,1] ## [1,] -9.4 ## [2,] -9.8 ## [3,] -0.2 ## [4,] 19.4  m$fitted                             ## the same

##    1    2    3    4
## -9.4 -9.8 -0.2 19.4


#### 2. Residuals $$\mathbf{U}$$ again

The remaining elements of effects equals to $$\mathbf{N}^\top\mathbf{Y}$$ and can be used to calculate easily the residuals: $$\mathbf{U} = \mathbf{M}\mathbf{Y} = \mathbf{N}\mathbf{N}^\top\mathbf{Y}$$.

N %*% m$effects[(m$rank + 1):(m$rank + m$df.residual)]

##      [,1]
## [1,]  0.4
## [2,] -1.2
## [3,]  1.2
## [4,] -0.4

m$residuals  ## 1 2 3 4 ## 0.4 -1.2 1.2 -0.4  #### 3. Solution $$\widehat{\beta}$$ to normal equations again Solution to normal equations is obtained as $\mathbf{X}^\top\mathbf{X}\mathbf{b} = \mathbf{X}^\top\mathbf{Y}$ $\mathbf{R}\mathbf{b} = \mathbf{Q}^\top\mathbf{Y}$ $\mathbf{R}\mathbf{b} = effects[1:r],$ which is solved by backward substitution: (b_from_QR_again <- backsolve(R, m$effects[1:m$rank]))  ## [1] -6.25 4.80 1.25  print(m$coefficients)                 ## previously calculated b

## (Intercept)           x          x2
##       -6.25        4.80        1.25


#### 4. Residual sum of squares

Residual sum of squares is $\mbox{SS}_e = \mathbf{U}^\top\mathbf{U} = \mathbf{Y}^\top\mathbf{M}^\top\mathbf{M}\mathbf{Y} = \mathbf{Y}^\top\mathbf{N}\mathbf{N}^\top\mathbf{N}\mathbf{N}^\top\mathbf{Y} = \mathbf{Y}^\top\mathbf{N}\mathbf{N}^\top\mathbf{Y} = \bigl\|\mathbf{N}^\top\mathbf{Y}\bigr\|^2 = \bigl\|effects[r+1, \ldots]\bigr\|^2.$

(SSe_from_QR <- crossprod(m$effects[(m$rank + 1):(m$rank + m$df.residual)]))

##      [,1]
## [1,]  3.2

as.numeric(crossprod(m$residuals)) ## also sum of squares  ## [1] 3.2  ## Methods for objects of class lm Many methods have been implemented for objects of class lm which return the important quantities. ### Available methods methods(class = "lm")  ## [1] add1 alias anova Anova ## [5] avPlot Boot bootCase boxCox ## [9] case.names ceresPlot coerce confidenceEllipse ## [13] confint cooks.distance crPlot deltaMethod ## [17] deviance dfbeta dfbetaPlots dfbetas ## [21] dfbetasPlots drop1 dummy.coef durbinWatsonTest ## [25] effects ellipse3d extractAIC family ## [29] formula hatvalues hccm infIndexPlot ## [33] influence influencePlot initialize inverseResponsePlot ## [37] kappa labels leveneTest leveragePlot ## [41] linearHypothesis logLik mcPlot mmp ## [45] model.frame model.matrix ncvTest nextBoot ## [49] nobs outlierTest plot powerTransform ## [53] predict print proj qqnorm ## [57] qqPlot qr residualPlot residualPlots ## [61] residuals rstandard rstudent show ## [65] sigmaHat simulate slotsFromS3 spreadLevelPlot ## [69] summary variable.names vcov ## see '?methods' for accessing help and source code  ### Estimated regression coefficients $$\widehat{\beta}$$ (solution to normal equations) coefficients(m)  ## (Intercept) x x2 ## -6.25 4.80 1.25  coef(m) ## the same  ## (Intercept) x x2 ## -6.25 4.80 1.25  m$coefficients           ## the same

## (Intercept)           x          x2
##       -6.25        4.80        1.25


### Vector $$\mathbf{U}$$ of residuals

residuals(m)

##    1    2    3    4
##  0.4 -1.2  1.2 -0.4

m$residuals ## the same  ## 1 2 3 4 ## 0.4 -1.2 1.2 -0.4  ### Vector $$\widehat{\mathbf{Y}}$$ of fitted values fitted(m)  ## 1 2 3 4 ## -9.4 -9.8 -0.2 19.4  m$fitted.values          ## the same

##    1    2    3    4
## -9.4 -9.8 -0.2 19.4


### Residual sum of squares $$\mbox{SS}_e$$

deviance(m)

## [1] 3.2


### Covariance matrix of the estimated regression coefficients

In full rank models, $\mbox{var}\bigl(\widehat{\beta}\bigr) = \frac{\mbox{SS}_e}{n-r}\,\bigl(\mathbf{X}^\top\mathbf{X}\bigr)^{-1}:$

vcov(m)

##             (Intercept)    x    x2
## (Intercept)        2.05 0.00 -0.25
## x                  0.00 0.16  0.00
## x2                -0.25 0.00  0.05


### Confidence intervals for the regression coefficients based on the t-statistic

In full rank models, for given $$j = 0,\ldots, k-1$$, a confidence interval for $$\beta_j$$ with a coverage of $$1 - \alpha$$ is given as $\widehat{\beta}_j \;\pm\; \mbox{S.E.}\bigl(\widehat{\beta}_j\bigr)\,\mbox{t}_{n-r}\bigl(1 - \alpha/2\bigr),$ where $$\mbox{S.E.}\bigl(\widehat{\beta}_j\bigr)$$ is standard error of $$\widehat{\beta}_j$$ (square root of appropriate diagonal element of matrix $$\mbox{var}\bigl(\widehat{\beta}\bigr)$$).

confint(m)

##                   2.5 %    97.5 %
## (Intercept) -24.4425166 11.942517
## x            -0.2824819  9.882482
## x2           -1.5911938  4.091194

confint(m, level = 0.99)

##                 0.5 %   99.5 %
## (Intercept) -97.39258 84.89258
## x           -20.66270 30.26270
## x2          -12.98408 15.48408


### Summary of the most important results

summary(m)

##
## Call:
## lm(formula = y ~ x + x2)
##
## Residuals:
##    1    2    3    4
##  0.4 -1.2  1.2 -0.4
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  -6.2500     1.4318  -4.365   0.1434
## x             4.8000     0.4000  12.000   0.0529 .
## x2            1.2500     0.2236   5.590   0.1127
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 1.789 on 1 degrees of freedom
## Multiple R-squared:  0.9943, Adjusted R-squared:  0.983
## F-statistic: 87.62 on 2 and 1 DF,  p-value: 0.07532


Note that the result of summary is an object of its own class

sm <- summary(m)
class(sm)

## [1] "summary.lm"

names(sm)

##  [1] "call"          "terms"         "residuals"     "coefficients"  "aliased"       "sigma"
##  [7] "df"            "r.squared"     "adj.r.squared" "fstatistic"    "cov.unscaled"


It is then possible to extract different important elements (model statistics).

sm$residuals  ## 1 2 3 4 ## 0.4 -1.2 1.2 -0.4  #### 2. Table (matrix with four columns) with estimated regression coefficients including their standard errors, t-statistics and p-values sm$coefficients

##             Estimate Std. Error   t value   Pr(>|t|)
## (Intercept)    -6.25  1.4317821 -4.365189 0.14336634
## x               4.80  0.4000000 12.000000 0.05292935
## x2              1.25  0.2236068  5.590170 0.11269007


#### 3. Estimated residual standard deviation

$\widehat{\sigma} = \sqrt{\frac{1}{n-r}\mbox{SS}_e}.$

sm$sigma  ## [1] 1.788854  #### 4. F-statistic to test the null hypothesis that all regression coefficients except the intercept are equal to zero $F = \frac{\;\;\frac{\mbox{SS}_T}{n - 1}\;\;}{\;\;\frac{\mbox{SS}_e}{n - r}\;\;}.$ sm$fstatistic

##  value  numdf  dendf
## 87.625  2.000  1.000


#### 5. Coefficient of determination $$R^2$$ and adjusted coefficient of determination $$R^2_{adj}$$

$R^2 = 1 - \frac{\mbox{SS}_e}{\mbox{SS}_T}, \qquad R^2_{adj} = 1 - \frac{n-1}{n-r}\;\frac{\mbox{SS}_e}{\mbox{SS}_T}.$

sm$r.squared  ## [1] 0.9943262  sm$adj.r.squared

## [1] 0.9829787