# NMSA407 Linear Regression: Tutorial

Confidence interval for the model based mean, prediction interval

Data Hosi0

## Introduction

### Load used data and calculate basic summaries

``````data(Hosi0, package = "mffSM")
``````
``````##   bweight blength weight12 length12 mage fage
## 1    3000      50     8900       78   23   30
## 2    3700      51    11550       77   26   28
## 3    3410      49    12000       80   21   24
## 4    3250      49     9600       73   19   22
## 5    3000      48     9600       74   16   22
## 6    3370      50    11100       73   19   18
``````
``````dim(Hosi0)
``````
``````## [1] 4838    6
``````
``````summary(Hosi0)
``````
``````##     bweight        blength         weight12        length12         mage            fage
##  Min.   :1750   Min.   :46.00   Min.   : 6500   Min.   :63.0   Min.   :16.00   Min.   :17.00
##  1st Qu.:3140   1st Qu.:49.00   1st Qu.: 9700   1st Qu.:74.0   1st Qu.:22.00   1st Qu.:24.00
##  Median :3450   Median :50.00   Median :10350   Median :76.0   Median :25.00   Median :28.00
##  Mean   :3429   Mean   :50.28   Mean   :10433   Mean   :76.6   Mean   :25.52   Mean   :28.49
##  3rd Qu.:3710   3rd Qu.:51.00   3rd Qu.:11100   3rd Qu.:79.0   3rd Qu.:29.00   3rd Qu.:32.00
##  Max.   :5100   Max.   :54.00   Max.   :16000   Max.   :94.0   Max.   :45.00   Max.   :69.00
``````

### Create a jittered version of `blength`

• It will be used on some plots.
``````set.seed(221913282)
Hosi0 <- transform(Hosi0, blength.jitter = blength + runif(nrow(Hosi0), -0.1, 0.1))
``````

## Linear model (regression line) for dependence of birth weight on birth length

``````summary(mwl <- lm(bweight ~ blength, data = Hosi0))
``````
``````##
## Call:
## lm(formula = bweight ~ blength, data = Hosi0)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -1520.33  -188.20   -10.33   189.67  1531.80
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept) -6230.146    121.095  -51.45   <2e-16 ***
## blength       192.124      2.407   79.81   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 291.7 on 4836 degrees of freedom
## Multiple R-squared:  0.5685, Adjusted R-squared:  0.5684
## F-statistic:  6370 on 1 and 4836 DF,  p-value: < 2.2e-16
``````

## Confidence intervals for the model based means

### LSE of the model based mean (and prediction) for `blength` equal to 47.5, 50, 52.5 cm

``````pdata <- data.frame(blength = c(47.5, 50, 52.5))
predict(mwl, newdata = pdata)
``````
``````##        1        2        3
## 2895.766 3376.077 3856.388
``````

### Calculate also standard errors

``````predict(mwl, newdata = pdata, se.fit = TRUE)
``````
``````## \$fit
##        1        2        3
## 2895.766 3376.077 3856.388
##
## \$se.fit
##        1        2        3
## 7.889673 4.245714 6.799569
##
## \$df
## [1] 4836
##
## \$residual.scale
## [1] 291.6664
``````

### 95% confidence intervals for the model based means

``````predict(mwl, newdata = pdata, interval = "confidence", level = 0.95)
``````
``````##        fit      lwr      upr
## 1 2895.766 2880.298 2911.233
## 2 3376.077 3367.753 3384.400
## 3 3856.388 3843.058 3869.718
``````

## Prediction intervals

### 95% prediction intervals

``````predict(mwl, newdata = pdata, interval = "prediction", level = 0.95)
``````
``````##        fit      lwr      upr
## 1 2895.766 2323.758 3467.774
## 2 3376.077 2804.217 3947.936
## 3 3856.388 3284.434 4428.342
``````

## Confidence and prediction bands

### Calculation

``````blengthNew <- seq(45.8, 54.2, length = 500)
pdata <- data.frame(blength = blengthNew)
pwlmean <- predict(mwl, newdata = pdata, interval = "confidence", level = 0.95)
pwlpred <- predict(mwl, newdata = pdata, interval = "prediction", level = 0.95)
``````

### Plot

``````par(mfrow = c(1, 1), bty = BTY, mar = c(5, 4, 1, 1)+0.1)
plot(bweight ~ blength.jitter, data = Hosi0, col = COL2, bg = BGC2, ylim = c(1750, 5100),
pch = PCH, xlab = "Birth length [cm]", ylab = "Birth weight [g]")

### Estimated means/predictions
lines(blengthNew, pwlmean[, "fit"], col = "red4", lwd = 2)

### Confidence band around the regression function
lines(blengthNew, pwlmean[, "lwr"], col = "red2", lwd = 2, lty = 2)
lines(blengthNew, pwlmean[, "upr"], col = "red2", lwd = 2, lty = 2)

### Prediction band
lines(blengthNew, pwlpred[, "lwr"], col = "blue3", lwd = 2, lty = 5)
lines(blengthNew, pwlpred[, "upr"], col = "blue3", lwd = 2, lty = 5)

legend(46, 5000, legend = c("Confidence band", "Prediction band"), col = c("red2", "blue3"), lwd = 2, lty = c(2, 5))
``````