Linear regression (NMSA407)

Arnošt Komárek


Home (CZ) | Teaching (CZ) | BESEDA | NMST552 |

Teaching winter

NMSA407 | NMST431 |

Teaching summer

NMST432 | NMST440 |

Teaching, software

Rko (CZ) |


Diploma theses (CZ) | Bachelor theses (CZ) |

Linear regression (NMSA407)

Winter semester 2016–17

SIS pages of the course:    ENG    CZE


Lectures: Wednesday 10:40 in K1   
Thursday 12:20 in K1   
Exercise class (MM1): Monday 15:40 in K4    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (MM2): Tuesday 9:00 in K11    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (MO): Tuesday 14:00 in K4    (Ing. Marek Omelka, Ph.D.)
  • Language of both lectures and all exercise classes is English.
  • Personal communication with the lecturer and the exercise class instructors can also be conducted in Czech or Slovak.


  • Exam grade will be based on three parts:
    1. Take home project
    2. Written part.
    3. Oral part.
    For details, see this document (pdf).

  • Sample exam assignment is available here (pdf).

The following exam dates have been open for enrollment in SIS:
  • Thursday 19/01
  • Thursday 26/01
  • Thursday 02/02
  • Wednesday 15/02
  • Thursday 16/02
Capacity of each of the exam terms is 40. Oral part of the exam may take place the following day (this will always be specified at the end of the written part). The above set of the exam dates is final. No other exam dates will be available in the academic year 2016–17.


Course notes will be gradually updated. They provide a record of the lecture including notes, comments etc. mentioned perhaps only orally during the lecture. In many cases, the course notes do not include proofs or derivations, especially those that are fully shown on the blackboard during the lecture.

The lecture will follow the notes quite closely and more or less in a linear way. Students are advised to bring printed course notes to the lecture and supplement them by their own hand-written notes. Not everything that will be said will be written on the blackboard (especially various remarks etc.). Also statements of definitions and theorems will not be fully written on the blackboard.

Notes (pdf), latest update including also proofs of most theorems/lemmas published on 20170104


Course slides will be projected during the lecture. They mainly contain

  • the structure of the lecture;
  • statements of definitions and theorems;
  • some illustrative plots/computer output.
Course slides alone are rather incomplete as a study material. They should be used as a supplementary material to the course notes only.

Main lecture parts I – IV published 20160929
 parts V – VII added 20161006
 part VIII added 20161020
 part IX added 20161103
 part X added 20161117
 part XI added 20161128
 part XII added 20161203
 parts XIII and XIV added 20161208
pdf (1 slide/page)    pdf (4 slides/page)
Appendices, published 20160929
pdf (1 slide/page)    pdf (4 slides/page)


R tutorials show the R analyses that are based on theory given during the lectures. They also provide the code used to prepare majority of the output/plots that is used during the lectures as illustrations. The R tutorials may serve as a reference for the assignments performed during the exercise classes or required in homeworks.

R tutorials will be gradually published during the semester in correspondence with the topics covered by the lecture. All tutorials from the academic year 2015–16 are available here.

The R scripts provided below assume that the content of the .Rprofile is sourced at start.

I. Linear Model
  1. Simple illustration of a linear model (data Hosi0)    html    R code
II. Least Squares Estimation
  1. Matrix algebra background of linear regression    html    R code
  2. R function lm    html    R code
III. Normal Linear Model
  1. Inference in a model with the regression line (data Cars2004nh)    html    R code
  2. Joint inference on a vector of estimable parameters (data Cars2004nh)    html    R code
  3. Confidence interval for the model based mean, prediction interval (data Hosi0)    html    R code
  4. Confidence interval for the model based mean, prediction interval (data Kojeni)    html    R code
IV. Basic Regression Diagnostics
  1. Basic Regression Diagnostics (data Cars2004nh)    html    R code
VI. General Linear Model
  1. Weighted least squares (data Kojeni and wKojeni)    html    R code
VII. Parameterizations of Covariates
  1. Numeric covariate: simple transformation, polynomial regression, regression splines (data Houses1987)    html    R code
  2. Numeric covariate: regression splines (data Motorcycle)    html    R code
  3. Categorical nominal covariate (data Cars2004nh)    html    R code
  4. Categorical ordinal covariate (data Cars2004nh)    html    R code
VIII. Additivity and Interactions
  1. Two numeric covariates (data Cars2004nh)    html    R code
  2. Numeric and categorical covariate (data Cars2004nh)    html    R code
  3. ANOVA tables of type I, II and III (data Cars2004nh)    html    R code
IX. Analysis of Variance
  1. Two-way Analysis of Variance (data Howells)    html    R code
X. Simultaneous Inference in a Linear Model
  1. Multiple comparison procedures (Tukey, Hothorn–Bretz–Westfall) (data Howells)    html    R code
  2. Multiple comparison procedures (Hothorn–Bretz–Westfall) (data Cars2004nh)    html    R code
  3. Confidence band around and for the regression function (data Kojeni)    html    R code
XI. Checking Model Assumptions
  1. Partial residuals, Simpson's paradox (data Policie)    html    R code
  2. Partial residuals (data Cars2004nh)    html    R code
  3. Residual plots and tests on assumptions (data Cars2004nh)    html    R code
  4. Checking homoscedasticity (data Draha)    html    R code
  5. Checking uncorrelated errors (data Olympic)    html    R code
  6. Transformation of response: ANOVA with log-transformed response    html    R code
      to get normality and homoscedasticity (data Houses1987)
  7. Transformation of response: Regression with log-transformed response    html    R code
      to stabilize the variance, Box–Cox transformation (data Cars2004nh)
XII. Problematic Regression Space
  1. Multicollinearity (data IQ)    html    R code
  2. Multicollinearity (data Cars2004nh)    html    R code
XIV. Unusual Observations
  1. Unusual observations (data Cars2004)    html    R code


All information related to the exercise classes is available at the central exercise classes webpage.

Exercise classes are synchronized. Content of the classes held in the same week is approximately the same.


View My Stats