Linear regression (NMSA407)

Arnošt Komárek


Home (CZ) | Teaching (CZ) | BESEDA | NMST552 |

Teaching winter

NMSA407 | NMST431 |

Teaching summer

NMSA230 | NMST432 |

Teaching, software

Rko (CZ) |


Diploma theses (CZ) | Bachelor theses (CZ) |

Linear regression (NMSA407)

Winter semester 2015–16


Lectures: Thursday 9:00 in K1   
Thursday 17:20 in K1   
Exercise class (MM1): Monday 19:00 in K11    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (MM2): Tuesday 7:20 in K11    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (MO): Thursday 15:40 in K4    (Ing. Marek Omelka, Ph.D.)
Friday 7:20 in K11   
  • Language of both lectures and all exercise classes is English.
  • Personal communication with the lecturer and the exercise class instructors can also be conducted in Czech or Slovak.


  • Exam grade will be based on three parts:
    1. Take home project
    2. Written part.
    3. Oral part.
    For details, see this document.
  • Exam dates are as follows:
    Thursday 21/01
    Monday 01/02
    Thursday 04/02
    Monday 15/02
  • A course credit is necessary to be able to take an exam. Provisionally, it is possible to subscribe in SIS for the first three exam dates even without the obtained course credit (since nobody will possess it till approx. the first week of January).
  • All exam terms start at 7:45 in K1 by the written part. The oral part takes place either in the afternoon of the same day, or in the following day (to be specified at the end of the written part).
  • There will be no other exam dates. Capacity of the term on 15/02 will be kept at a level allowing to take an exam to all those interested.


  • Assignments and detailed instructions were sent via e-mail on 20151231.




Course notes will be gradually updated, mostly on Thursday evening or on Friday. They provide a record of the lecture including notes, comments etc. mentioned perhaps only orally during the lecture. In most cases, the course notes do not include proofs or derivations, especially those that were fully shown on the blackboard during the lecture.

As of 20160104: in many cases, the course notes include also proofs or derivations. Nevertheless, there are still some proofs or derivations that are not included in the lecture notes and knowledge of whose is still expected for exam.

Course notes     pdf     (last update 20160114)


Course handouts provided below are a version of the lecture slides suitable for printing on A4 paper. The handouts/slides should mainly be considered as a skeleton of the lectures. They contain most definitions and theorems, illustrative plots and some remarks. They only exceptionally include calculations/proofs etc. Those mostly either appear on the blackboard or will be assigned to be worked out as a homework.

It is assumed that students bring the printed handouts to the lecture and will supplement them by hand-written notes/calculations/proofs etc. Definitions or statements of theorems provided on the slides/handouts will not be, in most cases, repeated on the blackboard.

Handouts for lectures of Thursday will always be available latest on Monday of the same week (but often earlier).

Typos or other errors in handouts revealed during the lecture or at other occasion are not corrected in the files available below. All corrections are available only in the course notes.

Practical Issues     pdf     (published 20150928)
I. Linear Model     pdf     (published 20150928)
II. Least Squares Estimation     pdf     (published 20150928)
III. Normal Linear Model     pdf     (published 20150928)
IV. Basic Regression Diagnostics     pdf     (published 20151008)
V. Submodels     pdf     (published 20151008)
VI. General Linear Model     pdf     (published 20151009)
VII. Parameterizations of Covariates     pdf     (published 20151008)
VIII. Additivity and Interactions     pdf     (Sections 1–4 published 20151019)
(Sections 5–6 added on 20151030)
(Sections 7–8 added on 20151105)
IX. Analysis of Variance     pdf     (published 20151117)
X. Checking Model Assumptions     pdf     (published 20151122)
XI. Consequences of a Problematic Regression Space     pdf     (published 20151128)
XII. Simultaneous Inference in a Linear Model     pdf     (published 20151128)
XIII. Asymptotic Properties of the LSE and Sandwich Estimator     pdf     (published 20151208
        – page 516 modified on 20151211)
XIV. Unusual Observations     pdf     (NEW: published 20160101)
Appendix A: Matrices pdf (published 20150928)
Appendix B: Distributions pdf (published 20150928)
Appendix C: Asymptotic Theorems pdf (published 20151208, correction 20160114)


R tutorials show the R analyses that are based on theory given during the lectures. They also provide the code used to prepare majority of the output/plots that is used during the lectures as illustrations. The R tutorials may serve as a reference for the assignments performed during the exercise classes or required in homeworks.

The R scripts provided below assume that the content of the .Rprofile is sourced at start.

I. Linear Model
  1. Simple illustration of a linear model (data Hosi0)    html    R code
II. Least Squares Estimation
  1. Matrix algebra background of linear regression    html    R code
  2. R function lm    html    R code
III. Normal Linear Model
  1. Inference in a model with the regression line (data Cars2004nh)    html    R code
  2. Joint inference on a vector of estimable parameters (data Cars2004nh)    html    R code
  3. Confidence interval for the model based mean, prediction interval (data Hosi0)    html    R code
  4. Confidence interval for the model based mean, prediction interval (data Kojeni)    html    R code
IV. Basic Regression Diagnostics
  1. Basic Regression Diagnostics (data Cars2004nh)    html    R code
VI. General Linear Model
  1. Weighted least squares (data Kojeni and wKojeni)    html    R code
VII. Parameterizations of Covariates
  1. Numeric covariate: simple transformation, polynomial regression, regression splines (data Houses1987)    html    R code
  2. Numeric covariate: regression splines (data Motorcycle)    html    R code
  3. Categorical nominal covariate (data Cars2004nh)    html    R code
  4. Categorical ordinal covariate (data Cars2004nh)    html    R code
VIII. Additivity and Interactions
  1. Two numeric covariates (data Cars2004nh)    html    R code
  2. Numeric and categorical covariate (data Cars2004nh)    html    R code
  3. Two categorical covariates (data Howells)    html    R code
  4. ANOVA tables of type I, II and III (data Cars2004nh)    html    R code
X. Checking Model Assumptions
  1. Partial residuals, Simpson's paradox (data Policie)    html    R code
  2. Partial residuals (data Cars2004nh)    html    R code
  3. Residual plots and tests on assumptions (data Cars2004nh)    html    R code
  4. Checking homoscedasticity (data Draha)    html    R code
  5. Checking uncorrelated errors (data Olympic)    html    R code
  6. Transformation of response: ANOVA with log-transformed response    html    R code
      to get normality and homoscedasticity (data Houses1987)
  7. Transformation of response: Regression with log-transformed response    html    R code
      to stabilize the variance, Box–Cox transformation (data Cars2004nh)
XI. Problematic Regression Space
  1. Multicollinearity (data IQ)    html    R code
  2. Multicollinearity (data Cars2004nh)    html    R code
XII. Simultaneous Inference in a Linear Model
  1. Multiple comparison procedures (Tukey, Hothorn–Bretz–Westfall) (data Howells)    html    R code
  2. Multiple comparison procedures (Hothorn–Bretz–Westfall) (data Cars2004nh)    html    R code
  3. Confidence band around and for the regression function (data Kojeni)    html    R code
XIII. Unusual Observations
  1. Unusual observations (data Cars2004)    html    R code


All information related to the exercise classes is available at the central exercise classes webpage.

To synchronize the exercise classes of the three groups, the exercise classes take place according to the following timetable (the leading group of each exercise class is shown in blue):

     Monday    Tuesday    Thursday
Class #01:  Common to all groups on Tu 06/10 (7:20) in K1
Class #02:    Mo 12/10     Tu 13/10     Th 15/10
Class #03:    Mo 19/10     Tu 20/10     Th 22/10
Class #04:    Mo 26/10     Tu 27/10     Th 29/10
Class #05:    Mo 02/11     Tu 03/11     Th 05/11
Class #06:    Mo 16/11     Tu 10/11     Th 12/11
Class #07:    Mo 23/11     Tu 24/11     Th 19/11
Class #08:    Mo 30/11     Tu 01/12     Th 26/11
Class #09:    Mo 07/12     Tu 08/12     Th 03/12
Class #10:    Mo 14/12     Tu 15/12     Th 10/12
Class #11:    Mo 21/12     Tu 22/12     Th 17/12
Class #12:    Mo 04/01     Tu 05/01     Th 07/01
Class #13:    Mo 11/01     Tu 12/01     Th 14/01
  • A content of equally numbered exercise classes will be approximately the same.
  • Note that Monday 09/11 is the Dean's sport day and Tuesday 17/11 is a public holiday. The Christmas break runs from We 23/12 till Su 03/01.


View My Stats