Linear regression (NMSA407)

Arnošt Komárek


Home (CZ) | Teaching (CZ) | BESEDA | NMST552 |

Teaching winter

NMSA407 | NMST431 |

Teaching summer

NMSA230 | NMST432 |

Teaching, software

Rko (CZ) |


Diploma theses (CZ) | Bachelor theses (CZ) |

Linear regression (NMSA407)

Winter semester 2014–15


Lectures: Thursday 12:20 in K1   
Thursday 17:20 in K1   
Exercise class (KZ): Tuesday 7:20 in K11    (doc. RNDr. Karel Zvára, CSc.)
Exercise class (MO): Tuesday 12:20 in K11    (Ing. Marek Omelka, Ph.D.)
Exercise class (AK): Thursday 19:00 in K11   
  • Lectures are taught in English. Exercise classes AK & MO are taught in English if at least one student requires so, exercise class KZ is taught in Czech.


  • Exam grade will be based on three parts:
    1. Take home project (practical analysis), results delivered in a form of a written report. Assignments will be published latest on December 19, 2014.
    2. Written part composed of theoretical and semi-practical assignments (no computer analysis).
    3. Oral part.
  • To be admitted for the written part of the exam, the project must be delivered on time (deadline will be given together with the assignment) and in a sufficient minimal quality (to be defined together with the issue of the project assignments). Non-admission to the written part of the exam results in a grade ``Fail'' (4).
  • To be admitted for the oral part of the exam, the written part must result in a minimal number of points (to be defined with the assignment). Non-admission to the oral part of the exam results in a grade ``Fail'' (4).
  • The oral part of the exam takes place either in the afternoon of the same day as the written part or the day after the written part.
  • The exam dates for the written part will be communicated in due time via SIS. All exam dates will be in a period January 5 – February 13, 2015. There will be no exam dates during the summer term or later!

Summary of exam requirements is available here (pdf).

Sample exam assignment (from the previous academic year) including hand-written solutions (as detailed as expected to be granted the maximal number of points) is available here (pdf). Read also disclaimer below.

DISCLAIMER: Exam assignments during this academic year will be slightly different! More explanation was given during the evening lecture on 18/12/2014. Additional explanation concerning the exam can also be obtained during the evening (from 17:20) lecture on 08/01/2015.


  • Lectures will combine a slide presentation with a blackboard writing. The PDF version of the slides suitable for printing will gradually become available for download in the ``LECTURE MATERIALS'' section of this page. Students are advised to bring the printed slides to the lectures and supplement them (either directly or on separate sheets of paper) by additional derivations, notes etc. that will be shown only on the blackboard.

  • To synchronize the exercise classes of Tuesday/Thursday groups (there are two "missing" Tuesdays during the semester whereas no "missing" Thursday), the exercise classes take places according to the following timetable:
    Class #1:    Th 02/10, Tu 07/10
    Class #2:    Th 09/10, Tu 14/10
    Class #3:    Th 16/10, Tu 21/10
    Class #4:    Th 30/10, Tu 04/11
    Class #5:    Th 06/11, Tu 11/11
    Class #6:    Th 13/11, Tu 18/11
    Class #7:    Th 20/11, Tu 25/11
    Class #8:    Th 27/11, Tu 02/12
    Class #9:    Th 04/12, Tu 09/12
    Class #10:   Th 11/12, Tu 16/12
    Class #11:   Th 08/01, Tu 06/01
    • A content of equally numbered exercise classes will be approximately the same.
    • There was a special semi-exercise class on Thursday 23/10 (19:00 – 20:30) taking place in K1 which students from all three groups were advised to attend.
    • There will be no exercise class on Thursday 18/12.


All slides     Presentation (pdf)     (published 20150105)
Practical Issues     Presentation (pdf)     (updated 20140923)
I. Linear Model Presentation (pdf) (updated 20150101, minor corrections)
II. Least Squares Estimation Presentation (pdf) (updated 20150101, minor corrections)
III. Principal Interpretation of a Linear Model Presentation (pdf) (updated 20150101, minor corrections)
IV. Quantitative Covariates Presentation (pdf) (updated 20150101, minor corrections)
V. Categorical Covariates Presentation (pdf) (updated 20150101, minor corrections)
VI. Normal Linear Model Presentation (pdf) (updated 20150101, minor corrections)
VII. Submodel Presentation (pdf) (updated 20150101, minor corrections)
VIII. Generalized Least Squares Presentation (pdf) (updated 20150101, minor corrections)
Complete proof of Theorem VIII.1 (pdf) (published 20141112)
IX. Regression Diagnostics Presentation (pdf) (updated 20150101, minor corrections)
X. Identification in less-than-full-rank linear model Presentation (pdf) (updated 20150101, minor corrections)
XI. Two-Way Analysis of Variance Presentation (pdf) (updated 20150101, minor corrections)
XII. Simultaneous Inference in a Linear Model Presentation (pdf) (updated 20150101, minor corrections)
XIII. Consequences of a Problematic Regression Space Presentation (pdf) (updated 20150101, minor corrections)
XIV. Model Building Presentation (pdf) (updated 20150101, minor corrections)
XV. Maximum Likelihood Estimation Presentation (pdf) (updated 20150101, minor corrections)
      in a Full-Rank Normal Linear Model
XVI. Asymptotic Properties of the Least Squares Estimators Presentation (pdf) (updated 20150101, Section 4 extended)
      and Sandwich Estimator of the Covariance Matrix
Appendix: Matrices Presentation (pdf) (updated 20150101, minor corrections)
Appendix: Distributions Presentation (pdf) (updated 20150101, minor corrections)
Appendix: Maximum-Likelihood Theory Presentation (pdf) (updated 20150105, minor corrections)


During the lecture, many things will be illustrated using the output from the R analysis. Corresponding R scripts that will be mentioned or only partially shown during the lecture become available here. It is assumed that, as a part of preparation for subsequent lectures and also exercise classes, student goes individually through those scripts while trying to understand the output and link it to the content of the lectures. It is also assumed that student will be able to use the provided R scripts (e.g., by copy-pasting and modifying parts of it) to solve problems given during the exercise classes.

Understanding of the material provided in those R scripts is also assumed for the exam.

Code which serves to create some plots is mostly overlaid by a PDF() function at the beggining and a function at the end. If you want to see the plots in a classical graphical window, avoid running those commands. By running them, the plot will be stored as a pdf file in your working directory and will not appear in a graphical window.

.Rprofile    Script containing some settings and smaller functions
(to be sourced at the beginning of most of the scripts below)
(updated 20141028)
II. Least Squares Estimation
LinRegr-02-01.R    Matrix algebra background of linear regression
IV. Quantitative Covariates
LinRegr-04-01.R    Regression line (Cars2004nh)
LinRegr-04-02.R    Two quantitative covariates (Cars2004nh)
LinRegr-04-03.R    Simpson's paradox (Policie)
V. Categorical Covariates
LinRegr-05-01.R    Regression with one categorical covariate (Cars2004nh)
LinRegr-05-02.R    Regression with continuous as well as categorical covariate (Cars2004nh)
VI. Normal Linear Model
LinRegr-06-01.R    Joint inference on a vector of estimable parameters (Cars2004nh)
LinRegr-06-02.R    Confidence interval for the model based mean, prediction interval (Hosi0)
LinRegr-06-03.R    Confidence interval for the model based mean, prediction interval (Kojeni)
VII. Submodel
LinRegr-07-01.R    Submodels testing (Cars2004nh)
VIII. Generalized Least Squares
LinRegr-08-01.R    Weighted least squares (Kojeni and wKojeni)
IX. Regression Diagnostics
LinRegr-09-01.R    Quantities of regression diagnostics (Cars2004)
LinRegr-09-02.R    Partial residuals (Cars2004nh)
LinRegr-09-03.R    Partial residuals (Policie)
LinRegr-09-04.R    Residual plots and tests on assumptions (Cars2004nh)
LinRegr-09-05.R    Checking homoscedasticity (Draha)
LinRegr-09-06.R    Checking uncorrelated errors (Olympic)
XI. Two-Way Analysis of Variance
LinRegr-11-01.R    Two-Way Analysis of Variance (Howells)
XII. Simultaneous Inference in a Linear Model
LinRegr-12-01.R    Multiple comparison procedures in two-way ANOVA (Howells)
LinRegr-12-02.R    Illustration of the Tukey distribution of the studentized range
LinRegr-12-03.R    Multiple comparison procedures in ANCOVA context (Cars2004nh)
LinRegr-12-04.R    Confidence band around and for the regression function (Kojeni)
XIII. Consequences of a Problematic Regression Space
LinRegr-13-01.R    Multicollinearity (IQ)
LinRegr-13-02.R    Multicollinearity (Cars2004nh)
XIV. Model Building
LinRegr-14-01.R    ANOVA with log-transformed response
to get normality and homoscedasticity (Houses1987)
LinRegr-14-02.R    Regression with log-transformed response to stabilize the variance,
Box-Cox transformation (Cars2004nh)
LinRegr-14-03.R    Simple regression after transformation of the regressor,
polynomial regression, regression splines (Houses1987)
LinRegr-14-04.R    Illustration of the B-spline basis
LinRegr-14-05.R    Regression splines (Motorcycle)
LinRegr-14-06.R    Model comparison and selection (Cars2004nh)


Those who are not too familiar with the R software are adviced that they already before the first exercise class go through R-related materials available on the webpage of the NMSA230: Software for Mathematics and Stochastics course.

Most datasets we will be working on during the exercices sessions plus few smaller R functions are available in a form of extension R package mffSM which can be installed ``from a local repository'' upon its download from the appropriate link below. Windows binary file is intended for the MS Windows users (as the title suggests), the source code is intended for those users who are used to compile their software from the source (mostly Linux, Mac etc. users). The mffSM package depends on packages colorspace, lattice, car, which are available in a standard way from CRAN. All those dependency packages should normally be automatically installed if the installation of the mffSM package is performed directly from the R console on an Internet-connected computer using the command (its appropriately modified analogy):

install.packages("C:/WHERE_DOWNLOADED/", repos = NULL)

The mffSM package might be updated during the semester. If this happens, the version number (like 1.0) will be increased and new files will be available.

Windows binary:    (published 20140923)
Source code:    mffSM_1.0.tar.gz    (published 20140923)

R scripts and assignments for exercise classes

Class #1 (02/10, 07/10)    Assignment (R script)    (published 20140924)
Class #2 (09/10, 14/10)    Assignment (R script)    (published 20141008)
    Data (csv)   
Class #3 (16/10, 21/10)    Assignment (R script)    (published 20141014)
Class #4 (30/10, 04/11)    Assignment (R script)    (published 20141029)
    Data (csv)   
Class #5 (06/11, 11/11)    Assignment (R script)    (published 20141105)
    Data (csv)    4 slides (pdf)
Class #6 and #7 (13 and 20/11, 18 and 25/11)    Assignment (R script)    (published 20141112)
Class #8 (27/11, 02/12)    Assignment (R script)    (published 20141125)
Class #9 (04/12, 09/12)    Assignment (R script)    (published 20141202)
    Data (RData)    Precalculated results (RData)
Class #10 (11/12, 16/12)    Assignment (R script)    (published 20141202)
Class #11 (06/01, 08/01)    Assignment (R script)    (published 20150101)
    Data (csv)   

Homework assignments

Homework 1 (Deadline Th: 16/10, Deadline Tu: 21/10)    Assignment (R script)    Data (RData)    (published 20140923)
General comments to quality of delivered reports (by Karel Zvára): pdf.
Homework 2 (Deadline 25/11)    Assignment (R script)    Data (RData)    (published 20141105)
Homework 3 (Deadline 25/12, 05:59)    Assignment (R script)    Data (csv)    (published 20141202)


The final requirements to get a course credit were published on 23/09/2014 and are available inside this document (pages 7–15).

Sample assignments of a final test towards the course credit is available here (pdf, updated 20141205).

Summary table of points obtained for delivered homeworks and a test is available here (pdf).


View My Stats