Linear regression (NMSA407)

Arnošt Komárek

Subpages

Home (CZ) | Teaching (CZ) | BESEDA | NMST552 |

Teaching winter

NMSA407 | NMST431 |

Teaching summer

NMST432 | NMST440 |

Teaching, software

Rko (CZ) |

Theses

Diploma theses (CZ) | Bachelor theses (CZ) |

Linear regression (NMSA407)

Winter semester 2017–18

SIS pages of the course:    ENG    CZE

TIMETABLE

Lectures: Thursday 9:00 in K1   
Thursday 14:00 in K1   
Exercise class (MM1): Tuesday 10:40 in K4    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (MM2): Tuesday 12:20 in K4    (RNDr. Matúš Maciak, Ph.D.)
Exercise class (SN): Friday 9:00 in K11    (Mgr. Stanislav Nagy, Ph.D.)
  • Language of both lectures and all exercise classes is English.
  • Personal communication with the lecturer and the exercise class instructors can also be conducted in Czech or Slovak.

ANNOUNCEMENTS

24/09/2017:   Shift of three lectures from January to October/November
There will be only one lecture in January 2018. Remaining three January lectures are moved to October/November 2017. All replacement lectures take place on three Wednesdays in a time slot which has no time clashes with other lectures/exercise classes for students of the first year of both PMSE and FPM study branches:
11/10/2017 from 15:40 in K1.
25/10/2017 from 15:40 in K1.
22/11/2017 from 15:40 in K1.
 

COURSE NOTES

Course notes will be gradually updated. They provide a record of the lecture including notes, comments etc. mentioned perhaps only orally during the lecture. In many cases, the course notes do not include proofs or derivations, especially those that are fully shown on the blackboard during the lecture.

The lecture will follow the notes quite closely and more or less in a linear way. Students are advised to bring printed course notes to the lecture and supplement them by their own hand-written notes. Not everything that will be said will be written on the blackboard (especially various remarks etc.). Also statements of definitions and theorems will not be fully written on the blackboard.

Notes (pdf), chapters 1 – 8, appendices published 20170924.
  chapters 9 and 10 added 20171019
  chapter 11 added 20171112
  chapter 12 added 20171123

Next to gradually updated course notes, full course notes from academic year 2016–17 are available here. Nevertheless, those might be different from the 2017–18 version at some places. Moreover, possible errors found in the 2016–17 version are corrected only in the 2017–18 update.

COURSE SLIDES

Course slides will be projected during the lecture. They mainly contain

  • the structure of the lecture;
  • statements of definitions and theorems;
  • some illustrative plots/computer output.
Course slides alone are rather incomplete as a study material. In principle, it is not necessary to print the slides. Information they contain is just a subset of information included in the notes, only in a different format (suitable for projection).

Main lecture (pdf), chapters 1 – 8 published 20170924.
 chapters 9 – 10 added 20171019
 chapter 11 added 20171112
 chapter 12 added 20171123
 
Appendices (pdf) published 20170924

The course slides used in the academic year 2016–17 are available here. Note that those may differ at some places from the slides used in 2017–18.

SUPPLEMENTARY R PACKAGE

The course is supplemented by the R package mffSM which contains example datasets used throughout the course and few additional small functions related to processing of the linear model fit. Upon download (from the link below, not from CRAN), the package can be installed in R in a standard way (``from a local repository''). Windows binary file is intended for the MS Windows users (as the title suggests), the source code is intended for users of other (mostly more reliable) operating systems where it is a standard to compile the package from its source (Linux, Mac etc.). The mffSM package depends on packages colorspace, lattice, car, which are available in a standard way from CRAN. All those dependency packages should normally be automatically installed if the installation of the mffSM package is performed directly from the R console on an Internet-connected computer using the command (its appropriately modified analogy):

install.packages("PATH_WHERE_DOWNLOADED/mffSM_1.1.[tar.gz,zip]", repos = NULL)

Source code:   mffSM_1.1.tar.gz
Windows binary:   mffSM_1.1.zip


R TUTORIALS

R tutorials show the R analyses that are based on theory given during the lectures. They also provide the code used to prepare majority of the output/plots that is used during the lectures as illustrations. The R tutorials may serve as a reference for the assignments performed during the exercise classes or required in homeworks.

R tutorials will be gradually published during the semester in correspondence with the topics covered by the lecture. All tutorials from the academic year 2016–17 are available here.

The R scripts provided below assume that the content of the .Rprofile is sourced at start.

-
1. Linear Model
  1. Simple illustration of a linear model (data Hosi0)    html    R code
 
2. Least Squares Estimation
  1. Matrix algebra background of linear regression    html    R code
  2. R function lm    html    R code
 
3. Normal Linear Model
  1. Inference in a model with the regression line (data Cars2004nh)    html    R code
  2. Joint inference on a vector of estimable parameters (data Cars2004nh)    html    R code
  3. Confidence interval for the model based mean, prediction interval (data Hosi0)    html    R code
  4. Confidence interval for the model based mean, prediction interval (data Kojeni)    html    R code
 
4. Basic Regression Diagnostics
  1. Basic Regression Diagnostics (data Cars2004nh)    html    R code
 
7. General Linear Model
  1. Weighted least squares (data Kojeni and wKojeni)    html    R code
 
8. Parameterizations of Covariates
  1. Numeric covariate: simple transformation, polynomial regression, regression splines (data Houses1987)    html    R code
  2. Numeric covariate: regression splines (data Motorcycle)    html    R code
  3. Categorical nominal covariate (data Cars2004nh)    html    R code
  4. Categorical ordinal covariate (data Cars2004nh)    html    R code
 
9. Additivity and Interactions
  1. Two numeric covariates (data Cars2004nh)    html    R code
  2. Numeric and categorical covariate (data Cars2004nh)    html    R code
  3. ANOVA tables of type I, II and III (data Cars2004nh)    html    R code
 
10. Analysis of Variance
  1. Two-way Analysis of Variance (data Howells)    html    R code
 
11. Simultaneous Inference in a Linear Model
  1. Multiple comparison procedures (Tukey, Hothorn–Bretz–Westfall) (data Howells)    html    R code
  2. Multiple comparison procedures (Hothorn–Bretz–Westfall) (data Cars2004nh)    html    R code
  3. Confidence band around and for the regression function (data Kojeni)    html    R code
 
12. Checking Model Assumptions
  1. Partial residuals, Simpson's paradox (data Policie)    html    R code
  2. Partial residuals (data Cars2004nh)    html    R code
  3. Residual plots and tests on assumptions (data Cars2004nh)    html    R code
  4. Checking homoscedasticity (data Draha)    html    R code
  5. Checking uncorrelated errors (data Olympic)    html    R code
  6. Transformation of response: ANOVA with log-transformed response    html    R code
      to get normality and homoscedasticity (data Houses1987)
  7. Transformation of response: Regression with log-transformed response    html    R code
      to stabilize the variance, Box–Cox transformation (data Cars2004nh)
 

EXERCISE CLASSES

All information related to the exercise classes is available at the central exercise classes webpage.

Requirements to get the course credit (zápočet) are described here (published on 26 September 2017).

Exercise classes are synchronized. Content of the classes held in the same week is approximately the same.

EXAM

  • It is necessary to be in possession of a course credit (zápočet) to be able to take exam.
  • Exam grade will be based on three parts:
    1. Take home project (practical analysis), results delivered in a form of a written report by prescribed deadline. Assignments will be published latest on January 2, 2018.
    2. Written part composed of theoretical and semi-practical assignments (no computer analysis).
    3. Oral part.

All exams take place between January 15 and February 16, 2018. There will be four to five opportunities to take an exam spread over this period. There will be no exam dates later on.

 

View My Stats