NMST539  Cvičenie 2Multivariate Normal DistributionLS 2017  Monday 26/02/18Rmd file (UTF8 coding)Outline of the second NMST539 lab session:
For the second lab session we utilize some useful tools, available within the standard R software instalation and the R package ‘mvtnorm’, meant for use with the multivariate normal distribution. Some additional options and practical functions in R targeting multivariate distributions will be discussed too. The Rsoftware is available for download (GNU licence, free of charge) from the website: https://www.rproject.org A userfriendly interface (one of many) however, not nesessary for a proper functioning of the R software: RStudio. Some (rather brief) manuals and introduction tutorials for R (in Czech or English):
1. R package ‘mvtnorm’Firstly, we need to make sure that the library for the multivariate normal distribution (R package
We can now use the loaded library to easily generate random values from some multivariate normal distribution with some prespecified variancecovariance matrix \(\Sigma\) (a symmetric and positive definite matrix). For more details about the library (including a PDF manual document) see the library website on R cran: CommentIt is useful to always keep in mind that everytime we are about to use some random generator in R (or any other software) it is good to predefine the initial values for the generator (in R there is a command called For some specific purposes (master thesis, scientific paper, simulations, etc.) this is even a strictly required step. For instance try (among each other) the command:
and compare it with the same command, however, with the initial setting of the generator done by the
Is it clear, what is the difference between these two outputs? ExampleLet us start with a simple (twodimensional) example: we are going to generate a random sample of size \(n \in \mathbb{N}\) from the twodimensional normal distribution for \(\boldsymbol{X}_{i} = (X_{i 1}, X_{i 2})^\top\). Remember, that in case of a onedimensional random generator for the normal distribution in R (command
The sample of size \(n=1000\) from the given twodimensional normal distribution can be generated by the following command:
And we can plot the generated random sample into a nice scatterplot using, for instance, the standard R command
Using now the mean estimates and the sample variancecovariance matrix we can easily draw some ‘depth’ contours in the figure. Another option of course, is to use the theoretical expression for the mutivariate normal density, as we know the distribution we generated from (which is usually not the case in real life situations). However, in the following, we will rather use the generated sample and we calculate the sample mean vector and the sample variancecovariance matrix.
The two dimensional normal density function is defined as where \(\boldsymbol{x} = (x_{1}, x_{2})^\top \in \mathbb{R}^2\), \(\boldsymbol{\mu} = (\mu_{1}, \mu_{2})^\top\) is the mean vector, and \(\Sigma\) is some given variancecovariance matrix (which is symmetric and positive definite). Thus, we also need the inverse matrix for the sample variancecovariance matrix \(\widehat{\Sigma}\) to be able to draw the corresponding contours. The R fuction
This can be now applied in the following way:
And the final plot is produced by the following couple of the commands:
