# NMST539 | Lab Session 2

## Multivariate Normal Distribution

### LS 2017 | Thursday 02/03/17

###### Rmd file (UTF8 coding)

For the second NMST539 lab session we will discover some tools, available within the R software instalation, which are meant for working with the multivariate normal distribution.

A user-friendly interface (one of many): RStudio.

Manuals and introduction into R (in Czech or English):

• Bína, V., Komárek, A. a Komárková, L.: Jak na jazyk R. (PDF súbor)
• Komárek, A.: Základy práce s R. (PDF súbor)
• Kulich, M.: Velmi stručný úvod do R. (PDF súbor)
• De Vries, A. a Meys, J.: R for Dummies. (ISBN-13: 978-1119055808)

#### R package ‘mvtnorm’

For the beginning, we need to make sure that the library for the multivariate normal distribution (mvtnorm) is loaded in the R working environment (use commad library("mvtnorm")). If the library can not be loaded we need to install it first by running the command install.packages("mvtnorm")

library("mvtnorm")

We can now use the loaded library to easily generate random values from some multivariate normal distribution with some pre-specified variance-covariance matrix $$\Sigma$$.

For more details about the library (including the manual document) see the library website on R cran:
https://cran.r-project.org/web/packages/mvtnorm/index.html

#### Comment

It is useful to always keep in mind that everytime we are about to use some random generator in R (or any other software) it is good to predefine the initial values for the generator (in R there is a command called set.seed() to do the job). By doing so one makes sure that the obtained results are always all reconstructable and they can be easily verified.

For instance try (among each other) the command:

(sample <- rnorm(10))
##  [1] -0.7465914  1.5746143 -0.0272531  0.4167468  1.2033093  0.7232020
##  [7]  0.4029413  0.7172862  2.1043320  0.9213117

and compare it with the same commend, however, with the initial setting of the generator done by the set.seed(1234) command (any numerical value can be used as a parameter):

set.seed(1234)
(sample <- rnorm(10))
##  [1] -1.2070657  0.2774292  1.0844412 -2.3456977  0.4291247  0.5060559
##  [7] -0.5747400 -0.5466319 -0.5644520 -0.8900378
Is it clear, what is the difference between these two outputs?

For the beginning, let us start with a simple (two-dimensional only) example: we will generate a random sample of size $$n \in \mathbb{N}$$ from the two-dimensional normal distribution

$$N_{2}\Big(\boldsymbol{\mu} = \Big(\begin{array}{c}2\\3\end{array}\Big), \Sigma = \Big(\begin{array}{cc}10^2 & 6^2\\ 6^2 & 6^2\end{array}\Big) \Big)$$.

Remember, that in case of a one-dimensional random generator for the normal distribution in R (command rnorm()) one needs to specify the standard error instead of the variance (compare the help sessions ?rnorm and ?rmnorm).

The sample of size $$n=1000$$ from the given two-dimensional normal distribution can be generated by the following command:

set.seed(1234)
s1 <- rmvnorm(1000, c(2, 3), matrix(c(10^2, 6^2, 6^2, 6^2),2,2))

And we can plot the generated random sample into a nice scatterplot using for instance the command

plot(s1, pch=21, xlim=c(-35, 35), ylim=c(-35,35), xlab="Marginal No. 1", ylab="Marginal No. 2", bg = "lightblue")