Základy regrese | NMFM 334

Letný semester 2024 | Cvičenie 9 | 23.04.2024
HTML Markdown – Rmd source file (UTF encoding)



Outline of the lab session no.9:



The idea is to consider a general version of the linear regression model where \[ \boldsymbol{Y} \sim N(\mathbb{X}\boldsymbol{\beta}, \sigma^2 \mathbb{W}^{-1}), \] where \(\boldsymbol{Y} = (Y_1, \dots, Y_n)^\top\) is the response vector for \(n \in \mathbb{N}\) independent subjects, \(\mathbb{X}\) is the corresponding model matrix (of a full rank \(p \in \mathbb{N}\)) and \(\boldsymbol{\beta} \in \mathbb{R}^p\) is the vector of the unknown parameters.

Note, that the normality assumption above is only used as a simplification but it is not needed to fit the model (without normality the results and the inference will hold asymptoticaly).

Considering the variance structure, reflected in the matrix \(\mathbb{W} \in \mathbb{R}^{n \times n}\) (a diagonal matrix because the subjets are mutually independent) can can distinguish two cases:

  1. the variance matrix \(\mathbb{W}\) is a known matrix
  2. the variance matrix \(\mathbb{W}\) is an unknown matrix

The first case leads to the so-called general linear model (not a generalized linear model, which is a different class of models) and the second case leads to a regression model under heteroscedastic errors. Both cases will be briefly addressed below.



1. General linear model

We start with the dataset (heads) which respresents head sizes of fetuses (from an ultrasound monitoring) in different periods of pregnancy. Each data value (row in the dataframe) provides an average over \(n \in \mathbb{N}\) measurements of \(n\) independent subjects (mothers).

The information provided in the data stand for the: \(i\) serial number; \(t\) week of the pregnancy; \(n\) number of measurements used to calculate the average head size; and \(y\) the average head size obtained from the particular number of measurements

 library("mffSM")
 
 data(Hlavy, package = "mffSM")
 
 ### Exploratory analysis
 head(Hlavy)
 ##   i  t  n     y
 ## 1 1 16  2 5.100
 ## 2 2 18  3 5.233
 ## 3 3 19  9 4.744
 ## 4 4 20 10 5.110
 ## 5 5 21 11 5.236
 ## 6 6 22 20 5.740
 dim(Hlavy)
 ## [1] 26  4
 summary(Hlavy)
 ##        i               t               n               y        
 ##  Min.   : 1.00   Min.   :16.00   Min.   : 2.00   Min.   :4.744  
 ##  1st Qu.: 7.25   1st Qu.:23.25   1st Qu.:20.25   1st Qu.:5.871  
 ##  Median :13.50   Median :29.50   Median :47.00   Median :7.776  
 ##  Mean   :13.50   Mean   :29.46   Mean   :39.85   Mean   :7.461  
 ##  3rd Qu.:19.75   3rd Qu.:35.75   3rd Qu.:60.00   3rd Qu.:8.980  
 ##  Max.   :26.00   Max.   :42.00   Max.   :85.00   Max.   :9.633
 plot(y ~ t, data = Hlavy, pch = 23, col = "red3", bg = "orange")

The simple scatterplot above can be, however, substantially improved by accounting also for reliability of each observation (higher number of measurements used to obtain the average also means higher reliability). This can be obtain, for instance, by the following code:

 with(Hlavy, quantile(n, prob = seq(0, 1, by = 0.2)))
 ##   0%  20%  40%  60%  80% 100% 
 ##    2   11   31   53   60   85
 Hlavy <- transform(Hlavy, fn = cut(n, breaks = c(0, 10, 30, 50, 60, 90), 
                          labels = c("<=10", "11-30", "31-50", "51-60", ">60")))
 with(Hlavy, summary(fn))
 ##  <=10 11-30 31-50 51-60   >60 
 ##     5     5     5     6     5
 plotData <- function(){
  PAL <- rev(heat_hcl(length(levels(Hlavy[, "fn"])), c.=c(80, 30), l=c(30, 90), 
                      power=c(1/5, 1.3)))
  names(PAL) <- levels(Hlavy[, "fn"])
  plot(y ~ t, data = Hlavy, pch = 23, col = "black", bg = PAL[fn])
  legend(17, 9, legend = levels(Hlavy[, "fn"]), title = "n", fill = PAL)
  abline(v = 27, col = "lightblue", lty = 2)
 }    
 
 plotData()

It is reasonable to assume, that for each measurement we have some model of the form \[ \widetilde{Y}_i | \boldsymbol{X}_i \sim N(\boldsymbol{X}_i^\top\boldsymbol{\beta}, \sigma^2) \] which is a homoscedastic model. Unfortunattely, we do not observe \(\widetilde{Y}_i\) directly, but, instead, there a a set of independent measurements \(\widetilde{Y}_{ij}\) for \(j = 1, \dots, n_i\) (all measured conditionally on \(\boldsymbol{X}_i\)) and we only observe the overall average \(Y_i = \frac{1}{n_i}\sum_{j = 1}^{n_i} \widetilde{Y}_{ij}\).

It also holds, that \[ Y_i \sim N(\boldsymbol{X}_i^\top\boldsymbol{\beta}, \sigma^2/{n_i}), \] or, considering all data together, we have \[ \boldsymbol{Y} \sim N_n(\mathbb{X}\boldsymbol{\beta}, \sigma^2 \mathbb{W}^{-1}), \] where \(\mathbb{W}^{-1}\) is a diagonal matrix \(\mathbb{W}^{-1} = diag(1/n_1, \dots, 1/n_n)\) (thus, the variance structure is known from the design of the experiment).

In addition, practical motivation behind the model is the following: Practitioners say that the relationship \(y ~ t\) could be described by a piecewise quadratic function with \(t = 27\) being a point where the quadratic $relationship changes its shape.

From the practical point of view, the fitted curve should be continuous at \(t = 27\) and perhaps also smooth (i.e., with continuous at least the first derivative) as it is biologically not defensible to claim that at \(t = 27\) the growth has a jump in the speed (rate) of the growth.

As far as different observations have different credibility, we will use the general linear model with the known matrix \(\mathbb{W}\). The model can be fitted as follows:

 Hlavy <- transform(Hlavy, t27 = t - 27)
 
 ### Grid to calculate the fitted regression function
 tgrid <- seq(16, 42, length = 100)
 pdata <- data.frame(t27 = tgrid - 27)
 
 
 summary(mCont <- lm(y ~ t27 + I(t27^2) + (t27 > 0):t27 + (t27 > 0):I(t27^2), 
                    weights = n, data = Hlavy))
 ## 
 ## Call:
 ## lm(formula = y ~ t27 + I(t27^2) + (t27 > 0):t27 + (t27 > 0):I(t27^2), 
 ##     data = Hlavy, weights = n)
 ## 
 ## Weighted Residuals:
 ##      Min       1Q   Median       3Q      Max 
 ## -0.88212 -0.35216  0.01232  0.37917  0.85320 
 ## 
 ## Coefficients:
 ##                       Estimate Std. Error t value Pr(>|t|)    
 ## (Intercept)           7.049749   0.042425 166.170  < 2e-16 ***
 ## t27                   0.381177   0.030288  12.585 3.01e-11 ***
 ## I(t27^2)              0.016214   0.003943   4.112 0.000497 ***
 ## t27:t27 > 0TRUE      -0.063531   0.039426  -1.611 0.122017    
 ## I(t27^2):t27 > 0TRUE -0.026786   0.003776  -7.094 5.35e-07 ***
 ## ---
 ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
 ## 
 ## Residual standard error: 0.477 on 21 degrees of freedom
 ## Multiple R-squared:  0.9968,    Adjusted R-squared:  0.9962 
 ## F-statistic:  1652 on 4 and 21 DF,  p-value: < 2.2e-16



Indidividual work

  • What is estimated by the model above? Interpret the estimated parameters:
  • Can you manually reconstruct the estimated values for the mean structure and, also, the estimated values for the variance parameters?

    ### Some naive calculation 
    W <- diag(Hlavy$n);
    Y <- Hlavy$y;
    X <- model.matrix(mCont)
    
    ### Estimated regression coefficients 
    (betahat <- solve(t(X)%*%W%*%X)%*%(t(X)%*%W%*%Y));
    ##                             [,1]
    ## (Intercept)           7.04974918
    ## t27                   0.38117667
    ## I(t27^2)              0.01621412
    ## t27:t27 > 0TRUE      -0.06353122
    ## I(t27^2):t27 > 0TRUE -0.02678567
    coef(mCont)
    ##          (Intercept)                  t27             I(t27^2) 
    ##           7.04974918           0.38117667           0.01621412 
    ##      t27:t27 > 0TRUE I(t27^2):t27 > 0TRUE 
    ##          -0.06353122          -0.02678567
    ### Generalized fitted values 
    (Yg <- X%*%betahat);
    ##        [,1]
    ## 1  4.818714
    ## 2  4.932503
    ## 3  5.038040
    ## 4  5.176004
    ## 5  5.346398
    ## 6  5.549219
    ## 7  5.784468
    ## 8  6.052146
    ## 9  6.352252
    ## 10 6.684787
    ## 11 7.049749
    ## 12 7.356823
    ## 13 7.642754
    ## 14 7.907542
    ## 15 8.151186
    ## 16 8.373688
    ## 17 8.575046
    ## 18 8.755262
    ## 19 8.914334
    ## 20 9.052263
    ## 21 9.169049
    ## 22 9.264692
    ## 23 9.339192
    ## 24 9.392549
    ## 25 9.424762
    ## 26 9.435833
    fitted(mCont)
    ##        1        2        3        4        5        6        7        8 
    ## 4.818714 4.932503 5.038040 5.176004 5.346398 5.549219 5.784468 6.052146 
    ##        9       10       11       12       13       14       15       16 
    ## 6.352252 6.684787 7.049749 7.356823 7.642754 7.907542 8.151186 8.373688 
    ##       17       18       19       20       21       22       23       24 
    ## 8.575046 8.755262 8.914334 9.052263 9.169049 9.264692 9.339192 9.392549 
    ##       25       26 
    ## 9.424762 9.435833
    ### Generalized residual sum of squares 
    (RSS <- t(Y - Yg)%*%W%*%(Y - Yg));
    ##          [,1]
    ## [1,] 4.777559
    ### Generalized mean square 
    RSS/(length(Y)-ncol(X))
    ##           [,1]
    ## [1,] 0.2275028
    (summary(mCont)$sigma)^2
    ## [1] 0.2275028
    ### Be careful that Multiple R-squared:  0.9968 reported in the output 
    ### "summary(mCont)" is not equal to 
    sum((Yg-mean(Yg))^2)/sum((Y-mean(Y))^2)
    ## [1] 1.005601
    ### Instead, it equals the following quantity that is more difficult 
    ### to interpret 
    barYw <- sum(Y*Hlavy$n)/sum(Hlavy$n);   
    ### weigthed mean of the original observations 
    
    ### something as "generalized" regression sum of squares 
    SSR <- sum((Yg - barYw)^2 * Hlavy$n)
    
    ### something as "generalized" total variance 
    SST <- sum((Y - barYw)^2 * Hlavy$n); 
    
    ### "generalized" R squared
    SSR/SST
    ## [1] 0.9968318
    summary(mCont)$r.squared
    ## [1] 0.9968318
  • Compare the general linear model above with the simple model below. What are the main differences between these two models? Can you explain them?

    iii <- rep(1:nrow(Hlavy), Hlavy$n)
    Hlavy2 <- Hlavy[iii,]; # new dataset
    
    summary(mCont <- lm(y ~ t27 + I(t27^2) + (t27 > 0):t27 + (t27 > 0):I(t27^2), data = Hlavy2))
    ## 
    ## Call:
    ## lm(formula = y ~ t27 + I(t27^2) + (t27 > 0):t27 + (t27 > 0):I(t27^2), 
    ##     data = Hlavy2)
    ## 
    ## Residuals:
    ##      Min       1Q   Median       3Q      Max 
    ## -0.29404 -0.04505 -0.01469  0.04125  0.30050 
    ## 
    ## Coefficients:
    ##                        Estimate Std. Error t value Pr(>|t|)    
    ## (Intercept)           7.0497492  0.0060548 1164.32   <2e-16 ***
    ## t27                   0.3811767  0.0043227   88.18   <2e-16 ***
    ## I(t27^2)              0.0162141  0.0005628   28.81   <2e-16 ***
    ## t27:t27 > 0TRUE      -0.0635312  0.0056268  -11.29   <2e-16 ***
    ## I(t27^2):t27 > 0TRUE -0.0267857  0.0005389  -49.70   <2e-16 ***
    ## ---
    ## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
    ## 
    ## Residual standard error: 0.06807 on 1031 degrees of freedom
    ## Multiple R-squared:  0.9968, Adjusted R-squared:  0.9968 
    ## F-statistic: 8.11e+04 on 4 and 1031 DF,  p-value: < 2.2e-16





2. Model with heteroscedastic errors (sandwich estimate)

For a general heteroscedastic model it is usually assumed that
\[ Y_i | \boldsymbol{X}_i \sim N(\boldsymbol{X}_i^\top\boldsymbol{\beta}, \sigma^2(\boldsymbol{X}_i)) \] where \(\sigma^2(\cdot)\) is some positive (unknown) function of the response variables \(\boldsymbol{X}_i \in \mathbb{R}^p\). Note, that the normality assumption is here only given for simplification but it is not required.

In the following, we will consider the datasat that comes from the project MANA (Maturita na necisto) launched in the Czech Republic in 2005. This project was a part of a bigger project whose goal was to prepare a new form of graduation exam (“statni maturita”).
This dataset contains the results in English language at grammar schools.

load(url("http://msekce.karlin.mff.cuni.cz/~maciak/NMSA407/mana.RData"))



The variables in the data are: region (1 = Praha; 2 = Stredocesky; 3 = Jihocesky;4 = Plzensky; =5 = Karlovarsky; 6 = Ustecky; 7 = Liberecky; 8 = Kralovehradecky; 9 = Pardubicky; 10 = Vysocina; 11 = Jihomoravsky; 12 = Olomoucky; 13 = Zlinsky; 14 = Moravskoslezsky), specialization (1 = education; 2 = social science; 3 = languages; 4 = law; 5 = math-physics; 6 = engineering; 7 = science; 8 = medicine; 9 = economics; 10 = agriculture; 11 = art), gender (0 = female; 1 = male), graduation (0 = no; 1 = yes), entry.exam(0 = no; 1 = yes), score (score in English language in MANA test), avg.mark (aaverage mark (grade) from all subjects at the last report card), and mark.eng (average mark (grade) from the English language at the last 5 report cards).

For better intuition, we can transform the data as follows:

mana <- transform(mana, fregion = factor(region, levels = 1:14, 
                         labels = c("A", "S", "C", "P", "K", "U", "L", "H", "E",
                                    "J", "B", "M", "Z", "T")),
                        fspec   = factor(specialization, levels = 1:11, 
                                         labels = c("Educ", "Social", "Lang", 
                                                    "Law", "Mat-Phys", "Engin", 
                                                    "Science", "Medic", 
                                                    "Econom", "Agric", "Art")),
                        fgender = factor(gender, levels = 0:1, 
                                        labels = c("Female", "Male")),
                        fgrad   = factor(graduation, levels = 0:1,
                                         labels = c("No", "Yes")),
                        fentry  = factor(entry.exam, levels = 0:1, 
                                         labels = c("No", "Yes")))

summary(mana)
##      region       specialization       gender        graduation    
##  Min.   : 1.000   Min.   : 1.000   Min.   :0.000   Min.   :0.0000  
##  1st Qu.: 3.000   1st Qu.: 2.000   1st Qu.:0.000   1st Qu.:1.0000  
##  Median : 7.000   Median : 6.000   Median :0.000   Median :1.0000  
##  Mean   : 7.462   Mean   : 5.496   Mean   :0.415   Mean   :0.9585  
##  3rd Qu.:11.000   3rd Qu.: 8.000   3rd Qu.:1.000   3rd Qu.:1.0000  
##  Max.   :14.000   Max.   :11.000   Max.   :1.000   Max.   :1.0000  
##                                                                    
##    entry.exam         score          avg.mark        mark.eng    
##  Min.   :0.0000   Min.   : 5.00   Min.   :1.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:40.00   1st Qu.:1.480   1st Qu.:1.400  
##  Median :0.0000   Median :44.00   Median :1.640   Median :2.000  
##  Mean   :0.4536   Mean   :43.02   Mean   :1.645   Mean   :2.176  
##  3rd Qu.:1.0000   3rd Qu.:48.00   3rd Qu.:1.810   3rd Qu.:2.800  
##  Max.   :1.0000   Max.   :52.00   Max.   :2.470   Max.   :4.200  
##                                                                  
##     fregion         fspec        fgender     fgrad      fentry    
##  T      : 923   Social : 855   Female:3044   No : 216   No :2843  
##  S      : 665   Econom : 846   Male  :2159   Yes:4987   Yes:2360  
##  B      : 634   Engin  : 638                                      
##  A      : 599   Medic  : 564                                      
##  L      : 534   Science: 557                                      
##  C      : 474   Educ   : 513                                      
##  (Other):1374   (Other):1230

Indidividual work

  • Consider the transformed data above and perform a simple exploratory analaysis (in terms of some simple sample characteristics and plots).
  • <Focus, in particular on the region in the Czech republic and the specialiation./li>



For illustration purposes we will fit a simple (additive) model to explain the average mark given the region and the specialization.

mAddit <- lm(avg.mark ~ fregion + fspec, data = mana, x = T)
summary(mAddit)
## 
## Call:
## lm(formula = avg.mark ~ fregion + fspec, data = mana, x = T)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.68845 -0.16441 -0.00517  0.15618  0.84024 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    1.77616    0.01432 124.065  < 2e-16 ***
## fregionS       0.00887    0.01334   0.665   0.5063    
## fregionC      -0.01336    0.01455  -0.918   0.3585    
## fregionP       0.02008    0.02197   0.914   0.3607    
## fregionK      -0.02198    0.02098  -1.048   0.2948    
## fregionU       0.05678    0.02049   2.772   0.0056 ** 
## fregionL       0.02388    0.01411   1.692   0.0907 .  
## fregionH       0.04861    0.02104   2.310   0.0209 *  
## fregionE       0.07980    0.02049   3.894 9.98e-05 ***
## fregionJ      -0.03138    0.01942  -1.616   0.1062    
## fregionB       0.03236    0.01348   2.400   0.0164 *  
## fregionM       0.02588    0.02159   1.199   0.2306    
## fregionZ       0.01653    0.01880   0.879   0.3794    
## fregionT       0.00846    0.01250   0.677   0.4984    
## fspecSocial   -0.15080    0.01325 -11.382  < 2e-16 ***
## fspecLang     -0.26387    0.01700 -15.520  < 2e-16 ***
## fspecLaw      -0.15020    0.01534  -9.789  < 2e-16 ***
## fspecMat-Phys -0.22192    0.01926 -11.525  < 2e-16 ***
## fspecEngin    -0.17752    0.01403 -12.655  < 2e-16 ***
## fspecScience  -0.10411    0.01448  -7.191 7.36e-13 ***
## fspecMedic    -0.14369    0.01446  -9.938  < 2e-16 ***
## fspecEconom   -0.16751    0.01330 -12.591  < 2e-16 ***
## fspecAgric    -0.07115    0.03120  -2.281   0.0226 *  
## fspecArt      -0.15527    0.02030  -7.650 2.38e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.236 on 5179 degrees of freedom
## Multiple R-squared:  0.07124,    Adjusted R-squared:  0.06712 
## F-statistic: 17.27 on 23 and 5179 DF,  p-value: < 2.2e-16
modelMatrix <- mAddit$x

plotLM(mAddit)

In the diagnostic plots, there are some obvious issues with heteroscedasticity. We will use the sandwich estimate of the variance matrix to account for this heteroscedasticty.

library(sandwich)


### Estimate of the variance matrix of the regression coefficients 
(VHC <- vcovHC(mAddit, type = "HC3"))
##                 (Intercept)      fregionS      fregionC      fregionP
## (Intercept)    0.0002035881 -1.045279e-04 -1.065979e-04 -1.052227e-04
## fregionS      -0.0001045279  1.942935e-04  1.020741e-04  1.023074e-04
## fregionC      -0.0001065979  1.020741e-04  2.077926e-04  1.022329e-04
## fregionP      -0.0001052227  1.023074e-04  1.022329e-04  4.862487e-04
## fregionK      -0.0001034781  1.011661e-04  1.006027e-04  1.010982e-04
## fregionU      -0.0001151940  1.022038e-04  1.021223e-04  1.022390e-04
## fregionL      -0.0001054516  1.022997e-04  1.022098e-04  1.026960e-04
## fregionH      -0.0001069183  1.017729e-04  1.019779e-04  1.022026e-04
## fregionE      -0.0001119843  1.034123e-04  1.036049e-04  1.040351e-04
## fregionJ      -0.0001020705  1.011792e-04  1.015031e-04  1.019322e-04
## fregionB      -0.0001056148  1.015463e-04  1.015702e-04  1.016549e-04
## fregionM      -0.0001238900  1.022109e-04  1.023576e-04  1.023937e-04
## fregionZ      -0.0001021461  1.021408e-04  1.021596e-04  1.025831e-04
## fregionT      -0.0001119894  1.021415e-04  1.022646e-04  1.022399e-04
## fspecSocial   -0.0001095004  5.186032e-06  4.850734e-06  5.542872e-06
## fspecLang     -0.0001018653 -4.794218e-06 -1.832688e-08 -3.318818e-06
## fspecLaw      -0.0001158160  1.136511e-05  1.754327e-05  1.720455e-05
## fspecMat-Phys -0.0001075772 -1.297224e-07  5.170287e-06 -1.690925e-07
## fspecEngin    -0.0001039184  3.516317e-07  2.732096e-06 -6.940539e-06
## fspecScience  -0.0001035725 -2.615603e-07  7.836906e-07  5.838505e-06
## fspecMedic    -0.0001053234 -1.747645e-06  7.730410e-07 -7.512734e-07
## fspecEconom   -0.0001117251  5.388386e-06  7.172107e-06  4.979715e-06
## fspecAgric    -0.0001261814  2.243536e-05  2.673643e-05  2.654674e-05
## fspecArt      -0.0001120803  5.012924e-06  5.279016e-06  1.006513e-06
##                    fregionK      fregionU      fregionL      fregionH
## (Intercept)   -1.034781e-04 -1.151940e-04 -1.054516e-04 -1.069183e-04
## fregionS       1.011661e-04  1.022038e-04  1.022997e-04  1.017729e-04
## fregionC       1.006027e-04  1.021223e-04  1.022098e-04  1.019779e-04
## fregionP       1.010982e-04  1.022390e-04  1.026960e-04  1.022026e-04
## fregionK       4.261994e-04  1.006884e-04  1.008589e-04  1.004968e-04
## fregionU       1.006884e-04  4.576749e-04  1.024748e-04  1.026034e-04
## fregionL       1.008589e-04  1.024748e-04  1.988494e-04  1.020482e-04
## fregionH       1.004968e-04  1.026034e-04  1.020482e-04  4.092964e-04
## fregionE       1.014683e-04  1.040431e-04  1.037727e-04  1.032234e-04
## fregionJ       1.018529e-04  1.014398e-04  1.018628e-04  1.006620e-04
## fregionB       1.004206e-04  1.020086e-04  1.018307e-04  1.018632e-04
## fregionM       1.016880e-04  1.042711e-04  1.025255e-04  1.021591e-04
## fregionZ       1.015116e-04  1.016879e-04  1.025097e-04  1.020868e-04
## fregionT       1.005988e-04  1.031143e-04  1.022234e-04  1.022581e-04
## fspecSocial    6.123929e-06  1.533598e-05  8.484251e-06  7.011846e-06
## fspecLang     -6.607376e-06  7.452497e-06 -7.832722e-06 -3.764169e-06
## fspecLaw       8.453615e-07  2.295827e-05  1.590225e-05  1.080191e-05
## fspecMat-Phys  4.253874e-06  1.057390e-05 -6.122380e-07  1.458788e-05
## fspecEngin     4.393692e-06  5.525088e-06 -1.484394e-06 -9.184423e-08
## fspecScience   4.344921e-07  1.130438e-05 -2.864310e-07  4.852612e-06
## fspecMedic     7.851661e-07  1.821728e-05  2.995977e-07  5.228142e-06
## fspecEconom    8.542314e-06  1.594485e-05  5.500519e-06  5.011322e-06
## fspecAgric     2.221538e-05  2.828049e-05  1.965951e-05  3.473386e-05
## fspecArt      -1.858998e-05  2.214965e-05  3.824366e-06  1.167659e-05
##                    fregionE      fregionJ      fregionB      fregionM
## (Intercept)   -1.119843e-04 -1.020705e-04 -1.056148e-04 -1.238900e-04
## fregionS       1.034123e-04  1.011792e-04  1.015463e-04  1.022109e-04
## fregionC       1.036049e-04  1.015031e-04  1.015702e-04  1.023576e-04
## fregionP       1.040351e-04  1.019322e-04  1.016549e-04  1.023937e-04
## fregionK       1.014683e-04  1.018529e-04  1.004206e-04  1.016880e-04
## fregionU       1.040431e-04  1.014398e-04  1.020086e-04  1.042711e-04
## fregionL       1.037727e-04  1.018628e-04  1.018307e-04  1.025255e-04
## fregionH       1.032234e-04  1.006620e-04  1.018632e-04  1.021591e-04
## fregionE       4.707723e-04  1.025186e-04  1.029284e-04  1.054622e-04
## fregionJ       1.025186e-04  3.853793e-04  1.009154e-04  1.013285e-04
## fregionB       1.029284e-04  1.009154e-04  1.946619e-04  1.021131e-04
## fregionM       1.054622e-04  1.013285e-04  1.021131e-04  5.880047e-04
## fregionZ       1.034574e-04  1.016605e-04  1.016424e-04  1.012773e-04
## fregionT       1.048254e-04  1.008371e-04  1.020891e-04  1.033797e-04
## fspecSocial    9.663411e-06  7.831133e-07  4.647422e-06  2.423658e-05
## fspecLang      4.964611e-07 -7.758178e-06 -4.279974e-06  1.646022e-05
## fspecLaw       2.230938e-05  6.582371e-06  1.292575e-05  3.297904e-05
## fspecMat-Phys  1.127995e-05 -9.293771e-06  8.187614e-06  2.023267e-05
## fspecEngin    -3.150392e-06  8.207013e-07  7.844985e-07  1.964833e-05
## fspecScience  -9.814639e-07  3.705295e-06  5.278403e-07  1.788432e-05
## fspecMedic     5.092648e-06 -2.162083e-06  2.990129e-06  2.586856e-05
## fspecEconom    2.132032e-05  5.159279e-06  6.527332e-06  2.983524e-05
## fspecAgric     3.732227e-05  1.594359e-05  1.920988e-05  3.202834e-05
## fspecArt       1.461463e-05 -1.306452e-05  1.087469e-05  7.813656e-06
##                    fregionZ      fregionT   fspecSocial     fspecLang
## (Intercept)   -1.021461e-04 -1.119894e-04 -1.095004e-04 -1.018653e-04
## fregionS       1.021408e-04  1.021415e-04  5.186032e-06 -4.794218e-06
## fregionC       1.021596e-04  1.022646e-04  4.850734e-06 -1.832688e-08
## fregionP       1.025831e-04  1.022399e-04  5.542872e-06 -3.318818e-06
## fregionK       1.015116e-04  1.005988e-04  6.123929e-06 -6.607376e-06
## fregionU       1.016879e-04  1.031143e-04  1.533598e-05  7.452497e-06
## fregionL       1.025097e-04  1.022234e-04  8.484251e-06 -7.832722e-06
## fregionH       1.020868e-04  1.022581e-04  7.011846e-06 -3.764169e-06
## fregionE       1.034574e-04  1.048254e-04  9.663411e-06  4.964611e-07
## fregionJ       1.016605e-04  1.008371e-04  7.831133e-07 -7.758178e-06
## fregionB       1.016424e-04  1.020891e-04  4.647422e-06 -4.279974e-06
## fregionM       1.012773e-04  1.033797e-04  2.423658e-05  1.646022e-05
## fregionZ       3.365638e-04  1.019333e-04  5.623728e-06 -1.101705e-05
## fregionT       1.019333e-04  1.590440e-04  1.156647e-05  7.722615e-06
## fspecSocial    5.623728e-06  1.156647e-05  1.657535e-04  1.027061e-04
## fspecLang     -1.101705e-05  7.722615e-06  1.027061e-04  2.849076e-04
## fspecLaw       7.272372e-06  1.555445e-05  1.033034e-04  1.025057e-04
## fspecMat-Phys -2.082997e-06  1.186696e-05  1.026083e-04  1.034841e-04
## fspecEngin    -2.203131e-06  5.951689e-06  1.020894e-04  1.028708e-04
## fspecScience  -2.909723e-06  2.070255e-06  1.020857e-04  1.021834e-04
## fspecMedic    -5.812652e-06  6.077983e-06  1.027525e-04  1.032205e-04
## fspecEconom    1.287623e-06  1.600694e-05  1.034019e-04  1.036670e-04
## fspecAgric     2.868416e-05  2.981068e-05  1.040363e-04  1.028865e-04
## fspecArt      -1.308125e-06  2.532588e-05  1.039312e-04  1.050192e-04
##                    fspecLaw fspecMat-Phys    fspecEngin  fspecScience
## (Intercept)   -1.158160e-04 -1.075772e-04 -1.039184e-04 -1.035725e-04
## fregionS       1.136511e-05 -1.297224e-07  3.516317e-07 -2.615603e-07
## fregionC       1.754327e-05  5.170287e-06  2.732096e-06  7.836906e-07
## fregionP       1.720455e-05 -1.690925e-07 -6.940539e-06  5.838505e-06
## fregionK       8.453615e-07  4.253874e-06  4.393692e-06  4.344921e-07
## fregionU       2.295827e-05  1.057390e-05  5.525088e-06  1.130438e-05
## fregionL       1.590225e-05 -6.122380e-07 -1.484394e-06 -2.864310e-07
## fregionH       1.080191e-05  1.458788e-05 -9.184423e-08  4.852612e-06
## fregionE       2.230938e-05  1.127995e-05 -3.150392e-06 -9.814639e-07
## fregionJ       6.582371e-06 -9.293771e-06  8.207013e-07  3.705295e-06
## fregionB       1.292575e-05  8.187614e-06  7.844985e-07  5.278403e-07
## fregionM       3.297904e-05  2.023267e-05  1.964833e-05  1.788432e-05
## fregionZ       7.272372e-06 -2.082997e-06 -2.203131e-06 -2.909723e-06
## fregionT       1.555445e-05  1.186696e-05  5.951689e-06  2.070255e-06
## fspecSocial    1.033034e-04  1.026083e-04  1.020894e-04  1.020857e-04
## fspecLang      1.025057e-04  1.034841e-04  1.028708e-04  1.021834e-04
## fspecLaw       2.336240e-04  1.030371e-04  1.019179e-04  1.023498e-04
## fspecMat-Phys  1.030371e-04  3.565494e-04  1.023021e-04  1.018275e-04
## fspecEngin     1.019179e-04  1.023021e-04  2.016054e-04  1.015482e-04
## fspecScience   1.023498e-04  1.018275e-04  1.015482e-04  2.065269e-04
## fspecMedic     1.030240e-04  1.032130e-04  1.025095e-04  1.025562e-04
## fspecEconom    1.039564e-04  1.035211e-04  1.027397e-04  1.024010e-04
## fspecAgric     1.051990e-04  1.040546e-04  1.024381e-04  1.022359e-04
## fspecArt       1.049214e-04  1.056042e-04  1.029795e-04  1.020804e-04
##                  fspecMedic   fspecEconom    fspecAgric      fspecArt
## (Intercept)   -1.053234e-04 -1.117251e-04 -1.261814e-04 -1.120803e-04
## fregionS      -1.747645e-06  5.388386e-06  2.243536e-05  5.012924e-06
## fregionC       7.730410e-07  7.172107e-06  2.673643e-05  5.279016e-06
## fregionP      -7.512734e-07  4.979715e-06  2.654674e-05  1.006513e-06
## fregionK       7.851661e-07  8.542314e-06  2.221538e-05 -1.858998e-05
## fregionU       1.821728e-05  1.594485e-05  2.828049e-05  2.214965e-05
## fregionL       2.995977e-07  5.500519e-06  1.965951e-05  3.824366e-06
## fregionH       5.228142e-06  5.011322e-06  3.473386e-05  1.167659e-05
## fregionE       5.092648e-06  2.132032e-05  3.732227e-05  1.461463e-05
## fregionJ      -2.162083e-06  5.159279e-06  1.594359e-05 -1.306452e-05
## fregionB       2.990129e-06  6.527332e-06  1.920988e-05  1.087469e-05
## fregionM       2.586856e-05  2.983524e-05  3.202834e-05  7.813656e-06
## fregionZ      -5.812652e-06  1.287623e-06  2.868416e-05 -1.308125e-06
## fregionT       6.077983e-06  1.600694e-05  2.981068e-05  2.532588e-05
## fspecSocial    1.027525e-04  1.034019e-04  1.040363e-04  1.039312e-04
## fspecLang      1.032205e-04  1.036670e-04  1.028865e-04  1.050192e-04
## fspecLaw       1.030240e-04  1.039564e-04  1.051990e-04  1.049214e-04
## fspecMat-Phys  1.032130e-04  1.035211e-04  1.040546e-04  1.056042e-04
## fspecEngin     1.025095e-04  1.027397e-04  1.024381e-04  1.029795e-04
## fspecScience   1.025562e-04  1.024010e-04  1.022359e-04  1.020804e-04
## fspecMedic     1.997466e-04  1.034978e-04  1.030147e-04  1.042026e-04
## fspecEconom    1.034978e-04  1.644054e-04  1.049520e-04  1.050798e-04
## fspecAgric     1.030147e-04  1.049520e-04  7.368809e-04  1.056533e-04
## fspecArt       1.042026e-04  1.050798e-04  1.056533e-04  4.969342e-04

And we can compare the result with manually recontructed estimate

X     <- model.matrix(mAddit);                                               
Bread <- solve(t(X) %*% X) %*% t(X)
Meat  <- diag(residuals(mAddit)^2 / (1 - hatvalues(mAddit))^2)
Bread %*% Meat %*% t(Bread)
##                 (Intercept)      fregionS      fregionC      fregionP
## (Intercept)    0.0002035881 -1.045279e-04 -1.065979e-04 -1.052227e-04
## fregionS      -0.0001045279  1.942935e-04  1.020741e-04  1.023074e-04
## fregionC      -0.0001065979  1.020741e-04  2.077926e-04  1.022329e-04
## fregionP      -0.0001052227  1.023074e-04  1.022329e-04  4.862487e-04
## fregionK      -0.0001034781  1.011661e-04  1.006027e-04  1.010982e-04
## fregionU      -0.0001151940  1.022038e-04  1.021223e-04  1.022390e-04
## fregionL      -0.0001054516  1.022997e-04  1.022098e-04  1.026960e-04
## fregionH      -0.0001069183  1.017729e-04  1.019779e-04  1.022026e-04
## fregionE      -0.0001119843  1.034123e-04  1.036049e-04  1.040351e-04
## fregionJ      -0.0001020705  1.011792e-04  1.015031e-04  1.019322e-04
## fregionB      -0.0001056148  1.015463e-04  1.015702e-04  1.016549e-04
## fregionM      -0.0001238900  1.022109e-04  1.023576e-04  1.023937e-04
## fregionZ      -0.0001021461  1.021408e-04  1.021596e-04  1.025831e-04
## fregionT      -0.0001119894  1.021415e-04  1.022646e-04  1.022399e-04
## fspecSocial   -0.0001095004  5.186032e-06  4.850734e-06  5.542872e-06
## fspecLang     -0.0001018653 -4.794218e-06 -1.832688e-08 -3.318818e-06
## fspecLaw      -0.0001158160  1.136511e-05  1.754327e-05  1.720455e-05
## fspecMat-Phys -0.0001075772 -1.297224e-07  5.170287e-06 -1.690925e-07
## fspecEngin    -0.0001039184  3.516317e-07  2.732096e-06 -6.940539e-06
## fspecScience  -0.0001035725 -2.615603e-07  7.836906e-07  5.838505e-06
## fspecMedic    -0.0001053234 -1.747645e-06  7.730410e-07 -7.512734e-07
## fspecEconom   -0.0001117251  5.388386e-06  7.172107e-06  4.979715e-06
## fspecAgric    -0.0001261814  2.243536e-05  2.673643e-05  2.654674e-05
## fspecArt      -0.0001120803  5.012924e-06  5.279016e-06  1.006513e-06
##                    fregionK      fregionU      fregionL      fregionH
## (Intercept)   -1.034781e-04 -1.151940e-04 -1.054516e-04 -1.069183e-04
## fregionS       1.011661e-04  1.022038e-04  1.022997e-04  1.017729e-04
## fregionC       1.006027e-04  1.021223e-04  1.022098e-04  1.019779e-04
## fregionP       1.010982e-04  1.022390e-04  1.026960e-04  1.022026e-04
## fregionK       4.261994e-04  1.006884e-04  1.008589e-04  1.004968e-04
## fregionU       1.006884e-04  4.576749e-04  1.024748e-04  1.026034e-04
## fregionL       1.008589e-04  1.024748e-04  1.988494e-04  1.020482e-04
## fregionH       1.004968e-04  1.026034e-04  1.020482e-04  4.092964e-04
## fregionE       1.014683e-04  1.040431e-04  1.037727e-04  1.032234e-04
## fregionJ       1.018529e-04  1.014398e-04  1.018628e-04  1.006620e-04
## fregionB       1.004206e-04  1.020086e-04  1.018307e-04  1.018632e-04
## fregionM       1.016880e-04  1.042711e-04  1.025255e-04  1.021591e-04
## fregionZ       1.015116e-04  1.016879e-04  1.025097e-04  1.020868e-04
## fregionT       1.005988e-04  1.031143e-04  1.022234e-04  1.022581e-04
## fspecSocial    6.123929e-06  1.533598e-05  8.484251e-06  7.011846e-06
## fspecLang     -6.607376e-06  7.452497e-06 -7.832722e-06 -3.764169e-06
## fspecLaw       8.453615e-07  2.295827e-05  1.590225e-05  1.080191e-05
## fspecMat-Phys  4.253874e-06  1.057390e-05 -6.122380e-07  1.458788e-05
## fspecEngin     4.393692e-06  5.525088e-06 -1.484394e-06 -9.184423e-08
## fspecScience   4.344921e-07  1.130438e-05 -2.864310e-07  4.852612e-06
## fspecMedic     7.851661e-07  1.821728e-05  2.995977e-07  5.228142e-06
## fspecEconom    8.542314e-06  1.594485e-05  5.500519e-06  5.011322e-06
## fspecAgric     2.221538e-05  2.828049e-05  1.965951e-05  3.473386e-05
## fspecArt      -1.858998e-05  2.214965e-05  3.824366e-06  1.167659e-05
##                    fregionE      fregionJ      fregionB      fregionM
## (Intercept)   -1.119843e-04 -1.020705e-04 -1.056148e-04 -1.238900e-04
## fregionS       1.034123e-04  1.011792e-04  1.015463e-04  1.022109e-04
## fregionC       1.036049e-04  1.015031e-04  1.015702e-04  1.023576e-04
## fregionP       1.040351e-04  1.019322e-04  1.016549e-04  1.023937e-04
## fregionK       1.014683e-04  1.018529e-04  1.004206e-04  1.016880e-04
## fregionU       1.040431e-04  1.014398e-04  1.020086e-04  1.042711e-04
## fregionL       1.037727e-04  1.018628e-04  1.018307e-04  1.025255e-04
## fregionH       1.032234e-04  1.006620e-04  1.018632e-04  1.021591e-04
## fregionE       4.707723e-04  1.025186e-04  1.029284e-04  1.054622e-04
## fregionJ       1.025186e-04  3.853793e-04  1.009154e-04  1.013285e-04
## fregionB       1.029284e-04  1.009154e-04  1.946619e-04  1.021131e-04
## fregionM       1.054622e-04  1.013285e-04  1.021131e-04  5.880047e-04
## fregionZ       1.034574e-04  1.016605e-04  1.016424e-04  1.012773e-04
## fregionT       1.048254e-04  1.008371e-04  1.020891e-04  1.033797e-04
## fspecSocial    9.663411e-06  7.831133e-07  4.647422e-06  2.423658e-05
## fspecLang      4.964611e-07 -7.758178e-06 -4.279974e-06  1.646022e-05
## fspecLaw       2.230938e-05  6.582371e-06  1.292575e-05  3.297904e-05
## fspecMat-Phys  1.127995e-05 -9.293771e-06  8.187614e-06  2.023267e-05
## fspecEngin    -3.150392e-06  8.207013e-07  7.844985e-07  1.964833e-05
## fspecScience  -9.814639e-07  3.705295e-06  5.278403e-07  1.788432e-05
## fspecMedic     5.092648e-06 -2.162083e-06  2.990129e-06  2.586856e-05
## fspecEconom    2.132032e-05  5.159279e-06  6.527332e-06  2.983524e-05
## fspecAgric     3.732227e-05  1.594359e-05  1.920988e-05  3.202834e-05
## fspecArt       1.461463e-05 -1.306452e-05  1.087469e-05  7.813656e-06
##                    fregionZ      fregionT   fspecSocial     fspecLang
## (Intercept)   -1.021461e-04 -1.119894e-04 -1.095004e-04 -1.018653e-04
## fregionS       1.021408e-04  1.021415e-04  5.186032e-06 -4.794218e-06
## fregionC       1.021596e-04  1.022646e-04  4.850734e-06 -1.832688e-08
## fregionP       1.025831e-04  1.022399e-04  5.542872e-06 -3.318818e-06
## fregionK       1.015116e-04  1.005988e-04  6.123929e-06 -6.607376e-06
## fregionU       1.016879e-04  1.031143e-04  1.533598e-05  7.452497e-06
## fregionL       1.025097e-04  1.022234e-04  8.484251e-06 -7.832722e-06
## fregionH       1.020868e-04  1.022581e-04  7.011846e-06 -3.764169e-06
## fregionE       1.034574e-04  1.048254e-04  9.663411e-06  4.964611e-07
## fregionJ       1.016605e-04  1.008371e-04  7.831133e-07 -7.758178e-06
## fregionB       1.016424e-04  1.020891e-04  4.647422e-06 -4.279974e-06
## fregionM       1.012773e-04  1.033797e-04  2.423658e-05  1.646022e-05
## fregionZ       3.365638e-04  1.019333e-04  5.623728e-06 -1.101705e-05
## fregionT       1.019333e-04  1.590440e-04  1.156647e-05  7.722615e-06
## fspecSocial    5.623728e-06  1.156647e-05  1.657535e-04  1.027061e-04
## fspecLang     -1.101705e-05  7.722615e-06  1.027061e-04  2.849076e-04
## fspecLaw       7.272372e-06  1.555445e-05  1.033034e-04  1.025057e-04
## fspecMat-Phys -2.082997e-06  1.186696e-05  1.026083e-04  1.034841e-04
## fspecEngin    -2.203131e-06  5.951689e-06  1.020894e-04  1.028708e-04
## fspecScience  -2.909723e-06  2.070255e-06  1.020857e-04  1.021834e-04
## fspecMedic    -5.812652e-06  6.077983e-06  1.027525e-04  1.032205e-04
## fspecEconom    1.287623e-06  1.600694e-05  1.034019e-04  1.036670e-04
## fspecAgric     2.868416e-05  2.981068e-05  1.040363e-04  1.028865e-04
## fspecArt      -1.308125e-06  2.532588e-05  1.039312e-04  1.050192e-04
##                    fspecLaw fspecMat-Phys    fspecEngin  fspecScience
## (Intercept)   -1.158160e-04 -1.075772e-04 -1.039184e-04 -1.035725e-04
## fregionS       1.136511e-05 -1.297224e-07  3.516317e-07 -2.615603e-07
## fregionC       1.754327e-05  5.170287e-06  2.732096e-06  7.836906e-07
## fregionP       1.720455e-05 -1.690925e-07 -6.940539e-06  5.838505e-06
## fregionK       8.453615e-07  4.253874e-06  4.393692e-06  4.344921e-07
## fregionU       2.295827e-05  1.057390e-05  5.525088e-06  1.130438e-05
## fregionL       1.590225e-05 -6.122380e-07 -1.484394e-06 -2.864310e-07
## fregionH       1.080191e-05  1.458788e-05 -9.184423e-08  4.852612e-06
## fregionE       2.230938e-05  1.127995e-05 -3.150392e-06 -9.814639e-07
## fregionJ       6.582371e-06 -9.293771e-06  8.207013e-07  3.705295e-06
## fregionB       1.292575e-05  8.187614e-06  7.844985e-07  5.278403e-07
## fregionM       3.297904e-05  2.023267e-05  1.964833e-05  1.788432e-05
## fregionZ       7.272372e-06 -2.082997e-06 -2.203131e-06 -2.909723e-06
## fregionT       1.555445e-05  1.186696e-05  5.951689e-06  2.070255e-06
## fspecSocial    1.033034e-04  1.026083e-04  1.020894e-04  1.020857e-04
## fspecLang      1.025057e-04  1.034841e-04  1.028708e-04  1.021834e-04
## fspecLaw       2.336240e-04  1.030371e-04  1.019179e-04  1.023498e-04
## fspecMat-Phys  1.030371e-04  3.565494e-04  1.023021e-04  1.018275e-04
## fspecEngin     1.019179e-04  1.023021e-04  2.016054e-04  1.015482e-04
## fspecScience   1.023498e-04  1.018275e-04  1.015482e-04  2.065269e-04
## fspecMedic     1.030240e-04  1.032130e-04  1.025095e-04  1.025562e-04
## fspecEconom    1.039564e-04  1.035211e-04  1.027397e-04  1.024010e-04
## fspecAgric     1.051990e-04  1.040546e-04  1.024381e-04  1.022359e-04
## fspecArt       1.049214e-04  1.056042e-04  1.029795e-04  1.020804e-04
##                  fspecMedic   fspecEconom    fspecAgric      fspecArt
## (Intercept)   -1.053234e-04 -1.117251e-04 -1.261814e-04 -1.120803e-04
## fregionS      -1.747645e-06  5.388386e-06  2.243536e-05  5.012924e-06
## fregionC       7.730410e-07  7.172107e-06  2.673643e-05  5.279016e-06
## fregionP      -7.512734e-07  4.979715e-06  2.654674e-05  1.006513e-06
## fregionK       7.851661e-07  8.542314e-06  2.221538e-05 -1.858998e-05
## fregionU       1.821728e-05  1.594485e-05  2.828049e-05  2.214965e-05
## fregionL       2.995977e-07  5.500519e-06  1.965951e-05  3.824366e-06
## fregionH       5.228142e-06  5.011322e-06  3.473386e-05  1.167659e-05
## fregionE       5.092648e-06  2.132032e-05  3.732227e-05  1.461463e-05
## fregionJ      -2.162083e-06  5.159279e-06  1.594359e-05 -1.306452e-05
## fregionB       2.990129e-06  6.527332e-06  1.920988e-05  1.087469e-05
## fregionM       2.586856e-05  2.983524e-05  3.202834e-05  7.813656e-06
## fregionZ      -5.812652e-06  1.287623e-06  2.868416e-05 -1.308125e-06
## fregionT       6.077983e-06  1.600694e-05  2.981068e-05  2.532588e-05
## fspecSocial    1.027525e-04  1.034019e-04  1.040363e-04  1.039312e-04
## fspecLang      1.032205e-04  1.036670e-04  1.028865e-04  1.050192e-04
## fspecLaw       1.030240e-04  1.039564e-04  1.051990e-04  1.049214e-04
## fspecMat-Phys  1.032130e-04  1.035211e-04  1.040546e-04  1.056042e-04
## fspecEngin     1.025095e-04  1.027397e-04  1.024381e-04  1.029795e-04
## fspecScience   1.025562e-04  1.024010e-04  1.022359e-04  1.020804e-04
## fspecMedic     1.997466e-04  1.034978e-04  1.030147e-04  1.042026e-04
## fspecEconom    1.034978e-04  1.644054e-04  1.049520e-04  1.050798e-04
## fspecAgric     1.030147e-04  1.049520e-04  7.368809e-04  1.056533e-04
## fspecArt       1.042026e-04  1.050798e-04  1.056533e-04  4.969342e-04
all.equal(Bread %*% Meat %*% t(Bread), VHC)
## [1] TRUE

This can be conpared with the classical estimate of the variance covariance matrix—in the following, we only focuss on some portion of the matrix.

VHC[1:5, 1:5] 
##               (Intercept)      fregionS      fregionC      fregionP
## (Intercept)  0.0002035881 -0.0001045279 -0.0001065979 -0.0001052227
## fregionS    -0.0001045279  0.0001942935  0.0001020741  0.0001023074
## fregionC    -0.0001065979  0.0001020741  0.0002077926  0.0001022329
## fregionP    -0.0001052227  0.0001023074  0.0001022329  0.0004862487
## fregionK    -0.0001034781  0.0001011661  0.0001006027  0.0001010982
##                  fregionK
## (Intercept) -0.0001034781
## fregionS     0.0001011661
## fregionC     0.0001006027
## fregionP     0.0001010982
## fregionK     0.0004261994
vcov(mAddit)[1:5, 1:5]
##               (Intercept)      fregionS      fregionC      fregionP
## (Intercept)  2.049583e-04 -1.009075e-04 -9.704506e-05 -9.933825e-05
## fregionS    -1.009075e-04  1.780801e-04  9.400668e-05  9.440175e-05
## fregionC    -9.704506e-05  9.400668e-05  2.116711e-04  9.436525e-05
## fregionP    -9.933825e-05  9.440175e-05  9.436525e-05  4.825169e-04
## fregionK    -9.449827e-05  9.342499e-05  9.366535e-05  9.394418e-05
##                  fregionK
## (Intercept) -9.449827e-05
## fregionS     9.342499e-05
## fregionC     9.366535e-05
## fregionP     9.394418e-05
## fregionK     4.400577e-04

Note, that some variance component are overestimated by the standard estimate (function vcov()) and some others or uderestimated.

Indidividual work

  • How can the variance-covariance matrix above (or the matrices above) be used to get the estimate for the variance matrix of \(\boldsymbol{Y}\)?
  • (modelMatrix %*% VHC %*% t(modelMatrix))[1:5, 1:5]
    ##               1            2             3             4             5
    ## 1  6.143465e-04 2.644008e-06 -7.700916e-06 -1.319871e-05 -7.700916e-06
    ## 2  2.644008e-06 1.410302e-04  8.037152e-05  8.117887e-05  8.037152e-05
    ## 3 -7.700916e-06 8.037152e-05  1.430679e-04  8.120846e-05  1.430679e-04
    ## 4 -1.319871e-05 8.117887e-05  8.120846e-05  1.874579e-04  8.120846e-05
    ## 5 -7.700916e-06 8.037152e-05  1.430679e-04  8.120846e-05  1.430679e-04