Introduction into FA
Factor analysis is an effective statistical method for dimensionaly reduction especially in situation where one needs to use the data for further analysis and calculations (e.g. regression). It is used to describe the overall correlation among all variables however, using a potentially lower number of variables which are called factors. These factors are, however, unobserved random variables.
The factor analysis approach searches for similar covariates with respect to their mutuall correlation. All variables with mutually high correlation are represented with a factor (or a linear combination of factors) instead.
In some sense the factor analysis approach can be considered to be a generalization of the classical principal component analysis (PCA) with one great advantage at the end - much more convenient interpretation of the factors.
In the statistical software R there is function factanal()
available under the standard R instalation. Beside that, there are many more additinal functions and packages which can be downloaded and installed in R (e.g. Factanal()
function in the ‘FAiR’ package; fa.promax()
function in the ‘psych’ package;).
For our purposes we mainly use the standard function factanal()
. Let us again recall the biological metrics data from the Czech republic. The data represent 65 different river localities in the Czech Republic where on each locality there are various biological metrics assessed (17 metrics in total).
rm(list = ls())
bioData <- read.csv("http://msekce.karlin.mff.cuni.cz/~maciak/NMST539/bioData.csv", header = T)
The correlation structure (wich will be later assessed using the factor analysis approach) can be either estimated using a standard variance covariance matrix (command var(bioData[,2:18])
) or it can be visualized using the corrplot()
function from the ‘corrplot’ package instead.
library(corrplot)
corrplot(cor(bioData[,2:18]), method="ellipse")

The idea behind the factanal()
function is to consider a \(p\)-dimensional (random) vector \(\boldsymbol{X}\) and to express this vector using a set of \(k\) factors where we require want that \(k \ll p\) (dimensionaly reduction). The model which is fitted by the factor analysis approch by appling the `factanal()’ function in R is the following one:
\[
\boldsymbol{X} = \Lambda \boldsymbol{F} + \boldsymbol{e},
\]
where \(\Lambda\) is a \(p\times k\) dimensional matrix of so called factor loadings, \(\boldsymbol{F}\) is a \(k\)-dimensional (random) vector represending \(k\) factors (common factors or scores respectively) and \(\boldsymbol{e}\) is an approximation error (or specific factors).
The tricky part in this expresssion is that beside the random vector \(\boldsymbol{X}\) no other quantity is directly observed. The problem, as such, is unsolvable unless we pose some additional restrictions. This is done by specifying a varinace covariance structure among all quantities which appear in the expression above:
-
The scores \(\boldsymbol{F}\) are required to be uncorrelated with unit variance (\(Var (\boldsymbol{F}) = \mathbb{I}_{k});\)
-
Error terms \(\boldsymbol{e}\) are independent with some variance (\(Var (\boldsymbol{e}) = \boldsymbol{\Psi}\), where \(var (e_{i}) = \psi_{ii}\));
-
The correlation matrix of \(\boldsymbol{X}\) is decomposed as \(\Sigma = \Lambda \Lambda^{T} + \boldsymbol{\Psi}\).
Note
-
Factor analysis is not unique – the rotation of \(X\) gives the same results;
-
Factor analysis is invariant with respect to the scale;
Since the factors are not unique with respect to the rotation, it is usefull to provide some rotation which makes sense. The automatic proceduce which tries to find factors in a way that original covariates can be splitted into disjoint sents is called the procedure.
To use the varimax procedure for detecting the right rotation we can use the additional parameter ‘rotation=“varimax”’ when calling the function factanal()
in R.
fa1 <- factanal(bioData[,2:18], factors = 3, rotation="varimax")
print(fa1, digits=2, cutoff=.6, sort=TRUE)
##
## Call:
## factanal(x = bioData[, 2:18], factors = 3, rotation = "varimax")
##
## Uniquenesses:
## SaprInd Lital RETI EPTAbu Marg Metaritr
## 0.15 0.09 0.05 0.05 0.03 0.02
## JepAbu Epiritral Hyporitral Ntaxon Nceled Bindex
## 0.10 0.07 0.39 0.21 0.09 0.19
## EPTTax PosAbu Spasaci MDS_1 MDS_2
## 0.10 0.35 0.11 0.13 0.45
##
## Loadings:
## Factor1 Factor2 Factor3
## SaprInd -0.79
## Lital 0.94
## RETI 0.94
## EPTAbu 0.90
## Metaritr 0.96
## JepAbu 0.68 0.65
## Epiritral 0.89
## Hyporitral 0.71
## PosAbu 0.79
## Spasaci 0.94
## MDS_1 0.91
## Marg 0.98
## Ntaxon 0.65
## Nceled 0.94
## Bindex 0.79
## EPTTax 0.83
## MDS_2 0.71
##
## Factor1 Factor2 Factor3
## SS loadings 8.96 4.05 1.41
## Proportion Var 0.53 0.24 0.08
## Cumulative Var 0.53 0.77 0.85
##
## Test of the hypothesis that 3 factors are sufficient.
## The chi square statistic is 414.98 on 88 degrees of freedom.
## The p-value is 6.88e-44
Is such factor representation enough? Compare the model with some others for the increasing number of factors used:
fa2 <- factanal(bioData[,2:18], factors = 4, rotation="varimax")
fa3 <- factanal(bioData[,2:18], factors = 5, rotation="varimax")
fa4 <- factanal(bioData[,2:18], factors = 9, rotation="varimax")
There are mainly two different approaches on how to define the number of factors which should be used. For instance, it is easy to see that for each element \(X_{j}\) from the random vector \(\boldsymbol{X} = (X_{1}, \dots, X_{p})^\top\) it holds that
\[
var(X_{j}) =h_{j}^2 + \psi_{j}, \qquad \textrm{where} \qquad h_{j}^2 = \sum_{\ell = 1}^{k} q_{j \ell}^2,
\]
and \(\boldsymbol{\psi} = Diag\{\psi_{1}, \dots, \psi_{k}\}\), and \(\Lambda = \{q_{i j}\}_{i, j = 1}^{k}\). The quantity \(h_{j}^2\) is the overall variability of \(X_{j}\) which is explained by the common factors in \(F\) (also called the communality) while the second quantity, \(\psi_{j}\) is the variability which is left (also called the specific variability).
To decide on how many factors are needed, one can base his decision on the comparison of the communality and the specific varibility. It is obvious, that for \(k = p\) the following holds: \(h_{j}^2 = var(X_{j})\) and \(\psi_{j} = 0\).
Alternative approaches to judge the right number of factors:
-
Statistical pre-analysis
One usually runs e.g. a principal components analysis to determin how many factors should be enough (in order to have some reasonable proportion of the overal variability explained);
-
Expert judgement
Sometimes (under some optimal interpretation options) the estimated factors nice correspond with some latent variables which are not observed, however, can be indentified by some expert judgement.
-
Maximum Likelihood Test
Using the likelihood approach we can also test the null hypothesis \(H_{0}\) that \(\Sigma = \Lambda\Lambda^\top + \boldsymbol{\psi}\) against a general altternative \(H_{1}\) specifing no restrictions on the variance covariance matrix \(\Sigma\). The likelihood ratio test is given by the test statistic \[
T = - 2 log \Bigg[\frac{\textrm{maximum likelihood under $H_{0}$}}{\textrm{maximum likelihood under $H_{1}$}}\Bigg]
\]
Note
-
Another option on how to determine the optimal number of factor to be extracted in the analysis is to use the tools from the library ‘nFactors’ (the library can be installed by
install.packages("nFactors")
);
library(nFactors)
ev <- eigen(cor(bioData[,2:18]))
ap <- parallel(subject=nrow(bioData[,2:18]),var=ncol(bioData[,2:18]), rep=100, cent=.05)
nS <- nScree(x=ev$values, aparallel=ap$eigen$qevpea)
plotnScree(nS)

load <- fa1$loadings[,1:2]
plot(load,type="n")
text(load,labels=names(bioData[,2:18]),cex=.7)

Alternatively, using the library FactoMineR
and function PCA()
we can obtain the complete factor map (covariate map with the mutual correlation).
library(FactoMineR)
result <- PCA(bioData[,2:18])


Note
-
The
factanal()
function in R uses scaled variables as the starting point for the factor analysis - the variance-covariance matrix of \(\boldsymbol{X}\) is replaced by the corresponding correlation matrix with ones on its diagonal.
Now, we an idea that maybe three factors should be ok. In orther to get the corresponding score values (those are values of \(\boldsymbol{F}\) in the expression above given for each observations - in our case \(n = 65\)) from the factanal()
function we need an additional parameter to be specified:
fa1 <- factanal(bioData[,2:18], factors = 3, scores = "regression")
fa1$scores
## Factor1 Factor2 Factor3
## [1,] -0.12449662 0.49318396 0.019226166
## [2,] -0.33849930 -0.39230318 0.119550138
## [3,] -0.22044557 -0.38580832 2.320568152
## [4,] 0.82566318 1.20827606 -0.940891112
## [5,] -0.13932079 0.89310776 -0.532355972
## [6,] -1.01455286 -0.65600386 -0.375228204
## [7,] -0.82343873 -0.62783914 -0.701274829
## [8,] -0.98728818 -0.13782643 -0.433312546
## [9,] 0.22418720 0.88632819 0.795206993
## [10,] 0.52292639 1.06261823 0.295526977
## [11,] 0.29559045 1.25010730 -0.529517247
## [12,] -0.50645369 -0.62925864 -0.373956123
## [13,] -0.93565845 -1.35045706 -0.637327245
## [14,] -0.95854352 -0.20085958 1.192711934
## [15,] 0.42955162 -0.44374222 -1.634492803
## [16,] 1.52669131 0.90258023 -0.647334942
## [17,] -1.06358562 -0.98644111 -0.614191669
## [18,] -0.17714883 0.29705426 -0.273963895
## [19,] -1.17296128 0.15482177 0.740586340
## [20,] -0.78876331 1.64768437 1.135756173
## [21,] -0.68592393 1.86494676 0.622906404
## [22,] 0.03378541 0.51636339 -0.492220100
## [23,] -0.98102088 0.67287079 0.410026081
## [24,] 1.72144792 0.93062040 -1.616832483
## [25,] -0.76259043 -1.35110582 -0.851702050
## [26,] -0.91691053 -1.09563523 -0.287746278
## [27,] -0.28310899 -0.08405725 -0.554590635
## [28,] 1.01603809 0.64177486 -1.212012177
## [29,] 0.02890208 0.83487030 0.220552649
## [30,] 2.69443073 -1.16900885 0.606585871
## [31,] -0.47082315 -0.18615905 2.222434682
## [32,] 1.95090263 -1.29573724 1.303667307
## [33,] 0.28558011 1.04412770 0.364248451
## [34,] -0.18909235 0.97786635 0.090795648
## [35,] -0.48400090 -0.41407901 -0.459370022
## [36,] -0.86086894 -1.09247855 -0.356923055
## [37,] -0.83806848 -0.92300383 -0.815412673
## [38,] -0.91971280 -0.62821762 -0.098009986
## [39,] -0.87317702 -0.29454466 -0.077311663
## [40,] 0.96502394 -1.12911936 4.327243670
## [41,] -0.63704689 0.20145093 0.197835134
## [42,] -0.19959392 1.30938359 0.667994311
## [43,] 3.12996486 -1.62033577 0.044354682
## [44,] 1.20880189 0.64341909 -0.149924195
## [45,] 0.06135524 -0.02822842 -0.557196522
## [46,] -0.86225779 -1.61696484 -1.552055930
## [47,] -0.61532943 -0.45390266 -0.419664831
## [48,] -0.14042956 0.55569570 -0.113699662
## [49,] -0.76637088 0.22722578 -0.178657943
## [50,] 0.92871029 1.82419989 -0.448391534
## [51,] 1.52938447 1.37527251 -1.001758261
## [52,] -0.06380299 -2.13721348 0.714719247
## [53,] -0.48147547 0.55671249 1.028660417
## [54,] -0.01773075 -0.25909705 -0.961153139
## [55,] 0.44621941 0.62532458 -0.231747386
## [56,] -0.48055946 0.70571466 0.712530342
## [57,] -0.57932009 -0.16888336 0.267753908
## [58,] 0.41294090 0.99097762 -0.110134319
## [59,] -0.54167359 0.01836471 0.755755149
## [60,] -0.08414099 0.39657781 0.326141043
## [61,] -0.92105906 -1.35457565 -0.853275093
## [62,] 2.50167388 -1.84503963 -0.802830680
## [63,] -0.84252936 -0.72197336 -0.191797001
## [64,] 1.79639695 -1.48006272 0.583465441
## [65,] 0.21360641 1.45044090 0.001460895
and the corresponding loadings (matrix \(\Lambda\) in the expression above) is obtained as
fa1$loadings
##
## Loadings:
## Factor1 Factor2 Factor3
## SaprInd -0.794 -0.457 -0.109
## Lital 0.939 0.164
## RETI 0.941 0.203 -0.142
## EPTAbu 0.903 0.357
## Marg 0.984
## Metaritr 0.959 0.193 -0.150
## JepAbu 0.684 -0.115 0.646
## Epiritral 0.886 0.182 -0.340
## Hyporitral 0.710 0.184 0.275
## Ntaxon -0.572 0.654 -0.190
## Nceled 0.102 0.936 -0.151
## Bindex 0.421 0.789
## EPTTax 0.456 0.834
## PosAbu 0.785 -0.159
## Spasaci 0.939
## MDS_1 0.907 0.190 0.113
## MDS_2 -0.165 -0.129 0.714
##
## Factor1 Factor2 Factor3
## SS loadings 8.961 4.053 1.409
## Proportion Var 0.527 0.238 0.083
## Cumulative Var 0.527 0.766 0.848
Application in Regression or SEM
Structural Equation Models (SEM) are statistical modelling techniqes showing great potential especially in causal dependencies between endogenous and exogenous variables, and the measurement model showing the relations between latent variables and their indicators (which is also the case of the latent factors \(F\) in the factor analysis).
It this situations is common to firstly draw a schematic diagram with the assumed underlying structure. Later, the diagram is used to specify the final model.
An example of such diagram is below.
In the R software there is package ‘sem’ available for the structural equation modelling.
library("sem")
and a simple example using the dataset ‘PoliticalDemocracy’ from the package library("lavaan")
:
library("lavaan")
## This is lavaan 0.5-23.1097
## lavaan is BETA software! Please report any bugs.
##
## Attaching package: 'lavaan'
## The following objects are masked from 'package:sem':
##
## cfa, sem
data(PoliticalDemocracy)
PoliticalDemocracy
## y1 y2 y3 y4 y5 y6 y7
## 1 2.50 0.000000 3.333333 0.000000 1.250000 0.000000 3.726360
## 2 1.25 0.000000 3.333333 0.000000 6.250000 1.100000 6.666666
## 3 7.50 8.800000 9.999998 9.199991 8.750000 8.094061 9.999998
## 4 8.90 8.800000 9.999998 9.199991 8.907948 8.127979 9.999998
## 5 10.00 3.333333 9.999998 6.666666 7.500000 3.333333 9.999998
## 6 7.50 3.333333 6.666666 6.666666 6.250000 1.100000 6.666666
## 7 7.50 3.333333 6.666666 6.666666 5.000000 2.233333 8.271257
## 8 7.50 2.233333 9.999998 1.496333 6.250000 3.333333 9.999998
## 9 2.50 3.333333 3.333333 3.333333 6.250000 3.333333 3.333333
## 10 10.00 6.666666 9.999998 8.899991 8.750000 6.666666 9.999998
## 11 7.50 3.333333 9.999998 6.666666 8.750000 3.333333 9.999998
## 12 7.50 3.333333 6.666666 6.666666 8.750000 3.333333 6.666666
## 13 7.50 3.333333 9.999998 6.666666 7.500000 3.333333 6.666666
## 14 7.50 7.766664 9.999998 6.666666 7.500000 0.000000 9.999998
## 15 7.50 9.999998 3.333333 10.000000 7.500000 6.666666 9.999998
## 16 7.50 9.999998 9.999998 7.766666 7.500000 1.100000 6.666666
## 17 2.50 3.333333 6.666666 6.666666 5.000000 1.100000 6.666666
## 18 1.25 0.000000 3.333333 3.333333 1.250000 3.333333 3.333333
## 19 10.00 9.999998 9.999998 10.000000 8.750000 9.999998 9.999998
## 20 7.50 3.333299 3.333333 6.666666 7.500000 2.233299 6.666666
## 21 10.00 9.999998 9.999998 10.000000 10.000000 9.999998 9.999998
## 22 1.25 0.000000 0.000000 0.000000 2.500000 0.000000 0.000000
## 23 2.50 0.000000 3.333333 3.333333 2.500000 0.000000 3.333333
## 24 7.50 6.666666 9.999998 10.000000 7.500000 6.666666 9.999998
## 25 8.50 9.999998 6.666666 6.666666 8.750000 9.999998 7.351018
## 26 6.10 0.000000 5.400000 3.333333 0.000000 0.000000 4.696028
## 27 3.30 0.000000 6.666666 3.333333 6.250000 0.000000 6.666666
## 28 2.90 3.333333 6.666666 3.333333 2.385559 0.000000 3.177568
## 29 9.20 0.000000 9.900000 3.333333 7.609660 0.000000 8.118828
## 30 6.90 0.000000 6.666666 3.333333 4.226033 0.000000 0.000000
## 31 2.90 0.000000 3.333333 3.333333 5.000000 0.000000 3.333333
## 32 2.00 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
## 33 5.00 0.000000 3.333333 3.333333 5.000000 0.000000 3.333333
## 34 5.00 0.000000 9.999998 3.333333 0.000000 0.000000 3.333333
## 35 4.10 9.999998 4.700000 6.666666 3.750000 0.000000 7.827667
## 36 6.30 9.999998 9.999998 6.666666 6.250000 2.233333 6.666666
## 37 5.20 4.999998 6.600000 3.333333 3.633403 1.100000 3.314128
## 38 5.00 3.333333 6.400000 6.666666 2.844997 0.000000 4.429657
## 39 3.10 4.999998 4.200000 5.000000 3.750000 0.000000 6.164304
## 40 4.10 9.999998 6.666666 3.333333 5.000000 0.000000 4.938089
## 41 5.00 9.999998 6.666666 1.666666 5.000000 0.000000 6.666666
## 42 5.00 7.700000 6.666666 8.399997 6.250000 4.358243 9.999998
## 43 5.00 6.200000 9.999998 6.060997 5.000000 2.782771 6.666666
## 44 5.60 4.900000 0.000000 0.000000 6.555647 4.055463 6.666666
## 45 5.70 4.800000 0.000000 0.000000 6.555647 4.055463 0.000000
## 46 7.50 9.999998 7.900000 6.666666 3.750000 9.999998 7.631891
## 47 2.50 0.000000 6.666666 3.333333 2.500000 0.000000 0.000000
## 48 8.90 9.999998 9.700000 6.666666 5.000000 9.999998 9.556024
## 49 7.60 0.000000 10.000000 0.000000 5.000000 1.100000 6.666666
## 50 7.80 9.999998 6.666666 6.666666 5.000000 3.333333 6.666666
## 51 2.50 0.000000 6.666666 3.333333 5.000000 0.000000 6.666666
## 52 3.80 0.000000 5.100000 0.000000 3.750000 0.000000 6.666666
## 53 5.00 3.333333 3.333333 2.233333 5.000000 3.333333 6.666666
## 54 6.25 3.333333 9.999998 2.955702 6.250000 5.566663 9.999998
## 55 1.25 0.000000 3.333333 0.000000 2.500000 0.000000 0.000000
## 56 1.25 0.000000 4.700000 0.736999 2.500000 0.000000 3.333333
## 57 1.25 0.000000 6.666666 0.000000 2.500000 0.000000 5.228375
## 58 7.50 7.766664 9.999998 6.666666 7.500000 3.333333 9.999998
## 59 2.50 0.000000 6.666666 4.433333 5.000000 0.000000 6.666666
## 60 7.50 9.999998 9.999998 10.000000 8.750000 9.999998 9.999998
## 61 1.25 0.000000 0.000000 0.000000 1.250000 0.000000 0.000000
## 62 1.25 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000
## 63 2.50 0.000000 0.000000 0.000000 0.000000 0.000000 6.666666
## 64 6.25 2.233299 6.666666 2.970332 3.750000 3.333299 6.666666
## 65 7.50 9.999998 9.999998 10.000000 7.500000 9.999998 9.999998
## 66 5.00 0.000000 6.100000 0.000000 5.000000 3.333333 9.999998
## 67 7.50 9.999998 9.999998 10.000000 3.750000 9.999998 9.999998
## 68 4.90 2.233333 9.999998 0.000000 5.000000 0.000000 3.621989
## 69 5.00 0.000000 8.200000 0.000000 5.000000 0.000000 0.000000
## 70 2.90 3.333333 6.666666 3.333333 2.500000 3.333333 6.666666
## 71 5.40 9.999998 6.666666 3.333333 3.750000 6.666666 6.666666
## 72 7.50 8.800000 9.999998 6.066666 7.500000 6.666666 9.999998
## 73 7.50 7.000000 9.999998 6.852998 7.500000 6.348340 6.666666
## 74 10.00 6.666666 9.999998 10.000000 10.000000 6.666666 9.999998
## 75 3.75 3.333333 0.000000 0.000000 1.250000 3.333333 0.000000
## y8 x1 x2 x3
## 1 3.333333 4.442651 3.637586 2.557615
## 2 0.736999 5.384495 5.062595 3.568079
## 3 8.211809 5.961005 6.255750 5.224433
## 4 4.615086 6.285998 7.567863 6.267495
## 5 6.666666 5.863631 6.818924 4.573679
## 6 0.368500 5.533389 5.135798 3.892270
## 7 1.485166 5.308268 5.075174 3.316213
## 8 6.666666 5.347108 4.852030 4.263183
## 9 3.333333 5.521461 5.241747 4.115168
## 10 10.000000 5.828946 5.370638 4.446216
## 11 6.666666 5.916202 6.423247 3.791545
## 12 6.666666 5.398163 6.246107 4.535708
## 13 10.000000 6.622736 7.872074 4.906154
## 14 0.000000 5.204007 5.225747 4.561047
## 15 10.000000 5.509388 6.202536 4.586286
## 16 6.666666 5.262690 5.820083 3.948911
## 17 0.368500 4.700480 5.023881 4.394491
## 18 3.333333 5.209486 4.465908 4.510268
## 19 10.000000 5.916202 6.732211 5.829084
## 20 2.948164 6.523562 6.992096 6.424591
## 21 10.000000 6.238325 6.746412 5.741711
## 22 0.000000 5.976351 6.712956 5.948168
## 23 3.333333 5.631212 5.937536 5.686755
## 24 7.766666 6.033086 6.093570 4.611429
## 25 6.666666 6.196444 6.704414 5.475261
## 26 3.333333 4.248495 2.708050 1.740830
## 27 3.333333 5.141664 4.564348 2.255134
## 28 1.116666 4.174387 3.688879 3.046927
## 29 3.333333 4.382027 2.890372 1.711279
## 30 0.000000 4.290459 1.609438 1.001674
## 31 3.333333 4.934474 4.234107 1.418971
## 32 0.000000 3.850148 1.945910 2.345229
## 33 3.333333 5.181784 4.394449 3.167167
## 34 0.744370 5.062595 4.595120 3.834970
## 35 6.666666 4.691348 4.143135 2.255134
## 36 2.955702 4.248495 3.367296 3.217506
## 37 3.333333 5.564520 5.236442 2.677633
## 38 1.485166 4.727388 3.610918 1.418971
## 39 3.333333 4.143135 2.302585 1.418971
## 40 2.233333 4.317488 4.955827 4.249888
## 41 0.368500 5.141664 4.430817 3.046927
## 42 4.141377 4.488636 3.465736 2.013579
## 43 4.974739 4.615121 4.941642 2.255134
## 44 3.821796 3.850148 2.397895 1.740830
## 45 0.000000 3.970292 2.397895 1.050741
## 46 6.666666 3.784190 3.091042 2.113313
## 47 0.000000 3.806662 2.079442 2.137561
## 48 6.666666 4.532599 3.610918 1.587802
## 49 1.099999 5.117994 4.934474 3.834970
## 50 6.666666 5.049856 5.111988 4.381490
## 51 3.333333 5.393628 5.638355 4.169451
## 52 1.485166 4.477337 3.931826 2.474671
## 53 5.566663 5.257495 5.840642 5.001796
## 54 6.666666 5.379897 5.505332 3.299937
## 55 0.000000 5.298317 6.274762 4.381490
## 56 3.333333 4.859812 5.669881 3.537416
## 57 0.000000 4.969813 5.564520 4.510268
## 58 6.666666 6.011267 6.253829 5.001796
## 59 1.485166 5.075174 5.252273 5.350708
## 60 10.000000 6.736967 7.125283 6.330518
## 61 0.000000 5.225747 5.451038 3.167167
## 62 0.000000 4.025352 1.791759 2.657972
## 63 2.948164 4.234107 2.708050 2.474671
## 64 3.333333 4.644391 5.564520 3.046927
## 65 10.000000 4.418841 4.941642 3.380653
## 66 3.333333 4.262680 4.219508 4.368462
## 67 10.000000 4.875197 4.700480 3.834970
## 68 3.333333 4.189655 1.386294 1.418971
## 69 0.000000 4.521789 4.127134 2.113313
## 70 3.333333 4.653960 3.555348 1.881917
## 71 1.485166 4.477337 3.091042 1.987909
## 72 6.666666 5.337538 5.631212 3.491004
## 73 7.508044 6.129050 6.403574 5.001796
## 74 10.000000 5.003946 4.962845 3.976994
## 75 0.000000 4.488636 4.897840 2.867566
with the corresponding model structure, which we want to fit:
Firstly, we define the model:
model <- '
# measurement model
ind60 =~ x1 + x2 + x3
dem60 =~ y1 + y2 + y3 + y4
dem65 =~ y5 + y6 + y7 + y8
# regressions
dem60 ~ ind60
dem65 ~ ind60 + dem60
# residual correlations
y1 ~~ y5
y2 ~~ y4 + y6
y3 ~~ y7
y4 ~~ y8
y6 ~~ y8
'
later, the covariates:
variable ~~ variable
## variable ~ ~variable
and finaly, we can fit the model:
fit <- sem(model, data=PoliticalDemocracy)
summary(fit, standardized=TRUE)
## lavaan (0.5-23.1097) converged normally after 68 iterations
##
## Number of observations 75
##
## Estimator ML
## Minimum Function Test Statistic 38.125
## Degrees of freedom 35
## P-value (Chi-square) 0.329
##
## Parameter Estimates:
##
## Information Expected
## Standard Errors Standard
##
## Latent Variables:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## ind60 =~
## x1 1.000 0.670 0.920
## x2 2.180 0.139 15.742 0.000 1.460 0.973
## x3 1.819 0.152 11.967 0.000 1.218 0.872
## dem60 =~
## y1 1.000 2.223 0.850
## y2 1.257 0.182 6.889 0.000 2.794 0.717
## y3 1.058 0.151 6.987 0.000 2.351 0.722
## y4 1.265 0.145 8.722 0.000 2.812 0.846
## dem65 =~
## y5 1.000 2.103 0.808
## y6 1.186 0.169 7.024 0.000 2.493 0.746
## y7 1.280 0.160 8.002 0.000 2.691 0.824
## y8 1.266 0.158 8.007 0.000 2.662 0.828
##
## Regressions:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## dem60 ~
## ind60 1.483 0.399 3.715 0.000 0.447 0.447
## dem65 ~
## ind60 0.572 0.221 2.586 0.010 0.182 0.182
## dem60 0.837 0.098 8.514 0.000 0.885 0.885
##
## Covariances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .y1 ~~
## .y5 0.624 0.358 1.741 0.082 0.624 0.296
## .y2 ~~
## .y4 1.313 0.702 1.871 0.061 1.313 0.273
## .y6 2.153 0.734 2.934 0.003 2.153 0.356
## .y3 ~~
## .y7 0.795 0.608 1.308 0.191 0.795 0.191
## .y4 ~~
## .y8 0.348 0.442 0.787 0.431 0.348 0.109
## .y6 ~~
## .y8 1.356 0.568 2.386 0.017 1.356 0.338
##
## Variances:
## Estimate Std.Err z-value P(>|z|) Std.lv Std.all
## .x1 0.082 0.019 4.184 0.000 0.082 0.154
## .x2 0.120 0.070 1.718 0.086 0.120 0.053
## .x3 0.467 0.090 5.177 0.000 0.467 0.239
## .y1 1.891 0.444 4.256 0.000 1.891 0.277
## .y2 7.373 1.374 5.366 0.000 7.373 0.486
## .y3 5.067 0.952 5.324 0.000 5.067 0.478
## .y4 3.148 0.739 4.261 0.000 3.148 0.285
## .y5 2.351 0.480 4.895 0.000 2.351 0.347
## .y6 4.954 0.914 5.419 0.000 4.954 0.443
## .y7 3.431 0.713 4.814 0.000 3.431 0.322
## .y8 3.254 0.695 4.685 0.000 3.254 0.315
## ind60 0.448 0.087 5.173 0.000 1.000 1.000
## .dem60 3.956 0.921 4.295 0.000 0.800 0.800
## .dem65 0.172 0.215 0.803 0.422 0.039 0.039