SEM growth in research applications
Steps in doing SEM
Basics of SEM
CB-SEM vs. PLS-SEM
Munim, Z. H., & Noor, T. (2020). Young people’s perceived service quality and environmental performance of hybrid electric bus service. Travel Behaviour and Society, 20, 133–143. https://doi.org/10.1016/j.tbs.2020.03.003
measurement part of a a full SEM model
confirmatory factor analysis
measurement part of a a full SEM model
confirmatory factor analysis
relationship between constructs
full sem model is combination of measurement and structural component
“Choice of the method originates from the goal of research. If the existing theory needs to be tested and confirmed, CB-SEM is the chosen one. Nevertheless, for theory development as well as prediction purposes, PLS-SEM is better.” - Dash and Paul (2021)
“… both methods are complementary, not competitive” - Hair (2017)
H., Hair Jr, F., J., Matthews, L.M., Matthews, R.L., Sarstedt, M., 2017. PLS-SEM or CBSEM: updated guidelines on which method to use. Int. J. Multivar. Data Anal. 1 (2), 107–123. https://doi.org/10.1504/IJMDA.2017.087624.
“developed to provide useRs, researchers, and teachers a free open-source, but commercial quality”, Yves Rosseel (2012)
Check-out this lavaan tutorial
library(lavaan)
example(cfa)
cfa> ## The famous Holzinger and Swineford (1939) example
cfa> HS.model <- ' visual =~ x1 + x2 + x3
cfa+ textual =~ x4 + x5 + x6
cfa+ speed =~ x7 + x8 + x9 '
cfa> fit <- cfa(HS.model, data = HolzingerSwineford1939)
cfa> summary(fit, fit.measures = TRUE)
lavaan 0.6.17 ended normally after 35 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 21
Number of observations 301
Model Test User Model:
Test statistic 85.306
Degrees of freedom 24
P-value (Chi-square) 0.000
Model Test Baseline Model:
Test statistic 918.852
Degrees of freedom 36
P-value 0.000
User Model versus Baseline Model:
Comparative Fit Index (CFI) 0.931
Tucker-Lewis Index (TLI) 0.896
Loglikelihood and Information Criteria:
Loglikelihood user model (H0) -3737.745
Loglikelihood unrestricted model (H1) -3695.092
Akaike (AIC) 7517.490
Bayesian (BIC) 7595.339
Sample-size adjusted Bayesian (SABIC) 7528.739
Root Mean Square Error of Approximation:
RMSEA 0.092
90 Percent confidence interval - lower 0.071
90 Percent confidence interval - upper 0.114
P-value H_0: RMSEA <= 0.050 0.001
P-value H_0: RMSEA >= 0.080 0.840
Standardized Root Mean Square Residual:
SRMR 0.065
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|)
visual =~
x1 1.000
x2 0.554 0.100 5.554 0.000
x3 0.729 0.109 6.685 0.000
textual =~
x4 1.000
x5 1.113 0.065 17.014 0.000
x6 0.926 0.055 16.703 0.000
speed =~
x7 1.000
x8 1.180 0.165 7.152 0.000
x9 1.082 0.151 7.155 0.000
Covariances:
Estimate Std.Err z-value P(>|z|)
visual ~~
textual 0.408 0.074 5.552 0.000
speed 0.262 0.056 4.660 0.000
textual ~~
speed 0.173 0.049 3.518 0.000
Variances:
Estimate Std.Err z-value P(>|z|)
.x1 0.549 0.114 4.833 0.000
.x2 1.134 0.102 11.146 0.000
.x3 0.844 0.091 9.317 0.000
.x4 0.371 0.048 7.779 0.000
.x5 0.446 0.058 7.642 0.000
.x6 0.356 0.043 8.277 0.000
.x7 0.799 0.081 9.823 0.000
.x8 0.488 0.074 6.573 0.000
.x9 0.566 0.071 8.003 0.000
visual 0.809 0.145 5.564 0.000
textual 0.979 0.112 8.737 0.000
speed 0.384 0.086 4.451 0.000
sem_fit <- sem(model = sem_model, data = hbat_data)
summary(sem_fit, standardized = TRUE)
lavaan 0.6.17 ended normally after 41 iterations
Estimator ML
Optimization method NLMINB
Number of model parameters 50
Number of observations 400
Model Test User Model:
Test statistic 287.179
Degrees of freedom 181
P-value (Chi-square) 0.000
Parameter Estimates:
Standard errors Standard
Information Expected
Information saturated (h1) model Structured
Latent Variables:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
SI =~
SI1 1.000 0.707 0.813
SI2 1.076 0.055 19.670 0.000 0.761 0.869
SI3 1.058 0.066 15.976 0.000 0.749 0.738
SI4 1.158 0.061 19.117 0.000 0.819 0.848
JS =~
JS1 1.000 0.988 0.739
JS2 1.036 0.076 13.664 0.000 1.023 0.748
JS3 0.904 0.072 12.498 0.000 0.894 0.680
JS4 0.912 0.071 12.933 0.000 0.901 0.705
JS5 15.234 1.137 13.396 0.000 15.054 0.732
AC =~
AC1 1.000 1.144 0.822
AC2 1.236 0.067 18.384 0.000 1.414 0.820
AC3 1.037 0.055 18.847 0.000 1.186 0.837
AC4 1.147 0.063 18.261 0.000 1.313 0.816
EP =~
EP1 1.000 1.253 0.685
EP2 1.040 0.075 13.855 0.000 1.303 0.802
EP3 0.835 0.061 13.633 0.000 1.046 0.785
EP4 0.924 0.065 14.130 0.000 1.157 0.824
OC =~
OC1 1.000 1.455 0.577
OC2 1.328 0.110 12.080 0.000 1.932 0.885
OC3 0.790 0.077 10.217 0.000 1.149 0.656
OC4 1.172 0.099 11.802 0.000 1.705 0.832
Regressions:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
JS ~
EP (H1) 0.198 0.049 4.038 0.000 0.252 0.252
AC (H3) -0.009 0.051 -0.186 0.852 -0.011 -0.011
OC ~
EP (H2) 0.523 0.079 6.628 0.000 0.450 0.450
AC (H4) 0.255 0.068 3.745 0.000 0.200 0.200
JS (H5) 0.126 0.078 1.608 0.108 0.085 0.085
SI ~
JS (H6) 0.087 0.036 2.383 0.017 0.121 0.121
OC (H7) 0.269 0.032 8.284 0.000 0.553 0.553
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
AC ~~
EP 0.368 0.087 4.244 0.000 0.257 0.257
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.SI1 0.256 0.023 11.006 0.000 0.256 0.339
.SI2 0.189 0.021 9.173 0.000 0.189 0.246
.SI3 0.469 0.038 12.259 0.000 0.469 0.456
.SI4 0.263 0.026 9.998 0.000 0.263 0.282
.JS1 0.812 0.074 10.998 0.000 0.812 0.454
.JS2 0.824 0.076 10.818 0.000 0.824 0.440
.JS3 0.930 0.078 11.911 0.000 0.930 0.538
.JS4 0.824 0.071 11.574 0.000 0.824 0.504
.JS5 196.613 17.664 11.131 0.000 196.613 0.465
.AC1 0.628 0.060 10.559 0.000 0.628 0.324
.AC2 0.972 0.092 10.606 0.000 0.972 0.327
.AC3 0.603 0.060 10.118 0.000 0.603 0.300
.AC4 0.865 0.081 10.720 0.000 0.865 0.334
.EP1 1.775 0.145 12.249 0.000 1.775 0.531
.EP2 0.945 0.093 10.200 0.000 0.945 0.358
.EP3 0.682 0.064 10.633 0.000 0.682 0.384
.EP4 0.634 0.067 9.510 0.000 0.634 0.321
.OC1 4.244 0.321 13.227 0.000 4.244 0.667
.OC2 1.038 0.144 7.212 0.000 1.038 0.218
.OC3 1.751 0.137 12.739 0.000 1.751 0.570
.OC4 1.298 0.136 9.547 0.000 1.298 0.309
.SI 0.326 0.036 8.943 0.000 0.652 0.652
.JS 0.916 0.115 7.943 0.000 0.938 0.938
AC 1.309 0.136 9.661 0.000 1.000 1.000
EP 1.569 0.213 7.364 0.000 1.000 1.000
.OC 1.443 0.247 5.840 0.000 0.682 0.682
gof_indices <- c('chisq', 'df','pvalue', "gfi",
'rmsea', 'rmr', 'srmr', 'nfi',
'nnfi', 'cfi', 'agfi')
fitmeasures(sem_fit, fit.measures = gof_indices)
chisq df pvalue gfi rmsea rmr srmr nfi nnfi cfi
287.179 181.000 0.000 0.938 0.038 0.410 0.060 0.936 0.971 0.975
agfi
0.921
fitmeasures(cfa_fit, fit.measures = gof_indices)
chisq df pvalue gfi rmsea rmr srmr nfi nnfi cfi
240.738 179.000 0.001 0.947 0.029 0.414 0.036 0.946 0.983 0.985
agfi
0.932
“SEMinR brings a friendly syntax to creating and estimating SEM. It uses its own PLS-SEM engine and integrates with the Lavaan package for CB-SEM/CFA estimation. - Soumya Ray & Nicholas Danks (2020)
Check-out this SEMinR vignette
Download the PLS-SEM book using R
Describe measurement model for each constructs and its items
Describe the structural model of causal relationships between constructs
Estimate the model using PLS, CB-SEM, or CFA
## specifying measurement model
mm <- constructs(
composite(construct_name = "F1", item_names = multi_items("x", 1:4)),
composite("F2", multi_items("x", 5:8)),
composite("F3", multi_items("x", 9:12)))
## specifying structural model
sm <- relationships(
paths(from = "F1", to = "F2"),
paths(from = c("F1", "F2"), to = "F3"))
plot(sm)
SEM: fundamentals and applications | link: bit.ly/viserdac-demo