Outline

  • SEM growth in research applications

  • Steps in doing SEM

  • Basics of SEM

  • CB-SEM vs. PLS-SEM

SEM growth in research applications

SEM growth in research applications

Munim, Z. H., & Noor, T. (2020). Young people’s perceived service quality and environmental performance of hybrid electric bus service. Travel Behaviour and Society, 20, 133–143. https://doi.org/10.1016/j.tbs.2020.03.003

Steps in doing SEM


Basics of SEM

Measurement model

  • measurement part of a a full SEM model

  • confirmatory factor analysis

Basics of SEM

Measurement model

  • measurement part of a a full SEM model

  • confirmatory factor analysis

Structural model

  • relationship between constructs

  • full sem model is combination of measurement and structural component

Approaches in SEM


Covariance-based SEM (CB-SEM)

  • theory testing and confirmation

Partial least-square SEM (PLS-SEM)

  • prediction and theory development


“Choice of the method originates from the goal of research. If the existing theory needs to be tested and confirmed, CB-SEM is the chosen one. Nevertheless, for theory development as well as prediction purposes, PLS-SEM is better.” - Dash and Paul (2021)


“… both methods are complementary, not competitive” - Hair (2017)

CB-SEM or PLS-SEM

H., Hair Jr, F., J., Matthews, L.M., Matthews, R.L., Sarstedt, M., 2017. PLS-SEM or CBSEM: updated guidelines on which method to use. Int. J. Multivar. Data Anal. 1 (2), 107–123. https://doi.org/10.1504/IJMDA.2017.087624.

CB-SEM with Lavaan R package

What is Lavaan?

  • “developed to provide useRs, researchers, and teachers a free open-source, but commercial quality”, Yves Rosseel (2012)

  • Check-out this lavaan tutorial

install.packages("lavaan")
library(lavaan)
example(cfa)
library(lavaan)
example(cfa)

cfa> ## The famous Holzinger and Swineford (1939) example
cfa> HS.model <- ' visual  =~ x1 + x2 + x3
cfa+               textual =~ x4 + x5 + x6
cfa+               speed   =~ x7 + x8 + x9 '

cfa> fit <- cfa(HS.model, data = HolzingerSwineford1939)

cfa> summary(fit, fit.measures = TRUE)
lavaan 0.6.17 ended normally after 35 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        21

  Number of observations                           301

Model Test User Model:
                                                      
  Test statistic                                85.306
  Degrees of freedom                                24
  P-value (Chi-square)                           0.000

Model Test Baseline Model:

  Test statistic                               918.852
  Degrees of freedom                                36
  P-value                                        0.000

User Model versus Baseline Model:

  Comparative Fit Index (CFI)                    0.931
  Tucker-Lewis Index (TLI)                       0.896

Loglikelihood and Information Criteria:

  Loglikelihood user model (H0)              -3737.745
  Loglikelihood unrestricted model (H1)      -3695.092
                                                      
  Akaike (AIC)                                7517.490
  Bayesian (BIC)                              7595.339
  Sample-size adjusted Bayesian (SABIC)       7528.739

Root Mean Square Error of Approximation:

  RMSEA                                          0.092
  90 Percent confidence interval - lower         0.071
  90 Percent confidence interval - upper         0.114
  P-value H_0: RMSEA <= 0.050                    0.001
  P-value H_0: RMSEA >= 0.080                    0.840

Standardized Root Mean Square Residual:

  SRMR                                           0.065

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)
  visual =~                                           
    x1                1.000                           
    x2                0.554    0.100    5.554    0.000
    x3                0.729    0.109    6.685    0.000
  textual =~                                          
    x4                1.000                           
    x5                1.113    0.065   17.014    0.000
    x6                0.926    0.055   16.703    0.000
  speed =~                                            
    x7                1.000                           
    x8                1.180    0.165    7.152    0.000
    x9                1.082    0.151    7.155    0.000

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)
  visual ~~                                           
    textual           0.408    0.074    5.552    0.000
    speed             0.262    0.056    4.660    0.000
  textual ~~                                          
    speed             0.173    0.049    3.518    0.000

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)
   .x1                0.549    0.114    4.833    0.000
   .x2                1.134    0.102   11.146    0.000
   .x3                0.844    0.091    9.317    0.000
   .x4                0.371    0.048    7.779    0.000
   .x5                0.446    0.058    7.642    0.000
   .x6                0.356    0.043    8.277    0.000
   .x7                0.799    0.081    9.823    0.000
   .x8                0.488    0.074    6.573    0.000
   .x9                0.566    0.071    8.003    0.000
    visual            0.809    0.145    5.564    0.000
    textual           0.979    0.112    8.737    0.000
    speed             0.384    0.086    4.451    0.000

Major operators of lavaan syntax


Major operators of lavaan syntax

Defining a reflective latent variable

model <- "F1 =~ x1 + x2 + x3 + x4"


Estimate factor covariance

model <- "F1 =~ x1 + x2 + x3 + x4
          F2 =~ x5 + X6 + x6 + x8
          F1 ~~ F2"


Major operators of lavaan syntax

Estimate regression

model <- "F1 =~ x1 + x2 + x3 + x4
          F2 =~ x5 + X6 + x7 + x8
          F3 =~ x9 + X10 + x11 + x12
          F1 ~~ F2
          F3 ~ F1 + F2"

Major operators of lavaan syntax

Insert a comment in the syntax

model <- "F1 =~ x1 + x2 + x3 + x4
          F2 =~ x5 + X6 + x7 + x8
          F3 =~ x9 + X10 + x11 + x12
          
          # covariance
          F1 ~~ F2
          
          # F3 is regressed on F1 and F2
          F3 ~ F1 + F2"

Major operators of lavaan syntax

Label a parameter

model <- "F1 =~ x1 + x2 + x3 + x4
          F2 =~ x5 + X6 + x7 + x8
          F3 =~ x9 + X10 + x11 + x12
          
          # covariance
          F1 ~~ F2
          
          # F3 is regressed on F1 and F2
          F3 ~ b1*F1 + b2*F2"

Major operators of lavaan syntax

Create a new parameter

model <- "F1 =~ x1 + x2 + x3 + x4
          F2 =~ x5 + X6 + x7 + x8
          F3 =~ x9 + X10 + x11 + x12
          
          # regression
          F3 ~ b1*F1 + b2*F2
          F2 ~ b3*F1
          # F1 indirect effect
          ie := b3*b2
          # F1 total effect
          te := b3*b2 + b1"

Structural model: estimation

sem_model <- "SI =~ SI1 + SI2 + SI3 + SI4
              JS =~ JS1 + JS2 + JS3 + JS4 + JS5
              AC =~ AC1 + AC2 + AC3 + AC4
              EP =~ EP1 + EP2 + EP3 + EP4
              OC =~ OC1 + OC2 + OC3 + OC4
              EP ~~ AC
              JS ~ H1*EP + H3*AC
              OC ~ H2*EP + H4*AC + H5*JS
              SI ~ H6*JS + H7*OC"
sem_fit <- sem(model = sem_model, data = hbat_data)
summary(sem_fit, standardized = TRUE)
lavaan 0.6.17 ended normally after 41 iterations

  Estimator                                         ML
  Optimization method                           NLMINB
  Number of model parameters                        50

  Number of observations                           400

Model Test User Model:
                                                      
  Test statistic                               287.179
  Degrees of freedom                               181
  P-value (Chi-square)                           0.000

Parameter Estimates:

  Standard errors                             Standard
  Information                                 Expected
  Information saturated (h1) model          Structured

Latent Variables:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  SI =~                                                                 
    SI1               1.000                               0.707    0.813
    SI2               1.076    0.055   19.670    0.000    0.761    0.869
    SI3               1.058    0.066   15.976    0.000    0.749    0.738
    SI4               1.158    0.061   19.117    0.000    0.819    0.848
  JS =~                                                                 
    JS1               1.000                               0.988    0.739
    JS2               1.036    0.076   13.664    0.000    1.023    0.748
    JS3               0.904    0.072   12.498    0.000    0.894    0.680
    JS4               0.912    0.071   12.933    0.000    0.901    0.705
    JS5              15.234    1.137   13.396    0.000   15.054    0.732
  AC =~                                                                 
    AC1               1.000                               1.144    0.822
    AC2               1.236    0.067   18.384    0.000    1.414    0.820
    AC3               1.037    0.055   18.847    0.000    1.186    0.837
    AC4               1.147    0.063   18.261    0.000    1.313    0.816
  EP =~                                                                 
    EP1               1.000                               1.253    0.685
    EP2               1.040    0.075   13.855    0.000    1.303    0.802
    EP3               0.835    0.061   13.633    0.000    1.046    0.785
    EP4               0.924    0.065   14.130    0.000    1.157    0.824
  OC =~                                                                 
    OC1               1.000                               1.455    0.577
    OC2               1.328    0.110   12.080    0.000    1.932    0.885
    OC3               0.790    0.077   10.217    0.000    1.149    0.656
    OC4               1.172    0.099   11.802    0.000    1.705    0.832

Regressions:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  JS ~                                                                  
    EP        (H1)    0.198    0.049    4.038    0.000    0.252    0.252
    AC        (H3)   -0.009    0.051   -0.186    0.852   -0.011   -0.011
  OC ~                                                                  
    EP        (H2)    0.523    0.079    6.628    0.000    0.450    0.450
    AC        (H4)    0.255    0.068    3.745    0.000    0.200    0.200
    JS        (H5)    0.126    0.078    1.608    0.108    0.085    0.085
  SI ~                                                                  
    JS        (H6)    0.087    0.036    2.383    0.017    0.121    0.121
    OC        (H7)    0.269    0.032    8.284    0.000    0.553    0.553

Covariances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
  AC ~~                                                                 
    EP                0.368    0.087    4.244    0.000    0.257    0.257

Variances:
                   Estimate  Std.Err  z-value  P(>|z|)   Std.lv  Std.all
   .SI1               0.256    0.023   11.006    0.000    0.256    0.339
   .SI2               0.189    0.021    9.173    0.000    0.189    0.246
   .SI3               0.469    0.038   12.259    0.000    0.469    0.456
   .SI4               0.263    0.026    9.998    0.000    0.263    0.282
   .JS1               0.812    0.074   10.998    0.000    0.812    0.454
   .JS2               0.824    0.076   10.818    0.000    0.824    0.440
   .JS3               0.930    0.078   11.911    0.000    0.930    0.538
   .JS4               0.824    0.071   11.574    0.000    0.824    0.504
   .JS5             196.613   17.664   11.131    0.000  196.613    0.465
   .AC1               0.628    0.060   10.559    0.000    0.628    0.324
   .AC2               0.972    0.092   10.606    0.000    0.972    0.327
   .AC3               0.603    0.060   10.118    0.000    0.603    0.300
   .AC4               0.865    0.081   10.720    0.000    0.865    0.334
   .EP1               1.775    0.145   12.249    0.000    1.775    0.531
   .EP2               0.945    0.093   10.200    0.000    0.945    0.358
   .EP3               0.682    0.064   10.633    0.000    0.682    0.384
   .EP4               0.634    0.067    9.510    0.000    0.634    0.321
   .OC1               4.244    0.321   13.227    0.000    4.244    0.667
   .OC2               1.038    0.144    7.212    0.000    1.038    0.218
   .OC3               1.751    0.137   12.739    0.000    1.751    0.570
   .OC4               1.298    0.136    9.547    0.000    1.298    0.309
   .SI                0.326    0.036    8.943    0.000    0.652    0.652
   .JS                0.916    0.115    7.943    0.000    0.938    0.938
    AC                1.309    0.136    9.661    0.000    1.000    1.000
    EP                1.569    0.213    7.364    0.000    1.000    1.000
   .OC                1.443    0.247    5.840    0.000    0.682    0.682

Fit indices

Goodness of fit indices

  • Goodness-of-fit index (GFI)
  • Adjusted goodness-fit-index (AGFI)
  • Comparative fit index (CFI)
  • Normed fit index (NFI)
  • Non-normed fit index (NNF)

Badness of fit indices

  • Standard root mean square of the residuals (SRMR)
  • Root mean square error of approximation (RMSEA)

GOF measures between structural and CFA model

gof_indices <- c('chisq', 'df','pvalue', "gfi", 
                 'rmsea', 'rmr', 'srmr', 'nfi', 
                 'nnfi', 'cfi', 'agfi')
fitmeasures(sem_fit, fit.measures = gof_indices)
fitmeasures(cfa_fit, fit.measures = gof_indices)


gof_indices <- c('chisq', 'df','pvalue', "gfi", 
                 'rmsea', 'rmr', 'srmr', 'nfi', 
                 'nnfi', 'cfi', 'agfi')
fitmeasures(sem_fit, fit.measures = gof_indices)
  chisq      df  pvalue     gfi   rmsea     rmr    srmr     nfi    nnfi     cfi 
287.179 181.000   0.000   0.938   0.038   0.410   0.060   0.936   0.971   0.975 
   agfi 
  0.921 
fitmeasures(cfa_fit, fit.measures = gof_indices)
  chisq      df  pvalue     gfi   rmsea     rmr    srmr     nfi    nnfi     cfi 
240.738 179.000   0.001   0.947   0.029   0.414   0.036   0.946   0.983   0.985 
   agfi 
  0.932 

PLS-SEM with SEMinR package

  • “SEMinR brings a friendly syntax to creating and estimating SEM. It uses its own PLS-SEM engine and integrates with the Lavaan package for CB-SEM/CFA estimation. - Soumya Ray & Nicholas Danks (2020)

  • Check-out this SEMinR vignette

  • Download the PLS-SEM book using R

What is SEMinR?

Three main steps in using SEMinR

  1. Describe measurement model for each constructs and its items

  2. Describe the structural model of causal relationships between constructs

  3. Estimate the model using PLS, CB-SEM, or CFA

Major operators of SEMinR syntax

1. Describe measurement model for each constructs and its items

model <- constructs(
  composite(construct_name = "F1", item_names = multi_items("x", 1:4)))

plot(model)


Major operators of SEMinR syntax

2. Describe the structural model of causal relationships between constructs

## specifying measurement model
mm <- constructs(
  composite(construct_name = "F1", item_names = multi_items("x", 1:4)),
  composite("F2", multi_items("x", 5:8)),
  composite("F3", multi_items("x", 9:12)))

## specifying structural model
sm <- relationships(
  paths(from = "F1", to = "F2"),
  paths(from = c("F1", "F2"), to = "F3"))

plot(sm)


Major operators of SEMinR syntax

3. Estimate the model

## specifying measurement model
pls_sem_estimate <- 
  estimate_pls(data = my_data,
             measurement_model = mm,
             structural_model = sm)

## plotting pls-sem model
plot(pls_sem_estimate)

Well done!

Sample analysis