Conference Presentation Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample GOLAY, Philippe, et al. Abstract Numerous studies have supported exploratory and confirmatory bifactor structures of the WISC-IV in US, French, and Irish samples. When investigating the structure of cognitive ability measures like the WISC-IV, subtest scores theoretically associated with one latent variable could also be related to other factors. A major drawback of classical confirmatory factor analysis (CFA) is that the majority of factor loadings need to be fixed to zero to estimate the model parameters. This unnecessary strict parameterization can lead to model rejection and cause researchers to perform many exploratory modifications to achieve acceptable model fit. Bayesian structural equation modeling (BSEM) overcomes this limitation by replacing fixed-to-zero-loadings with approximate zeros that translates into small, but not necessary zero, cross-loadings. Because all relationships between factors and subtest scores are estimated, both the number of models to be tested and the risk of capitalizing on the chance characteristics of the data are decreased. The objective of this study was to determine whether secondary interpretation of the 1 [...] Reference GOLAY, Philippe, et al. Bayesian Structural Equation Modeling of the WISC-IV with a Large Referred US Sample. In: 9th Conference of the International Test Commission, San Sebastian (Spain), 2-5 July, 214 Available at: http://archive-ouverte.unige.ch/unige:38747 Disclaimer: layout of this document may differ from the published version.
Philippe Golay 1,2, Thierry Lecerf 1, Marley W. Watkins 3 & Gary L. Canivez 4 1 University of Geneva, 2 University of Lausanne, 3 Baylor University, 4 Eastern Illinois University THE 9TH CONFERENCE OF THE INTERNATIONAL TEST COMMISSION SAN SEBASTIAN, SPAIN 2-5 JULY, 214
The Wechsler Intelligence Scale for children remains the most widely used test in the field of intelligence assessment. General intelligence (g) has traditionally been conceptualized as a superordinate factor (higher-order model). But most recent research has shown better support for g as a breadth factor (bifactor model): exploratory / confirmatory bifactor structures in US, French, Swiss and Irish samples. 2
Higher order model Verbal Comprehension VCI Similarities Vocabulary Comprehension Verbal Comprehension VCI Similarities Vocabulary Comprehension Bifactor model Block Design Block Design g Perceptual Reasoning PRI Picture Concept Matrix Reasoning Perceptual Reasoning PRI Picture Concept Matrix Reasoning g Working Memory WMI Digit Span Letter Number Working Memory WMI Digit Span Letter Number Processing Speed PSI Coding Symbol Search Processing Speed PSI Coding Symbol Search 3
Goal 1 : Compare Higher-order (indirect hierarchical) versus Bifactor (direct hierarchical) models of the 1 WISC-IV core subtests from a large referred US sample. 4
Many controversies remain on the nature of the constructs measured by each subtest score: scores theoretically associated with one latent variable could also be related to other factors. Many disagreements remain about constructs that would contribute, at a secondary level, to the results of each of the subtests scores. 5
Contribution of fluid reasoning in the Similarities verbal subtest score. Contribution of general verbal information and crystalized intelligence to performance in the Picture Concept subtest score. Contribution of visual abilities in the Symbol Search processing speed subtest score. 6
Goal 2: Determine more precisely which constructs are adequately measured by WISC-IV core subtests and can secondary interpretation of some subtest scores be supported by the data? 7
EFA is not very restrictive because the relationships between all items and all factors are estimated. Two decisions remain for selecting a proper solution: Number of factors on the basis of theoretical and statistical considerations. Rotation method. Orthogonal rotations vs oblique rotations. Hypothesized complexity of the factorial structure. 8
Most rotations methods are designed to seek a simple structure with a low factorial structure complexity. When several subtest scores are expected to load on more than one factor, these rotations are inefficient and cannot recover the correct structure. Expected factor complexity is not always easy to determine a priori. 9 = Λ = Λ
Contrarily to EFA, CFA allows estimating only some of the model parameters on the basis of theoretical knowledge. With CFA the majority of factor loadings need to be fixed to zero to estimate the model parameters. Although needed for model identification, these restrictions do not always faithfully reflect the researchers hypotheses. 1
Small but not necessarily zero loading could be equally or even more compatible with theory. This unnecessary strict parameterization can contribute to poor model fit, distorted factors and biased factor correlations (Marsh, et al., 21). It also may cause researchers to perform many exploratory modifications to achieve acceptable model fit (risk of overfitting & loss of meaning for indices of statistical significance). 11
Build on the strenghts of both methods and avoid their weaknesses: bayesian approach to model estimation. BSEM could be seen as an intermediate approach between CFA and EFA: It allows, like CFA, to specify the expected loadings. At the same time, it is also possible, like with EFA, to maintain a certain level of uncertainty and estimate all loadings. 12
With classical CFA, most secondary loadings are fixed to exactly zero Latent variable 1 Latent variable 2 13
Diffuse non informative priors (zero mean and infinite variance) Latent variable 1 Latent variable 2-1 -.5.5 1 14
Informative priors (zero mean and small variance) Latent variable 1 Latent variable 2-1 -.5.5 1 15
Bayesian estimation combines prior distributions for all parameters with the experimental data and forms posterior distributions via Bayes' theorem. posterior likelihood x prior The prior variance was.1 which results in 95% credibility interval of ±.2 (small cross-loadings). MCMC estimation with Mplus 7.. 16
BSEM overcomes CFA s limitations by replacing fixed-to-zero-loadings with approximate zeros that translates into small, but not necessary zero, cross-loadings. Approximate zeros often reflect more accurately theoretical assumptions and facilitate unbiased estimations of the model parameters. 17
BSEM allows the estimation of many parameters without depending on the selection of a method of rotation as needed when performing an EFA. Because all relationships between factors and subtest scores are estimated this approach eliminates the need for comparisons of many competing models. It is also possible to determine the precise nature of the constructs measured by the core subtest scores of the WISC-IV. 18
WISC-IV data were obtained from 113 US children who were referred for evaluation of learning difficulties. As it appears to be common in clinical assessments, only the 1 core subtests were administered. Age IQ Sample N % Male Mean (SD) Min/Max Mean (SD) Min/Max US 113 62% (696) 1.24 (2.51) 6-/16-11 89.94 (17.16) 4/147 19
Model Number of free parameters PPP Value Difference between observed & replicated Χ 2 Lower 2.5% 95% C.I. DIC Upper 2.5% Estimated number of parameters 1. WISC-IV - higher order model 34. 34.732 9.485 25983.2 33.785 2. WISC-IV - higher order model with cross-loadings (priors variance =.1) 64.151-15.168 45.585 25941.9 4.471 3. WISC-IV - bifactor model 4. 21.281 76.965 25963.1 27.42 4. WISC-IV - bifactor model with cross-loadings (priors variance =.1) 7.388-27.95 35.638 25913.3 23.9 Note. Higher Posterior Predictive P-Value and Lower DIC indicates better fit to the data. The WISC-IV bifactor model with small cross-loadings showed better fit to the data overall. 2
Loadings estimates (median) VCI PRI WMI PSI 95% CI 95% CI 95% CI 95% CI Similarities.798.134 -.16 -.65.679.943.15.249 -.123.88 -.154.13 Vocabulary.932 -.7.33 -.35.797 1.85 -.137.114 -.78.143 -.124.46 Comprehension.821 -.56.1.21.695.963 -.18.61 -.99.116 -.6.17 Block Design -.4.87 -.96.52 -.159.74.73 1.41 -.266.11 -.39.14 Picture Concepts.13.535.81.9.1.27.44.67 -.24.23 -.7.91 Matrix Reasoning.1.772.42.7 -.15.121.62.933 -.74.16 -.79.96 Digit Span.5 -.26.713.23 -.134.141 -.173.113.467.931 -.72.112 Letter-Number Sequencing.32 -.16.779.35 -.11.163 -.168.123.559 1.16 -.62.134 Coding -.5 -.51.39.78 -.173.65 -.194.82 -.99.171.481.944 Symbol Search -.17.11 -.5.78 -.142.14 -.38.259 -.13.125.533 1.21 21
Loadings estimates (median) G VCI PRI WMI PSI 95% CI 95% CI 95% CI 95% CI 95% CI Similarities.756.38.52 -.8 -.51.68.823.22.54 -.74.184 -.134.132 -.156.54 Vocabulary.796.488.7.43 -.14.713.887.267.66 -.142.145 -.127.187 -.138.12 Comprehension.691.394 -.49.7.2.63.779.16.521 -.186.153 -.141.157 -.95.133 Block Design.699 -.25.48 -.5.34.592.8 -.176.13 -.198.68 -.29.153 -.19.162 Picture Concepts.691.12.148.1 -.24.61.767 -.15.163 -.159.394 -.165.188 -.146.87 Matrix Reasoning.752.5.285.29.2.671.822 -.127.125 -.97.479 -.116.162 -.119.112 Digit Span.637.22 -.1.256.26.566.71 -.116.149 -.151.134 -.178.52 -.84.131 Letter-Number Sequencing.738.39 -.6.311.44.663.816 -.113.175 -.156.141 -.2.6 -.88.162 Coding.691 -.3 -.27.36.554.63.779 -.171.18 -.172.126 -.127.179.341.811 Symbol Search.621 -.2.71.19.472.539.699 -.135.124 -.85.24 -.116.151.32.714 22
Results of the higher-order models (Models 1 and 2) highlighted two theoretically meaningful cross-loadings. The loading from VCI to Picture Concepts was considered substantial. The cross-loading from PRI to Similarities was also substantive. No other hypothesized cross-loadings were supported. 23
In contrast, results of the bifactor models (Models 3 and 4) revealed no cross-loadings. The breadth conception of the g-factor left less unmodeled complexity than the higher-order structure. 24
Loadings of the subtests scores on the g-factor were systematically higher than their respective loadings on the four index scores. Index scores represented rather small deviations from unidimensionality and did not necessarily provide additional and separate information from the Full Scale IQ score (FISQ). 25
Results on a sample of 113 referred US children showed that the bifactor model fit was better than the higher order solution. Models including small cross-loadings were more adequate. BSEM allowed us to estimate models that were closer to theoretical assumptions. BSEM also permited to test more complex models that were not possible to estimate through maximum likelihood estimation. BSEM suggested a simple and parsimonious interpretation of the subtest scores. 26
Thank you very much for your attention Contact : philippe.golay@unil.ch
BSEM was conducted using Mplus 7. with Markov Chain Monte Carlo (MCMC) estimation algorithm with Gibbs sampler. Three chains with 5, iterations, different starting values, and different random seeds were estimated. The convergence of the chains was verified using the Potential Scale Reduction Factor (PSR; Gelman & Rubin, 1992). A Kolmogorov-Smirnov test of equality of the posterior parameter distributions across the three chains was also performed for all models. The 1 st half of the chain was discarded (burn-in phase) and the posteriori distributions were estimated on the 2 nd half. 28
Second order loadings estimates (median) g 95% CI Verbal Comprehension Index.874.793.932 Perceptual Reasoning Index.893.89.943 Working Memory Index.921.816.977 Processing Speed Index.79.59.827 29
Loadings estimates g VCI PRI WMI PSI Similarities.756.38.52 -.8 -.51 Vocabulary.796.488.7.43 -.14 Comprehension.691.394 -.49.7.2 Block Design.699 -.25.48 -.5.34 Picture Concepts.691.12.148.1 -.24 Matrix Reasoning.752.5.285.29.2 Digit Span.637.22 -.1.256.26 Letter-Number Sequencing.738.39 -.6.311.44 Coding.691 -.3 -.27.36.554 Symbol Search.621 -.2.71.19.472 Omega-Hierarchical.875.215.19.14.311 Omega-hierarchical coefficients for group (index) factors were likely too low for interpretation. 3