MCAT DIFFERENTIATION – SUPPLEMENTAL MATERIAL
Online Supplemental Material
Differentiation of Cognitive Abilities and the Medical College Admission Test
McLarnon, M. J. W., Goffin, R. D., & Rothstein, M. G. (2017), Personality and Individual Differences
http://doi.org/10.1016/j.paid.2017.11.005
Contents
Table S1 – Full sample correlation matrix of study variables P. 2
Table S2 – Low-g group correlation matrix P. 3
Table S3 – High-g group correlation matrix P. 4
Figure S1 – Model to assess Tenet 2 via differential correlations between the P. 5
MCAT’s g-factor, its subtests and GPA
Figure S2 – Model to assess Tenet 2 via equivalence of GPA’s residual P. 6
variance across ability groups
Summary of moderated factor analysis P. 7
Table S4 – Linear and moderated factor analysis fit statistics P. 8
Table S5 – Parameter estimates from linear and moderated factor models P. 9
Moderated factor model Mplus syntax P. 10
Extreme group syntax P. 12
References used in Online Supplemental Material P. 20
Table S1
Full Sample Correlation Matrix of Study Variables |
|||||||
|
Mean |
SD |
1. |
2. |
3. |
4. |
5. |
|
8.85 |
1.95 |
-- |
|
|
|
|
|
9.81 |
1.92 |
.38 |
-- |
|
|
|
|
3.90 |
.93 |
.23 |
.15 |
-- |
|
|
|
10.12 |
1.75 |
.46 |
.65 |
.17 |
-- |
|
|
3.50 |
.38 |
.22 |
.44 |
.15 |
.44 |
-- |
Note. n = 7,498. Raw score descriptives for MCAT subtests presented, scores were standardized prior to differentiating low- and high-g groups. All correlations significant at p < .001. |
Table S2
Low-g Group Correlation Matrix |
|||||||
|
Mean |
SD |
1. |
2. |
3. |
4. |
5. |
1. Verbal Reasoning |
7.31 |
1.82 |
-- |
|
|
|
|
2. Physical Science |
8.20 |
1.50 |
.07 |
-- |
|
|
|
3. Writing |
3.33 |
.93 |
-.04 |
-.21 |
-- |
|
|
4. Biological Science |
8.59 |
1.50 |
.19 |
.48 |
-.17 |
-- |
|
5. Grade Point Average |
3.34 |
.43 |
.10 |
.37 |
.02 |
.36 |
-- |
Note. n = 2,421. Raw score descriptives for MCAT subtests presented. All correlations significant at p < .001 except those that are underlined. |
Table S3
High-g Group Correlation Matrix |
||||||||
|
Mean |
SD |
Cohen’s d |
1. |
2. |
3. |
4. |
5. |
1. Verbal Reasoning |
10.20 |
1.38 |
1.79 |
-- |
|
|
|
|
2. Physical Science |
11.40 |
1.43 |
2.18 |
-.02 |
-- |
|
|
|
3. Writing |
4.42 |
.71 |
1.32 |
-.08 |
-.21 |
-- |
|
|
4. Biological Science |
11.62 |
1.21 |
2.22 |
.03 |
.27 |
-.26 |
-- |
|
5. Grade Point Average |
3.66 |
.27 |
.89 |
-.01 |
.22 |
-.05 |
.22 |
-- |
Note. n = 2,425. Raw score descriptives for MCAT subtests presented. Cohen’s d compares mean differences in each variable across low- and high-g groups. All mean differences significant at p < .001. All correlations significant at p < .001 except those that are underlined. |
Figure S1. The first statistical model used to assess Tenet 2 via differential correlations between the MCAT’s g-factor, its subtests and GPA. Correlations with GPA were assessed sequentially, in that each was tested in a separate model. VR = verbal reasoning, BS = biological science, PS = physical science, WRIT = writing. Parameters that were tested for equivalence across low- and high-g groups are given by k subscripts.
Figure S2. The second statistical model used to assess Tenet 2 via the estimates of R2GPA, or more specifically, equivalence of GPA’s residual variance across ability groups. VR = verbal reasoning, BS = biological science, PS = physical science, WRIT = writing. Parameters that were tested for equivalence across low- and high-g groups are given by k subscripts.
Summary of Moderated Factor Model
Our use of the moderated factor analysis model (i.e., non-linear structural equation model) was informed by the studies of Tucker-Drob (2009) and Bauer (2016). In a two stage testing procedure, we first analyzed a linear model with the same technical specifications as those required by the moderated factor model. In contrast to a typical single-group linear factor model, this required the specification for random slopes (i.e., random factor loadings [in Mplus, this is invoked by TYPE IS RANDOM;]) and specifying that a numerical integration algorithm (ALGORITHM IS INTEGRATION;) is used in conjunction with a robust maximum likelihood estimator (ESTIMATOR IS MLR; which was implemented in the analyses reported in the manuscript). The default options for standard (trapezoidal) numerical integration were used (i.e., 15 integration points, adaptive quadrature, and an accelerated expectation-maximization algorithm; Muthén & Muthén, 2012). As in the analyses reported in the manuscript, Mplus 7.4 (Muthén & Muthén, 2015) was used. Full syntax for the moderated model and the extreme group models are available in a later section of this Online Supplemental Material.
Inclusion of the random and numerical integration settings does not permit the typical model fit indices (χ2, comparative fit index [CFI], and root mean square error of approximation [RMSEA]) so model comparisons between the linear and moderated models were facilitated by examining the differences in loglikelihood values (which is distributed as χ2, with the degrees of freedom equal to the difference in the number of parameters) and differences in the information criteria (i.e., Akaike Information Criteria [AIC], Bayesian Information Criteria [BIC], sample-size adjusted BIC [aBIC]), where lower values indicate better fit. We do note, however, that since this baseline model is still a linear factor model, the specifications of random slopes and numerical integration can be turned off to achieve the same model fit (above) and parameter estimates, but can furnish typical estimates of model fit. We present these only for the sake of completion: χ2(2) = 152.537, p < .001, χ2scaling = 1.0249, CFI = .975, RMSEA = .102 (90% CI = .089 - .116), SRMR = .031, and place emphasis on the loglikelihood and information criteria presented in Table S4 to enable a comparison to the moderated model. The moderated factor model converged without error and gave parameter estimates (for 16 parameters) that were all within bounds (i.e., no negative residual variances or Heywood cases).
Table S4
Model Fit Statistics |
|
|
|
Linear |
Moderated |
#fp |
12 |
16 |
LL |
-37985.46 |
-37887.29 |
LLc |
1.05 |
1.04 |
AIC |
75994.92 |
75806.58 |
BIC |
76077.55 |
75916.75 |
aBIC |
76039.42 |
75865.91 |
ΔAIC |
-188.34 |
|
ΔBIC |
-160.80 |
|
ΔaBIC |
-173.51 |
|
Δχ2(4) |
193.78, p < .001 |
|
Note. #fp = number of free parameters, LL = loglikelihood, LLc = LL scaling correction factor, AIC = Akaike Information Criteria, BIC = Bayesian Information Criteria, aBIC = sample size adjusted BIC, Δestimates = differences between respective columns, Δχ2(4) = χ2 difference test with 4 degrees of freedom, based the Satorra and Bentler (1994, 2001) nested model comparison. |
First, we note that the model fit from the moderated model’s information criteria were all lower than those from the linear model, suggesting better fit. In particular, ΔAIC = -188.341, ΔBIC = -160.798, ΔaBIC = -173.509. Kass and Raferty (1995) have suggested that a ΔBIC of greater than |10| is strong evidence in favor of the model with the lower estimate. The nested model comparison facilitated by loglikelihood estimates, in conjunction with the scaling correction factors associated with robust maximum likelihood estimation (Satorra & Bentler, 1994, 2001) also revealed a significant improvement in fit associated with the moderated model over the linear model: Δχ2(4) = 193.784, p < .001. Thus, the moderated model fits the data significantly better than the linear model.
Next, Table S5 presents the parameter estimates stemming from both the linear and moderated factor models.
Table S5
Parameter Estimates (and 99% Confidence Intervals) From Full-Sample Linear and Moderated Factor Models |
||||
MCAT subtests |
υ |
λ1 |
λ2 |
σ2 |
Linear Model |
|
|
|
|
Verbal Reasoning |
.00 (-.03 - .03) |
.53 (.50 - .56) |
-- |
.72 (.69 - .75) |
Physical Sciences |
.00 (-.03 - .03) |
.75 (.72 - .77) |
-- |
.45 (.41 - .48) |
Biological Sciences |
.00 (-.03 - .03) |
.87 (.85 - .89) |
-- |
.25 (.21 - .29) |
Writing |
.00 (-.03 - .03) |
.22 (.19 - .25) |
-- |
.95 (.94 - .97) |
Moderated Model |
|
|
|
|
Verbal Reasoning |
.08 (.04 - .11) |
.53 (.49 - .56) |
-.08 (-.10 - -.05) |
.71 (.67 - .74) |
Physical Sciences |
.01 (-.03 - .05) |
.76 (.73 - .79) |
-.01 (-.03 - .01) |
.42 (.38 - .45) |
Biological Sciences |
.06 (.03 - .10) |
.84 (.81 - .88) |
-.06 (-.08 - -.04) |
.28 (.24 - .32) |
Writing |
.04 (.01 - .08) |
.22 (.18 - .25) |
-.04 (-.06 - -.01) |
.95 (.91 - .99) |
Note. υ = intercept; λ1 = linear factor loadings; λ2 = moderated/non-linear factor loadings; σ2 = residual variance. Bolded table entries reflect parameters that have 99% CIs that exclude zero. |
We make several observations based on these parameter estimates. First, it appears that the factor loadings representing the linear relations are remarkably similar between linear and moderated models. There is also a similar pattern of relatively consistent parameter estimates for the residual variances. Perhaps most interestingly is that the moderated factor loadings for the verbal reasoning, biological sciences, and writing subtests of the MCAT were significant (the 99% CIs excluded zero). The negative values can be interpreted as the (linear) factor loadings decreasing as the level of the g-factor increases (see Tucker-Drob, 2009). Physical science, in contrast, did not demonstrate a significant moderated factor loading, suggesting that it is positively related to g across the levels of g. In sum, these findings are in alignment with the general propositions of cognitive ability differentiation and the results reported in the manuscript. Thus, regardless of whether multi-group or moderated factor model approaches are used, the findings converge on a similar pattern: there is evidence to support Tenet 1 of cognitive ability differentiation.
Mplus Syntax for Moderated Factor Analysis
TITLE: Moderated Factor Model
!DATA: identifies the data file
!Notes can be added to the syntax file by using an ! at the beginning of the line
DATA: FILE IS ‘data.csv’;
!VARIABLE: defines specifics to the variables in the dataset
VARIABLE:
!NAMES ARE defines each column in dataset
NAMES ARE ID YEAR GPA mcatvr mcatps mcatws mcatbs
z_vr z_ps z_ws z_bs MCAT_grp ;
!VR=verbal reasoning, PS=physical science, WS=writing, BS=biological science
!z_vr etc.=z-scored variables
!MCAT_grp 0 = LOW, 1 = MIDDLE, 2 = HIGH;
!MISSING = defines the values that represent missing data
MISSING = .;
!USEOBSERVATIONS selects participants from a particular subgroup (i.e., an extreme group)
!USEOBSERVATIONS IS ;
!USEVARIABLES specifies the variables, defined in the NAMES command to use for analysis
USEVARIABLES = z_vr z_ps z_ws z_bs ;
!IDVAR defines which variable of the dataset contains participant ID numbers
IDVAR = ID;
!GROUPING specifies a multi-group model, as used in the extreme group approach
!GROUPING IS MCAT_grp (0 = LOW 2 = HIGH);
ANALYSIS:
!Use maximum likelihood estimator with robust standard errors
ESTIMATOR IS MLR;
!Specify random slopes (i.e., factor loadings)
TYPE IS RANDOM;
!Model requires numerical integration; default options used for standard/trapezoidal integration
!with 15 integration points, adaptive quadrature, and accelerated expectation-maximization
!algorithm; see Mplus User Guide
ALGORITHM IS INTEGRATION;
!MODEL: is used to specify the model to be analyzed
MODEL:
!Single-factor model, no correlated residuals
!All factor loadings freely estimated with *
!Variance of latent variable fixed at 1 for identification purposes using @
!These factor loadings represent the linear loadings
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
!Define nonlinear/moderated factor, gxg
gxg | MCAT XWITH MCAT;
!Regress specific abilities on moderated factor, gxg
!Represents moderated factor loadings
z_vr ON gxg;
z_ps
ON gxg;
z_ws ON gxg;
z_bs ON gxg;
!OUTPUT request additional output
!SAMPSTAT=sample statistics
!STDYX=standardized parameters
!CINTERVAL=confidence intervals for all parameters
!RESIDUAL=model residual estimates
!TECH1=parameter specification matrices
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
!PLOT provides various plotting/graphing features
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
Mplus Syntax for Extreme-Group Approach
TITLE: Model 1 – Configural Invariance
!Refer to Moderated Factor Model for syntax details
DATA: FILE IS ‘data.csv’;
VARIABLE:
NAMES ARE ID YEAR GPA mcatvr mcatps mcatws mcatbs
z_vr z_ps z_ws z_bs MCAT_grp ;
MISSING = .;
!USEOBSERVATIONS used in preliminary analyses to examine factor model in a single group
!With this syntax activated, only those cases in the Low group would be used for analysis
!USEOBSERVATIONS IS MCAT_grp == 0;
USEVARIABLES = z_vr z_ps z_ws z_bs ;
IDVAR = ID;
GROUPING IS MCAT_grp (0 = LOW 2 = HIGH);
ANALYSIS:
ESTIMATOR IS MLR;
MODEL:
!Single-factor model, no correlated residuals
!All factor loadings freely estimated with *
!Variance of latent variable fixed at 1 for identification purposes using @
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
!Separate factor models are specified for each ability group
MODEL LOW:
!Freely
estimate factor loadings in both groups
MCAT BY *z_vr z_ps
z_ws z_bs;
!Latent variable variance fixed
at 1 in both groups
MCAT@1;
!Freely
estimate means/intercepts of specific abilities in both groups
[z_vr
z_ps z_ws z_bs];
!Latent
variable mean fixed at 0 for identification in both
groups
[MCAT@0];
!Freely
estimate residual variances in both groups
z_vr z_ps z_ws
z_bs;
MODEL HIGH:
!Freely
estimate factor loadings in both groups
MCAT BY *z_vr z_ps
z_ws z_bs;
!Latent variable variance fixed
at 1 in both groups
MCAT@1;
!Freely
estimate means/intercepts of specific abilities in both groups
[z_vr
z_ps z_ws z_bs];
!Latent variable mean
fixed at 0 for identification in both groups
[MCAT@0];
!Freely
estimate residual variances in both groups
z_vr z_ps z_ws
z_bs;
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 2 – Metric/Factor Loading Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
MODEL
LOW:
!Constrain factor loadings to
equality across groups by commenting out factor loadings
!MCAT
BY *z_vr z_ps z_ws z_bs;
!Latent variable variance still fixed
at 1 in referent group
MCAT@1;
!Freely
estimate means/intercepts of specific abilities in both groups
[z_vr
z_ps z_ws z_bs];
!Latent
variable mean fixed at 0 for identification in both
groups
[MCAT@0];
!Freely
estimate residual variances in both groups
z_vr z_ps z_ws
z_bs;
MODEL HIGH:
!MCAT
BY *z_vr z_ps z_ws z_bs;
!Latent variable variance freely
estimated in comparison group
MCAT*;
!Freely
estimate means/intercepts of specific abilities in both groups
[z_vr
z_ps z_ws z_bs];
!Latent variable mean
fixed at 0 for identification in both groups
[MCAT@0];
!Freely
estimate residual variances in both groups
z_vr z_ps z_ws
z_bs;
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 3 – Metric + Uniqueness Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
MODEL LOW:
!MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!Constrain
residual variances to equality across groups by assigning an
arbitrary label to each
!uniqueness
estimate
z_vr z_ps z_ws z_bs (v1-v4);
MODEL
HIGH:
MCAT*;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!As
the uniqueness estimates share the same labels, v1-v4, across groups,
they are constrained to
!equality
z_vr
z_ps z_ws z_bs (v1-v4);
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 3b – Uniqueness Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
MODEL LOW:
!Group-specific factor loadings restored (in comparison to Model 3)
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!Constrain
residual variances to equality across groups by assigning an
arbitrary label to each
!uniqueness
estimate
z_vr z_ps z_ws z_bs (v1-v4);
MODEL
HIGH:
MCAT BY *z_vr z_ps z_ws z_bs;
!Latent factor variance fixed to 1 for identification purposes
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!As
the uniqueness estimates share the same labels, v1-v4, across groups,
they are constrained to
!equality
z_vr
z_ps z_ws z_bs (v1-v4);
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 4 – Metric + Factor Variance Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
MODEL LOW:
!MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!Freely
estimate uniquenesses/residual variances
z_vr z_ps z_ws
z_bs;
MODEL HIGH:
!Fix latent factor variance at 1, in conjunction with constraining factor loadings to equality (as
!they are not specified here)
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!Freely
estimate uniquenesses/residual variances
z_vr z_ps z_ws
z_bs;
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 4b – Factor Variance Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
!Switch identification to marker variable approach; by default factor loading for first item is
!fixed at 1
MCAT BY z_vr z_ps z_ws z_bs;
!Freely estimate latent factor variance
MCAT*;
MODEL LOW:
!Factor loadings for items other than marker variable (see above) freely estimated across groups
MCAT BY z_ps z_ws z_bs;
!Latent factor variance estimated, but constrained to equality across groups as label, lv1, is the
!same in each group
MCAT (lv1);
[z_vr z_ps
z_ws z_bs];
[MCAT@0];
z_vr
z_ps z_ws z_bs;
MODEL HIGH:
!Factor loadings for items other than marker variable (see above) freely estimated across groups
MCAT BY z_ps z_ws z_bs;
!Latent factor variance estimated, but constrained to equality across groups as label, lv1, is the
!same in each group
MCAT (lv1);
[z_vr z_ps z_ws z_bs];
[MCAT@0];
z_vr
z_ps z_ws z_bs;
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
TITLE: Model 5 – Metric + Uniqueness + Factor Variance Invariance
NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1
MODEL:
MCAT BY *z_vr z_ps z_ws z_bs;
MCAT@1;
MODEL LOW:
!Combine
constraints implemented in Models 2, 3, and 4
!Constrain factor loadings to equality across groups by commenting out factor loadings
!MCAT
BY *z_vr z_ps z_ws z_bs;
!Latent variable variance still fixed
at 1 in referent group
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
!Constrain residual variances to
equality across groups by assigning an arbitrary label to each
!uniqueness
estimate
z_vr z_ps z_ws z_bs (v1-v4);
MODEL
HIGH:
!Fix latent factor variance at 1, in conjunction with constraining factor loadings to equality (as
!they are not specified here)
MCAT@1;
[z_vr
z_ps z_ws z_bs];
[MCAT@0];
z_vr z_ps z_ws
z_bs (v1-v4);
OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;
PLOT:
SERIES = z_vr z_ps
z_ws z_bs(*);
TYPE IS PLOT3;
References
Bauer, D. J. (2016). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22, 507-526.
Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795.
Muthén, L.K., & Muthén, B. (2012). Mplus User’s Guide (7th ed.). Los Angeles, CA: Muthén & Muthén.
Muthén, L. K., and Muthén, B. O. (2015). Mplus 7.4. [Computer program]. Los Angeles, CA: Muthén & Muthén.
Satorra, A., & Bentler, P.M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C.C. Clogg (Eds.) Latent variables analysis: applications for developmental research (pp. 399–419). Thousand Oaks: Sage.
Satorra, A., & Bentler, P.M. (2001). A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66, 507-514.
Tucker-Drob, E. M. (2009). Differentiation of cognitive abilities across the life span. Developmental Psychology, 45, 1097-1118.