MCAT DIFFERENTIATION – SUPPLEMENTAL MATERIAL 29

Online Supplemental Material

Differentiation of Cognitive Abilities and the Medical College Admission Test



McLarnon, M. J. W., Goffin, R. D., & Rothstein, M. G. (2017), Personality and Individual Differences

http://doi.org/10.1016/j.paid.2017.11.005



Contents

  1. Table S1 – Full sample correlation matrix of study variables P. 2

  2. Table S2 – Low-g group correlation matrix P. 3

  3. Table S3 – High-g group correlation matrix P. 4

  4. Figure S1 – Model to assess Tenet 2 via differential correlations between the P. 5

MCAT’s g-factor, its subtests and GPA

  1. Figure S2 – Model to assess Tenet 2 via equivalence of GPA’s residual P. 6

variance across ability groups

  1. Summary of moderated factor analysis P. 7

  2. Table S4 – Linear and moderated factor analysis fit statistics P. 8

  3. Table S5 – Parameter estimates from linear and moderated factor models P. 9

  4. Moderated factor model Mplus syntax P. 10

  5. Extreme group syntax P. 12

  6. References used in Online Supplemental Material P. 20



Table S1

Full Sample Correlation Matrix of Study Variables


Mean

SD

1.

2.

3.

4.

5.

  1. Verbal Reasoning

8.85

1.95

--





  1. Physical Science

9.81

1.92

.38

--




  1. Writing

3.90

.93

.23

.15

--



  1. Biological Science

10.12

1.75

.46

.65

.17

--


  1. Grade Point Average

3.50

.38

.22

.44

.15

.44

--

Note. n = 7,498. Raw score descriptives for MCAT subtests presented, scores were standardized prior to differentiating low- and high-g groups. All correlations significant at p < .001.





Table S2

Low-g Group Correlation Matrix


Mean

SD

1.

2.

3.

4.

5.

1. Verbal Reasoning

7.31

1.82

--





2. Physical Science

8.20

1.50

.07

--




3. Writing

3.33

.93

-.04

-.21

--



4. Biological Science

8.59

1.50

.19

.48

-.17

--


5. Grade Point Average

3.34

.43

.10

.37

.02

.36

--

Note. n = 2,421. Raw score descriptives for MCAT subtests presented. All correlations significant at p < .001 except those that are underlined.





Table S3

High-g Group Correlation Matrix


Mean

SD

Cohen’s d

1.

2.

3.

4.

5.

1. Verbal Reasoning

10.20

1.38

1.79

--





2. Physical Science

11.40

1.43

2.18

-.02

--




3. Writing

4.42

.71

1.32

-.08

-.21

--



4. Biological Science

11.62

1.21

2.22

.03

.27

-.26

--


5. Grade Point Average

3.66

.27

.89

-.01

.22

-.05

.22

--

Note. n = 2,425. Raw score descriptives for MCAT subtests presented. Cohen’s d compares mean differences in each variable across low- and high-g groups. All mean differences significant at p < .001. All correlations significant at p < .001 except those that are underlined.





Figure S1. The first statistical model used to assess Tenet 2 via differential correlations between the MCAT’s g-factor, its subtests and GPA. Correlations with GPA were assessed sequentially, in that each was tested in a separate model. VR = verbal reasoning, BS = biological science, PS = physical science, WRIT = writing. Parameters that were tested for equivalence across low- and high-g groups are given by k subscripts.



Figure S2. The second statistical model used to assess Tenet 2 via the estimates of R2GPA, or more specifically, equivalence of GPA’s residual variance across ability groups. VR = verbal reasoning, BS = biological science, PS = physical science, WRIT = writing. Parameters that were tested for equivalence across low- and high-g groups are given by k subscripts.



Summary of Moderated Factor Model


Our use of the moderated factor analysis model (i.e., non-linear structural equation model) was informed by the studies of Tucker-Drob (2009) and Bauer (2016). In a two stage testing procedure, we first analyzed a linear model with the same technical specifications as those required by the moderated factor model. In contrast to a typical single-group linear factor model, this required the specification for random slopes (i.e., random factor loadings [in Mplus, this is invoked by TYPE IS RANDOM;]) and specifying that a numerical integration algorithm (ALGORITHM IS INTEGRATION;) is used in conjunction with a robust maximum likelihood estimator (ESTIMATOR IS MLR; which was implemented in the analyses reported in the manuscript). The default options for standard (trapezoidal) numerical integration were used (i.e., 15 integration points, adaptive quadrature, and an accelerated expectation-maximization algorithm; Muthén & Muthén, 2012). As in the analyses reported in the manuscript, Mplus 7.4 (Muthén & Muthén, 2015) was used. Full syntax for the moderated model and the extreme group models are available in a later section of this Online Supplemental Material.


Inclusion of the random and numerical integration settings does not permit the typical model fit indices (χ2, comparative fit index [CFI], and root mean square error of approximation [RMSEA]) so model comparisons between the linear and moderated models were facilitated by examining the differences in loglikelihood values (which is distributed as χ2, with the degrees of freedom equal to the difference in the number of parameters) and differences in the information criteria (i.e., Akaike Information Criteria [AIC], Bayesian Information Criteria [BIC], sample-size adjusted BIC [aBIC]), where lower values indicate better fit. We do note, however, that since this baseline model is still a linear factor model, the specifications of random slopes and numerical integration can be turned off to achieve the same model fit (above) and parameter estimates, but can furnish typical estimates of model fit. We present these only for the sake of completion: χ2(2) = 152.537, p < .001, χ2scaling = 1.0249, CFI = .975, RMSEA = .102 (90% CI = .089 - .116), SRMR = .031, and place emphasis on the loglikelihood and information criteria presented in Table S4 to enable a comparison to the moderated model. The moderated factor model converged without error and gave parameter estimates (for 16 parameters) that were all within bounds (i.e., no negative residual variances or Heywood cases).


Table S4

Model Fit Statistics



Linear

Moderated

#fp

12

16

LL

-37985.46

-37887.29

LLc

1.05

1.04

AIC

75994.92

75806.58

BIC

76077.55

75916.75

aBIC

76039.42

75865.91

ΔAIC

-188.34

ΔBIC

-160.80

ΔaBIC

-173.51

Δχ2(4)

193.78, p < .001

Note. #fp = number of free parameters, LL = loglikelihood, LLc = LL scaling correction factor, AIC = Akaike Information Criteria, BIC = Bayesian Information Criteria, aBIC = sample size adjusted BIC, Δestimates = differences between respective columns, Δχ2(4) = χ2 difference test with 4 degrees of freedom, based the Satorra and Bentler (1994, 2001) nested model comparison.



First, we note that the model fit from the moderated model’s information criteria were all lower than those from the linear model, suggesting better fit. In particular, ΔAIC = -188.341, ΔBIC = -160.798, ΔaBIC = -173.509. Kass and Raferty (1995) have suggested that a ΔBIC of greater than |10| is strong evidence in favor of the model with the lower estimate. The nested model comparison facilitated by loglikelihood estimates, in conjunction with the scaling correction factors associated with robust maximum likelihood estimation (Satorra & Bentler, 1994, 2001) also revealed a significant improvement in fit associated with the moderated model over the linear model: Δχ2(4) = 193.784, p < .001. Thus, the moderated model fits the data significantly better than the linear model.

Next, Table S5 presents the parameter estimates stemming from both the linear and moderated factor models.



Table S5

Parameter Estimates (and 99% Confidence Intervals) From Full-Sample Linear and Moderated Factor Models

MCAT subtests

υ

λ1

λ2

σ2

Linear Model





Verbal Reasoning

.00 (-.03 - .03)

.53 (.50 - .56)

--

.72 (.69 - .75)

Physical Sciences

.00 (-.03 - .03)

.75 (.72 - .77)

--

.45 (.41 - .48)

Biological Sciences

.00 (-.03 - .03)

.87 (.85 - .89)

--

.25 (.21 - .29)

Writing

.00 (-.03 - .03)

.22 (.19 - .25)

--

.95 (.94 - .97)

Moderated Model





Verbal Reasoning

.08 (.04 - .11)

.53 (.49 - .56)

-.08 (-.10 - -.05)

.71 (.67 - .74)

Physical Sciences

.01 (-.03 - .05)

.76 (.73 - .79)

-.01 (-.03 - .01)

.42 (.38 - .45)

Biological Sciences

.06 (.03 - .10)

.84 (.81 - .88)

-.06 (-.08 - -.04)

.28 (.24 - .32)

Writing

.04 (.01 - .08)

.22 (.18 - .25)

-.04 (-.06 - -.01)

.95 (.91 - .99)

Note. υ = intercept; λ1 = linear factor loadings; λ2 = moderated/non-linear factor loadings; σ2 = residual variance. Bolded table entries reflect parameters that have 99% CIs that exclude zero.



We make several observations based on these parameter estimates. First, it appears that the factor loadings representing the linear relations are remarkably similar between linear and moderated models. There is also a similar pattern of relatively consistent parameter estimates for the residual variances. Perhaps most interestingly is that the moderated factor loadings for the verbal reasoning, biological sciences, and writing subtests of the MCAT were significant (the 99% CIs excluded zero). The negative values can be interpreted as the (linear) factor loadings decreasing as the level of the g-factor increases (see Tucker-Drob, 2009). Physical science, in contrast, did not demonstrate a significant moderated factor loading, suggesting that it is positively related to g across the levels of g. In sum, these findings are in alignment with the general propositions of cognitive ability differentiation and the results reported in the manuscript. Thus, regardless of whether multi-group or moderated factor model approaches are used, the findings converge on a similar pattern: there is evidence to support Tenet 1 of cognitive ability differentiation.



Mplus Syntax for Moderated Factor Analysis


TITLE: Moderated Factor Model


!DATA: identifies the data file

!Notes can be added to the syntax file by using an ! at the beginning of the line

DATA: FILE IS ‘data.csv’;


!VARIABLE: defines specifics to the variables in the dataset

VARIABLE:

!NAMES ARE defines each column in dataset

NAMES ARE ID YEAR GPA mcatvr mcatps mcatws mcatbs

z_vr z_ps z_ws z_bs MCAT_grp ;

!VR=verbal reasoning, PS=physical science, WS=writing, BS=biological science

!z_vr etc.=z-scored variables

!MCAT_grp 0 = LOW, 1 = MIDDLE, 2 = HIGH;

!MISSING = defines the values that represent missing data

MISSING = .;

!USEOBSERVATIONS selects participants from a particular subgroup (i.e., an extreme group)

!USEOBSERVATIONS IS ;

!USEVARIABLES specifies the variables, defined in the NAMES command to use for analysis

USEVARIABLES = z_vr z_ps z_ws z_bs ;

!IDVAR defines which variable of the dataset contains participant ID numbers

IDVAR = ID;

!GROUPING specifies a multi-group model, as used in the extreme group approach

!GROUPING IS MCAT_grp (0 = LOW 2 = HIGH);


ANALYSIS:

!Use maximum likelihood estimator with robust standard errors

ESTIMATOR IS MLR;

!Specify random slopes (i.e., factor loadings)

TYPE IS RANDOM;

!Model requires numerical integration; default options used for standard/trapezoidal integration

!with 15 integration points, adaptive quadrature, and accelerated expectation-maximization

!algorithm; see Mplus User Guide

ALGORITHM IS INTEGRATION;


!MODEL: is used to specify the model to be analyzed

MODEL:

!Single-factor model, no correlated residuals

!All factor loadings freely estimated with *

!Variance of latent variable fixed at 1 for identification purposes using @

!These factor loadings represent the linear loadings

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


!Define nonlinear/moderated factor, gxg

gxg | MCAT XWITH MCAT;


!Regress specific abilities on moderated factor, gxg

!Represents moderated factor loadings

z_vr ON gxg;
z_ps ON gxg;
z_ws ON gxg;
z_bs ON gxg;


!OUTPUT request additional output

!SAMPSTAT=sample statistics

!STDYX=standardized parameters

!CINTERVAL=confidence intervals for all parameters

!RESIDUAL=model residual estimates

!TECH1=parameter specification matrices

OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


!PLOT provides various plotting/graphing features

PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;



Mplus Syntax for Extreme-Group Approach


TITLE: Model 1 – Configural Invariance

!Refer to Moderated Factor Model for syntax details

DATA: FILE IS ‘data.csv’;

VARIABLE:

NAMES ARE ID YEAR GPA mcatvr mcatps mcatws mcatbs

z_vr z_ps z_ws z_bs MCAT_grp ;

MISSING = .;

!USEOBSERVATIONS used in preliminary analyses to examine factor model in a single group

!With this syntax activated, only those cases in the Low group would be used for analysis

!USEOBSERVATIONS IS MCAT_grp == 0;

USEVARIABLES = z_vr z_ps z_ws z_bs ;

IDVAR = ID;

GROUPING IS MCAT_grp (0 = LOW 2 = HIGH);


ANALYSIS:

ESTIMATOR IS MLR;


MODEL:

!Single-factor model, no correlated residuals

!All factor loadings freely estimated with *

!Variance of latent variable fixed at 1 for identification purposes using @

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


!Separate factor models are specified for each ability group

MODEL LOW:
!Freely estimate factor loadings in both groups

MCAT BY *z_vr z_ps z_ws z_bs;
!Latent variable variance fixed at 1 in both groups

MCAT@1;
!Freely estimate means/intercepts of specific abilities in both groups
[z_vr z_ps z_ws z_bs];

!Latent variable mean fixed at 0 for identification in both groups
[MCAT@0];
!Freely estimate residual variances in both groups
z_vr z_ps z_ws z_bs;

MODEL HIGH:
!Freely estimate factor loadings in both groups

MCAT BY *z_vr z_ps z_ws z_bs;
!Latent variable variance fixed at 1 in both groups

MCAT@1;
!Freely estimate means/intercepts of specific abilities in both groups
[z_vr z_ps z_ws z_bs];
!Latent variable mean fixed at 0 for identification in both groups

[MCAT@0];
!Freely estimate residual variances in both groups
z_vr z_ps z_ws z_bs;


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;





TITLE: Model 2 – Metric/Factor Loading Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


MODEL LOW:
!Constrain factor loadings to equality across groups by commenting out factor loadings

!MCAT BY *z_vr z_ps z_ws z_bs;
!Latent variable variance still fixed at 1 in referent group

MCAT@1;
!Freely estimate means/intercepts of specific abilities in both groups
[z_vr z_ps z_ws z_bs];

!Latent variable mean fixed at 0 for identification in both groups
[MCAT@0];
!Freely estimate residual variances in both groups
z_vr z_ps z_ws z_bs;

MODEL HIGH:

!MCAT BY *z_vr z_ps z_ws z_bs;
!Latent variable variance freely estimated in comparison group

MCAT*;
!Freely estimate means/intercepts of specific abilities in both groups
[z_vr z_ps z_ws z_bs];
!Latent variable mean fixed at 0 for identification in both groups

[MCAT@0];
!Freely estimate residual variances in both groups
z_vr z_ps z_ws z_bs;


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;





TITLE: Model 3 – Metric + Uniqueness Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


MODEL LOW:

!MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!Constrain residual variances to equality across groups by assigning an arbitrary label to each

!uniqueness estimate
z_vr z_ps z_ws z_bs (v1-v4);

MODEL HIGH:

MCAT*;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!As the uniqueness estimates share the same labels, v1-v4, across groups, they are constrained to

!equality
z_vr z_ps z_ws z_bs (v1-v4);


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;





TITLE: Model 3b – Uniqueness Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


MODEL LOW:

!Group-specific factor loadings restored (in comparison to Model 3)

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!Constrain residual variances to equality across groups by assigning an arbitrary label to each

!uniqueness estimate
z_vr z_ps z_ws z_bs (v1-v4);

MODEL HIGH:

MCAT BY *z_vr z_ps z_ws z_bs;

!Latent factor variance fixed to 1 for identification purposes

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!As the uniqueness estimates share the same labels, v1-v4, across groups, they are constrained to

!equality
z_vr z_ps z_ws z_bs (v1-v4);

OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;




TITLE: Model 4 – Metric + Factor Variance Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


MODEL LOW:

!MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!Freely estimate uniquenesses/residual variances
z_vr z_ps z_ws z_bs;

MODEL HIGH:

!Fix latent factor variance at 1, in conjunction with constraining factor loadings to equality (as

!they are not specified here)

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!Freely estimate uniquenesses/residual variances
z_vr z_ps z_ws z_bs;


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;




TITLE: Model 4b – Factor Variance Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

!Switch identification to marker variable approach; by default factor loading for first item is

!fixed at 1

MCAT BY z_vr z_ps z_ws z_bs;

!Freely estimate latent factor variance

MCAT*;


MODEL LOW:

!Factor loadings for items other than marker variable (see above) freely estimated across groups

MCAT BY z_ps z_ws z_bs;

!Latent factor variance estimated, but constrained to equality across groups as label, lv1, is the

!same in each group

MCAT (lv1);


[z_vr z_ps z_ws z_bs];

[MCAT@0];
z_vr z_ps z_ws z_bs;

MODEL HIGH:

!Factor loadings for items other than marker variable (see above) freely estimated across groups

MCAT BY z_ps z_ws z_bs;

!Latent factor variance estimated, but constrained to equality across groups as label, lv1, is the

!same in each group

MCAT (lv1);

[z_vr z_ps z_ws z_bs];

[MCAT@0];
z_vr z_ps z_ws z_bs;


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;






TITLE: Model 5 – Metric + Uniqueness + Factor Variance Invariance

NB. All DATA, VARIABLE, and ANALYSIS commands are carried over from Model 1


MODEL:

MCAT BY *z_vr z_ps z_ws z_bs;

MCAT@1;


MODEL LOW:
!Combine constraints implemented in Models 2, 3, and 4

!Constrain factor loadings to equality across groups by commenting out factor loadings

!MCAT BY *z_vr z_ps z_ws z_bs;
!Latent variable variance still fixed at 1 in referent group

MCAT@1;
[z_vr z_ps z_ws z_bs];

[MCAT@0];
!Constrain residual variances to equality across groups by assigning an arbitrary label to each

!uniqueness estimate
z_vr z_ps z_ws z_bs (v1-v4);

MODEL HIGH:

!Fix latent factor variance at 1, in conjunction with constraining factor loadings to equality (as

!they are not specified here)

MCAT@1;

[z_vr z_ps z_ws z_bs];

[MCAT@0];


z_vr z_ps z_ws z_bs (v1-v4);


OUTPUT: SAMPSTAT STDYX CINTERVAL RESIDUAL TECH1;


PLOT:

SERIES = z_vr z_ps z_ws z_bs(*);
TYPE IS PLOT3;



References

Bauer, D. J. (2016). A more general model for testing measurement invariance and differential item functioning. Psychological Methods, 22, 507-526.

Kass, R. E., & Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, 773-795.

Muthén, L.K., & Muthén, B. (2012). Mplus User’s Guide (7th ed.). Los Angeles, CA: Muthén & Muthén.

Muthén, L. K., and Muthén, B. O. (2015). Mplus 7.4. [Computer program]. Los Angeles, CA: Muthén & Muthén.

Satorra, A., & Bentler, P.M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C.C. Clogg (Eds.) Latent variables analysis: applications for developmental research (pp. 399–419). Thousand Oaks: Sage.

Satorra, A., & Bentler, P.M. (2001). A scaled difference chi-square test statistic for moment structure analysis. Psychometrika66, 507-514.

Tucker-Drob, E. M. (2009). Differentiation of cognitive abilities across the life span. Developmental Psychology, 45, 1097-1118.