A primer on the Mutualism theory of general intelligence

08/07/25, 16:18

Last updated:

Published:

10/10/24, 14:19

A new theory suggests intelligence develops through reciprocal interactions between abilities

Introduction

One of the most replicated findings in psychology is that if a sufficiently large and diverse battery of cognitive tests is administered to a representative sample of people, an all-positive correlation matrix will be produced. For a century, psychometricians have explained this occurrence by proposing the existence of g, a latent biological variable that links all abilities. G is statistically represented by the first factor derived from a correlation matrix using a method called factor analysis, which reduces the dimensionality of data into clusters of covariance between tests called factors.

Early critics of g pointed out that nothing about the statistical g-factor required the existence of a real biological factor and that the overlap of uncorrelated mental processes sampled by subtests was sufficient. While the strength of correlations between subtests does generally correspond to intuitive beliefs about processes shared between them, this is not universally the case, and for this reason, sampling theory has never seen widespread acceptance.

A new theory called mutualism has been proposed that explains the positive manifold without positing the existence of g. In mutualism the growth of abilities is coupled, meaning improvement in one domain causes growth in another, inducing correlations between abilities over time. The authors of the introductory paper demonstrated in a simulation that when growth in abilities is coupled the interaction between baseline ability, growth speed and limited developmental resources is sufficient to create a statistical general factor from abilities that are initially uncorrelated, offering a novel explanation for why abilities like vocabulary that are ‘inexpensive’ in terms of developmental resources explain the most variance in other abilities.

Empirical evidence

In the field of intelligence, mutualism has been tested twice among neurotypical children in the lab and once in a naturalistic setting with data from a gamified maths revision platform. Alongside these, a lone study exists comparing coupling in children with a language disorder and neurotypical children, however methodological issues related to attrition preclude it from discussion here. All studies used latent change score modelling (LCSM) to compare competing models of how intelligence develops over time. LCSM is a subset of structural equation modelling in which researchers compare the discrepancy between models of proposed causal connections between variables and their values in the real data using model fit indices.

Three parameters resembling those used in the introductory paper’s simulations were used to represent causal connections between variables: the change score - wave #2 score minus wave #1 score of the same ability, the self-feedback parameter – the regression coefficient of baseline ability on the change score of the same ability and the coupling effect parameter – the regression coefficient of one ability at wave #1 on the change score of the other ability. The following models were compared: the g-factor model defined by the absence of coupling and growth driven by change in the g-factor, the investment model defined by coupling from matrix reasoning to vocabulary and the mutualism model defined by bidirectional coupling.

Mutualism in the lab

The first two lab studies investigated coupling between vocabulary and matrix reasoning in samples of 14-25 year olds and 6-8 year olds respectively. The mutualism model showed the best model fit in both studies albeit less decisively in the three wave younger sample, suggesting the stronger model fit of the first study may have been an artefact of regression to the mean.

I think it’s problematic to interpret this as empirical support for mutualism due to issues that follow from only using two abilities. A g-factor extracted from two abilities may reflect specific non-g variance shared between tests as much as it does common variance caused by g. Adding to this ambiguity is the fact that the correlations between the change scores of the two tests after controlling for coupling and self-feedback effects were positive, reflecting the influence of an unmodelled third variable, be that g or unmeasured coupling.

Another problematic feature of the studies comes from their model specification of the g-factor as being without coupling. This is despite the fact no latent change score modelling study of childhood development has ruled out that g may develop in a coupled or partially coupled manner. Studies using the methodology to study cognitive ageing have shown that some abilities are coupled whereas others are not suggesting that only sampling abilities that do show coupling may lead to a biassed comparison.

Mutualism in the classroom

Mutualism showed a marginally better fit than the investment model in explaining the development of counting, addition, multiplication and division over three years in a study featuring a sample of 12,000 Dutch 6-10 year olds using the revision platform Mathgarden. The change scores of each ability showed strong correlations after controlling for coupling and self-feedback effects. When considered in relation to the good model fit of the investment model, I believe this may reflect the standardised effect of the curriculum on the development of abilities independent of coupling and baseline ability.

A finding with negative implications for mutualism from this study is the fact that the number of games played was not associated with any greater strength in coupling. This could reflect that coupling is a passive mechanism of development with little environmental input but it could equally reflect sorting of high ability students into a niche combined with self-feedback effects of their baseline ability impeding coupling.

To observe the causal effect of effort on coupling after controlling for cognitive aging and the tendency of high ability people to train harder a randomised control trial of cognitive training is needed.

Cognitive training

Unfortunately, no cognitive training study has used latent change score modelling, meaning coupling must be inferred from the presence of far transfer (gains on untrained abilities), rather than directly estimated.

COGITO’s youth sample resembled the first lab study to test mutualism in its age range and choice of fluid reasoning as a far transfer measure. Participants underwent 100 days of hour-long training sessions of working memory, processing speed and episodic memory.

The authors found no near or far transfer gains for working memory and processing speed, possibly indicating developmental limits on their improvement. However, moderate effect sizes were found for fluid reasoning and episodic memory. The study’s results are lacklustre and developmentally bound but they offer an example of experimentally induced far transfer in a literature – in which it is a rarity – leaving open the possibility that the coupling effects observed in the lab studies were not mere passive effects of development.

In contrast to COGITO which targeted young people at the tail end of their cognitive development, the Abecedarian Project started almost as soon as the subjects were born. Conceived of as a pre-school intervention to improve the educational outcomes of African Americans in North Carolina, the Abecedarian Project consisted of an experimental group that received regular guided educational play for infants aimed at building early language and a control condition which only received nutritional supplementation. At the entry of primary school, the experimental group showed a 7 point difference in IQ, which persisted in a diminished capacity at 4.4 IQ points by age 21.

In contrast to previous early life interventions, in cognitive training studies and studies on the cognitive outcomes of adoption the gains were domain general rather than improvements on specific abilities. This provides causal evidence that if interventions are sufficiently early and target highly g-loaded abilities such as vocabulary they can induce cascades of domain-general improvement, a finding in line with the predictions of mutualism.

It would be unfair to end this segment without mentioning perhaps the most standardised cognitive training regime there is: schooling. The causal effect of a year of schooling on IQ can be teased apart from the developmental effects of ageing by using a method called regression discontinuity analysis. In this method, the distance of a student’s birthday from the year cutoff for two year groups is used as a predictor variable alongside the school year in a multiple regression predicting IQ.

A recent paper reanalysing data from a study using this method found that the subtest gains from a year of schooling showed a moderate negative correlation with their g loading. As mutualism states that g develops through coupling, this would lend credence to the view that coupling effects are passive mechanisms of g’s development rather than being inseparable from experience.

Conclusion

I believe that it’s more accurate to say there is evidence for coupling effects than it is to say there is evidence for mutualism. There is convergent evidence from a year of schooling effect, coupling effects not rising with the amount of maths games played and the COGITO intervention’s results that the environment has little causal role in coupling effects and their strength. Opposing evidence comes from the Abecedarian Project, however this is not an environmental stimulus to which most people will be exposed to. Therefore, more weight should be placed on the effects of a year of schooling because it is generalisable.

To reconcile this conflicting evidence, future authors should seek to replicate the COGITO intervention in an early adolescent identical twin sample with co-twin controls. This would allow researchers to observe coupling effects while executive functions are still in development and give them a more concrete understanding of the self-feedback parameter grounded in developmental cascades of gene expression. A more readily available alternative would be to apply latent change score modelling to the Abecedarian Project dataset.

I will end with a quote from a critic of mutualism, Gilles Gignac:

I conclude with the suggestion that belief in the plausibility of the g factor (or mutualism) may be impacted significantly by individual differences in personality, attitudes, and worldviews, rather than rely strictly upon logical and/or empirical evidence.

As the current evidence stands, this may be true, but with the availability of new developmental studies such as the Adolescent Brain Cognitive Development study and old ones like the Louisville twin study there’s less of an excuse than ever.

Written by James Howarth

REFERENCES

Carroll, J. B. (1993). Human cognitive abilities: A Survey of Factor-Analytic Studies. Cambridge University Press

Rindermann, H., Becker, D., & Coyle, T. R. (2020). Survey of expert opinion on intelligence: Intelligence research, experts’ background, controversial issues, and the media. Intelligence, 78, 101406. https://doi.org/10.1016/j.intell.2019.101406

Spearman, C. (1904). “General intelligence,” objectively determined and measured. The American Journal of Psychology, 15(2), 201. https://doi.org/10.2307/1412107

Thomson, G. H. (1916). A hierarchy without a general factor. British Journal of Psychology 1904-1920, 8(3), 271–281. https://doi.org/10.1111/j.2044-8295.1916.tb00133.x

Jensen, A. R. (1998). The g factor: The science of mental ability. Praeger Publishers/Greenwood Publishing Group

Van Der Maas, H. L. J., Dolan, C. V., Grasman, R. P. P. P., Wicherts, J. M., Huizenga, H. M., & Raijmakers, M. E. J. (2006). A dynamical model of general intelligence: The positive manifold of intelligence by mutualism. Psychological Review, 113(4), 842–861. https://doi.org/10.1037/0033-295X.113.4.842

Johnson, W., Nijenhuis, J. T., & Bouchard, T. J. (2008). Still just 1 g: Consistent results from five test batteries. Intelligence, 36(1), 81–95. https://doi.org/10.1016/j.intell.2007.06.001

Project Gallery