Mental health problems are comorbid, which means that they are positively intercorrelated and don’t tend to occur in isolation. Of all people diagnosed with major depression, for example, about half of them have at least one more comorbid mental health problem, such as generalized anxiety disorder or posttraumatic stress disorder. Now, the same holds for medical problems, and on top of that, there is comorbidity between mental and physical health problems, for example, cancer and depression often go together. Many have speculated as to the causal mechanism that governs this comorbidity, and today’s blog post is about one particular theory: the d (disease) factor.
The d factor
A few days ago, a new paper came out on this topic in World Psychiatry, one of the most renowned journals in psychiatry. The authors aim to answer the question why mental and physical health problems often co-occur. The short paper is entitled “First evidence of a general disease (“d”) factor, a common factor underlying physical and mental illness”. In the paper, the d factor is defined as an “underlying disease dimension [..] that accounts for the individuals’ propensity to develop mental as well as physical conditions”, and as “a general vulnerability to develop any of the included conditions”. This rests on similar work in the mental health literature, where authors have identified a p (for psychopathology) factor that is thought to explain the comorbidity among mental health problems.
In the paper on the d factor, authors fit a particular statistical model to a large dataset, and claim that they have “discovered” the d factor: “the results support the assumption of the existence of a general ‘d’ factor in adults”. The authors also claim that their discovery has “highly relevant research and clinical implications regarding our understanding and management of mental and physical conditions, as well as for service organizations”, “relevant implications for the conceptualization and classification of mental and physical conditions”, and “important implications for clinical practice and policy”.
These are sweeping conclusions that, in my view, are not supported by evidence.
Shortcomings of the paper
The paper repeats mistakes that have been made in the p factor literature; I’ll summarize 3 in some detail below.
1. Bifactor Schmifactor
First, and most importantly, the authors fit 3 statistical models to the data: a) a correlated factors model, b) a unifactor model, and c) a bifactor model. It doesn’t really matter what these models are or do in particular. What is important here is that when you fit a statistical model to data, you are doing something similar to trying to find the right lid (statistical model) to your pot (data). If the lid fits well, which is determined by so-called “fit indices” in statistics, it indicates that you have a good match between statistical model and data. This may then allow you to conclude that your statistical model represents or describes your data well. For example, a few decades ago scientists found that a statistical model “smoking causes lung cancer” fit data well, corroborating the theory that smoking causes lung cancer.
In the current paper, the authors find that the bifactor model fits the data better than the other two models they fit. Unfortunately, the bifactor model has an exceptionally high fit propensity, meaning that it is a lid that fits all sorts of pots really well. For the smoking and cancer example, this would mean that no matter how the data look like, the statistical model would tell you that smoking causes cancer because this statistical model tends to fit all sorts of data well.
Worse, it is well established in statistics that even if you buy a pot and lid together (i.e. they are the perfect match!), and you now swap out your newly purchased, precious lid for a lid from the bifactor model, the bifactor model lid will fit better although we know for a fact it is the wrong lid for the pot! Statistically speaking, this means that even if you simulate data from e.g. a correlated factors model, the bifactor model will fit this simulated data better than a correlated factor model, raising very big concerns about using model fit of the bifactor model to determine what model is best.
Unfortunately, the authors’ whole argument rests on this one point, the fit of the bifactor model:
“We found that the bifactor model fitted the data best (CFI=0.98, TLI=0.98, RMSEA=0.016) […]. Therefore, our results support the assumption of the existence of a general “d” factor in adults.”
This conclusion is false, and a little disheartening this was not caught in peer-view, given how well known the problem is in the statistical literature. There are two more problems I would like to explain.
2. Statistical equivalence
First, there are hundreds of models that represent hundreds of different causal processes, but the authors didn’t fit those hundreds of models to their data: they only fit three. This makes it difficult to conclude that they really found support for their particular theory. For example, one could have the theory that …
- … having depression makes one more vulnerable to developing an anxiety disorder
- … having a physical health problem such a chronically weak immune system makes one more vulnerable to develop other physical health problems due to this weak immune system
- … the comorbidity of physical and mental health problems, such as between cancer and depression, is not explained by some underlying disease factor, but that the relation comes from the fact that people with cancer are more likely to develop depression.
This “systems” theory is highly plausible given what we know about mental and physical health comorbidities, but the authors didn’t test this theory, because they didn’t fit an appropriate statistical model to the data. If you were to fit such a model to the data, it is widely known in statistics that the model would have very good fit to the data as well! This is due to something known as statistical equivalence, or in other words, that there are multiple competing models that can describe the authors’ data equally well (there are several lids for the pot). Philosophers call this situation one where a theory is “underdetermined by data”, i.e. that data and statistical model together are insufficient to provide strong evidence for a theory. This, of course, is another big problem for the authors’ interpretation that the underlying d factor theory is supported because the bifactor model fits their data: other models they didn’t fit have at least equal fit, and there may be models with even better fit. How can they then claim they discovered the lid?
3. Discovering vs generating latent variables
Finally, I would argue that the authors did not “discover” the d factor: they “created” it. The d factor is not in the data, but one can fit a bifactor model to make a new variable, and then one can call it the d factor. But calling this process “discovery” is odd, because in any situation where a set of variables is positively correlated (no matter the causal process that leads to these correlations), I can create a variable that summarizes these variables. For example, I can simulate data in which 10 variables are correlated with each other because every variable causes one other variable: A causes B, B causes C, C causes D, and so on. Now I have 10 intercorrelated variables, and I can fit a latent variable model that summarizes these correlations as latent variable M (the matrix factor). But that has nothing to do with discovery, and concluding that the m factor “underlies” my data, or “accounts” for the correlations among my 10 variables, is not defensible given the evidence I have: a latent variable I myself created.
Conclusion
There are other challenges with the paper, but these are just your typical challenges with measures we use in psychology and psychiatry, or with the fact that the data authors have don’t lend themselves to the type of causal inference that the authors engage in (i.e. that the d factor “accounts for” the data or “underlies” the data). Ashley Watts and colleagues recently submitted a paper on the topic (I will link to it here once it is available online) that summarizes all these issues for the p factor literature, and these challenges apply to the current paper as well.
The main concern I have here is, to briefly reiterate, that the conclusion that there is evidence for the existence of an underlying dimension — the d factor — does not follow from demonstrating that a bifactor model has good fit for data. A lot more work would need to be done to support this conclusion.
Pingback: Playing the alphabet factor game: the S factor for satisfaction » Eiko Fried
Pingback: Antidotes to cynicism creep in academia » Mental health & data science
Great, Eiko!
Thanks a lot; quiet encouraging that there are people like you!
Unfortunately, most colleagues are reluctant to carefully consider what a priori assumptions they invest – and still choose to ignore to study theory of sciences (and see the shortcomings of naive empirism).
A frequent error is to confuse a theoretical construct with a real entity; another one to take naturalism, determinism and reductionism for granted and – thereby – fail to distinguish between reactive behavior and purposeful action
Best wishes
Tobias
Love this article. It seems like if this d-factor existed they would find physical evidence of a tangible process. However, they simply found a statistical correlation on a single data set.
A therapist/researcher still reports that he can predict divorce with 93% accuracy. He conducted a statistical technique that involves trying hundreds of lids on the pot [lots of permutations of various variables] until the computer found the pot that fit the data the best. However, when his model was applied to another data set, the predictive power plummeted.
I’ve gone back to this statement over and over as I look at research finding a correlation.
A might cause B.
B might cause A.
A and B might be caused by C.
and The correlation might be spurious.
You have to consider each option in most cases where a correlation has been found.
That article is from the previous issue of World Psychiatry. And it is problematic for reasons beyond the excessvie reliance on suspect statistical methods.
The current (June) issue has an article that suggests possible explanations.
Nesse, R. M. (2023). Evolutionary psychiatry: Foundations, progress and challenges. World Psychiatry, 22(2), 177–202. https://doi.org/10.1002/wps.21072
A good paper to write would focus on the connection between the extensive comorbidity of mental disorders and their amazing high genetic correlation. I think there are a number of possible explanations. The obvious one is that disorders cause each other. People with addictions lose their jobs and friends and get depressed. But there are also more fundamental reasons, such as generalized dysregulation of brain development that can have many manifestations. I am working on using polygenic scores to try to understand these relationships.