This post is an appendix to another post.
The main issue that Martin Bland noted was that the summary data in the text did not match the data displayed in the graphs. This was subsequently fixed by the correction provided by Chalder and Wessely. The scores in the previous version had been shunted by +11 for the Fatigue Scale and +12 for the General Health Questionnaire. But for me, a bigger question remains as to why they had quoted the precision estimates (95% CIs) for the means rather than indicating the spread of the data (SDs), which would be more usual? I suspect it was purely lack of knowledge about what the output of the software (SPSS) meant.
This is also clear in the way that data are quoted throughout the rest of the paper. Not only that, but they also plotted those precision estimates in figure 3, instead of showing the actual spread of the data. This gives a false indication of the level of correlation. They also fail to indicate that they have used the bimodal versions of the data for that analysis. And because of that, the y-axis just looks, well, odd, because it only goes up to 10, rather than 12, which would encompass the entire range of that bimodal scale.
Bland also says:
“There are several more subtle statistical problems: the histograms with unequal interval sizes shown as the same length on the graph; the statement that with such large numbers the distributions of responses to the fatigue and the general health questionnaires follow a normal distribution (the shape of the distribution is not related to the sample size); the ignoring of the cluster sampling; the use of two different scoring systems for the questionnaires.”
None of these issues are addressed by the authors in their reply.
Both authors and editors say that the conclusions are unaffected by this main issue. But further scrutiny would have revealed that the conclusion that “fatigue… is closely associated with psychological morbidity” is flawed, because of the use of the General Health Questionnaire as an indicator of psychological morbidity in a GP patient population. Goldberg (the author of the GHQ) specifically urges caution in using the scale to diagnose psychiatric disorders because of a high false-positive rate if neurological (or even acute medical) symptoms are involved. Does that extend to psychological morbidity? Well, surely any kind of chronic or persistent disease is likely to be comorbid with psychological distress, but that’s not what this questionnaire is doing. It is asking questions that could simply be interpreted that way.
If your illness affects your concentration (brain fog), sleep, usefulness, decision-making ability, stress (or strain), overwhelm, enjoyment of day-to-day activities, mental state, confidence, worth, or happiness, you will likely have a high score.
What was also interesting to me, and something I hadn’t realised before, was that the GHQ was used as a direct template for the Chalder Fatigue Questionnaire (Fatigue Scale). So of course there is a reasonable chance they will correlate! Many of the questions in both scales are alternative versions of each other. For example, GHQ asks “Have you been able to concentrate on whatever you’re doing?”, whereas CFQ asks “Do you have difficulties concentrating?”. Of course, the CFQ is biased towards questions that deal with tiredness and fatigue, but that also leads to a substantial risk of collinearity between the items.
The biggest and potentially the most dangerous assumption we can make is that the tool we are using is accurately measuring what we think it is measuring. No amount of complex statistical validation techniques can tell us that if the underlying assumptions (that led to the construction of the scale) are flawed.
–
Ref.
Pawlikowska T, Chalder T, Hirsch SR, Wallace P, Wright DJM, Wessely SC. Population based study of fatigue and psychological distress. BMJ 1994;308:763–766.