Problems with Pawlikowska et al. (BMJ 1994)

8 08 2023

This post is an appendix to another post.

The main issue that Martin Bland noted was that the summary data in the text did not match the data displayed in the graphs. This was subsequently fixed by the correction provided by Chalder and Wessely. The scores in the previous version had been shunted by +11 for the Fatigue Scale and +12 for the General Health Questionnaire. But for me, a bigger question remains as to why they had quoted the precision estimates (95% CIs) for the means rather than indicating the spread of the data (SDs), which would be more usual? I suspect it was purely lack of knowledge about what the output of the software (SPSS) meant.

This is also clear in the way that data are quoted throughout the rest of the paper. Not only that, but they also plotted those precision estimates in figure 3, instead of showing the actual spread of the data. This gives a false indication of the level of correlation. They also fail to indicate that they have used the bimodal versions of the data for that analysis. And because of that, the y-axis just looks, well, odd, because it only goes up to 10, rather than 12, which would encompass the entire range of that bimodal scale.

Bland also says:

“There are several more subtle statistical problems: the histograms with unequal interval sizes shown as the same length on the graph; the statement that with such large numbers the distributions of responses to the fatigue and the general health questionnaires follow a normal distribution (the shape of the distribution is not related to the sample size); the ignoring of the cluster sampling; the use of two different scoring systems for the questionnaires.”

None of these issues are addressed by the authors in their reply.

Both authors and editors say that the conclusions are unaffected by this main issue. But further scrutiny would have revealed that the conclusion that “fatigue… is closely associated with psychological morbidity” is flawed, because of the use of the General Health Questionnaire as an indicator of psychological morbidity in a GP patient population. Goldberg (the author of the GHQ) specifically urges caution in using the scale to diagnose psychiatric disorders because of a high false-positive rate if neurological (or even acute medical) symptoms are involved. Does that extend to psychological morbidity? Well, surely any kind of chronic or persistent disease is likely to be comorbid with psychological distress, but that’s not what this questionnaire is doing. It is asking questions that could simply be interpreted that way.

If your illness affects your concentration (brain fog), sleep, usefulness, decision-making ability, stress (or strain), overwhelm, enjoyment of day-to-day activities, mental state, confidence, worth, or happiness, you will likely have a high score.

What was also interesting to me, and something I hadn’t realised before, was that the GHQ was used as a direct template for the Chalder Fatigue Questionnaire (Fatigue Scale). So of course there is a reasonable chance they will correlate! Many of the questions in both scales are alternative versions of each other. For example, GHQ asks “Have you been able to concentrate on whatever you’re doing?”, whereas CFQ asks “Do you have difficulties concentrating?”. Of course, the CFQ is biased towards questions that deal with tiredness and fatigue, but that also leads to a substantial risk of collinearity between the items.

The biggest and potentially the most dangerous assumption we can make is that the tool we are using is accurately measuring what we think it is measuring. No amount of complex statistical validation techniques can tell us that if the underlying assumptions (that led to the construction of the scale) are flawed.

Ref.

Pawlikowska T, Chalder T, Hirsch SR, Wallace P, Wright DJM, Wessely SC. Population based study of fatigue and psychological distress. BMJ 1994;308:763–766.





Is science self-correcting? How flawed stats and false assumptions plague the medical literature

8 08 2023

“Read the evidence!” we are told, when we question long-held views. However, the historic structure of scientific literature often makes it very difficult to track the evolution of evidence through time. As a former medical journal copyeditor (and later, statistician), I’ve often pondered on the inadequacies of the research record to accurately document what has happened in any particular field, because it tends to preserve the wrong thing. Once a controversial or flawed paper appears in print, it is pretty much untouchable unless it is retracted or corrected. Even then, retraction does not erase any erroneous conclusions from consciousness. And any debate that subsequently occurs by means of correspondence is usually long forgotten, particularly if the original paper keeps being cited, if indeed anyone was interested enough to follow it up and read it.

Although the PubMed record may give an indication of how much discussion there was about a paper, the trail is easily lost, often incomplete, and few rarely have the diligence to follow it all the way to any conclusion. Sometimes it fails to log errors, flaws go uncorrected, and the literature rumbles along, oblivious. This is the tale of one such error.

Read the rest of this entry »




Muddying the waters

18 11 2022

In January 1989, a group of clinicians (a senior registrar, a clinical research worker, and two behavioural therapists) published a discussion paper in the Journal of the Royal College of General Practitioners entitled “Management of chronic (post-viral) fatigue syndrome”. [pdf]

Read the rest of this entry »