Elsevier

General Hospital Psychiatry

Volume 29, Issue 5, September–October 2007, Pages 388-395
General Hospital Psychiatry

Psychiatry and Primary Care
Diagnostic accuracy of the mood module of the Patient Health Questionnaire: a systematic review

https://doi.org/10.1016/j.genhosppsych.2007.06.004Get rights and content

Abstract

Objective

The nine-item mood module of the Patient Health Questionnaire (PHQ-9) was developed to screen and to diagnose patients in primary care with depressive disorders. We systematically reviewed the psychometric literature on the PHQ-9 and performed a meta-analysis to ascertain its summary sensitivity and specificity.

Methods

EMBASE, PubMed and PsycINFO were used to search literature up to July 2006. Studies were included if (1) they investigated the diagnostic accuracy of the PHQ-9 and (2) the PHQ-9 had been compared with a reference test. The quality of the studies was appraised using the Quality Assessment of Diagnostic Accuracy Studies. We calculated sensitivity, specificity and confidence intervals for each included study. We used the random effects model to calculate the summary sensitivity and specificity.

Results

We found a sensitivity of 0.77 (0.71–0.84) and a specificity of 0.94 (0.90–0.97) for the PHQ-9. The positive predictive value in an unselected primary care population was 59%, which increased to 85–90% when the prior probability increased to 30–40%.

Conclusion

In primary care, the PHQ-9 is a valid diagnostic tool if used in selected subgroups of patients with a high prevalence of depressive disorder.

Introduction

Detection of depression, especially in primary care, is far from optimal [1], [2], [3]. Both underdiagnosis and overdiagnosis have been reported, resulting in inadequate treatment. Underdiagnosis is related to the fact that patients present to their family physician with atypical symptoms either because they are too ashamed to discuss psychological problems or because subjective somatic symptoms are the main reason for their consultation. Family physicians may have difficulty with asking patients frankly about psychological symptoms. Sometimes, they do not know how to introduce the idea that depression may be an explanation for patients' physical complaints [4]. Overdiagnosis may occur among patients with subclinical depression or in psychological distress who are known to have had earlier episodes of depression [5]. Underdiagnosis carries the risk that patients do not get effective psychiatric treatment or are inappropriately treated for physical symptoms. Conversely, overdiagnosis carries the risk for unnecessary and therefore ineffective psychiatric treatment of minor and self-limiting problems. An instrument to detect a major depressive episode (MDE) should ideally have both a high sensitivity and a high specificity in order to reduce the number of false-negatives and false-positives.

A number of screening instruments to detect depressive episodes have been developed. Recently, these instruments were evaluated by Williams et al. [6] in a literature synthesis reporting similar operating characteristics but differences in administration time, ease of scoring and the ability to serve additional purposes, such as monitoring severity and screening for conditions other than depression. Of these instruments, only the Patient Health Questionnaire (PHQ) was developed for screening and diagnosis as well as monitoring of depression severity. It was developed in 1999 as a self-report version of the Primary Care Evaluation of Mental Disorders (PRIME-MD) [7] aimed at criteria-based diagnosis not only of depressive episodes but also of other mental disorders commonly encountered in primary care. Nowadays, the PHQ is used all over the world and has been translated into more than 25 languages, including German [8], French [9], Spanish [10], Italian [11], Arabic [12], Bengali [13], Turkish [14], Flemish [15] and Dutch [16].

The nine-item depression module of the full PHQ is called the PHQ-9 [17]. In contrast to other depression questionnaires, the PHQ-9 evaluates the nine Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition (DSM-IV) criteria for MDE [18]. The diagnosis of MDE can be made by a categorical algorithm using these nine items. By calculating a summary score, the severity of an episode can be assessed. Several studies have reported on the diagnostic accuracy of this instrument, but so far, the results have not been synthesized. We systematically reviewed the literature on the diagnostic accuracy of the self-report version of the PHQ-9. We further performed a meta-analysis to calculate its summary sensitivity and specificity.

Section snippets

Data sources

We performed a systematic search of literature dating between 1999 (PHQ issued) and July 2006 using the databases EMBASE, PubMed and PsycINFO with the terms “PHQ” and “Patient Health Questionnaire,” both as MESH headings and as text words. In addition, we checked the references of all included articles for relevant studies.

Study selection

Articles were included if their titles and abstracts were focused on the diagnostic accuracy of the PHQ-9 for MDE. Furthermore, the PHQ had to have been compared with a

Study selection

We found 223 articles, of which 40 were selected for detailed reading. Twenty-eight articles were excluded because (1) they did not concern the self-administered version of the PHQ-9 (n=5), (2) the study did not validate the complete mood module of the PHQ (n=4), (3) the PHQ-9 was only used for detecting any depressive episode and not specifically MDE (n=3), (4) the article was not a diagnostic accuracy study on the PHQ-9 (n=14) or (5) the data had been previously published in articles by the

Discussion

Our meta-analysis, in which we used the random effects model to calculate the summary sensitivity and specificity of four primary care studies, shows that the PHQ-9 has a high specificity of 0.94 (range=0.90–0.97) when used with the algorithm. This indicates that the PHQ-9 is a reliable tool if the user wants to avoid overdiagnosis. On the other hand, the chance of missing a patient with a depressive disorder in an unselected primary care sample (estimated prevalence of MDE=10%) is substantial

Acknowledgments

The authors thank Rob J.P.M. Scholten, MD, PhD, for his statistical advice and for constructing Figure 3 of this article.

References (39)

  • D. Gill et al.

    Frequent consulters in general practice: a systematic review of studies of prevalence, associations and outcome

    J Psychosom Res

    (1999)
  • J.C. Coyne et al.

    Questionnaires for depression and anxiety. Routine screening entails additional pitfalls

    BMJ

    (2001)
  • E.S. Paykel et al.

    The Defeat Depression Campaign: psychiatry in the public arena

    Am J Psychiatry

    (1997)
  • J.P. Docherty

    Barriers to the diagnosis of depression in primary care

    J Clin Psychiatry

    (1997)
  • E. Aragones et al.

    The overdiagnosis of depression in non-depressed patients in primary care

    Fam Pract

    (2006)
  • R.L. Spitzer et al.

    Utility of a new procedure for diagnosing mental disorders in primary care. The PRIME-MD 1000 study

    JAMA

    (1994)
  • K. Grafe et al.

    Screening psychischer Storungen mit dem “Gesundheitsfragebogen fur Patienten (PHQ-D)”: Ergebnisse der deutschen Validierungsstudie

    Diagnostica

    (2004)
  • Dumont P, Andreoli A, Borgacci S. et al. Quick detection of depression: a significant clinical issue. Rev Med Suisse...
  • C. Diez-Quevedo et al.

    Validation and utility of the Patient Health Questionnaire in diagnosing mental disorders in 1003 general hospital Spanish inpatients

    Psychosom Med

    (2001)
  • Cited by (379)

    View all citing articles on Scopus

    K. Wittkampf had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.

    View full text