Prevalence of burnout among GPs: a systematic review and meta-analysis

Background Burnout is a work-related syndrome documented to have negative consequences for GPs and their patients. Aim To review the existing literature concerning studies published up to December 2020 on the prevalence of burnout among GPs in general practice, and to determine GP burnout estimates worldwide. Design and setting Systematic literature search and meta-analysis. Method Searches of CINAHL Plus, Embase, MEDLINE, PsycINFO, and Scopus were conducted to identify published peer-reviewed quantitative empirical studies in English up to December 2020 that have used the Maslach Burnout Inventory — Human Services Survey to establish the prevalence of burnout in practising GPs (that is, excluding GPs in training). A random-effects model was employed. Results Wide-ranging prevalence estimates (6% to 33%) across different dimensions of burnout were reported for 22 177 GPs across 29 countries were reported for 60 studies included in this review. Mean burnout estimates were: 16.43 for emotional exhaustion; 6.74 for depersonalisation; and 29.28 for personal accomplishment. Subgroup and meta-analyses documented that country-specific factors may be important determinants of the variation in GP burnout estimates. Moderate overall burnout cut-offs were found to be determinants of the variation in moderate overall burnout estimates. Conclusion Moderate to high GP burnout exists worldwide. However, substantial variations in how burnout is characterised and operationalised has resulted in considerable heterogeneity in GP burnout prevalence estimates. This highlights the challenge of developing a uniform approach, and the importance of considering GPs' work context to better characterise burnout.

Burnout is generally referred to as an inability to cope with chronic psychological stress at work due to insufficient resources to cope with job demands. 15,16 Researchers have denoted that burnout captures three dimensions/subscales: emotional exhaustion (EE), cynicism/depersonalisation (DEP), and feelings of reduced personal accomplishment (PA). [17][18][19] This characterisation of burnout is also used in health care, as is aptly captured in the World Health Organization's (WHO) 11 th revision of the International Classification of Diseases (ICD-11).
General Practitioner (GP) tasks are related to treating illness in the context of the patient's life, belief systems and community (thus it is person-rather than diseasefocused) 20,21 , and working with other healthcare professionals to coordinate care and make efficient use of health resources 22,23 . While surveys on physician burnout in the US conducted by other researchers reported that physician specialties that frequently deal with patients and their families, like GPs, experienced considerably higher burnout rates than other specialties, it is unclear how prevalent GP burnout is. 12,24 This systematic review aimed to conduct a synthesis of the evidence on the prevalence of GP burnout documented in the literature. In doing so, it aimed to deliver a baseline picture of burnout in the GP context, to establish the burden GP burnout imposes on the health care system. This, in turn, may benefit policy makers, healthcare institutions, clinicians, researchers and the public to develop interventions to address the syndrome. This is especially important in the post-COVID-19 environment, 3 which has witnessed considerably greater burden placed on GPs via more frequent patient visits and other requirements.

Data Sources and Searches
The search strategy for this systematic review was conducted using a combination of key words and subject headings to include two concepts: general practice or GP, and burnout.
Primary care physicians typically include GPs as well as other physicians like pediatricians, emergency physicians, and internal medicine specialists. However, this study focuses specifically on physicians who typically undertake generalist patient care such as GPs, and excludes the other sub-specialties of primary care. Only studies that reported prevalence estimates on GP burnout in general practice using the Maslach Burnout Inventory-Human Services Survey (MBI-HSS) were included in this review. Although different burnout scales have been used in prior research, the MBI-HSS was used in this review to allow comparisons in burnout prevalence estimates across studies. Moreover, the MBI-HSS is the most widely used burnout instrument in the literature that measures burnout by capturing the different dimensions of burnout that have been identified in the literature, namely, emotional exhaustion, depersonalisation, and personal accomplishment. The following databases were searched for potentially relevant articles, followed by screening the reference lists of identified articles: CINAHL Plus, Embase, MEDLINE, PsycINFO, and Scopus. The study eligibility criteria and selection are outlined in the Appendix. Details pertaining to the search terms, inclusion and exclusion criteria, and search strategy used for each database are outlined in Appendix S1. The review followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. 4

Data Extraction
The following data were extracted from each article using a standardised form by one of us (C.K.): geographic location, survey period, sample size with response rate, average age of participants (GPs), number and proportion of male participants, average number of years the participants have worked in general practice, practice size, number of hours worked per week, version of MBI-HSS instrument used to measure burnout, cut-off criteria to denote subcomponents of burnout (EE, DEP and low PA) and overall burnout (defined using the criterion used in the study), mean and proportion estimates of subcomponents of burnout and overall burnout, for all the GPs and for male versus female GPs.

Risk of Bias and Quality Assessment
The risk of bias of the included studies was assessed by one reviewer (C.K.) using the Joanna Briggs Institute (JBI) Critical Appraisal Checklist for Studies Reporting Prevalence Data, which scored studies based on 9 items that assessed quality. This checklist is described in the Appendix. Full details of the scoring method used and the quality appraisal results for the studies included in this review are provided in Appendices S2 to S4.

Pooled Analysis
We conducted a meta-analysis of high quality studies defined using a threshold of 7 out of 9 items (77.8%) that satisfied the respective quality criteria pertaining to the JBI checklist. Stata statistical software version 16.0 (StataCorp) was used to obtain pooled burnout estimates. The meta-analysis commands used are summarised in Appendix S5. Pooled mean estimates of the burnout subscales were computed using the metan command for means and standard errors (SEs), with the SEs having been calculated in advance using the standard deviations (SDs).
Prevalence estimates (rates) were computed from these numbers using the metaprop 5 command, reflecting the pooled proportion of GPs who were reported to suffer from burnout.
Accounting for potential heterogeneity across studies, a random-effects model was employed to estimate variances of the raw proportions or means.

Study Characteristics
The PRISMA flow diagram detailing the selection process for the 60 articles included in the systematic review is given in Figure 1. Thirty-one of the 60 (51.6%) identified studies met the threshold of "high quality". Of these studies, 74.2% (23/31) reported the number of GPs that suffered from high or moderate burnout along one or more of the burnout subcomponents (EE, DEP and PA) and overall burnout; 58.1% (18/31) reported mean and standard deviation estimates for one or more of the burnout subcomponents. Appendix S6 provides a description of selected demographic data extracted from the 60 included studies in this review; burnout cut-offs, mean, and proportion estimates are provided in Appendices S7 and S8. Estimates are provided separately for male and female GPs if they are reported in the respective study.
Study time periods ranged from 1987 to 2020, comprising data from 22,177 GPs across 29 countries spanning 5 continents. The majority of these studies (70%; 42/60) were conducted in Europe whereas 18.3% (11/60) were conducted in Asia, with the remaining studies conducted in the following three continents: Africa 1.7% (1/60), North America 3.3% (2/60); and Oceania 6.7% (4/60). Where a study was conducted over different time periods, data for the earliest period was extracted. Most of the studies (70%; 42/60) used the 22-item version of the MBI-HSS.
The reported findings collectively show that there is wide variation in the demographic data, as well as burnout cutoffs and estimates, extracted from the studies included in the review. Selected demographic characteristics reported in the 31 high quality studies are provided in Appendix S9. The heterogeneity in demographic and burnout data observed for the 60 included studies remained for the higher quality 31 studies included in the metaanalysis. However, the ranges of the burnout estimates reported in these studies are considerably narrower than those reported for all the 60 studies. Figure 2 reports the pooled random-effect mean estimates using continuous data based on the scores obtained for the difference burnout subscales: 16.43 (95% CI, 13.57-19.29; I 2 = 100.0%; p=0.00) for EE; 6.74 (95% CI, 5.29-8.18; I 2 = 99.8%; p=0.00) for DEP; and 29.28 (95% CI, 23.61-34.96; I 2 = 100.0%; p=0.00) for PA. These estimates denote moderate levels of burnout for EE and DEP, and a high level of burnout for PA, based on standard burnout cut-offs for these subscales, indicating significant levels of burnout among GPs. As evident in the high I 2 (> 99%), there is considerable heterogeneity across studies. Appendix S11
The sub-group analysis by country revealed that the country the study was conducted in did not influence High EE; High DEP was significantly higher in China (regression coefficient, 0.543; 95% CI, 0.386 to 0.700; p = 0.00) than in the other countries included in the meta regression; Low PA was significantly higher in China (regression coefficient 0.213; 95% CI 0.088 to 0.339; p = 0.01), Denmark (regression coefficient 0.220; 95% CI 0.117 to 0.324; p = 0.00), and England (regression coefficient 0.211; 95% CI 0.080 to 0.341; p = 0.01) than in other countries. Overall, there is some evidence that GPs from China experienced higher depersonalisation than GPs from other countries.
Overall, there was high residual heterogeneity for high burnout (>=95% for continent and >=70% for country) and moderate burnout (>=84% for continent), indicating that continent and country were not important determinants of heterogeneity in the reported GP burnout estimates across studies. There was no residual heterogeneity (0.00%) and high explained between-study variance for the cut-off for moderate overall burnout (adjusted R-squared 99.93%), indicating that this cut-off may be an important determinant of heterogeneity in moderate overall burnout estimates across studies. The findings also reveal that less restrictive burnout criteria used in the studies is associated with higher GP burnout prevalence. For example, the more restrictive criterion for moderate overall burnout used in the studies of High EE and/or High DEP has a smaller regression coefficient of 0.170 compared to the less restrictive criterion of High EE and/or High DEP and/or Low PA, which has a regression coefficient of 0.355.
Tests of publication bias via funnel plots 25 and Egger tests 26 were conducted and results provided in Appendix S12. The results provide no evidence of publication bias using the dichotomous data. Visual inspection of the funnel plots showed no asymmetry in all 9 distributions for burnout studies. Furthermore, the Egger tests did not show significant results and thus suggested no evidence of publication bias among the studies on burnout proportions. However, Egger tests on studies using the continuous data showed some evidence of possible small-study effects, with significant results (p=0.00) for Mean EE, Mean DEP, and Mean PA.
As another sensitivity test, the meta-analysis was conducted including studies of lower quality (rated 6 or lower on the JBI) that were more susceptible to risk of bias. The results (Appendix S13) showed that the burnout estimates were similar and still displayed significant heterogeneity for all studies (including those of lower quality) as for only higher quality studies.

Summary
The 60 studies included in this systematic review reported a wide range of demographic characteristics, burnout cut-offs and prevalence estimates. Some studies characterised burnout as uni-or bi-dimensional, although the vast majority of studies characterised burnout as multidimensional. Other studies contribute to the ambiguity with how burnout is characterised by partitioning burnout into high, moderate and low dimensions, or using different labels (e.g., "severe", "high", "extreme", "full", or "complete" were used to denote high burnout). These variations across studies were observed despite narrowly focusing on only one burnout instrument, the MBI-HSS, and one specialty, general practice, in this review.
In our study, there appears to be some evidence that the country the study was conducted in may influence this heterogeneity. It is conceivable that different national cultural factors (e.g., general practice being perceived as a calling versus a profit-making enterprise) may influence how workload is perceived and thus burnout experienced by GPs. Furthermore, the different features of the primary care system across countries may influence the GP's work environment, which in turn may influence the likelihood of burnout. This review has provided evidence that the cut-offs used to denote burnout play an important role in influencing GP burnout estimates across studies. The more restrictive the burnout criterion used, the lower the burnout estimate reported across studies.

Comparison with existing literature
The wide ranges in burnout estimates reported in this review are consistent with those reported in two recent systematic reviews on the prevalence of physician burnout across a range of specialties. 27,28 The evidence provided in these studies and our study collectively may reflect the heterogeneity across studies in the criteria used to define and measure burnout, and thus highlight the importance of uniformity in how burnout is measured and defined across studies.

Strengths and limitations
This study is, to the authors' knowledge, the first to undertake a systematic review and meta-analysis of studies on the prevalence of GP burnout worldwide. Another strength of this study is that it attempted to conduct a rigorous examination of the burden of GP burnout worldwide based on a clearly defined concept of burnout using the MBI-HSS, and focusing only on general practice.
However, this study has several limitations. First, the studies included in this review were not conducted concurrently. Hence, the findings may be subject to different interpretations across different time periods. Second, the different demographics, at the GP and other levels, across the studies may have influenced how burnout is perceived, and may in turn influence the generalisability of the findings. Third, although every attempt was made to select studies that were similar in their methodological approach for the quantitative 11 analysis, several differences in the study design remained and reduced comparability across the studies. Fourth, given this review's focus on studies using the MBI-HSS, the insights derived in this review should be interpreted with caution, especially given the criticism some researchers have directed toward the MBI-HSS instrument and who have used other instruments such as the Oldenburg burnout inventory and the Copenhagen Burnout Inventory.
Relatedly, the MBI-HSS is subject to criticism of bias generated by self-ratings by respondents on the questionnaire used in the study. To focus narrowly on burnout, studies on constructs related to burnout, like psychological or occupational stress, were not included in the review. To the extent that these studies also capture GP experiences similar to burnout, this review could be criticised as ignoring a vast literature that may be relevant.
In a similar vein, what constitutes burnout has been debated in the literature, and the literature that conflates burnout and depression was excluded. It is conceivable that there is an important overlap between GP mental health, psychological distress and burnout. More importantly, burnout may be more a manifestation of the GP's underlying mental condition than solely due to the workplace context. Hence, the generalisability of this review's findings beyond studies using only the MBI-HSS could be called into question. Relatedly, this literature may also include papers on burnout using the MBI-HSS that may not have been identified in the search strategy used in this systematic review. The MBI-HSS, used in this review, was designed to capture burnout associated with interpersonal relations. However, GP burnout also arises due to factors external to human relations like workload and electronic documentation. Thus, the MBI-HSS may not fully capture GP burnout. Fifth, studies conducted in a language other than English were not included, which may limit this review's generalisability to other studies not conducted in English. Finally, this review only considered peer-reviewed publications and did not consider published data from non-peer-reviewed 12 outlets, which also may have introduced another type of selection or publication bias.

Implications for research and/or practice
This study has shown that the approaches used in prior studies to characterise and operationalise GP burnout are inconclusive, with the reported wide-ranging prevalence estimates possibly influenced by a range of factors like using different measurement scales, differing cut-off points to define burnout, differing approaches to how burnout is characterised, and different cultural attributes across countries. An implication of this finding for research, practice, and policy pertaining to addressing GP burnout is that assessing and addressing the syndrome should be undertaken by considering the context the GPs work in.
The work environment is challenging for the GP, as the GP's decisions and actions are influenced by those of the patients and other agents that operate within the primary care system who may have different expectations and demands. 22,29 These differences in values and priorities between the GP and other individuals in the primary care system can result in difficult interactions between the GP and these individuals. Additional research on the reasons for high/moderate burnout was beyond the scope of this paper, but could be related to differences in priorities between the individual GP and the practice the GP is employed at. For example, the emphasis on efficiency could be perceived by GPs as being at the expense of patient welfare, leading to a potential mismatch in values between the practice and the GP. This could interact with the work-related burden imposed on the GP, perhaps exacerbating the level of burnout.
Recent studies have shown that the COVID-19 pandemic also played an important role in influencing physician burnout. For example, one study showed that infection or death from COVID-19 among colleagues or relatives showed significant association with higher emotional exhaustion and lower personal accomplishment. 30 Two other studies 13 reported that GPs described feeling more stressed during the pandemic than they had been previously due to higher workload (e.g., due to new responsibilities such as additional safety protocols, learning new technology, and daily emails for prescriptions). 31,32 The extraordinary impact of the COVID-19 emergency on GPs, as frontline medical providers, was in part produced by the uncertainty of the procedures and treatments required and the immediate saturation of hospitals for critical case management. GPs had to respond directly to a large number of requests without clear prevention or screening instruments. At the time of writing of this paper, GPs are the foundation of COVID-19 vaccination programs in several countries and are heavily involved in administering vaccines, with some even involved in COVID-19 diagnoses, thus increasing their workload even further.
Differences across countries in the severity of the disease as well as the resources available and methods used to curb and treat it (including inefficiencies associated with supplying vaccines to GPs), and operating under different primary care systems, are likely to exacerbate the impact of COVID-19 on GP burnout across countries. Probing GP burnout in more detail within the GP's workplace environment is left for future research.

Funding
This study received no funding.