Abstract
Background General practices in the UK contract with the government to receive additional payments for high-quality primary care. Little is known about the resulting impact on population health.
Aim To estimate the potential reduction in population mortality from implementation of the pay-for-performance contract in England.
Design of study Cross-sectional and modelling study.
Setting Primary care in England.
Method Twenty-five clinical quality indicators in the contract had controlled trial evidence of mortality benefit. This was combined with condition prevalence, and the differences in performance before and after contract implementation, to estimate the potential mortality reduction per indicator. Improvement was adjusted for pre-existing trends where data were available.
Results The 2004 contract potentially reduced mortality by 11 lives per 100 000 people (lower–upper estimates 7–16) over 1 year, as performance improved from baseline to the target for full incentive payment. If all eligible patients were treated, over and above the target, 56 (29–81) lives per 100 000 might have been saved. For the 2006 contract, mortality reduction was effectively zero, because new baseline performance for a typical practice had already exceeded the target performance for full payment.
Conclusion The contract may have delivered substantial health gain, but potential health gain was limited by performance targets for full payment being set lower than typical baseline performance. Information on both baseline performance and population health gain should inform decisions about future selection of indicators for pay-for-performance schemes, and the level of performance at which full payment is triggered.
INTRODUCTION
Pay-for-performance programmes have become increasingly important in the design and delivery of health care for several countries including the US, UK, Canada, Australia, New Zealand, Germany, the Netherlands, and Spain.1 In the UK, a pay-for-performance contract was agreed between the Department of Health and the British Medical Association in 2003 and introduced into primary care in April 2004,2 supported by an £8 billion ($12 billion) investment by the Department of Health over the first 3 years.3 The contract rewarded performance against criteria in four areas: clinical, organisational, patient experience, and additional services. There were 10 clinical domains in the original contract, which was revised to include a further nine domains in 2006.2,4 There were 76 clinical indicators in the 2004 contract, increasing to 80 indicators in the 2006 revisions. Points are allocated to each indicator, and a point represents a payment of £124.60 ($190) for a typical practice.2 The revisions to the contract in 2006 increased the points allocated to clinical indicators from 550 to 665.4
Practices do not need to treat all patients that are eligible, to receive full payment. In the 2004 contract, target levels at which full payment for each indicator is received range from 50% for prescribing a beta-blocker drug to a patient with heart disease (CHD 10), to 90% for several smoking-related indicators. These targets apply to the eligible population after exclusion of all patients for whom the indicated treatment is judged by their doctor to be inappropriate: the Quality and Outcomes Framework (QOF) calls this concept ‘exception reporting’. This has been introduced to allow practices to pursue the quality-improvement agenda but not be penalised, where, for example, patients do not attend for review, or where a medication cannot be prescribed due to a contraindication or side-effect. Appendix 1 gives the full criteria agreed for exception reporting.2
How this fits in
In the UK, a pay-for-performance contract was introduced into primary care in April 2004; it was supported by an £8 billion investment by the Department of Health over the first 3 years. The interventions in this contract have potential for significant mortality reduction; however, this may be limited by pragmatic setting of targets well below 100% of eligible patients. Using measures of health gain (overall population outcomes) may be a better reflection of cost-effectiveness and evidence base for the future development of pay-for-performance programmes.
There is a lack of consensus on decisions about which indicators to include in pay-for-performance programmes, whether to keep the same indicators in or ‘rotate’ them out, and about the target performance level that should be set for full payment to be received. Identifying the best indicators and size of incentives is important because incentives have been shown to change practice, and areas of care not receiving incentives may be relatively ignored.5 Clinical indicators for the pay-for-performance contract in the UK were selected, and the relative size of the financial incentive determined, on the basis of clinical effectiveness and anticipated workload.2 One problem with rewarding workload is that clinical activity may be skewed towards high-workload interventions that may be less clinically effective than other interventions with a lower workload. For example, indicator Asthma 6 — the percentage of patients with asthma who have had an asthma review in the last 15 months — is a high-effort activity that received a maximum payment of 20 points in 2004, although the QOF states that the evidence for improvement in morbidity is ‘not good’. Conversely, indicator Stroke 9 — aspirin therapy in patients after a stroke — is a relatively low-effort intervention that has a robust evidence base for health gain but only had a maximum payment of 4 points. Payments may therefore not reflect population health gain.6
An alternative method for selecting clinical indicators is to estimate population health gain, as was first proposed in 1992.7 Little is currently known about the potential population mortality reduction from the pay-for-performance contract in England, but this would seem an important overall outcome of a clinical contract. The aim of this study was to combine data on baseline performance and clinical effectiveness and to apply it to the English population to estimate the reduction in all-cause mortality for individual indicators, and for the contract overall, in order to understand whether measures of health gain such as mortality reduction should be used by policy makers to inform the choice of new indicators to include in pay-for-performance programmes, and to determine the size of the financial incentives to maximise potential health gain for the population.
METHOD
Calculations were made of the number of additional eligible patients (all those on the relevant disease register who were not excluded by exception reporting) who would receive indicated treatment as a result of performance improving from baseline, both to the target set for full incentive payment, and over and above the target to 100% performance.
Five types of data were used for this analysis: (1) prevalence of each condition; (2) clinical effectiveness of indicated care; (3) baseline performance; (4) level of performance at which full target payment is gained; and (5) maximum realistic (100%) performance (treating all eligible patients who have not been excluded by exception reporting). Baseline performance in the 2004 version of the pay-for-performance contract indicators was obtained from four published studies for the year before the contract was implemented (2003).5,8–10 Where data were present in more than one study, the larger study was used. For the two indicators in each contract for which there were no baseline data, the conservative assumption was made that the indicator was already fully implemented. For baseline performance prior to the 2006 revision of the pay-for-performance contract, the English contract returns in 2005 were used, and for new indicators, the QRESEARCH database in 2005 was used.9,11
Prevalence for each condition was obtained from the pay-for-performance contract returns from all practices in England in 2006 from the NHS Information Centre.11 To estimate performance, including maximum realistic (100%) performance, only eligible patients were considered, that is those not excluded by exception reporting in the 2006 contract data.11 ‘Exception-reported’ patients were deemed by their GP to be unsuitable for that intervention, or did not agree to investigation or treatment.2 Target thresholds that lead to full financial incentive being gained were obtained from the contract documentation for 2004 and 2006.2,4 The prevalence of smoking-related indicators was adjusted to reflect predicted smoking-cessation rates published by the National Institute for Health and Clinical Excellence.12
Clinical effectiveness was obtained from a literature review that identified the highest level of evidence for risk reduction in all-cause mortality for each clinical indicator in the 2004 and 2006 versions of the pay-for-performance contract.13 The study used a technique previously described to estimate potential health gain for each clinical indicator.14 These risk reductions were converted to potential mortality reduction per 100 000 using the following three methods:
absolute risk reduction was multiplied by the number of cases per 100 000 population;
relative risk reduction was multiplied by the control event rate and the number of cases per 100 000 population; and
control event rates were taken from the best matched clinical trial identified in the literature review.13
Odds ratios were converted to relative risk reductions.
Sensitivity analyses
Estimates of aggregate potential health gain need to allow for patients receiving multiple interventions for one disease, and also for comorbidity. Where patients receive multiple drug treatments for one condition, the extra reduction in mortality from each additional drug in one condition varies from zero to the sum of the benefits of all the individual drugs.15 Comorbidity means that the total number of people with any condition is less than the sum of people with each individual condition. For example, in the contract conditions, the sum of the prevalences of the following six chronic conditions: heart disease, stroke, hypertension, diabetes, asthma, and chronic obstructive pulmonary disease (COPD) is 29.3%, whereas the prevalence of all of these conditions combined is only 20.4%.11 These conditions include most of the indicators with potential to save lives.
To allow for these two considerations, mid, higher, and lower estimates of aggregate health gain were constructed. For the main analysis (mid estimate), it was assumed that health gain was additive between different indicators and then the overall health gain was reduced by a factor of 20.4/29.3 to adjust for comorbidity. For the higher estimate, the assumption was made that health gain was additive between indicators, and the possible effects of comorbidity were ignored. For the lower estimate, only the indicator from each domain with the highest health gain was used, and it was assumed that more than one intervention in one disease will not further increase health gain. The health gain was then reduced by a further 20.4/29.3 to account for comorbidity. The health gain from one of the diabetes indicators (DM 7) was ignored throughout, as these patients are already included within the diabetes indicator with better glucose control (DM 6). A worked example for indicator DM 18 is given in Appendix 2.
Also, there was evidence that clinical activities in primary care were already improving in quality before the 2004 contract. Data were available for the years 2003–2005 for three domains — asthma, diabetes, and heart disease.8 These data show that care for asthma improved by 14%, of which 2% was predicted by the rate of improvement since 1998, for diabetes by 11%, of which 3% was predicted by the trend, and for coronary heart disease (CHD) by 9%, of which 4% was predicted. In order not to overestimate the health gain attributable to the contract, the estimates of health gain were reduced for these conditions by the amount predicted by the trend (that is, 2/14 in asthma, 3/11 in diabetes, and 4/9 in CHD.
RESULTS
Evidence for reduced mortality was found for 25 of the 80 indicators in the 2004 and 2006 versions of the pay-for-performance contract (Appendix 3). Prevalence, risk reductions, control event rates, baseline activity, exception reporting rates, and threshold targets are displayed in Table 1. Mortality reduction for all included indicators is shown in Table 2.
Table 1 Data used to calculate estimates of mortality reduction for 2004 and 2006 indicators.
Table 2 Potential reduction in mortality for a population of 100 000 people from implementation of the 2004 and 2006 contract indicators, (1) all eligible patients (2) when target thresholds are attained
In the 2004 contract, the mid estimate was for an additional 11 lives to be saved (lower–upper estimates 7–16) per 100 000 population per year when performance improved from the pre-contract baseline to the level of the targets set for full incentive payment. This represents a saving of an additional 6600 lives in the English population per year (lower–upper estimates 4200–9600). In the 2006 contract, additional mortality reduction fell to zero for a typical practice, as baseline performance had already exceeded the targets set for full incentive payment. This decrease in potential lives saved between 2004 and 2006 is due to substantial improvement in baseline performance between 2003 and 2005 in all but two indicators — beta-blockers in heart disease (CHD 10) and angiotensin-converting enzyme (ACE) inhibitor treatment in diabetic kidney disease (DM 15).
If all eligible patients were to receive treatment, over and above the level set by the target for full payment, then in the 2004 contract the mid estimate was for an additional 56 lives to be saved (lower–upper estimates 29–81) per 100 000 population per year in England, as performance improved from baseline to 100% of eligible patients. In the 2006 contract, this fell to a mid estimate of a potential additional 30 lives saved (lower–upper estimates 20–43) per 100 000 population per year. This equates to a possible saving of approximately 18 000 additional lives per year in England (lower–upper estimates 12 000–25 800).
In terms of individual indicators, the clinical indicators with the greatest potential for mortality reduction if all eligible patients were to receive treatment were primary prevention for hypertension and influenza immunisation (12 and 6 lives respectively in 2006). The domains with the largest potential reduction in mortality were heart disease, diabetes, and primary hypertension, which accounted for 4/5 of the total reductions in 2006.
DISCUSSION
Summary of main findings
Between seven and 16 lives per 100 000 population per year in England were potentially saved, with performance improvement from baseline in 2003 up to the target for full incentive payment. There was no additional health gain in the 2006 version of the contract, since on average the target for full incentive payment had by then been achieved for all clinical quality indicators. However, if performance in the 2006 contract rose to 100% (the maximum realistic level at which all eligible — not exception-reported — patients were treated), then this would potentially save 30 lives per 100 000 population.
Strengths and limitations of the study
This study is the first to estimate English population health gain in the pay-for-performance contract. Pre-contract baseline performance is estimated across a substantial proportion of the indicators in the contract for the first time, derived from studies in English practices in 2003. The clinical effectiveness of indicators is derived from the highest quality evidence available, usually controlled trials. Recent (2006–2007) and reliable prevalence data from the contract were used to estimate health gain,11 and an adjustment was made for comorbidity.
The main limitation of the study is the difficulty in attributing the changes in performance over time to the intervention of the contract. The strongest method for attributing causation is the randomised controlled trial, but the contract was not introduced as a research intervention. It was a rapidly introduced policy initiative, so this research attempts to add to the observational knowledge base about the possible past effects and future suggested modifications to the contract. Trend data for the domains of asthma, diabetes, and CHD suggest that performance in primary care was improving before the contract began.8 If underlying trends for quality improvement existed in more domains, and if it is assumed that practices would have continued their considerable efforts to maintain and improve performance, then it is possible that much of the health gain in the contract would have happened anyway without the incentive scheme. The study results allowed for the effects of trends where data were available, but debate about the effect of any underlying quality-improvement trends is less relevant in the light of the study finding that the 2006 contract was likely to have produced zero additional health gain due to the higher baseline performance.
A second limitation is that the measure of health gain used (mortality reduction per year) is narrow, and is only available for 25 indicators. However, these 25 indicators are important in terms of health gain, and are dependent on a further 18 structural and process indicators in the clinical domains being met. The study also searched for full published evidence for quality-adjusted life years (QALYs), but this could only be found for nine indicators. Third, while some estimates for baseline performance were based on a large database of over 3.5 million people, others were based on smaller studies. Furthermore, baseline data were not available for two indicators in each version of the contract; although these indicators were associated with relatively small health gains per percentage point improvement, and so their exclusion from the overall population health-gain estimates is unlikely to affect the study conclusions. Fourth, the study used evidence on performance for a typical practice, and there will have been some practices with baseline performance below the target for full incentive in 2006 that had potential for health gain, albeit small. Fifth, there is no agreed method for aggregating estimates of mortality reduction in patients with more than one coexisting treated condition. This was dealt with by adjusting for comorbidity, and by constructing upper and lower estimates using different assumptions. Finally, the prevalence data were taken from QOF returns from each practice in England and these data are subject to validation checks by the primary care trusts. However, they may still include inaccuracies and may overestimate or underestimate true prevalence, and therefore health gain.
Comparisons with existing literature
Two studies prior to the introduction of the contract evaluated the health-impact gain for a subset of eight and five interventions respectively in primary care.6,16 The risk reductions in all-cause mortality were similar to those found in the present study.
Implications for future research and clinical practice
It may not be cost-effective to leave the same indicators in a pay-for-performance scheme from year to year as performance improves. Adding new indicators while retiring older ones may lead to greater health gain — although only of course if performance is maintained in the old indicators — a hypothesis that remains to be tested. Information on baseline performance and health gain could be used to inform decisions about retiring less-effective indicators, and weighting the size of financial incentives in order to maximise potential health gain for the population. Targets for full payment may need to be revised upwards over time as performance improves — and arguably could simply be set at 100% as long as appropriate procedures for exception reporting are in place.
The methods used to estimate potential reduction in mortality from implementation of the pay-for-performance contract could be extended to other clinical interventions, and used to identify new clinical indicators. As an example, an electronic search of the British Medical Association's ‘Clinical Evidence’ database was conducted, to identify indicators with evidence for mortality reduction that could be used in the primary care setting.17 Six interventions were identified: a Mediterranean diet for secondary prevention of heart disease (RRR [relative risk reduction] 44%),18 beta-blockers for heart failure (RRR 38%),19 smoking cessation in primary prevention (RRR 30%),20 oily fish diet for secondary prevention of heart disease (RRR 26%),21 spironolactone for heart failure (RRR 24%),22 and cardiac rehabilitation (odds ratio [OR] 0.80).23 Further research could use broader measures of health gain such as QALYs. This method could also apply to other areas of care that have the potential to reduce high morbidity and mortality, such as early diagnosis of cancer. According to one study, improving performance in earlier cancer diagnosis could save up to 15 000 lives a year in the UK.24
The pay-for-performance scheme in primary care had substantial potential for mortality reduction, although this was limited by much good-quality care giving a high baseline performance, and by some targets being set below current baseline performance for most practices. The focus on workload may have been right at the time, in the context of a pay award for GPs that was also intended as an investment in practice infrastucture.2 As the framework evolves, it would be sensible to consider using measures of health gain for selecting indicators, as this would bring the principles of cost-effectiveness and evidence-based policy making more directly into the future development of pay-for-performance programmes.
Appendix 1. Criteria for exception reporting
The following criteria have been agreed for exception reporting:2
Patients who have been recorded as refusing to attend review who have been invited on at least 3 occasions during the preceding 12 months
Particular circumstances, for example, terminal illness, extreme frailty
Patients newly diagnosed within the practice or who have recently registered with the practice, who should have measurements made within 3 months and delivery of clinical standards within 9 months; for example, blood pressure or cholesterol measurements within target levels
Patients who are on maximum tolerated doses of medication whose levels remain suboptimal
Patients for whom prescribing a medication is not clinically appropriate, for example, those who have an allergy or another contraindication, or have experienced an adverse reaction
Where a patient has not tolerated medication
Where a patient does not agree to investigation or treatment (informed dissent), and this has been recorded in their medical records
Where the patient has a supervening condition which makes treatment of their condition inappropriate, for example cholesterol reduction where the patient has liver disease
Where an investigative service or secondary care service is unavailable
Appendix 2. Worked example of calculating health gain
Indicator DM 18 about influenza immunisation in diabetes in 2004 is used here as a worked example to demonstrate how potential mortality reduction was calculated.
The potential number of lives that could be saved per year in a population of 100 000 is a product of the prevalence of diabetes in that population (3650), the relative risk reduction in mortality from influenza immunisation (61%) and mortality in the unimmunised population (2.86%). This calculation is 3650 × 0.61 × 0.0286 = 63.7 lives per 100 000.
However, the potential health gain from baseline is only a fraction of this raw estimate, because baseline performance (immunisation) was 70%, and 15% of patients were exception reported. The maximum realistic percentage of all patients is thus 85% (100% of eligible patients after excluding the 15% exception reported), and the maximum potential gain in performance is 15% (the ceiling of 85% minus baseline performance of 70%). So the maximum potential health gain from baseline to ceiling performance is 9.55 lives per 100 000, that is, 15% of the raw estimate of 63.7 lives per 100 000. To adjust for trends for improved performance that was already happening, this figure is multiplied by 8/11, that is, 6.94 lives saved.
The potential health gain from baseline to target performance is even smaller, since target performance is set at 85% of the maximum possible percentage of eligible patients (also 85%), which gives an effective target of 72% of the total population (0.85 × 0.85). With baseline performance at 70%, the potential health gain is 2% × 63.7 = 1.27 lives per 100 000 population. This figure was reduced by 3/11 to reflect the trend in improvement that was already occurring, resulting in a potential health gain of 0.93 lives per 100 000 population.
Appendix
Appendix 3 Full description of indicators.
Notes
Funding body
This paper expands on data originally published in a report commissioned and funded by the Policy Research Programme in the Department of Health, to which all authors contributed. The views expressed in the original report are not necessarily those of the Department of Health.
Ethical approval
Not needed.
Competing interests
The authors'work for this paper was independent of the Department of Health. Richard Cookson and Nicholas Steel sit on National Institute for Health and Clinical Excellence advisory committees.
- Received October 23, 2009.
- Revision received December 16, 2010.
- Accepted April 1, 2010.
- © British Journal of General Practice, 2010.