Routine monitoring of UK GPs' mortality rates has been recommended by the Shipman Inquiry, and is likely to be implemented soon.1-3 In this Journal, Mohammed et al4 are to be applauded for their rigorous attempt to address the potential problems of such monitoring.2,3 In particular, they describe the application of structured investigation to practices with unexpectedly high or low mortality rates that is a potential model for any national system.5 Ultimately though, many uncertainties remain.
Crucially, what mortality monitoring is intended to achieve needs to be clearly articulated, and reflected in monitoring system design. The two purposes usually identified are, first to deter or detect murderous clinicians or those dangerous through ignorance, incompetence or illness, and secondly to improve the overall quality of care.2,3,6 Whether any system can achieve the first of these is to some extent unknowable, but the Northern Ireland pilot left considerable space for the unscrupulous to avoid detection. One quarter of practices were excluded from the primary cross-sectional analysis because they did not have data for long enough due to practice mergers or splits, and participation in investigation was voluntary. Additionally, mortality rates were monitored at practice level, which reduces the ability to detect individual doctors with high mortality, and will not detect dangerous doctors working out-of-hours or as locums. However, it remains uncertain whether it is possible to create a monitoring system able to attribute mortality reliably to individual doctors under current practice and NHS organisation,3 and monitoring individual GPs itself rather ignores the possibility of dangerous nurses or teams. Monitoring mortality at practice level therefore seems preferable until alternatives are proved to be feasible.
The authors say that their ultimate aim was to improve quality of care, but at present there is no evidence that this happened or how it might have. Given that mortality rates in general practice are low and most deaths are not preventable, we cannot assume that simply discussing mortality rates will in itself improve quality. However, it is plausible that it may do if used alongside complementary approaches including monitoring processes and intermediate outcomes known to be associated with higher mortality,7 or significant event analysis of selected cases to examine care in the period before death systematically.
Although the Northern Ireland system can inform the debate, a number of other questions remain about how a national system should operate. First, it is uncertain which statistical monitoring tool is most appropriate. The Northern Ireland pilot used two kinds of Shewart control charts. Their key advantage is that they are simple to explain and interpret. Others have proposed control charts using cumulative data, which are more sensitive at detecting individuals and/or practices with high or low mortality but harder to interpret.6 Which is ‘better’ may depend on the purpose being addressed, with Shewart charts perhaps more appropriate for quality improvement work that requires discussion and engagement with clinicians, and charts using cumulative data superior for signalling cause for concern.3
Secondly, it is uncertain where to set the threshold for triggering an alarm signal. The Northern Ireland system used control limits set at three standard deviations. This follows Shewart's convention, which has been found to be useful for repeated measurements of single, industrial processes over time. Whether it is appropriate for cross-sectional comparisons of mortality rates with inevitably imperfect case-mix adjustment is uncertain. It is notable that cardiac surgery mortality monitoring has used wider control limits because of the presence of unmeasured case-mix heterogeneity. Wider limits reduce the risk of falsely labelling a surgeon as showing special cause variation, but at the cost of increasing the risk of failing to identify a surgeon with true cause for concern.8-10 Case-mix is more complex in general practice, and adjustment less comprehensive, which would imply wider limits still, although other ways of accounting for unmeasured case-mix heterogeneity, such as adjusting for resulting over-dispersion in the data, may be more appropriate.3,6 At a minimum, the utility of Shewart's convention in this context versus other approaches to setting of control limits, such as simulation, requires further prospective examination.3,6
Thirdly, whatever statistical technique is used and wherever the alarm threshold is set, this unmeasured case-mix variation means that any system sensitive enough to detect the few practitioners where there is true cause for concern will inevitably also generate false alarms. In Northern Ireland, nearly half of practices showed special cause variation using crude mortality, and 16% after adjustment for age, sex, and socioeconomic status. All such practices examined had a reasonable explanation in terms of the original case-mix adjustment being imperfect. This emphasises both the importance of explicitly comparing different case-mix adjustment systems to find the most appropriate, and the considerable resource implications of investigation.
Finally, all monitoring systems have unintended, and potentially adverse consequences. With the advent of the Freedom of Information Act, publication of mortality data is inevitable,10-12 significantly increasing the risk of damage to morale and recruitment due to stigmatisation of practices investigated for a false alarm. The resource implications of investigation are also likely to be significant. There may also be adverse consequences for patients. In industry, variation in raw material inputs is a special cause to be eliminated. In contrast, ‘eliminating’ case-mix variation in health care by selective registration is undesirable.13 For an individual practice, measured mortality can be lowered by stopping registering patients in nursing homes, but such selection would likely reduce access to care for a vulnerable group. Decisions about what data will be publicly available, at what point in any investigation it is released, and the way in which it is presented will be crucial in mitigating such effects.
The considerable uncertainties that this study highlights emphasise the need for more systematic exploration of the choices that have to be made in creating a national system of mortality monitoring. It is necessary to clearly articulate the purposes of monitoring, and to design appropriate statistical tools and appropriate responses to alarm signals for each of these purposes. Designing the right statistical tool will require work to examine the feasibility of monitoring individual clinicians, including locums and nurses, which case-mix adjustment system is most appropriate, and where alarm thresholds should be set. Shewart charts and the case-mix adjustment used in Northern Ireland are one model, but what is needed is direct comparison of the application of different techniques to the same data. The systematic pyramid model of investigation of alarm signals used in Northern Ireland is attractive, but it remains unclear whether the forensic investigation of potential murderers and supportive quality improvement should be separate processes, or can be done simultaneously.2
We do not know if mortality monitoring in general practice can deter or detect murderers, or detect the non-malicious incompetent, or help raise the quality of care in all practices. The Shipman case demands a response, but the worst of all worlds is an expensive monitoring system that does not work. Developing an effective system or being able to say with certainty that a system cannot serve a particular purpose cannot be achieved through piecemeal local implementation. It needs national commitment, coordination and resources3 to create a large scale prospective pilot across a number of primary care trust areas, to explicitly compare different monitoring and investigation systems, and with external evaluation to check that mortality monitoring actually does serve its intended purposes without major adverse consequences. If mortality monitoring fails such a real world test, we should abandon it and seek a more suitable alternative.
- © British Journal of General Practice, 2005.