The call for transparency about quality of care provided to patients has become stronger in recent years in most developed countries. Society demands that care providers, GPs included, account for their performance, and is increasingly prepared to pay for better care. The development in accountability faces critical debate, such as questions about its impact on improving practice, and claims that it is a threat to professionalism, that it promotes strategic behaviour and even fraud, and increases bureaucracy and costs of health care.1–3
As public reporting of performance and pay for performance are becoming more of a reality in health care, it is important to observe and evaluate these new ‘accountability approaches’.
Seen from the other side of the North Sea, the pay-for-performance initiative in general practice in the UK (Quality and Outcomes Framework [QOF] and GMS contract) is one of the most interesting quality improvement experiments for GPs in the world today. We in the rest of Europe learn from this experiment. However, we sometimes do things differently, and also ask ourselves why the UK does not take on board some of the relevant experiences gained within Europe on measuring performance in general practice.
One of the crucial issues related to any performance assessment, whatever its objectives, is the validity and reliability of the indicators and measurement instruments used. In particular, when indicators are used for comparison, incentives, or certification, they should be ‘correct and fair’ and avoid unjustified harm to practitioners.4 Indicators should meet the highest standards for quality, their features should be tested, the scores should be corrected for case-mix differences. Those assessed should have the possibility to check data before use by others.
There is vast literature on criteria for indicators and instruments,5,6 including papers focused on measuring patients' views of care.7 However, in different countries many current initiatives related to performance assessment, public reporting, and pay for performance fail to meet such requirements. Based on a large number of projects undertaken in the Netherlands and other European counties, we identified some steps in the development and validation of performance indicators and instruments to measure performance in primary care:
Development of a logical and consistent framework covering all aspects of the field to be assessed with all relevant stakeholders involved. This framework needs to assure good coverage of the field and acceptance by all stakeholders. Strategic behaviour, such as aiming at high performance for only those aspects of care that are measured, needs to be prevented. For instance, indicators reflecting patient experiences need to cover a variety of aspects of primary care services, including accessibility of the practice, professional performance in consultations with the patient, and involvement of patients in decision making and in debates about practice development. Indicators for patient experiences should be developed together with patients to ensure that they reflect patients' priorities.
Specific indicators can be derived from the vast international literature on indicators, evidence-based clinical guidelines, and other sources. A rigorous procedure (for example, the Rand modified Delphi method) is required to select the most relevant indicators. The selection of indicators should be guided by the goals of performance assessment. For example, a review identified six instruments for measuring patient views in the context of formative feedback.8 In most of our indicator projects we observed that many ‘obvious’ indicators do not meet these criteria. Therefore, indicators need to be selected using robust methods.
Feasible and reliable instruments need to be selected or developed to collect data and measure the indicators. Decisions are required as to whether new instruments should be developed and what type of instruments are suitable (for example, checklists, questionnaires, observations, prospective self-recording).
A test in real practice, such as an audit, is a crucial step to determine whether instruments and their items meet criteria for construct and criterion validity, reliability, case-mix control, sensitiveness to change, and acceptability by those assessed. As this test will normally lead to a further reduction of the indicators selected, it is not wise to implement the indicators widely at first.
Finally, it is crucial to consider the impact that indicators will have on the delivery of care after wide-scale implementation. Even a rigorously developed indicator may have undesired side effects or show little room for improvement.
In this issue of the BJGP two articles provide evidence of the failure to meet these requirements in some widely used instruments for measuring patient experiences with health care. Garratt et al9 reviewed four questionnaires to measure patient experiences and satisfaction with primary care out-of-hours services: two from the UK and two from the Netherlands. They conclude that all questionnaires had limitations regarding validation.
The article by Hankins et al10 in this issue of the Journal reviewed the value of instruments now used as part of the QOF. It concludes that the two instruments approved by the QOF, which are used to determine part of the financial bonus, appear to have insufficient research evidence of validity and reliability.
Serious consideration of these issues in the ongoing revision of the QOF indicators is essential in our opinion. Current developments outside the UK also need to be taken into consideration. A critical question could be why the QOF has not yet adopted some of the indicators and instruments developed in a European research collaboration in primary care (the TOPAS-Europe association).
The EUROPEP instrument, developed by TOPAS, is probably the most widely used and best validated patient satisfaction instrument for general practice available. This tool was developed and validated using a rigorous procedure. This included a needs assessment of thousands of patients in 11 European countries, including the UK, which showed many similarities between patient expectations of primary care.11 A systematic process was used to develop the instrument (framework development, inventory of available indicators and items, and a selection process), and there has been wide testing in 16 European countries, including the UK.12–14
The EUROPEP is now used as the national instrument to measure patient experience with primary care in some countries, such as Denmark and Switzerland. After several years of use, the EUROPEP has been recently modified by an international working group and a user manual has been developed. It will be tested again in some countries before wider implementation is recommended.
The instrument is short (23 items) and makes international comparison possible. Will the UK continue to drive on the left side of the patient experience road, or harmonize with the rest of Europe in the use of this patient satisfaction instrument?
EUROPEP is used in conjunction with another important instrument that could be included in the QOF: the European Practice Assessment (EPA).15–17 The EPA focuses on practice management and organisation and was developed using the same rigorous validation methods as the EUROPEP. It was developed within a collaboration between researchers and practitioners from general practice from 10 European countries including the UK. In addition to questions for GPs and nurses and visitor observations, the EPA contains a number of questions for patients about factual experiences with their primary care practice; for example, concerning their experiences with accessibility or coordination of care.
It would be great if UK general practice could compare itself with achievements in other countries using such standardised instruments, to see where improvements in measuring and enhancing services for their patients are possible.
An interesting finding from a small group of practices in different countries was that the UK has the highest scores for the topic ‘incident reporting’ (an indicator in the QOF), while it has the lowest score on ‘essential drugs in doctors' emergency bag’ (not in the QOF).17
The newest development of the TOPAS collaboration concerns the validation and testing of a set of European primary care indicators related to cardiovascular risk and disease management (EPA-Cardio) in nine European countries. The first publication of the indicators and their validity and use in practice will soon follow.
Many quality assessment schemes for primary care have been developed around the world in the last decade. The QOF is one of the most intriguing and ambitious ones. However, the validation of the indicators and instruments used (for example, for measuring patient experiences) may be improved. Collaboration and exchange of developments and expertise between different countries could help to take this difficult undertaking forward, and make harmonization and comparison of measures and data between countries possible.
- © British Journal of General Practice, 2007.