Article Text

Quantifying public preferences for different bowel preparation options prior to screening CT colonography: a discrete choice experiment
  1. Alex Ghanouni1,
  2. Steve Halligan2,
  3. Stuart A Taylor2,
  4. Darren Boone2,
  5. Andrew Plumb2,
  6. Sandro Stoffel3,
  7. Stephen Morris4,
  8. Guiqing Lily Yao5,
  9. Shihua Zhu5,
  10. Richard Lilford5,
  11. Jane Wardle1,
  12. Christian von Wagner1
  1. 1Department of Epidemiology and Public Health, University College London, London, UK
  2. 2Centre for Medical Imaging, University College London, London, UK
  3. 3Institute for Health and Consumer Protection, European Commission, Joint Research Centre, Ispra, Italy
  4. 4Department of Applied Health Research, University College London, London, UK
  5. 5Department of Public Health, Epidemiology and Biostatistics, University of Birmingham, Birmingham, UK
  1. Correspondence to Dr Christian von Wagner; c.wagner{at}ucl.ac.uk

Abstract

Objectives CT colonography (CTC) may be an acceptable test for colorectal cancer screening but bowel preparation can be a barrier to uptake. This study tested the hypothesis that prospective screening invitees would prefer full-laxative preparation with higher sensitivity and specificity for polyps, despite greater burden, over less burdensome reduced-laxative or non-laxative alternatives with lower sensitivity and specificity.

Design Discrete choice experiment.

Setting Online, web-based survey.

Participants 2819 adults (45–54 years) from the UK responded to an online invitation to take part in a cancer screening study. Quota sampling ensured that the sample reflected key demographics of the target population and had no relevant bowel disease or medical qualifications. The analysis comprised 607 participants.

Interventions After receiving information about screening and CTC, participants completed 3–4 choice scenarios. Scenarios showed two hypothetical forms of CTC with different permutations of three attributes: preparation, sensitivity and specificity for polyps.

Primary outcome measures Participants considered the trade-offs in each scenario and stated their preferred test (or chose neither).

Results Preparation and sensitivity for polyps were both significant predictors of preferences (coefficients: −3.834 to −6.346 for preparation, 0.207–0.257 for sensitivity; p<0.0005). These attributes predicted preferences to a similar extent. Realistic specificity values were non-significant (−0.002 to 0.025; p=0.953). Contrary to our hypothesis, probabilities of selecting tests were similar for realistic forms of full-laxative, reduced-laxative and non-laxative preparations (0.362–0.421). However, they were substantially higher for hypothetical improved forms of reduced-laxative or non-laxative preparations with better sensitivity for polyps (0.584–0.837).

Conclusions Uptake of CTC following non-laxative or reduced-laxative preparations is unlikely to be greater than following full-laxative preparation as perceived gains from reduced burden may be diminished by reduced sensitivity. However, both attributes are important so a more sensitive form of reduced-laxative or non-laxative preparation might improve uptake substantially.

  • Preventive Medicine
  • Public Health
  • Radiology & Imaging

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/

Statistics from Altmetric.com

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Strengths and limitations of this study

  • This is the first quantitative study to investigate whether future screening invitees would prefer a less burdensome preparation experience or a more sensitive and specific CT colonography for colorectal cancer screening.

  • To facilitate informed decision-making, participants were provided with comprehensive information on test practicalities, the probability that precancerous polyps could turn into cancer and the prevalence of polyps.

  • Preferences were obtained in an unfamiliar, hypothetical decision-making context, which may affect the validity of responses.

  • It was necessary to use a heterogeneous and limited literature in order to make assumptions about realistic levels of sensitivity and specificity for different methods of preparation.

Introduction

CT colonography (CTC) has been recommended as a screening test for colorectal cancer (CRC).1 ,2 It is capable of high sensitivity,3 ,4 potentially reducing CRC mortality and incidence.5 In this respect, it may represent an improvement on the only widely available method of screening in the UK (guaiac faecal occult blood testing), which has yet to demonstrate preventative potential with an uptake level of 57% and a threshold for positivity of 5–6 abnormal samples out of a possible six.6 CTC is less invasive, and often preferred by individuals undergoing screening, compared with the ‘gold-standard’ whole-colon test of colonoscopy.7

CTC typically requires patients to undergo bowel purgation beforehand, and this is frequently reported to be the worst part of the investigation.8 ,9 However, this aspect of patients’ experience can be ameliorated by offering reduced-laxative options (eg, diatrizoic acid, the purgative effects of which cause only mild diarrhoea) or a non-laxative preparation (eg, barium sulfate, which has no purgative effect) as an alternative to standard, full-laxative methods (eg, polyethylene glycol).7 ,10–12 Reducing the burden of preparation may improve patients’ expected satisfaction with the test experience13 and reduce perceived barriers, potentially increasing uptake and thereby population health benefits.14–16 This rationale underpinned the decision to offer CTC with reduced-laxative preparation in a trial comparing CTC with colonoscopy.17

A disadvantage of decreased purgation is the likely reduction in not only test sensitivity but also specificity for polyps.18 Previous research has found that patients value sensitivity highly in screening and diagnostic tests, and it can be prioritised over burden.19–21 A recent study gave staged information (in lay language) on screening CTC, the practicalities of different bowel preparations, and the estimated sensitivity and specificity of each method. Participants’ final preferences favoured a full-laxative preparation because of its superior sensitivity and specificity, even though the non-laxative bowel preparation was favoured for its lower physical and lifestyle effects.22 This led to the hypothesis that when making screening decisions, people would prefer a full-laxative preparation for CTC for its sensitivity and specificity, despite the greater burden.

The present study tested this hypothesis on a larger, more representative sample of adults approaching the age for being invited to the UK CRC screening programme, using a discrete choice experiment (DCE). DCEs allow (1) the value placed on key modifiable attributes of CTC (preparation, sensitivity and specificity, in this case) to be compared and (2) predictions of the probability that participants choose a test, which allows an estimate of which method of delivering CTC would achieve the highest level of uptake if offered for screening.

Method

Discrete choice experiments

DCEs have been applied to numerous healthcare contexts,23 including bowel cancer screening,24 although these have assessed preferences for different bowel screening tests and not modifiable attributes of a single, specific investigation. DCEs are based on the premise that an aspect of healthcare may be defined by several key attributes (eg, bowel preparation), and each attribute may take one of several levels (eg, non-laxative, reduced-laxative or full-laxative preparation). A DCE generates a number of hypothetical options which are presented side-by-side for participants to compare and then state which option they prefer (see figure 1 for an example). Each participant answers several questions in this format, and responses can then be analysed to determine the value placed on each attribute, and the attribute that is valued most overall.25 The data can also be used to estimate which form of testing (real or hypothetical) would achieve the highest uptake by asking participants about their willingness to be screened with different methods (eg, Marshall et al).21

Figure 1

Example of both stages of a choice scenario.

Attribute and level selection

The study was designed in accordance with the best practice guidelines for DCEs26 and following a review of the strengths and weaknesses of the literature.24 The three attributes selected were (1) the method of bowel preparation (specifically the intensity of the laxative effect), (2) test sensitivity for ≥10 mm precancerous polyps and (3) test specificity. Selected levels for sensitivity, specificity (presented in table 1) and other statistics were based on realistic values for each method of preparation using the existing literature4 ,27–34 as well as expert radiologists’ opinions. The use of these levels was supported by a previous interview study which found that patients’ preferences were responsive to this range of values.22

Table 1

Attributes and levels, as they were described to participants

Selection of scenarios

Three attributes, each with three levels, generate 27 (33) possible scenarios. Paired comparisons of every scenario with every other scenario would result in 702 (272–27) comparisons. Since this was not a feasible number of scenarios for participants to complete, an efficient fractional factorial design was used, based on the number of attributes and attribute levels, to select a subset of comparisons from the full list. A 100% D-efficient design was achievable with 18 choice scenarios, three of which were ‘rationality’ tests (ie, scenarios in which one test is unequivocally preferable to the alternative on all attributes; these act as checks for ‘rational’ responding, which is an indicator of understanding and engaging with the task. The selection of choice sets ensured that all levels were represented with the same frequency (level balance), that participants did not have to choose between alternatives with similar levels of an attribute (minimal overlap), and that the levels presented for each attribute were uncorrelated (orthogonality). Initially, we used a block design to divide these scenarios into three sets so that each participant would be presented with six scenarios. However, following pilot testing, this number of scenarios was found to cause unacceptable participant burden. Therefore, we redesigned the choice scenarios into six sets. Three sets included one of the three rationality tests and the remaining three had one of the three rationality tests added by the researchers, meaning that each participant would be presented with only three or four scenarios in total. The design was generated using SAS V.9.2 (Market Expo Macros, Cary, North Carolina, USA). The full list of choice scenarios and questionnaire versions is provided in online supplementary appendix 1.

Participants

Following ethical approval, potential participants were recruited from online survey panels located in the UK by Survey Sampling International (SSI, London, UK). Members of the general population who have signed up to such panels receive invitations to participate in online studies in exchange for minor rewards such as air miles. They were also offered a lay summary of the study results.

Participants who responded to the initial invitation from SSI completed a set of demographic questions which excluded them if they reported being: (1) outside the target age range of 45–54 years; (2) medically qualified doctors or nurses; (3) previously diagnosed with bowel cancer, ulcerative colitis or Crohn's disease or (4) part of a quota that was already sufficiently well represented in the sample. Recruitment quotas were set so that the study sample would resemble the population of interest, that is, members of the general population approaching screening age, in terms of key characteristics. Statistics used to define the population of interest were obtained from the most recent census of England and Wales that was currently available (2001) via InFuse (http://infuse.mimas.ac.uk/).

No definitive calculation exists to establish the sample size for a DCE35 ,36; the survey ran until approximately 600 participants had completed it. A soft-launch of the DCE suggested an overall completion rate of 72%, among those who began it. Consequently, it was estimated that approximately 750 individuals would need to begin the DCE to reach this sample size.

Measures

DCEs are cognitively demanding and typically require participants to consider several unfamiliar concepts. Consequently, we used a web-based method of data collection. This allowed greater flexibility for creating a DCE that participants would find engaging, understandable, easy to use without assistance and easy to access. For example, background information was accompanied by an optional audio voiceover to assist with pronunciation of unfamiliar medical terminology and information on CTC was supplemented with images and video of the test being performed. Information was presented in stages, allowing participants to complete the DCE at their own pace and giving the option to return to a previous page of the DCE to remind them of any information.

Eligible participants were presented with the full DCE. This began with information regarding bowel cancer screening and the study context. They were then asked whether they had experience of bowel testing and whether they knew anyone with bowel cancer, followed by information regarding the practicalities of CTC. Participants were then given information about the three methods of preparation and asked to state their first and second preferences, followed by information regarding the practicalities of colonoscopy. The final information page introduced the concepts of sensitivity and specificity using lay terminology that was found comprehensible in previous research22 (figure 2). In the absence of key contextual information, participants may make inaccurate assumptions: for example that polyps are very common or are highly likely to become cancers.37 Consequently, a webpage explained that 10 in 100 people with precancerous polyps (meaning ≥10 mm adenomas) would eventually get bowel cancer (figure 2), and that the prevalence of such polyps in the relevant age group was around 100 in 2500 people. This information was followed by a measure of health literacy (adapted from Woloshin et al38 and Lipkus et al39).

Figure 2

Information on polyp prevalence, risk of transition to cancer, sensitivity and specificity.

Participants were then presented with one of six randomly determined choice sets of three or four choice scenarios. Overall preferences for each choice scenario were determined over two stages: first, participants were asked to state which of the two tests they thought appeared best (ie, which of the two tests they preferred), after which the non-preferred option was faded out and participants were asked whether they would have the preferred test if it were offered to them in the next month or if they would opt for no testing (ie, whether their overall preference was for having the initially favoured test or having no testing; figure 1 for an example). The second stage allowed the probabilities of selecting a test to be calculated for different forms of preparation, which relates to potential screening uptake: a test that is more likely to be selected, on average, is more likely to have higher uptake when offered to a population. A two-step approach may also improve comprehension of the various attributes.40

The DCE concluded with questions regarding the subjective difficulty of completing the questions, self-rated health, and a free text field for comments on the DCE. An example of the full survey is available in online supplementary appendix 2.

Members of the public (n=17), recruited from a panel of individuals who took part in a previous study on CRC screening, and the local research group (n=17), piloted the DCE to ensure it was comprehensible, not excessively burdensome, and any weaknesses and software bugs were identified and addressed.

Analysis

Basic descriptive statistics were generated for demographic data. Main analyses used Stata V.12 for Windows (StataCorp, College Station, Texas, USA). Coding for the continuous attributes (sensitivity and specificity) was mean centred, while coding for the categorical attribute (type of preparation) used effects coding. A constant term was included in the model to account for the option of choosing neither test.

The first stage of the analysis used conditional logistic regression with three effects-coded laxative variables, plus centred sensitivity and specificity variables as predictors. The outcome was whether a given option was preferred overall; each choice scenario for each participant generated three observations: scan A preferred overall (yes/no), scan B preferred overall (yes/no) or no testing preferred overall (yes/no). A unique identifier was generated to account for the interrelated nature of responses within choice sets and participants (ie, four participants completing three choice sets had 12 unique identifiers between them). Significant coefficients (p<0.05) denote that an attribute was associated with preferences. We anticipated positive coefficients for the sensitivity and specificity attributes, consistent with prior evidence for a preference for tests with greater ‘accuracy’. We anticipated a negative coefficient for the preparation attribute, based on evidence for a preference for preparations with less powerful purgative effects.

Finally, we calculated the probability that a test would be selected (using the “predict p1’ and ‘summ p1 if…” commands in Stata). This extrapolated the statistically significant coefficients observed in the primary analysis to create a new variable for each of the possible combinations of levels for the statistically significant attributes. This was calculated for all choice scenarios where a participant had an overall preference for either scan A or B (ie, did not select no testing). This meant that versions of CTC preceded by different preparations could be ranked in the order of the overall expressed preferences (including hypothetical options such as a best-case scenario with the most preferred levels of each significant attribute). It also allowed a comparison of the relative importance of significant attributes: We first estimated the probability of selecting a test for all three levels of one significant attribute with another significant attribute fixed at the middle value. We then calculated estimates for all three levels of the second attribute when the first attribute was set at its middle value. The ranges generated by this procedure were then compared for the two sets of estimates.

Analyses first included all participants who completed the DCE, and were then repeated including only those who correctly answered the rationality test (ie, who stated a preference for the test that was superior to the alternative on all three attributes). Failure of rationality tests may be due to issues with understanding or insufficient motivation. However, it has been argued that DCE analyses should not automatically exclude such participants.41

Results

The flow of participants through the study is presented in figure 3. The final sample consisted of 607 participants (mean age 49 years, SD 2.9). Based on 792 participants who began the main DCE, we achieved our target completion rate with 77%. Participants’ characteristics are described in table 2.

Table 2

Demographic statistics of all discrete choice experiment completers

Figure 3

Flow of participants through the study.

Most participants identified the greatest risk of a disease correctly, and reported that generally they found medical statistics and written medical information easy to understand. However, a substantial minority incorrectly identified the greatest risk of a disease and reported that they found medical statistics and information difficult or very difficult (table 3). The majority reported finding the questions easy and this was supported by free-text responses at the end of the DCE, which were largely positive (eg, “I think the survey was very interesting and made me think what I would do in that situation; 45 years, male). Most participants (90.3%) passed the rationality test across all versions of the DCE.

Table 3

Statistics on health literacy/numeracy and ease of completing the discrete choice experiment

Attribute valuations and preferences

Coefficients for preparation and sensitivity were both significant and in the predicted directions, indicating that these attributes were associated with preferences (table 4). The full-laxative preparation was preferred less than reduced-laxative preparation, which was preferred less than non-laxative preparation. The positive coefficient for sensitivity indicates that higher values of sensitivity were preferred over lower values. Specificity was not significant, indicating that this did not affect preferences. Subsequent analysis focused on the two attributes that were associated with preferences.

Table 4

Magnitude and direction of preferences for each attribute

Mean probabilities of selecting tests

Relative importance of significant attributes

In order to compare the relative importance of preparation and sensitivity, we first estimated the probability of selecting a test for all three levels of preparation with 89% sensitivity, and for all three levels of sensitivity with reduced-laxative preparation (table 5). The ranges of probabilities were almost equivalent, suggesting that these attributes are equally important determinants of uptake.

Table 5

Estimated probabilities of choosing a test for permutations of preparation used to compare the relative value of significant attributes

Predictions of uptake from participants’ choices

Estimated probabilities of choosing a test for all possible permutations of preparation type and sensitivity are shown in table 6. Although we hypothesised that overall preferences would favour a more sensitive full-laxative preparation, the results showed that probabilities were very similar to reduced-laxative or non-laxative preparations, despite their lower sensitivity (suggesting similar levels of uptake). However, the probability of choosing hypothetical forms of improved non-laxative or reduced-laxative preparations (with 89% and 92% sensitivity, respectively) was notably higher than all three currently realistic preparations, suggesting that uptake could be increased considerably by improving sensitivity of less burdensome methods, even if the improvements do not achieve optimal sensitivity (92%). Similarly, the data suggest that a best-case scenario (non-laxative preparation and 92% sensitivity) would have even higher uptake.

Table 6

Probabilities of choosing a test for each type of preparation (options considered realistic are in italics)

There were no meaningful differences in the results of analyses for all participants and restricted to participants who correctly answered the rationality test, except for a possible trend for rational responders to be more likely to choose all tests.

Discussion

Although previous small-scale qualitative work on screening CTC,22 and several studies on general CRC screening19 ,21 have found sensitivity and specificity to be more important attributes than practicalities of undergoing screening tests, our results using the DCE method find that the influence of preparation burden is comparable to sensitivity—at least within a plausible range of values. Moreover, specificity had no effect on preferences when defined using realistic values.

Previous studies have found evidence to support the use of reduced-laxative or non-laxative preparation instead of full-laxative methods,14 ,15 or made decisions to offer it prior to screening17 on the basis that it may improve uptake. For example, a previous randomised trial assessed screening invitees’ expectations of different bowel preparations (2×50 mL iodinated contrast agent for CTC vs 2 L of polyethylene glycol solution for colonoscopy) and found that participants did anticipate the former to be less burdensome.13 However, such studies on expectations and preferences rarely inform participants of sensitivity and specificity for different methods of testing.7 Our estimates show when participants were able to factor these attributes into their preferences, only very small overall differences were apparent between the three preparations we considered currently realistic. This suggests that offering one over another may not result in an appreciable increase in uptake. In effect, the perceived benefits from a reduction in preparation burden may be offset by the perceived costs from a reduction in sensitivity.

A strength of this study was that it considered possible future improvements to improve the sensitivity of reduced-laxative and non-laxative preparations (to 89% for non-laxative preparation), and also hypothetically optimised it to the same level as full-laxative preparation (92% for non-laxative and reduced-laxative preparations). In contrast to the similar probabilities for the realistic forms of preparations, improving sensitivity in our hypothetical ‘best-case scenario’ led to much higher probabilities of selecting tests and so expected uptake. Furthermore, it was notable that participants appeared to be willing to compromise on sensitivity; the probability of choosing a test with a hypothetical improved non-laxative preparation with 89% sensitivity was higher than for a test with realistic full-laxative preparation with 92% sensitivity, which suggests that non-laxative preparation could improve uptake even if it could not achieve sensitivity equal to full-laxative preparation. This underscores the value of further research to reduce the burden of bowel preparation and also optimise sensitivity (and specificity). Beyond this, it highlights the need to conduct trials comparing the accuracy and uptake of different methods.

Although our results contradict several previous findings, several factors may explain these differences. Some studies have examined CTC for cancer.21 We examined sensitivity with respect to precancerous polyps, which is the more realistic scenario for use of CTC, but not as well understood by the public who may value polyp detection less than cancer detection.42 Our study aimed to ensure that participants knew enough to make an informed choice by providing detailed information about the test and CRC screening in general. In particular, our information described the prevalence of ≥10 mm adenomas in a screening age population (4%) and further clarified the transition rate from adenoma to cancer (10%). People's expectations about cancer risk can be unrealistically negative in the absence of explicit information,37 and the statistics we presented may have caused participants to view bowel cancer as less common, diminishing the perceived value of sensitivity. In addition, the realistic range of values for sensitivity may have been perceived as narrow, and participants may not have seen much benefit from selecting an option with 92% sensitivity over one with 86% sensitivity (the largest gain possible) compared with previous studies that have used wider ranges.

The information provided may also account for why specificity did not have a significant effect on preferences; we used a realistic range of values (89% for 91%) that may not have been enough to over-ride the influence of the other two attributes. The findings suggest that specificity values in this range are essentially equivalent in terms of their perceived value. This is consistent with previous results showing that potential screening invitees consider sensitivity to be more important than specificity, and would accept a ratio of up to six additional false positives for one additional polyp detected with CTC screening.42 Similar findings have been observed in the context of breast cancer screening, where the trade-off is phrased in terms of the number of false positives considered acceptable to save one life from cancer.43

Although it is possible that participants may have had difficulty understanding unfamiliar concepts, several factors suggest that they were generally able to incorporate them into their decision-making. Most participants correctly answered the rationality test and the question on objective health numeracy. They also reported finding the DCE easy to complete, and medical information and statistics easy to understand in general. Our conclusion is that although sensitivity remains an important attribute, using realistic information and values for sensitivity and specificity diminishes the influence of both attributes on preferences. However, although this may be true of the general population approaching screening age, it is possible that there are important subgroup differences relating to cognitive abilities (eg, individuals with different levels of education) which would be relevant to policy contexts in which invitees have a choice of tests or preparation methods. For example, less-educated individuals may have more difficulty considering statistics, leading to a stronger preference for less physically burdensome preparations compared with individuals with more education. Future research could explore this possibility.

Our study has limitations. Performance characteristics are not well understood for the range of preparations available and data are currently lacking,44 leading to uncertainty in the values used to define sensitivity and specificity for participants in our study. The study was also in a hypothetical context and behaviour may differ when responding to a real invitation for cancer screening. In addition, the choice scenarios constituted an unfamiliar decision-making context and this may have reduced response validity. However, performance on the rationality tests and the reported ease of completing the DCE suggested a high level of understanding. The study also described a range of available preparations, whereas organised screening programmes typically offer invitees the option of a single test, performed to a specific protocol. Hence, the most robust, ecologically valid assessment of absolute preferences and uptake would be a trial in which participants are invited to CTC screening preceded by one of the several forms of preparation, combined with information about sensitivity and specificity (once more precise estimates are available). This would reduce the effects of people's ability to compare several options. The results of the SAVE trial45 will be particularly relevant to the present findings.

In conclusion, this study found that preparation burden and sensitivity for polyps have comparable effects on preferences. Consequently, CTC following non-laxative or reduced-laxative preparations is not likely to result in greater uptake than following full-laxative preparation as any gains in perceived burden are undermined by the reduction in sensitivity. However, uptake may be increased substantially by improving the sensitivity of less burdensome preparations. Future research should focus on refining and comparing bowel preparation regimens.

Acknowledgments

The authors would like to thank Mr John Robinson for creating the online questionnaire to the authors’ specifications, Dr Lesley McGregor for providing voiceovers, Mrs Lynn Faulds Wood for providing CT colonography and colonoscopy video and Survey Sampling International for identifying and inviting participants.

References

Supplementary materials

  • Supplementary Data

    This web only file has been produced by the BMJ Publishing Group from an electronic file supplied by the author(s) and has not been edited for content.

    Files in this Data Supplement:

Footnotes

  • Contributors SH, CVW, JW and AG conceived of the study. AG, SH, CVW, JW, SAT, DB, AP, GLY, SZ and RL participated in the design. AG participated in the acquisition, analysis and interpretation of the data and drafting of the manuscript. SS and SM also participated in the analysis and interpretation of the data. All the authors participated in the critical revision of the manuscript and approved the final version.

  • Funding This article presents independent research funded by the National Institute for Health Research (NIHR) under its Programme Grants for Applied Research funding scheme (RP-PG-0407-10338). The views expressed in this article are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

  • Competing interests None.

  • Patient consent Obtained.

  • Ethics approval This study was approved by the UCL Research Ethics Committee (ref: 2951/002).

  • Provenance and peer review Not commissioned; externally peer reviewed.

  • Data sharing statement The survey used in the study is available as online supplementary material. Data are not currently available to be shared due to confidentiality being a condition of participants’ consent (online supplementary appendix 2).