Predicting the risk of acute kidney injury in primary care: derivation and validation of STRATIFY-AKI

Background Antihypertensives reduce the risk of cardiovascular disease but are also associated with harms including acute kidney injury (AKI). Few data exist to guide clinical decision making regarding these risks. Aim To develop a prediction model estimating the risk of AKI in people potentially indicated for antihypertensive treatment. Design and setting Observational cohort study using routine primary care data from the Clinical Practice Research Datalink (CPRD) in England. Method People aged ≥40 years, with at least one blood pressure measurement between 130 mmHg and 179 mmHg were included. Outcomes were admission to hospital or death with AKI within 1, 5, and 10 years. The model was derived with data from CPRD GOLD (n = 1 772 618), using a Fine–Gray competing risks approach, with subsequent recalibration using pseudo-values. External validation used data from CPRD Aurum (n = 3 805 322). Results The mean age of participants was 59.4 years and 52% were female. The final model consisted of 27 predictors and showed good discrimination at 1, 5, and 10 years (C-statistic for 10-year risk 0.821, 95% confidence interval [CI] = 0.818 to 0.823). There was some overprediction at the highest predicted probabilities (ratio of observed to expected event probability for 10-year risk 0.633, 95% CI = 0.621 to 0.645), affecting patients with the highest risk. Most patients (>95%) had a low 1- to 5-year risk of AKI, and at 10 years only 0.1% of the population had a high AKI and low CVD risk. Conclusion This clinical prediction model enables GPs to accurately identify patients at high risk of AKI, which will aid treatment decisions. As the vast majority of patients were at low risk, such a model may provide useful reassurance that most antihypertensive treatment is safe and appropriate while flagging the few for whom this is not the case.


INTRODUCTION
Blood pressure lowering (antihypertensive) medications are one of the most commonly prescribed medications in older people. 1 They are highly effective at reducing the risk of cardiovascular disease (CVD) and mortality; 2 however, they are also associated with adverse events, including acute kidney injury (AKI), electrolyte abnormalities, hypotension, and syncope. 3 At present, decisions about when to start (or continue) antihypertensive therapy are made almost exclusively on the basis of blood pressure level and CVD risk, aided by CVD risk prediction models. 4 In contrast, less emphasis is given to the potential for harm from treatment. To make such informed clinical decisions, GPs need to understand both the effect of treatment on adverse events (which has been shown previously), 3 and an individual's underlying risk of harm, which currently remains largely unknown.
One such adverse event is AKI, which is typically defined as an increase in serum creatinine of ≥0.3 mg/dl within the past 48 h or an increase of ≥1.5 times the baseline value within the past 7 days. 5 Over the past decade, automatic reporting of potential AKI on renal function test reports has become usual practice 6 and may lead to GPs modifying potentially beneficial treatment. In serious cases, AKI can lead to admission to hospital and reduced quality of life, and acute renal failure, as AKI was previously known, is still a significant and long-standing issue. 7 Better understanding of an individual's risk of serious AKI (resulting in admission to hospital or death), along with other adverse events, could better inform GPs making antihypertensive treatment decisions, particularly where such a risk is high. This study therefore aimed to use routinely available data from clinical records to develop and externally validate a clinical prediction model to predict an individual's underlying risk of experiencing admission to hospital or death with AKI within the Abstract Background Antihypertensives reduce the risk of cardiovascular disease but are also associated with harms including acute kidney injury (AKI). Few data exist to guide clinical decision making regarding these risks.

Aim
To develop a prediction model estimating the risk of AKI in people potentially indicated for antihypertensive treatment.

Design and setting
Observational cohort study using routine primary care data from the Clinical Practice Research Datalink (CPRD) in England.

Method
People aged ≥40 years, with at least one blood pressure measurement between 130 mmHg and 179 mmHg were included. Outcomes were admission to hospital or death with AKI within 1, 5, and 10 years. The model was derived with data from CPRD GOLD (n = 1 772 618), using a Fine-Gray competing risks approach, with subsequent recalibration using pseudo-values. External validation used data from CPRD Aurum (n = 3 805 322).

Results
The mean age of participants was 59.4 years and 52% were female. The final model consisted of 27 predictors and showed good discrimination at 1, 5, and 10 years (C-statistic for 10-year risk 0.821, 95% confidence interval [CI] = 0.818 to 0.823). There was some overprediction at the highest predicted probabilities (ratio of observed to expected event probability for 10-year risk 0.633, 95% CI = 0.621 to 0.645), affecting patients with the highest risk. Most patients (>95%) had a low 1-to 5-year risk of AKI, and at 10 years only 0.1% of the population had a high AKI and low CVD risk.

Conclusion
This clinical prediction model enables GPs to accurately identify patients at high risk of AKI, which will aid treatment decisions. As the vast majority of patients were at low risk, such a model may provide useful reassurance that most antihypertensive treatment is safe and appropriate while flagging the few for whom this is not the case.
Keywords blood pressure; drug-related side effects and adverse reactions; electronic health records; epidemiology; primary health care; vascular diseases. next 1, 5, and 10 years, for patients with an indication for antihypertensive treatment.

METHOD
Extended methods for this study are described in Supplementary Appendix S1. This study used an observational cohort design and aimed to develop and validate a prediction model for admission to hospital or death with AKI. As a prediction modelling study it was not the aim to examine the association between antihypertensive treatment and AKI, which has been studied previously. 3 The current study used routine primary care data from the Clinical Practice Research Datalink (CPRD) in England, linked to Office for National Statistics (ONS) mortality data, basic inpatient Hospital Episode Statistics (HES) data, and patient-level Index of Multiple Deprivation (IMD) data. The model was derived using population data from CPRD GOLD and externally validated using CPRD Aurum, each of which is based on data from English general practices using different electronic health record (EHR) software.

Population
Patients were eligible for this study if they were registered at linked general practices contributing to the CPRD GOLD or Aurum in England. Individual records were included if they related to patients aged ≥40 years, registered to a CPRD 'up-to-standard' practice, and had records available after the study start date (1 January 1998). The study end date was 31 December 2018. Patients entered the cohort following their first systolic blood pressure reading ≥130 mmHg, chosen as a group likely to be considered for antihypertensive therapy. 8 Patients were excluded if they had no record of blood pressure measurement or a systolic blood pressure ≥180 mmHg, as at this level treatment would be indicated regardless of risk of adverse events. The index date was defined 12 months after the patient was recorded as having a systolic blood pressure reading ≥130 mmHg. Patients experiencing AKI on their index date were excluded from the analysis.

Outcomes
The model outcome was defined as first admission to hospital or death with a primary diagnostic code for AKI within 10 years of the index date. This was based on ICD-9/10 codes documented in HES and ONS mortality data (codes are available in Supplementary Box S1).

Model predictors
Potential predictors of AKI were identified from the literature 9,10 and by expert clinical opinion. These included patient demographics, clinical characteristics, previous conditions, and other prescribed medications (see Supplementary Box S2). Predictors were defined as the most recent relevant clinical code before the index date. Antihypertensive medications were defined as a prescription in the 12 months before the index date.

Sample size calculation
Assuming a conservative event rate of 24.6 per 100 000 person-years, 11 an expected median follow-up of 7 years, 12 an estimate of Nagelkerke's R 2 statistic of 0.15, and a maximum number of 40 parameters in the model, a sample size of approximately 80 000 patients was estimated to be required for the development of this risk equation. 13 Model development A multivariable model was fitted using a Fine-Gray subdistribution model that takes into account competing risks to avoid overestimation of predicted probabilities. 14 Deaths from causes other than AKI were treated as a competing event. Automated variable selection methods were not used, as all the variables were predetermined based on the literature and expert clinical opinion. Predictor effects in the model were reported as subdistribution hazard ratios

How this fits in
Acute kidney injury (AKI) is one of the more serious adverse events associated with antihypertensive treatment, reducing an individual's health-related quality of life and increasing the risk of admission to hospital. Clinical guidelines recommend that when prescribing antihypertensives GPs should take into account the likelihood of both the benefits and harms from treatment, but few data exist in regard to the risk of AKI. A clinical prediction model was developed and externally validated for the risk of AKI up to 10 years in the future in patients eligible for antihypertensive medication, incorporating commonly recorded patient characteristics, comorbidities, and prescribed medications. The model showed good discrimination and good calibration for probabilities up to 20%, enabling GPs to accurately identify patients at higher risk of AKI. This could be useful to reassure the majority of patients starting or continuing treatment that their risk of AKI is very low.
(SHRs) with 95% confidence intervals (CIs), and post-estimation of the baseline cumulative incidence for AKI was calculated using the Breslow estimator as defined in the Fine and Gray article. 14 Analyses were undertaken using the fastcmprsk package in R (version 4.1.0).
Fractional polynomials were used to identify the optimal functional form of continuous variables. The baseline cumulative incidence function at 1, 5, and 10 years was estimated in the derivation dataset to allow individual risk predictions at these time points.
Initial model calibration was assessed in the development dataset using calibration curves generated from pseudo-values: jack-knife estimators representing an individual's contribution to the cumulative incidence function for AKI accounting for competing risk, and calculated by the Aalen- Johansen method. These were generated separately in 50 groups by linear predictor value, accounting for the competing risk of death. 15 Where calibration was observed to be suboptimal at 5 and 10 years, the model in the development data was recalibrated by fitting a generalised linear model (with logit link function) directly to the pseudo-values, with the linear predictor from the Fine-Gray model as the only variable, and allowing for a non-linear recalibration effect using fractional polynomials.

Missing data
Multiple imputation was used to impute all variables with missing data, separately for each of the development and validation datasets. Ten imputations were generated for each dataset. Imputation models contained all predictors included in the main analysis, as well as the Nelson-Aalen estimator and the outcomes of interest (AKI and death). 16 The model coefficients and performance measures (such as C-statistic) were estimated from each imputation dataset and combined using Rubin's rules. 17

External validation
The external validation was conducted independently by researchers at a different institution. The final model equation (recalibrated at 5 and 10 years; see Supplementary Box S3) was applied to each individual in the validation cohort to give the predicted probabilities of AKI at 1, 5, and 10 years, while taking into account the competing risk of death. 18 Model performance was determined using Royston and Sauerbrei's R 2 D , a truncated C-statistic and the D-statistic. 19 Model calibration was assessed through comparison of predicted probabilities with observed pseudo-values estimated using jack-knife estimators representing an individual's contribution to the cumulative incidence function for AKI, accounting for competing risks, and calculated by the Aalen-Johansen method, in the external validation cohort. Calibration was presented as the ratio of observed to expected event probabilities and calibration plots to compare the observed versus predicted risks at 1, 5, and 10 years. A random effects meta-analysis was used to examine heterogeneity in model performance across different GP practices, where case mix and outcome prevalence were expected to vary.
The clinical utility of the model was assessed by plotting the 1, 5, and 10-year risk of AKI against the 10-year risk of CVD, calculated using the QRisk2 algorithm. 4 A net-benefit analysis was also conducted, where the harms and benefits of using the model to guide treatment/management decisions were compared with either not taking any action for everyone (irrespective of AKI risk) or taking action for everyone. 20

Population characteristics
The CPRD GOLD derivation cohort included 1 772 618 patients with a mean age of 59.4 years (SD 13.2), including 921 867 females (52%) ( Table 1 and Supplementary Figure S1). The 10-year prevalence of significant AKI following the index date was 3% (n = 56 110) with 10% (n = 171 018) of patients experiencing the competing event of death from other causes. Median follow-up time for the cohort was 6.4 years (interquartile range [IQR] 2.7-10.0).
The CPRD Aurum validation cohort contained 3 805 322 patients, with 131 584 (3%) experiencing admission to hospital or death with AKI during 10-year follow-up (incidence by practice shown in Supplementary Figure S2). The competing event of death affected 407 857 (11%) patients during follow-up (data not shown in Tables or Figures). Median follow-up time in the validation cohort was 6.9 years (IQR 2.8-10.0) ( Table 1).

Model derivation
The final model included 27 predictors, with transformations used for diastolic blood pressure and total cholesterol because of non-linear relationships with the outcome. Being male, morbidly obese, a smoker, a heavy drinker, more deprived, increasing age or frailty, or a history of chronic kidney disease and diabetes were associated with an increased risk of AKI. Most antihypertensive medications, with the exception of thiazide and thiazide-like diuretics, increased the risk of AKI, with angiotensin-converting enzyme inhibitors (SHR 1.54, 95% CI = 1.51 to 1.57) and angiotensin II receptor blockers (SHR 1.43, 95% CI = 1.38 to 1.48) conferring the highest risk (Table 2).

External validation
The distribution of the linear predictor in the validation dataset, grouped by outcome type, can be seen in Supplementary Figure S3.
There was some evidence of model overprediction at each time point, although this was less pronounced in the models that had been recalibrated to the development data at 5 and 10 years (Table 3, Figure 1, Supplementary Figure S4, and Supplementary Table S1). Miscalibration was mostly evident in a small number of patients at higher predicted probabilities (>20% risk).
Net-benefit analysis showed that using the model with an AKI risk threshold of ≥10% to define those at high risk (potentially requiring action), would result in higher clinical utility compared with other approaches, such as assuming that everyone is at high risk of AKI or that all patients have low risk (Figure 2). Model performance varied more among smaller practices, with more consistent performance seen as practice size increased (see Supplementary Figure S5).

Summary
In this study, a clinical prediction model to identify those more at risk of AKI leading to significant harm within 10 years in patients with an indication for antihypertensive treatment showed that most had very-low risk, particularly in the medium term (≤5 years). The model incorporated commonly recorded patient characteristics, comorbidities, and prescribed medications, and showed good discrimination on external validation. Where miscalibration occurred this primarily affected the small proportion of patients with a very-high risk of AKI (>20% over 10 years, n = 204 775 patients [5%]). Such a tool could therefore be useful for GPs and pharmacists to reassure most patients that their risk of AKI is low, and although treatment with medications such as antihypertensives might increase this risk 3 it is unlikely to outweigh the potential benefits from reducing blood pressure and CVD risk. For the small number where this is not the case, the tool could flag this to allow incorporation into clinical decision making.

Strengths and limitations
This study used two large, population-based cohorts to derive and externally validate a clinical prediction model for admission to hospital or death with AKI. These datasets

. Calibration plots comparing observed and predicted risk of acute kidney injury at: a and b) 1; c and d) 5; and e and f) 10 years in the GOLD derivation dataset and Aurum validation dataset.
have been shown to be representative of patients across England and include data collected by many hundreds of GP practices. It is therefore expected that the findings are generalisable to the same population. 21 A strength of this analysis was that it accounted for the competing risk of death in each analysis, which minimises the likelihood of overestimating the underlying risk of AKI. This is important for older patients, where the competing risk of death is high. The model only included predictors that are routinely available in primary care EHRs, and those predictors with missing data at implementation (such as alcohol consumption, ethnicity, and body mass index) could easily be collected within the patient consultation in which the tool is used. This analysis had some limitations. First, model miscalibration was present, particularly for those with higher predicted risks, although this is common in prediction models based on EHRs commonly used in clinical practice, 22 and has been observed in those previously developed for AKI. 23 Such miscalibration would not be a problem in practice if using lower thresholds to define high/low risk (for example, +/-10% over 10 years). Second, the model outcome was based on hospital and death registry codes, where AKI was listed as the primary cause of admission/death, rather than guideline recommended changes in creatinine, 5 although many of these codes will have been based on creatinine measurements. This aimed to ensure that the AKI events were truly significant and hence meaningful for both patients and their GPs. It also avoided simply labelling individuals on the basis of blood results that may not have an impact on their quality of life, 24 although the authors' acknowledge that such test results are important and can lead to further nephrology referral, investigations, and medication changes, all of which can impose a burden on patients.
Finally, previous prediction models for AKI, based in a secondary care setting, have included conditions such as heart failure, respiratory failure, and prescription of nonsteroidal anti-inflammatory medications. 23,25 These predictors were not included in the present analysis and it is unclear whether  ; a net  benefit of 0.1 means 10 true positives per 100 patients).

Comparison with existing literature
Previous clinical prediction models developed to predict AKI have almost exclusively focused on utility in an inpatient or post-operative setting, 9,23,26 using data from a selected population of patients admitted to hospital for a range of conditions such as heart failure. 27 These models estimated the risk of AKI over shorter periods of followup, did not account for the competing risk of death, and did not include prescribed medication as a potential predictor.
To the authors' knowledge, this is the first clinical prediction model for AKI developed for use in a primary care setting and taking into account prescribed medication. Unlike many other models, 9,23,27 the present model was externally validated in a nationally representative population and displayed better discrimination than previous models, 23 even at 10-years post-index date.

Implications for practice
The rationale for developing this model was to provide data to aid GPs in better understanding the balance of benefits and harms of antihypertensive therapy, before prescribing treatment or modifying existing prescriptions. To do this, GPs need to understand the effect of treatment on CVD 2 and adverse events, 3 and an individual's underlying risk of benefit and harm. Many CVD prediction models exist that enable the benefits of antihypertensive treatment to be estimated, 4 but unlike conditions such as atrial fibrillation where stroke prevention is routinely assessed against bleeding risk, the risk of harm from antihypertensive treatment is not well documented or understood. 3,[28][29][30] The present prediction model provides this information.
Given the low risk of AKI seen across the population, it seems likely that these particular harms of treatment would only outweigh the benefits in a small fraction of individuals. Indeed, in the present population, <1% of individuals had a high risk of AKI but low risk of CVD. These individuals were more likely to be obese, be from an area of high deprivation, or be prescribed multiple antihypertensive medications, but using the present tool alongside existing CVD prediction tools 4 would provide the most 10-year CVD risk, % 10-year AKI risk, % 40 50

Ethical approval
The study protocol was approved by the Clinical Practice Research Datalink (CPRD) independent scientific advisory committee in February 2019 before obtaining the data relevant to the project. As all data are fully anonymised, no consent was required.

Data
Data were obtained via a CPRD institutional licence. Requests for data sharing should be made directly to the CPRD. The algorithm is freely available for research use and can be downloaded from: https://process. innovation.ox.ac.uk/software (will be made publically available on publication of this manuscript). Codelists used to generate the study cohort and variables included in the analysis are available at: https://github. com/jamessheppard48/STRATIFY-BP/tree/ STRATIFY-AKI.

Provenance
Freely submitted; externally peer reviewed.
personalised risk estimates. Such tools could also be enhanced by incorporating similar evidence regarding the risk of falls 31 to develop a multidimensional antihypertensive harm tool. Regular monitoring of creatinine levels is recommended in primary care, including in those with hypertension, 32 and blood test results now routinely alert GPs to possible AKI. 6 What GPs should do with this information remains unclear. 33 The present prediction model could be used to target such monitoring to those most likely to benefit from it.
In conclusion, the present study developed and validated a clinical prediction model for admission to hospital or death with AKI, and found most patients with an indication for antihypertensives had a very- low risk of AKI. This model could be used to reassure patients starting or up-titrating antihypertensive treatment, and should be used alongside other prediction models for adverse events related to antihypertensive therapy 31 to allow GPs and patients to better understand the full spectrum of benefits and harms from such treatment.
e613 British Journal of General Practice, August 2023