Development and external validation of a new clinical prediction model for early recognition of sepsis in adult patients in primary care: a diagnostic study

Loots, Feike; Smits, Marleen; Hopstaken, Rogier; Jenniskens, Kevin; Schroeten, Fleur; Van den Bruel, Ann; van de Pol, Alma C; Oosterheert, Jan-Jelrik; Bouma, Hjalmar; Little, Paul; Moore, Michael; van Delft, Sanne; Rijpsma, Douwe; Holkenborg, Joris; van Bussel, Bas; Laven, Ralph; Bergmans, Dennis; Hoogerwerf, Jacobien; Latten, Gideon; de Bont, Eefje; Giesen, Paul; den Harder, Annemarie; Kusters, Ron; van Zanten, Arthur; Verheij, Theo


Introduction
Early recognition of sepsis is the critical factor influencing patient outcome. [1][2][3][4] Protocols for the early identification of sepsis to trigger the administration of intravenous antibiotics successfully decreased sepsis-related mortality in emergency departments (EDs). 5,6 In patients with community-acquired sepsis, general practitioners (GPs) are often the first responding healthcare providers assessing patients. 7,8 GPs' recognition of sepsis and decision to refer a patient to the hospital is essential for adequate treatment. At the same time, GPs have an essential role in preventing unnecessary referrals, as hospital admission in itself can have a negative impact, especially in older, frail patients.
Currently, GPs' decision to refer patients with severe infections to the hospital is based on an intuitive interpretation of signs, symptoms and general impression of a patient. 9,10 For primary care, up until now, there is no diagnostic model available to support decisions to diagnose and manage sepsis. Clinical scores used in hospitals, like the quick Sequential Organ Failure Assessment (qSOFA) score, 11 Systemic Inflammatory Response Syndrome (SIRS) 12 or National Early Warning Score (NEWS), 13 are not validated in primary care.
This study aimed to develop and validate a first diagnostic clinical model for the early recognition of sepsis in adults presenting in primary care. Ideally, patients with sepsis are identified early in the course of the disease, and therefore the model will be designed to predict sepsis to be present within 72 hours. In these patients, immediate hospital referral is expected to improve outcome. Clinical signs and symptoms and biomarkers potentially available at the bedside were investigated.

Setting
Patients were enrolled between June 2018 and March 2020 at four participating out-of-hours primary care services in the central and south of the Netherlands. The combined area covers ~800,000 inhabitants in a mixed urban, suburban and rural area. In the Netherlands, out-ofhours primary care is organised in large-scale primary care services serving between 50,000 and 400,000 inhabitants. 14 Telephone triage is used to decide who needs to come to the clinic and who is visited at home. Only patients who received home visits were included in the study as these patients are usually more severely ill than other primary care populations. All participants (or legally authorised representatives of incapacitated patients) gave written informed consent for the study. The protocol for this study has been previously published 15 and can be consulted for further details.

Patients
Acutely ill adult (≥18 years) patients with fever, confusion, general deterioration or otherwise suspected severe infection were eligible for inclusion. Patients were excluded if any of the following criteria were present: 1) non-infectious diagnosis suspected as the cause of the acute complaints (e.g. myocardial infarction or stroke); 2) hospitalisation within seven days before the home visit; 3) a condition present requiring secondary care assessment regardless of the severity of infection (e.g. neutropenic fever); 4) terminal illness or other reason not to be referred to the hospital, despite the presence of a life-threatening condition.

Procedures
The GP assessed eligibility for inclusion at the home visit. Drivers who accompanied the GPs during the home visit were equipped with portable monitoring devices (Philips Intellivue MP2 or X2) to measure blood pressure, peripheral oxygen saturation, heart rate, and respiratory rate. All vital signs and other clinical candidate predictors were registered in a case report form on site. The GP also rated the perceived likelihood of sepsis on a scale from 0-10. Either the GP or an on-call laboratory assistant obtained venous blood samples directly after inclusion. Lactate was measured by point-of-care testing (StatStrip Xpress lactate, Nova Biomedical), as lactate cannot be measured reliably from stored blood samples. 18 The venous blood samples were stored at -70⁰C for later measurements of CRP and PCT. All patients received care as usual.

Outcome definitions and assessment
Three expert panels were created, each consisting of one GP, one emergency physician and one intensivist (or acute care internist). These expert panels established the primary outcome "sepsis within 72 hours of inclusion", using all relevant information from medical records, per Sepsis-3 definition. 19 Cases were divided among the three panels, with 10% of all cases being evaluated by all three panels for inter-rater and inter-panel reliability. If panel members could not reach a consensus on the presence or absence of sepsis, the case was discussed in a faceto-face meeting until consensus was reached.
Secondary outcomes assessed by the expert panel included whether the infection was the cause of acute complaints (yes/no) and the need for hospital treatment (on a scale from 0-10). Furthermore, the presence or absence of an "adverse outcome" -defined as an intensive care unit (ICU) admission within 72 hours or death within 30 days of inclusion -was determined.

Statistical analysis
Baseline characteristics of the study population were described using the mean and standard deviation for continuous variables with a normal distribution and the median and interquartile range (IQR) for variables with a skewed distribution. Inter-rater and inter-panel reliability were assessed using Cohen's Kappa for the primary outcome of sepsis.
Multiple imputations using Multivariate Imputation by Chained Equations (MICE) procedure 21,22 was used to account for missing data. These imputed datasets' regression coefficients and performance measures were pooled using Rubin's rules 23  Combined with previously described methods for imputing missing data and variable selection, 26 optimism was calculated to adjust for C statistics of the continuous models using 10-fold cross-validation. The calibration slope was used as a shrinkage factor for model regression coefficients and subsequently re-estimating the intercept.
Discrimination was evaluated using the area under the receiver-operating curve (C statistic).
Calibration was assessed by visual inspection of the calibration plots and evaluating the calibration slope and Brier score. In addition, the calibration of external datasets was also assessed using the O/E ratio as a measure for calibration in the large. Percentiles of bootstrapped samples were used to calculate 95% confidence intervals (CI) for performance measures. Performance measures of the continuous and simplified model were compared to each other, as well as to the performance of existing scoring systems (i.e., SIRS, qSOFA, and NEWS), and to the likelihood of sepsis (on a scale from 0-10) according to GP on site.
R v4.0.5 package was used for the analyses.

Sensitivity analyses
To evaluate potential incorporation bias resulting from the use of the SOFA score (by the expert panel) as part of the sepsis definition, model performance for secondary outcomes was assessed, as well as for a more conservative calculation of the SOFA score (fewer SOFA points for decreased oxygen saturation and altered mental status).

External validation
To test the external validity of both the continuous and simplified model, datasets from patients with suspected infections assessed in two EDs in the Netherlands were used. The C statistic discrimination was assessed, and the continuous and simplified models were compared with the NEWS. The calibration was assessed using calibration plots, as well as calibration in the large and slope. A more detailed description can be found in Supplementary Methods S1.

Study population and outcome
In total, 357 patients were included for analysis ( Figure 1) Table S2 ). Table 1 shows a summary of the characteristics of patients with and without sepsis.

Prediction model development
Of the nine clinical candidate predictors, six were included in the continuous model after A simplified model was created through the dichotomisation of variables included in the continuous model (Box 1). Models without respiratory rate were also evaluated, as the respiratory rate is less feasible for GPs to perform. Heart rate showed collinearity with respiratory rate, and model performance did not decrease after substitution. Consequently, heart rate was used instead of respiratory rate in the final simplified model.
Discrimination of the simplified model (C statistic of 0.80, 95% CI 0.76 to 0.83) was nearly identical to the continuous model ( Figure 2). Diagnostic accuracy measures for the simplified model at different cutoff scores are presented in Table 2. The calibration of the simplified model was also similar to the continuous model (Supplementary Figure S5). The use of multiple cutoff points for individual variables in the model, or grouping score categories, did not significantly improve performance.

Comparison with existing models
Performance of the continuous and simplified models was compared to SIRS, qSOFA and NEWS. (Table 3)

Sensitivity analyses
The prediction of secondary endpoints (including alternative sepsis definition using more restrictive calculation of the SOFA score) resulted in comparable performance results for the continuous model, simplified model and NEWS for all analyses. C statistics ranged between 0.7 and 0.8 for all outcomes, except for prediction of "adverse outcome" (ICU admission <72 hours or 30-day mortality), where a C statistic of 0.58 (95% CI 0.51 to 0.66) was found for the continuous model, compared to 0.62 (95% CI 0.53 to 0.69) for both the simplified model and NEWS (Supplementary Table S6).

External validation
The first validation dataset (Dataset 1) was from a teaching hospital in the south of the Netherlands and previously published by Latten et al. 7 The population consisted of 440 patients with a median age of 71 years, of whom 163 (37%) were diagnosed with sepsis (severe sepsis or septic shock according to the sepsis-2 definitions). 26

Summary
In this observational cohort study, a new and easy-to-use prediction model was developed for the early recognition of sepsis in primary care. Biomarkers provided no significant improvement in prediction performance when added to the model. The respiratory rate could be replaced with the more accessible and more reliable measure heart rate without decreasing the prediction performance of the simplified model. The performance of the simplified model was significantly better than SIRS and qSOFA. The outcomes of our simplified model were comparable to NEWS.
The validity of the simplified model was confirmed in the external validation, although some differences were found in discrimination and calibration compared to the development data.
Three different aspects may have contributed to these discrepancies. Firstly, the outcome "sepsis" was defined differently in the external datasets. The SIRS-based sepsis definition may have introduced incorporation bias in the first external dataset (Dataset 1), resulting in better NEWS predictions. Secondly, the variable "altered mental status" was registered differently.
Any empirical change in mental status was sufficient in our cohort, while a decrease in the Glasgow Coma Score was used in the validation cohorts. This score is probably less sensitive to subtle changes in mental status. Finally, resuscitation, admission of intravenous fluids, and supplemental oxygen by ambulance personnel of septic patients have likely occurred.
Consequently, vital signs may have normalised once patients arrived at the ED and were included in the study. 28

Strengths and limitations
This study is the first to include patients in their home situation, where the decision to refer the patient had yet to be made. This is a major strength, as the potential impact on patient care is larger in these patients than in patients already in, or in transit to the hospital. Another strength of the study is the prospective design, specifically tailored to developing a clinical prediction rule. As only very few data on the candidate predictors were missing, the study was sufficiently powered according to prevailing sample size calculation methods. 25,29,30 Furthermore, the newly developed models were internally and externally validated and compared to existing scoring systems.
Several limitations of this study should be taken into account. Firstly, using an expert panel as a reference standard for sepsis may have resulted in biased results. Verification bias may have occurred, as patients referred to the hospital received more diagnostic tests than nonreferred patients did. Secondly, as some candidate predictors were also part of the SOFA score, this may have resulted in incorporation bias. Therefore sensitivity analyses were performed, using a more strict calculation of the SOFA score and alternative outcomes (i.e. adverse outcomes and need for hospital treatment according to the expert panel). These analyses did not suggest significant bias. Furthermore, not all eligible patients have been included in the study. However, the most common reasons not to include eligible patients were not based on patient factors but rather on a too busy shift, which is unlikely to have resulted in selection bias. Finally, the external validations were performed in patients assessed in the accident and emergency department due to suspected infection. Ideally, validation of the model would have been performed in a primary care population in whom the decision to refer a patient to the hospital was not yet made. These data were not available to the authors.
However, the fact that the model also performed well in other domains underscores the robustness.

Comparison with existing literature
Other clinical prediction rules have been proposed for either sepsis or critically ill patients in the prehospital setting. These were mostly derived from retrospective data retrieved from patients transported by ambulance and used SIRS-based sepsis definitions. 31 Only one prospective cohort study using the Sepsis-3 outcome definition was found in the prehospital setting, which included 551 patients with suspected infection in the ambulance. 32 This study showed blood pressure ≤100 mmHg, temperature >38.5 °C, lactate >4 mmol/L, gastrointestinal symptoms, and altered mental status to be most predictive of sepsis. These findings mainly align with our results and support the decision not to include respiratory rate in the simplified model. In our data, only three patients showed a lactate >4 mmol/L, which might explain lactate was not found to be a useful predictor in the primary care setting. Two studies were found in which vital signs were measured in acutely ill adult patients in a primary care setting. 32,33 However, both studies only included patients who were referred to a hospital or acute care clinic, and both did not report sepsis as an outcome measure. The simplified prediction model developed in the current study was comparable to the NEWS score. NEWS was initially developed for the early detection of clinical deterioration of adult patients admitted to the hospital. 35 Recent studies in the ED setting showed NEWS superior to SIRS and qSOFA to predict sepsis, 36,37 which was confirmed in our study for the primary care setting. An implementation study of the NEWS in the prehospital setting in England showed promising results, 38 but NEWS was only performed in 30% and 63% of cases by GP support teams and ambulance personnel, respectively. 34

Implications for research and practice
Although the difference between empirical clinical assessment by the GP and performance of our model was modest, it can help support clinicians during the busy daily routine, reduce variation in the quality of primary care and improve collaboration between primary and secondary care for this potentially life-threatening condition. The model is not intended to overrule the GP's overall judgement but rather to inform the GP on the probability of the outcome sepsis. The GP can subsequently use this information to decide whether or not to refer the patient to the hospital. The presented simplified model is easy to use in daily practice. Compared to the NEWS score, our model does not include respiratory rate and does not have a complex scoring matrix. The results do not mean that respiratory rate should not be measured in severely ill patients, and the minority of the GPs who are currently using the NEWS score are using a valid and useful model, as our results showed. Our simplified model showed similar diagnostic properties and could be easier to implement in the primary care setting. After the decision to refer a patient due to suspected sepsis, ambulance personnel can score the NEWS depending on local protocols. Before widely advocating the new model, effects on referrals and patient outcomes should also be prospectively evaluated in a pragmatic trial in primary care.

Conclusion
A simple score-based model can accurately predict sepsis in adult primary care patients with suspected severe infections at home. Biomarkers do not improve the model's predictive performance. The score does not replace clinical judgement, and further research will have to demonstrate how GPs can best use the score to improve the management of patients with possible sepsis.

Funding
This study was funded by ZonMw (grant number 843001811). Star-shl provided additional funding. The following manufacturers provided in-kind funding of materials: Philips, Nova Biomedical, ThermoFisher. The funders had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication.

Ethical approval
The study received ethical approval from the medical research ethics committee Utrecht (reference number 18-169). Table 2. Diagnostic accuracy measures with 95% confidence intervals of the simplified prediction model for predicting sepsis at different score thresholds in the development data (n=357).