Abstract
Background Electronic health records (EHRs) are increasingly used for research; however, multicomponent outcome measures such as daily functioning cannot yet be readily extracted.
Aim To evaluate whether an electronic frailty index based on routine primary care data can be used as a measure for daily functioning in research with community-dwelling older persons (aged ≥75 years).
Design and setting Cohort study among participants of the Integrated Systemic Care for Older People (ISCOPE) trial (11 476 eligible; 7285 in observational cohort; 3141 in trial; over-representation of frail people).
Method At baseline (T0) and after 12 months (T12), daily functioning was measured with the Groningen Activities Restriction Scale (GARS, range 18–72). Electronic frailty index scores (range 0–1) at T0 and T12 were computed from the EHRs. The electronic frailty index (electronic Frailty Index — Utrecht) was tested for responsiveness and compared with the GARS as a gold standard for daily functioning.
Results In total, 1390 participants with complete EHR and follow-up data were selected (31.4% male; median age = 81 years, interquartile range = 78–85). The electronic frailty index increased with age, was higher for females, and lower for participants living with a partner. It was responsive after an acute major medical event; however, the correlation between the electronic frailty index and GARS at T0 and over time was limited.
Conclusion Because the electronic frailty index does not reflect daily functioning, further research on new methods to measure daily functioning with routine care data (for example, other proxies) is needed before EHRs can be a useful data source for research with older persons.
INTRODUCTION
The use of routine care data such as electronic health records (EHRs) for research and population health management is increasing. These EHRs could be a valuable data source for research with older persons, which is often expensive and time-consuming. Some variables (for example, diagnoses, death, hospital admissions, polypharmacy, and multimorbidity) can be easily extracted from GPs’ EHRs. However, often in research with older persons, complex, multicomponent outcome measures such as quality of life and functioning are used. These variables cannot be readily extracted from EHR data.1,2 Daily functioning, which is often used as an outcome measure in the older population, is such a variable.3–7 It is described in terms of basic Activities of Daily Living (ADL) and instrumental Activities of Daily Living (iADL). Both in research and clinical practice, these are currently assessed with questionnaires such as the Katz ADL scale, the Lawton iADL scale, or the Groningen Activities Restriction Scale (GARS [ADL and iADL]).8–10 As reflected in those questionnaires, daily functioning is the result of a patient’s physical, psychological, cognitive, and social status.11 A potential measure of daily functioning based on items of the EHR should therefore incorporate these different aspects.
The frailty index (FI), as outlined by Rockwood et al,12 integrates the different aspects mentioned above (that is, physical, psychological, cognitive, and social functioning) into one measure.13,14 An FI consists of a comprehensive list of deficits and functional losses in different domains, from which a continuous score is calculated by dividing the number of deficits present in an individual by the total number of deficits from the list (score range 0–1).15,16 Most FIs are derived from questionnaires, but more recently FIs were developed that were derived from routine care data.12,17 Previous research has shown that the scores of the FI are stable across different versions of the FI and across different data sources used.15,16
Some researchers have suggested that the integration of multiple domains of functioning into the FI make it a potentially useful evaluative measure for health status or functioning.13,15,18 However, other researchers state that a measure of functional decline should not only include the number of deficits, but also the severity and impact of each deficit, which would make the FI unfit as a measure of daily functioning.19,20 If an older person’s daily functioning can be extracted from routine care data it opens new opportunities for research in large datasets, potentially saving costs and time in research. The FI is currently the only multicomponent outcome measure that can be extracted from EHRs, but it is still unclear whether it could serve as a proxy for daily functioning. The aim of this study was to test whether an electronic FI based on routine primary care data can be used as an evaluative measure for daily functioning in research with older persons.
Daily functioning is an often used outcome measure in the older population. If it could be extracted from routine care data it could save cost and time for both research and general practice. Although there are currently no established methods to measure daily functioning with routine primary care data, an electronic frailty index was suggested as a potentially useful evaluative measure for functioning. The electronic frailty index tested in this study (electronic Frailty Index — Utrecht) was responsive after an acute major medical event, but did not compare well with the gold standard for daily functioning (that is, the Groningen Activities Restriction Scale). Therefore, in its current state and context, the electronic frailty index cannot be used in research or general practice because of its limited ability to reflect daily functioning. |
METHOD
Design
This was a prospective cohort study embedded in the Integrated Systemic Care for Older People (ISCOPE) trial. Further details about the trial are described elsewhere.21
ISCOPE study
The ISCOPE study included 59 general practices from the Leiden region (the Netherlands). All patients aged ≥75 years enlisted in these practices were invited to participate. Exclusion criteria were:
Postal screening questionnaires together with an invitation to participate in the study were sent to 11 476 older persons. The ISCOPE screening questionnaire consisted of questions on four health domains (that is, functional; somatic; psychological; and social). Those who filled in and returned the ISCOPE screening questionnaire and the informed consent form (n = 7285) were included in the study. Inclusion took place from September 2009 to September 2010. All participants gave informed consent.21
For the trial, a selection of the participants (n = 3141) were included for a 12-month follow-up. This sample consisted of all participants with problems on three or four domains of the ISCOPE screening questionnaire, a random sample of 60% of participants with problems on two domains, and a random sample of 15% of participants with problems on one or no domain. At baseline (T0) they were visited at home by a research nurse to collect extra information on sociodemographic characteristics and to administer additional questionnaires (that is, GARS and Mini-Mental State Examination [MMSE]; range = 0 to 30). After 12 months (T12) the measurements were administered again. In addition, data over a period of 5 years until 1 year after the first home visit were extracted from the participants’ EHRs. The extracted data contained both diagnoses with International Classification of Primary Care (ICPC)-1-NL codes, prescriptions with Anatomical Therapeutic Chemical (ATC) codes, and free text. The EHR data were linked to the study data on a person-level using a personal identification number.
Participants
Inclusion criteria for this secondary analysis were a complete follow-up (T12), an available EHR, and at least one ICPC or ATC code registered in the EHR (that is, necessary to compute the electronic FI). Participants with missing values on either the GARS or the electronic FI were also excluded from the analyses (n = 23).
Measures
Electronic Frailty Index — Utrecht (eFI-U)
In this study the electronic FI was used as developed by Drubbel et al 22–26 (the eFI-U). This FI is generated from routine primary care data and consists of a list of 50 deficits (Supplementary Table S1). It includes physical, psychological, cognitive, and social deficits. Each deficit again consists of a list of ICPC and ATC codes related to that deficit. If one ICPC or ATC code was present in the previous 6 months or 5 years (depending on the code), the corresponding deficit scores positive (that is, one point). Diagnostic measurement data were not included in the eFI-U of this study, because these data were not extracted in the ISCOPE study.
The Groningen Activities Restriction Scale (GARS)
The GARS was used as a gold standard for measuring daily functioning. The GARS is an 18-item questionnaire with 11 questions on basic ADL and seven questions on iADL. Each question has four answer categories:
fully independent without problems;
fully independent, but with some difficulty;
fully independent, but with a lot of difficulty; and
only with another person’s help.
The total score ranges from 18 to 72 points, with a higher score indicating a lower level of daily functioning or more dependency.
Subgroups
Subgroups based on the occurrence of an acute major medical event during follow-up were compared. In this study, an acute major medical event was defined as a medical event with a sudden onset, which is likely to have a large impact on a person’s daily functioning. In this study, hip fracture, myocardial infarction, and stroke were included as acute major medical events. These events were considered to be present either if participants reported them in the follow-up questionnaire, or if corresponding ICPC codes were registered during the follow-up period. This was done to assure that all participants with an event during the follow-up period were identified. The ICPC codes included were L75 (femur fracture), K75 (acute myocardial infarction), K89 (transient cerebral ischaemia), and K90 (stroke).
Statistical analysis
Characteristics of the population at baseline were described. The construct validity of the eFI-U was assessed by comparing subgroups based on age, sex, and living status. Based on previous findings with the GARS, it was hypothesised that, if the baseline eFI-U measured daily functioning, average scores would increase with age, be higher for females compared with males, and be highest for those living in a residential care facility and lowest for those living independently with a partner. This was tested with Spearman’s correlation (age), the Mann–Whitney U test (sex), and the Kruskal–Wallis test (living status: independently alone; independently with partner; or residential care facility).27–32
To test the eFI-U for floor and ceiling effects, a histogram of the eFI-U at baseline was created for visual inspection. Floor or ceiling effects were considered to be present if >15% of participants reached the lowest or highest possible score, which was also tested. The upper limit of the eFI-U was assessed by plotting the 99th percentiles of the baseline eFI-U in the cohort against age.
For both the eFI-U and the GARS the difference between the follow-up and baseline scores was calculated (delta = measurement at 12 months minus measurement at baseline). The delta scores were also corrected for the baseline scores, because the latter influence the potential change over time. The resulting relative deltas were calculated as the actual delta divided by the maximum delta possible for that patient (relative delta = [measurement at 12 months minus measurement at baseline] divided by [total score minus measurement at baseline plus 0.01]). An extra 0.01 was added to the denominator to avoid a value of zero.
To explore responsiveness, the occurrence of an acute major medical event during follow-up was used as an implicit external criterion of larger change. The delta and relative delta eFI-U scores of the groups with and without event were described and compared with a Mann–Whitney U test. Standardised effect sizes (Cohen’s d) were calculated for both the (relative) delta eFI-U and the (relative) delta GARS. The standardised effect sizes of the eFI-U and the GARS were expected to be similar and both were expected to be small to moderate.
Criterion validity of the eFI-U was assessed with Spearman’s correlation between the baseline eFI-U and the baseline GARS. The association between changes (delta and relative delta) in the eFI-U and the GARS was also tested with Spearman’s correlation. If the eFI-U measured daily functioning, the correlation coefficient was expected to be ≥0.70 in both cases.32–34
To get a better understanding of the relationship between the eFI-U and the GARS over time, participants were grouped in quartiles according to their delta GARS scores. All delta GARS quartiles were compared on delta eFI-U scores (Jonckheere–Terpstra test) and on the number of acute major medical events during follow up (χ2 test for trend). In addition, the baseline GARS scores were compared between the delta GARS quartiles to check whether correction for baseline scores was needed. Because of significant differences between the quartiles in GARS score at baseline, the same analyses were repeated with quartiles based on the relative delta. The same analyses were also carried out with quartiles based on the (relative) delta eFI-U scores (Supplementary Tables S2 and S3).
RESULTS
A flowchart of the participants is presented in Figure 1. Table 1 displays the characteristics of the 1390 older persons included in the analyses. The delta eFI-U was approximately normally distributed and ranged from −0.14 to +0.20 (Supplementary Table S4).
Figure 1. Participant inclusion flowchart. ATC = anatomical therapeutic chemical. EHR = electronic health record. GARS = Groningen Activities Restriction Scale. ICPC = International Classification of Primary Care. ISCOPE = Integrated Systemic Care for Older People. T0 = baseline. T12 = after 12 months.
Table 1. Sociodemographic and functional characteristics of the total study population at T0 (baseline)
Construct validity
The baseline eFI-U scores were higher in the older participants, but the association with age was smaller than expected (Spearman’s ρ = 0.071; P = 0.008). As expected, females on average had a higher eFI-U score at baseline compared with males (Mann–Whitney U test, P<0.001; median females = 0.16, interquartile range [IQR] = 0.10 to 0.22 versus males = 0.14, IQR = 0.08 to 0.20). Furthermore, participants who lived in a residential care facility had the highest eFI-U score at baseline and those living independently with a partner the lowest (Kruskal–Wallis test, P<0.001; median institutionalised = 0.18, IQR = 0.12 to 0.26; median independently alone = 0.16, IQR = 0.10 to 0.22; median independently with partner = 0.14, IQR = 0.10 to 0.20) (data not shown).
Floor or ceiling effects
The histogram of the baseline eFI-U showed a slight right-skewed distribution, approaching a gamma distribution (Figure 2). The baseline eFI-U score in the total group ranged from 0.00 to 0.46. The 15% highest score was ≥0.25 and the 15% lowest score was ≤0.08, suggesting that there was no floor or ceiling effect. No common maximum of the eFI-U at every age was observed, which again suggested that there was no ceiling effect.16
Figure 2. Distribution of the eFI-U scores at T0 and T12 of the total population (N = 1390).
eFI-U = electronic Frailty Index — Utrecht. T0 = baseline. T12 = after 12 months.
Responsiveness of the Electronic Frailty Index — Utrecht (acute major medical events)
During follow-up, 193 participants (13.9%) experienced an acute major medical event (that is, hip fracture, myocardial infarction, and/or stroke) (Table 2). Of those 193 participants, 185 had one type of event and eight had two different types of events during follow-up. In total, 22 (1.6%) participants had a hip fracture, 64 (4.6%) a myocardial infarction, and 115 (8.3%) a stroke (data not shown). Characteristics of the participants with and without an acute major medical event during follow-up are described in Table 2.
Table 2. Characteristics of subgroups based on the presence of an acute major medical event during follow-up
There was a significant difference in (relative) delta eFI-U between participants with and without an acute major medical event during follow-up (mean absolute delta = 0.039, standard deviation [SD] 0.052 versus 0.020, SD 0.043; P<0.001; relative delta = 0.047, SD 0.064 versus 0.023, SD 0.051; P<0.001) (Table 2). The standardised effect sizes were 0.42 (delta) and 0.45 (relative delta), which can both be considered small but present (data not shown). The difference in delta and relative delta GARS between participants with and without an acute major medical event during follow-up was also significant. The standardised effect size was 0.21 for the delta GARS and 0.23 for the relative delta GARS, which can both be considered small but present, just like the standardised effect sizes of the (relative) delta eFI-U (data not shown).
Criterion validity
At baseline Spearman’s ρ between the eFI-U and the GARS was 0.374 (P<0.001). Figure 3 is a graphic representation of the relationship between the delta eFI-U and the delta GARS. The correlation coefficient between the delta eFI-U and the delta GARS was 0.088 and the correlation coefficient of the relative deltas was 0.097 (both P≤0.001). No regression analysis was done because of the low correlation between the delta GARS and the delta eFI-U.
Figure 3. Delta eFI-U scores against delta GARS scores for those with and without an acute major medical event during follow-up. With event (red) (n = 193); without event (blue) (n = 1197). eFI-U = electronic Frailty Index — Utrecht. GARS = Groningen Activities Restriction Scale.
Comparison GARS quartiles
More in-depth, the median delta eFI-U across the quartiles of the delta GARS was 0.02 (IQR = 0.00 to 0.04) for the first quartile, 0.02 (IQR = 0.00 to 0.04) for the second quartile, 0.02 (IQR = 0.00 to 0.06) for the third quartile, and 0.02 (IQR = 0.00 to 0.04) for the fourth quartile (P = 0.003) (Table 3). By contrast, there was a large and significant difference in median delta GARS over the delta GARS quartiles, as expected (P<0.001). Furthermore, the incidence of acute major medical events during follow-up increased over the quartiles (13.0% in the lowest quartile compared with 20.5% in the highest quartile; P = 0.005). The baseline GARS was highest for the participants in the lowest delta GARS quartile (P = 0.029). These differences in GARS at baseline suggest that the low change of the GARS during follow-up in the lowest quartile might be partly due to a high baseline GARS (that is, participants are not able to get much higher). Therefore, the same analyses were repeated with quartiles based on the relative delta GARS. Apart from the baseline GARS score the findings did not change much (Supplementary Table S5).
Table 3. Comparison between lowest and highest delta Groningen Activities Restriction Scale (GARS) quartiles
DISCUSSION
Summary
This study explored whether an electronic FI based on routine primary care data can be used as an evaluative measure for daily functioning in research with older persons. As the electronic FI tested in this study (eFI-U) changed over time and did not have floor or ceiling effects, it might be useful as an evaluative measure; however, there was a moderate overlap between the eFI-U and the GARS. Furthermore, the eFI-U was responsive after an acute major medical event, just like the GARS, but it was barely responsive over time in the population as a whole, which was different from the GARS. These findings suggest that the eFI-U does not reflect daily functioning in older persons.
Strengths and limitations
The main strength of this study is the high generalisability of the results due to the data and the instrument used. Previous studies already showed that the FI, because of the underlying concept of deficit accumulation, is a flexible instrument that can be based on different deficits and data sources, and still give the same results.15,16 The data used in this study (EHRs from Dutch general practices) are similar to many other routine care data in that they contain both codes and free text, which increase the generalisability of the results of this study. Another strength is the availability of a combination of routine care data and standardised questionnaires from the same community-dwelling older population and time period. Combining these data sources allows for a direct comparison of the EHR-derived instrument with a gold standard for daily functioning (that is, GARS). Furthermore, because of the availability of extensive prospective data, the authors were able to assess responsiveness by looking both over time (gradual decline in ADL/iADL) and after an event (sudden change in ADL/iADL).
This study also has some limitations. First, part of the lack of correlation in the study might be explained by the EHR data on which the electronic FI was based. Quality and completeness of coded routine care data fully rely on the ability and willingness of the primary care team to code and prioritise their findings in routine healthcare systems. Second, quite a few patients had to be excluded because they were not selected for follow-up or were lost to follow-up in the ISCOPE trial. This drop-out is likely to be associated with poor daily functioning and/or a higher level of deficits. The attrition and complete case analysis in this study, therefore, might have skewed the responses and weakened the effects found. Some patients were also excluded because of missing or unavailable EHRs; however, most of these missing EHRs are expected to be completely random as they were missing at practice level because of software problems. Thus, the influence on the results is expected to be limited.
Another limitation concerns the combination of the electronic FI tested in this study with the type of data from which it is derived. The eFI-U is a cumulative score based on EHRs of general practices. As a result, those patients who have been registered with their GP for a long time and those who visit more often are more likely to accumulate recorded deficits and thus have a higher eFI-U score compared with other patients. For any instrument based on EHR data, the influence of consultation frequency and registration period, among other factors, should be taken into account.
Comparison with existing literature
In previous literature some researchers suggested that an FI could serve as an evaluative measure for daily functioning because of its multicomponent nature.13,15,18 However, other researchers stated that this was not possible because frailty and disability are different constructs, and because no information on severity and impact is included in an FI.19,20,35,36
The results of this study using the eFI-U are in line with studies that showed a limited association between frailty and daily functioning.35,36 The authors of these studies propose that frailty and disability are overlapping but distinct concepts. Thus, an instrument that is designed to measure frailty will not be able to measure disability and vice versa. The findings of this study showed that an FI based on EHR data also does not reflect measurements of (daily) functioning. Furthermore, these findings are in line with studies on the relationship between the number of diseases or deficits and functional decline.19,20 As was already concluded by those studies, functioning or daily functioning is not only a matter of the number of deficits (which is the approach of an FI), but also of the severity and impact of each deficit. The current study shows that this is also the case when routine primary care data are used to count deficits. An electronic FI could be enriched with information on severity, and more importantly impact, through the use of new techniques such as plain-text mining and other advanced reading techniques, which are a proven approach to increase the quality of algorithms like an electronic FI. However, it is doubtful whether EHRs contain enough information on severity and impact.
Implications for research and practice
An evaluative measure for daily functioning that can be obtained from routine care data could be useful both for research (to replace time-consuming questionnaires) and clinical purposes (to monitor patients). In research, such a measure may save costs and time for both the researcher and the clinician. Furthermore, it may allow for more efficient and faster research, which might in the end improve patient outcomes and day-to-day general practice management. This study showed that the FI (with a deficit-counting approach), in its current state and context, has a limited ability to reflect daily functioning. As the electronic FI does not measure the aimed construct it cannot be used as an evaluative measure of daily functioning for research. The lack of precision and congruence of the eFI-U with the GARS means that it is even further away from use in clinic to monitor individual patients’ daily functioning. Further research could focus on other approaches (that is, other proxies or adjusted versions of the electronic FI) to measure daily functioning with routine care data. It must be noted that previous research has shown that the eFI-U can be used in population health management as a frailty identification instrument on a population level.23,25
Acknowledgments
The authors thank the members of the Advisory Board of Older Persons ‘Care and Wellbeing’ South-Holland North for their involvement in the design and conduct of the study.
Notes
Funding
The Integrated Systemic Care for Older People (ISCOPE) trial (the Netherlands trial register: NTR1946) was funded by
the Netherlands Organisation for Health Research and Development (Zon-MW) (project number: 311060201). The work of Willeke M Ravensbergen was funded by the Leiden University research profile area: Health, Prevention and the Human Life Cycle. Both funders had no role in the design and conduct of the study; collection, management, analysis, or interpretation of the data; or preparation, review, or approval of the manuscript. The researchers were independent from the funders.
Ethical approval
The ISCOPE trial was approved by the Medical Ethical Committee of the Leiden University Medical Center (reference number: P09.096). All participants gave written informed consent.
Provenance
Freely submitted; externally peer reviewed.
Competing interests
The authors have declared no competing interests.
Discuss this article
Contribute and read comments about this article: bjgp.org/letters
- Received February 25, 2020.
- Revision requested April 13, 2020.
- Accepted May 19, 2020.
- © British Journal of General Practice 2020
REFERENCES
- 1.↵
- 2.↵
- 3.↵
- 4.
- 5.
- 6.
- 7.↵
- 8.↵
- 9.
- 10.↵
- 11.↵
- 12.↵
- 13.↵
- 14.↵
- 15.↵
- 16.↵
- 17.↵
- 18.↵
- 19.↵
- 20.↵
- 21.↵
- 22.↵
- 23.↵
- 24.
- 25.↵
- 26.↵
- 27.↵
- 28.
- 29.
- 30.
- 31.
- 32.↵
- 33.
- 34.↵
- 35.↵
- 36.↵