Trends and clinical characteristics of COVID-19 vaccine recipients: a federated analysis of 57.9 million patients' primary care records in situ using OpenSAFELY

Background On 8 December 2020 NHS England administered the first COVID-19 vaccination. Aim To describe trends and variation in vaccine coverage in different clinical and demographic groups in the first 100 days of the vaccine rollout. Design and setting With the approval of NHS England, a cohort study was conducted of 57.9 million patient records in general practice in England, in situ and within the infrastructure of the electronic health record software vendors EMIS and TPP using OpenSAFELY. Method Vaccine coverage across various subgroups of Joint Committee on Vaccination and Immunisation (JCVI) priority cohorts is described. Results A total of 20 852 692 patients (36.0%) received a vaccine between 8 December 2020 and 17 March 2021. Of patients aged ≥80 years not in a care home (JCVI group 2) 94.7% received a vaccine, but with substantial variation by ethnicity (White 96.2%, Black 68.3%) and deprivation (least deprived 96.6%, most deprived 90.7%). Patients with pre-existing medical conditions were more likely to be vaccinated with two exceptions: severe mental illness (89.5%) and learning disability (91.4%). There were 275 205 vaccine recipients who were identified as care home residents (JCVI group 1; 91.2% coverage). By 17 March, 1 257 914 (6.0%) recipients had a second dose. Conclusion The NHS rapidly delivered mass vaccination. In this study a data-monitoring framework was deployed using publicly auditable methods and a secure in situ processing model, using linked but pseudonymised patient-level NHS data for 57.9 million patients. Targeted activity may be needed to address lower vaccination coverage observed among certain key groups.


INTRODUCTION
On 8 December 2020, the NHS in England administered the first COVID-19 vaccination as part of an ambitious vaccine programme to combat the ongoing pandemic owing to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Vaccination is one of the most cost-effective ways of avoiding disease, and worldwide vaccinations prevent 2-3 million deaths per year, but new vaccines can take many years or decades to develop and become part of routine practice. 1 Since the outset of the COVID-19 pandemic, teams of scientists, clinical trialists, and regulators around the world have worked at unprecedented speed, with >200 vaccines currently being tested. 2 The UK medicines regulator approved two COVID-19 vaccines for use before the end of 2020: the Pfizer-BioNTech mRNA vaccine and the AstraZeneca-Oxford vaccine. 3 The Moderna vaccine was subsequently approved in January 2021, 4 with more vaccine approvals during 2021.
For the UK the independent Joint Committee on Vaccination and Immunisation (JCVI) provides recommendations on vaccinations to the government and England's NHS. In early December 2020 the JCVI recommended nine priority groups for vaccination (Box 1), largely based on risk of death from COVID-19. 5 This was the basis for the NHS England vaccination programme, with due recognition that vaccination of care Abstract Background On 8 December 2020 NHS England administered the first COVID-19 vaccination.

Aim
To describe trends and variation in vaccine coverage in different clinical and demographic groups in the first 100 days of the vaccine rollout.

Design and setting
With the approval of NHS England, a cohort study was conducted of 57.9 million patient records in general practice in England, in situ and within the infrastructure of the electronic health record software vendors EMIS and TPP using OpenSAFELY.

Method
Vaccine coverage across various subgroups of Joint Committee on Vaccination and Immunisation (JCVI) priority cohorts is described.

Results
A total of 20 852 692 patients (36.0%) received a vaccine between 8 December 2020 and 17 March 2021. Of patients aged ≥80 years not in a care home (JCVI group 2) 94.7% received a vaccine, but with substantial variation by ethnicity (White 96.2%, Black 68.3%) and deprivation (least deprived 96.6%, most deprived 90.7%). Patients with pre-existing medical conditions were more likely to be vaccinated with two exceptions: severe mental illness (89.5%) and learning disability (91.4%). There were 275 205 vaccine recipients who were identified as care home residents (JCVI group 1; 91.2% coverage). By 17 March, 1 257 914 (6.0%) recipients had a second dose.

Conclusion
The NHS rapidly delivered mass vaccination. In this study a data-monitoring framework was deployed using publicly auditable methods and a secure in situ processing model, using linked but pseudonymised patient-level NHS data for 57.9 million patients. Targeted activity may be needed to address lower vaccination coverage observed among certain key groups. home residents may initially lag behind other groups because of complexities around storing and distributing Pfizer-BioNTech outside of larger healthcare settings without appropriate cold chain facilities. 6 Vaccinations were administered initially in hospitals and a number of primary care centres; then in GP surgeries, community pharmacies, and newly established mass vaccination centres.
Both vaccines administered to date require two doses. Given high infection rates and the relatively high protection thought to be offered by the first dose, after the start of the campaign the JCVI recommended extending the interval to 12 weeks. This strategy was intended to prevent the most deaths and admissions to hospital through maximising the number of patients with some protection against the virus as quickly as possible, although GPs were initially allowed some discretion in the exact timing of the second dose. 7,8 OpenSAFELY is a new secure analytics platform for electronic patient records built by the authors' group on behalf of NHS England to deliver urgent academic and operational research during the pandemic. 9,10 Analyses run across all patients' full-coded pseudonymised primary care records, and includes 57.9 million patients, 95% of people registered with an English general practice, those where EMIS or TPP electronic health record (EHR) software is deployed, with patient-level linkage to various sources, such as secondary care data. Code and analysis are shared openly for inspection and re-use. Vaccine administration details are recorded in the National Immunisation Management Service (NIMS) and electronically transmitted to every individual's GP record on a daily basis.
OpenSAFELY can provide detailed information about the demographics and clinical conditions of those vaccinated from each patient's full pseudonymised EHR, which is not available within NIMS. This can reveal whether the vaccine rollout is leaving certain groups behind and whether any targeted action is required to address gaps in coverage.
This study therefore set out to: assess the coverage of COVID-19 vaccination in all patients registered with TPP and EMIS practices in England in near real time in the first 100 days of the campaign; and to describe how coverage varied between key clinical and demographic subgroups.

Study design
A retrospective cohort study was conducted using general practice primary care EHR data from all England GP practices supplied by the EHR vendors EMIS and TPP. The cohort study began on 8 December 2020, the start of the national vaccination campaign, and ended on 17 March 2021. The authors of the current study are producing weekly vaccine coverage reports with a subset of this data 11 and will update this analysis regularly with extended follow-up time using near real-time data as the vaccination campaign progresses.

How this fits in
The COVID-19 vaccination programme was launched in England on 8 December 2020. Limited information is available on vaccination coverage in detailed demographic and clinical subgroups. This study deployed a data-monitoring framework for vaccine coverage using publicly auditable methods and secure in situ processing, for linked but pseudonymised patient-level NHS data on 95% of patients registered with a GP in England. This study highlights lower vaccination coverage in the first 100 days of the vaccine rollout among certain key groups: ethnic minorities, those living in areas of higher deprivation, and individuals living with severe mental illness or learning disabilities where targeted activity may be needed to ensure equitable protection against COVID-19. OpenSAFELY provides a secure software interface allowing a federated analysis of pseudonymised primary care patient records from England in near real-time within the EMIS and TPP highly secure data environments. Nondisclosive, aggregated results are exported to GitHub where further data processing and analysis takes place. This avoids the need for large volumes of potentially disclosive pseudonymised patient data to be transferred off-site. This, in addition to other technical and organisational controls, minimises any risk of re-identification.

Box 1. Priority groups for vaccination advised by the Joint
The dataset available to the platform includes pseudonymised data such as coded diagnoses, medications, and physiological parameters. No free-text data are included. All activity on the platform is publicly logged and all analytic code and supporting clinical coding lists are automatically published. In addition, the framework provides assurance that the analysis is reproducible and reusable. Further details on information governance and platform can be found in Supplementary Appendix S1.

Study population
For the descriptive analysis all patients registered with a general practice using EMIS (n = 33 873 987) or TPP (n = 24 056 480) in England on 17 March 2021 were included. Patients with unknown date of birth (that is default age >121 years) or unknown sex were excluded.

Priority groups for vaccination
Patients were classified into their JCVI priority group (Box 1) using SNOMED-CT codelists and logic defined in the national COVID-19 vaccination uptake reporting specification developed by PRIMIS. 12 Patients were assigned only to their highest priority group and not included again as part of any other priority group. For example, a 76-yearold living in a care home would be assigned to group 1 (care home residents) but not to group 3 (aged 75-79 years). Eligibility was not assessed as defined by occupation, that is, health and care staff for the relevant priority groups (1 and 2) because this information is largely missing from GP records and, where present, is unreliable. These patients were therefore either classified into a lowerpriority group where applicable (for example, by clinical conditions or age), or as 'other', if a person did not fall into one of the nine defined groups. In line with the national reporting specification, most criteria were ascertained using the latest available data at the time of analysis, with the exception of age, which was calculated as at 31 March 2021 as recommended by Public Health England. 12

COVID-19 vaccine status
Vaccination information is transmitted back to patients' primary care records in the days following vaccine administration in a designated centre. Which patients had any recorded COVID-19 vaccine administration code in their primary care record (only Pfizer-BioNTech mRNA vaccine or AstraZeneca-Oxford vaccine were available at the time of analysis) was ascertained. The latest available date of vaccinations recorded in the most recent comparable OpenSAFELY-EMIS and OpenSAFELY-TPP database build were included for those vaccinated up to 17 March 2021. Any COVID-19 vaccination within 19 days of the first dose was classed as a duplicate record entry, and the first vaccination after this date as the second dose of the schedule.

Key demographic and clinical characteristics of vaccinated groups
All patient demographics defined by the national reporting specification (for example, ethnicity) were extracted. Demographics not defined by the specification, including the level of deprivation, were also extracted. Deprivation was measured by the Index of Multiple Deprivation (IMD, in quintiles, with higher values indicating greater deprivation), derived from the patient's postcode at Lower Super Output Area. Patients with missing data were grouped into an unknown category.
The population was also described according to the presence or absence of various pre-existing health problems: chronic cardiac disease; diabetes; chronic kidney disease; severe mental illness; learning disabilities; chronic neurological disease (including stroke); asplenia; morbid obesity; chronic liver disease; chronic respiratory disease; and immunosuppression. Patients lacking codes in their primary care record indicating these conditions were assumed to be free of these conditions.

Factors associated with time to COVID-19 vaccination
To examine relationships between multiple patient characteristics in older adults and the chance of receiving a first COVID-19 vaccine by 17 March 2021, a federated survival analysis was carried out with individual Cox regression models fitted to the TPP and EMIS datasets, before combining model coefficients using inversevariance-weighting. Here, patients were included who were registered in England and aged ≥70 years on 7 December 2020 (n = 7 152 830, Supplementary Figure S1) and censored at death or deregistration. Patients aged <70 years were excluded because of the inability to ascertain health/ care worker status, a strong determinant of early vaccination. Patients with <1 year of prior follow-up were excluded to ensure completeness of clinical records (n = 237 940), and care home or nursing home residents (n = 154 123) were excluded as the vaccine rollout was organised separately for these individuals. All key demographic and clinical characteristics described in the section above were included in the multivariable model, with age grouped into 5-year age bands. Patients with missing sex or IMD (n = 59 386) were excluded. The model was stratified by GP practice. Hazard ratios from the fully adjusted model are reported with 95% confidence intervals (CIs).

Codelists and implementation
Information on all characteristics were obtained from primary care records by searching TPP SystmOne and EMIS records for specific coded data. EMIS and TPP SystmOne are fully compliant with the mandated NHS standard of SNOMED-CT clinical terminology. Medicines are entered or prescribed in a format compliant with the NHS Dictionary of Medicines and Devices (dm+d). 13 Codelists and logic for most features in the national reporting specification were automatically converted to software (https://codelists.opensafely.org/ codelist/primis-covid19-vacc-uptake).

Analysis
Charts of vaccine coverage were generated for all underlying conditions and medicines. Those not presented in this manuscript are available online for inspection in the associated GitHub repository. 14 Software and reproducibility Data management and analysis was performed using the OpenSAFELY software libraries and Jupyter notebooks, implemented using Python 3 and R (version 4.0.2). More details are available in Supplementary Data and openly accessible GitHub repository. This is the first analysis delivered using federated analysis through the OpenSAFELY platform: codelists and code for data management and data analysis were specified once using the OpenSAFELY tools; then transmitted securely from the OpenSAFELY jobs server to the OpenSAFELY-TPP platform within TPP's secure environment, and separately to the OpenSAFELY-EMIS platform within EMIS's secure environment, where they were each executed separately against local patient data; summary results were then reviewed for disclosiveness, released, and combined for the final outputs.
All code for the OpenSAFELY platform for data management, analysis, and secure code execution is shared for review and re-use under open licences at GitHub.com/ OpenSAFELY. All code for data management and analysis for this article is shared for scientific review and re-use under open licences on GitHub (https://github.com/ opensafely/covid19-vaccine-coverage-tppemis).

Patient and public involvement
Patients were not formally involved in developing this specific study design that was developed rapidly in the context of the rapid vaccine rollout during a global health emergency. The authors have developed a publicly available website (https:// opensafely.org) through which any patient or member of the public is invited to make contact regarding this study or the broader OpenSAFELY project.

Key demographic and clinical characteristics of priority group 2
The substantially larger priority group 2 was offered vaccination alongside group 1. Priority group 2 included those aged ≥80 years who were not known to be living in a care home, and rapidly reached 90% coverage over the first 8 weeks of the vaccination campaign, with a further increase of 4.7% over the remaining weeks of the study period (data available from the GitHub repository). 14 A breakdown of the proportion of patients vaccinated by 17 March in this group by various demographic and clinical categories is provided in Table 2 and Figures 2 and 3. Vaccination was less common among those living in the most deprived postcode areas (90.7% in the most deprived quintile compared to 96.6% in the least deprived, Figure 2a).
Vaccination coverage was initially lower among those with obesity ( Figure 3a); however, this gap had largely resolved by mid-March 2021. Vaccination coverage was slightly higher among patients with other physical comorbidities such as chronic respiratory disease (Figure 3b), cardiac disease, or chronic kidney disease. Vaccination coverage was substantially lower among those living with severe mental illness (89.5%, Figure 3c) and learning disabilities (91.4%, Figure 3d), with some improvements over time.

Key demographic and clinical characteristics of other priority groups
Detailed tables and charts of breakdowns for all other priority groups are available on the GitHub repository. 14 The differences by demographic and clinical features observed in priority group 2 were broadly reflected in other priority groups. Key exceptions were that the lower-priority groups, not yet widely vaccinated, had a different pattern of discrepancy by ethnicity, with the South Asian population most vaccinated. This pattern was also seen in several other groups during early stages of the campaign, before widespread targeting of individual groups, such as the 70-74 and 75-79 age groups. Recent improvements in the vaccination coverage of the Bangladeshi community was also noted, where an acceleration with respect to other ethnic groups was seen from around 16 February, particularly noticeable in priority groups 4, 6, 7, 8, and 9.  Figure S1). Associations between patient-level factors and having a COVID-19 vaccine are shown in Figure 4 and Supplementary All of the clinical risk groups were associated with a higher chance of being vaccinated, with the exception of chronic neurological diseases (HR 0.93, 95% CI = 0.93 to 0.93) and severe mental illness (HR 0.75, 95% CI = 0.74 to 0.75) (Figure 4).

DISCUSSION Summary
The NHS in England has rapidly responded to the availability of COVID-19 vaccines and administered a substantial number of doses in the first 100 days of the vaccination campaign. In this study, 20 852 692 patients (36.0% of patients registered in 97% of GP practices in England) received at least one vaccine dose by 17 March 2021, including 94.7% of eligible patients aged ≥80 years (priority group 2). However, ethnic minorities in priority group 2 were substantially less likely to be vaccinated, and those living in more socioeconomically deprived areas generally had lower vaccine coverage. Similarly, these patterns were broadly observed across the majority of priority groups. Furthermore, in those aged ≥80 years, patients with pre-existing medical conditions were equally likely, or more likely, to have received a vaccine, across all groups of pre-existing medical problems, with two exceptions: vaccination was lower among patients living with severe mental illness and learning disabilities. Similarly, in a riskadjusted model, chance of vaccination was significantly lower among those living in deprived areas, those from ethnic minority groups, those with chronic neurological disease (including learning disability), and those with severe mental illness.
A total of 3 012 051 people who received a vaccination, most likely health and care workers, were not identified as part of any priority group. Usage was split between two vaccine brands, Pfizer-BioNTech (41.7%, n = 8 691 536) and AstraZeneca-Oxford (57.9%, n = 12 080 194). A second dose of the vaccine was received by 6.0% (n = 1 257 914).

Strengths and limitations
The key strengths of this study are the scale, detail, completeness, and timeliness of the underlying raw EHR data. This analysis was executed across the full dataset of all raw, pseudonymised, single-event-level clinical events for 57.9 million patients registered  Another key strength is that all eligible patients were identified in each JCVI priority group by directly implementing the full official SNOMED-CT codelists and logic for the national PRIMIS COVID-19 vaccination uptake reporting specification, thus ensuring that the cohorts are perfectly in line with national procedures and GP expectations.
Some limitations to this analysis are recognised. The population, although extremely large, may not be fully representative of the full eligible population: it does not include individuals not registered with a general practice; or the 4% of patients registered at practices not using TPP and EMIS. Primary care records, while detailed and longitudinal, can be incomplete on certain patient characteristics.
Occupation is generally not available in the EHR so it was not possible to assess the eligibility in priority groups 1 or 2 where this is based on occupation. This means that it was not possible to determine the appropriateness of vaccination for those in the 'other' group. For patients aged ≥80 years, 36.2% and 0.8% had missing ethnicity or IMD information, respectively. In the weekly COVID-19 vaccination coverage report a different ethnicity codelist and additional sources of data within OpenSAFELY-TPP are used to reduce missing data to <10% for ethnicity. 11 This method will be implemented within OpenSAFELY-EMIS when possible.
As a federated analysis has been carried out across two EHR vendor's systems, it is possible that a very small number of patient records are duplicated; however, this has now been established to represent <0.03% of the total patient count. The ascertainment of vaccination status relies on the vaccination administration electronic message being successfully received into the primary care record; while these numbers are consistent with national figures, methods are being explored to also cross-validate this against other sources of person-level vaccination data, broken down by vaccination site type.
Finally, there is currently no wellvalidated person-level data to identify individuals resident in a care home: this is a limitation for all UK healthcare database studies. The method used for identifying care home residents in this analysis -a clinical code as detailed by the national reporting specification -will lead to underascertainment. 15 The authors of the current study are launching a programme of work, in collaboration with the UK health data science community, to describe and validate the best methods for identifying current care home residents, to produce a better understanding of their health outcomes.

Comparison with existing literature
The UK had already administered 40.49 vaccines per 100 people by 17 March 2021, one of the fastest vaccination programmes in the world. 16 18 It is noted that this figure is higher than the latest Office for National Statistics (ONS) estimated population of England (56.2 million): 19 the difference between ONS population estimates and NHS-registered populations is a well-recognised issue and may be caused by over-counting at GP practices, differences in definition, and under-counting by the ONS, or a combination of all three. 20,21 To the authors' knowledge, this manuscript, an update of the 27 January 2021 preprint covering only OpenSAFELY-TPP patients (40% of the population), 22 is the first study to describe in detail the demographic and clinical features of those who have been   30 and may somewhat even out during the course of the rapid vaccine rollout.

Implications for research and practice
The reasons underpinning variation in COVID-19 vaccination coverage are not yet understood, and information presented here should not be misinterpreted as a criticism of the rapidly established NHS vaccination campaign. Further research is needed to understand and address the observed lower vaccination coverage among patients from more deprived areas, and the striking disparity between ethnic groups. The initial preprint on 27 January 2021 and this author group's regular updates 11,22 have received substantial media coverage, particularly with regards to the differences in vaccinations between different ethnic communities. [31][32][33][34] The NHS, government, and communities themselves have introduced targeted activities to address the gap including vaccination at places of worship, 35 webinars led by community leaders to tackle misinformation, 36 and targeted funding for groups with remit for tackling any health inequalities. 37,38 The authors note the accelerated increase in the Bangladeshi community from mid-February, which may represent targeted action by a community group and/or a local NHS organisation. The regular OpenSAFELY vaccination coverage reports can support assessment of the success of these activities in increasing vaccination coverage. It is reassuring to see that those with a previous history of various medical problems are being vaccinated at the same rate as other patients: in particular it is reassuring to see no evidence that the vaccine programme is currently missing those with serious physical health problems who are at highest risk of death from COVID-19. The lower vaccination coverage among patients living with severe mental illness and learning disabilities is concerning: this may reflect challenges around access, including for those currently living in institutional settings. However, in the latest weeks there was evidence of the vaccination gap narrowing in these groups (Figures 3c and 3d). In late February the JCVI recommended expanded vaccine access for all people on the GP learning disability register as well as adults with other related conditions, including cerebral palsy. This update of the previous advice was based on OpenSAFELY analysis showing a higher risk of mortality in those with learning disabilities 39 and the JCVI anticipated that an additional 150 000 people would receive the vaccine sooner as a result of this advice. 40 The authors note that all findings in the present study are from the first 100 days of a major national vaccination programme. Very substantial changes in coverage among different groups are to be expected over the coming months. Weekly vaccine coverage reports are being shared by this author group to assist in monitoring and targeting vaccine initiatives along with machine-readable outputs for re-use in different formats (www. opensafely.org/covid-vaccine-coverage).
More broadly, the UK has an unusually large volume of very detailed longitudinal patient data, especially through primary care. The authors believe the UK has a responsibility to the global community to ensure that these data are used to inform response to the COVID-19 pandemic, in a timely manner, while maintaining the security of individual health records and ensuring the full transparency of all actions on the data to build public trust. To this end, codelists and code for data management and data analysis can only be executed on OpenSAFELY after first being made available at GitHub.com/ openSAFELY before execution; this is then shared publicly under open licences for review and re-use either at, or before, the time when results are reported.
In conclusion, the NHS in England has rapidly deployed a mass vaccination campaign. Targeted activity may be needed to address lower vaccination coverage observed among certain key groups: ethnic minorities, those living in areas of higher deprivation, and individuals living with severe mental illness or learning disabilities. Live data monitoring is likely to help support those on the frontline making complex operational decisions around vaccine rollout.