Identifying symptoms associated with diagnosis of pancreatic exocrine and neuroendocrine neoplasms: a nested case-control study of the UK primary care population

Background Pancreatic cancer has the worst survival rate among all cancers. Almost 70% of patients in the UK were diagnosed at Stage IV. Aim This study aimed to investigate the symptoms associated with the diagnoses of pancreatic ductal adenocarcinoma (PDAC) and pancreatic neuroendocrine neoplasms (PNEN), and comparatively characterise the symptomatology between the two tumour types to inform earlier diagnosis. Design and setting A nested case-control study in primary care was conducted using data from the QResearch® database. Patients aged ≥25 years and diagnosed with PDAC or PNEN during 2000 to 2019 were included as cases. Up to 10 controls from the same general practice were matched with each case by age, sex, and calendar year using incidence density sampling. Method Conditional logistic regression was used to investigate the association between the 42 shortlisted symptoms and the diagnoses of PDAC and (or) PNEN in different timeframes relative to the index date, adjusting for patients’ sociodemographic characteristics, lifestyle, and relevant comorbidities. Results A total of 23 640 patients were identified as diagnosed with PDAC and 596 with PNEN. Of the symptoms identified, 23 were significantly associated with PDAC, and nine symptoms with PNEN. The two alarm symptoms for both tumours were jaundice and gastrointestinal bleeding. The two newly identified symptoms for PDAC were thirst and dark urine. The risk of unintentional weight loss may be longer than 2 years before the diagnosis of PNEN. Conclusion PDAC and PNEN have overlapping symptom profiles. The QCancer® (pancreas) risk prediction model could be updated by including the newly identified symptoms and comorbidities, which could help GPs identify high-risk patients for timely investigation in primary care.


INTRODUCTION
Pancreatic cancer is the 10th most common cancer in incidence but represents the fifth most common cause of death owing to cancer in the UK. Pancreatic cancer is very aggressive and has the worst survival rate among all types of cancer. 1 Almost 70% of patients were diagnosed at Stage IV. Tumours arising from the pancreas can be classified as exocrine (approximately 95%) or neuroendocrine (≤5%) neoplasms, 2 known as pancreatic ductal adenocarcinoma (PDAC) and pancreatic neuroendocrine neoplasms (PNEN), 3 respectively. Though there are differences in tumour pathobiology and treatment strategies between PDAC and PNEN, the two tumour types share a proclivity to metastasise. [4][5][6] More favourable outcomes were observed in diagnosis at earlier stages. 7 In the absence of a screening programme for pancreatic cancer in the UK, symptomatic presentation in general practice remains a key avenue for earlier diagnosis. However, besides jaundice, current literature reporting symptoms associated with pancreatic cancer are vague and non-specific. 8,9 GPs face the challenges of differentiating a potential malignancy from other benign diseases when patients present with non-specific symptoms, which could be easily 'missed' or delay diagnosis of the tumour. In addition, as a rarer type of cancer, there is still no large-scale study characterising the symptomatology of PNEN, nor have studies comprehensively compared the symptomatology of PDAC with PNEN. Therefore, a better understanding of the symptomatology and the timings at which patients present with symptoms could help GPs better manage patients and make clinical decisions. Furthermore, such symptoms could be used in social campaigns to increase public awareness of pancreatic cancer.
To address this research gap, the authors conducted this study with three aims: to explore the symptoms that patients presented in primary care in different time windows, which may indicate the diagnosis of PDAC/PNEN; to comparatively characterise the symptomatology of PDAC and PNEN; and to inform the update of the QCancer ® (pancreas) prediction model. 10,11 The QCancer (pancreas) score quantifies the risk of an incident diagnosis of pancreatic cancer in the next 2 years, based on an individual patient's characteristics. Such a risk score may help GPs make different decisions: a 2-week-wait referral Abstract Background Pancreatic cancer has the worst survival rate among all cancers. Almost 70% of patients in the UK were diagnosed at Stage IV.

Aim
This study aimed to investigate the symptoms associated with the diagnoses of pancreatic ductal adenocarcinoma (PDAC) and pancreatic neuroendocrine neoplasms (PNEN), and comparatively characterise the symptomatology between the two tumour types to inform earlier diagnosis.

Design and setting
A nested case-control study in primary care was conducted using data from the QResearch ® database. Patients aged ≥25 years and diagnosed with PDAC or PNEN during 2000 to 2019 were included as cases. Up to 10 controls from the same general practice were matched with each case by age, sex, and calendar year using incidence density sampling.

Method
Conditional logistic regression was used to investigate the association between the 42 shortlisted symptoms and the diagnoses of PDAC and (or) PNEN in different timeframes relative to the index date, adjusting for patients' sociodemographic characteristics, lifestyle, and relevant comorbidities.

Results
A total of 23 640 patients were identified as diagnosed with PDAC and 596 with PNEN. Of the symptoms identified, 23 were significantly associated with PDAC, and nine symptoms with PNEN. The two alarm symptoms for both tumours were jaundice and gastrointestinal bleeding. The two newly identified symptoms for PDAC were thirst and dark urine. The risk of unintentional weight loss may be longer than 2 years before the diagnosis of PNEN.
for patients at high risk; watchful wait and safety netting for patients with low risk.

METHOD Study design setting
This was a nested case-control study using the QResearch database (version 44), an extensive, validated, anonymised primary care database comprising records of >35 million patients registered in approximately 1500 GP surgeries, spread throughout the UK, which have been using the EMIS System since 1989. The population in the QResearch database is representative of the UK population. Patients' records have been linked with cancer registries, Office for National Statistics (ONS) mortality records, and Hospital Episode Statistics (HES). The cancer registration data include information on the date of diagnosis, type and location of the tumour, morphology, grade and stage, and treatment.

Study population
The eligible study population was an open cohort of patients aged ≥25 years and registered in the QResearch database between 1 January 2000 and 31 December 2019. Patients with an existing diagnosis of any type of pancreatic cancer before the entry date were excluded. The entry date to the cohort was the latest of the patient's 25th birthday, the date of patients registered with the practice plus 1 year, the date on which the practice computer system was installed plus 1 year, or the beginning of the study period. The right censor date was the earliest date of the following: the date of pancreatic cancer diagnosis, the date of death, the date of leaving the practice, or the study end date. Person years were calculated between the study entry date and the right censor date.

Identification of cases and controls
Cases were patients in the study cohort with an incident diagnosis of PDAC/PNEN, recorded in ≥1 of the four linked sources -GP records, HES, cancer registry, or ONS. The index date for cases was the earliest date the diagnosis was recorded in any four data sources. Cases were matched with up to 10 controls in the same practice, age, sex, and calendar year using incidence density sampling. 12 Each control was allocated an index date, which was the date of diagnosis of their matched case.

Candidate symptoms and potential risk factors
A broad list of the 42 symptoms potentially associated with PDAC and PNEN is summarised in Box 1. These symptoms were identified through literature review, 2,10,13,14 information from the leading charities such as Cancer Research UK 15 and Pancreatic Cancer UK, 16 National Institute for Health and Care Excellence (NICE) guidelines -NG12 17 and NG85, 18 and patient representatives. All occurrences of symptoms in primary care records were extracted, but the analysis was focused on the most recent 5 years before the index date. In addition, the following variables were of research interest and adjusted in the models: patients' sociodemographic characteristics (ethnicity, socioeconomic deprivation using Townsend quintile), lifestyle factors (smoking and drinking statuses, and body mass index [BMI], using most recently available values before the

How this fits in
This is the largest population-based study of its kind, systematically examining the symptomatology of pancreatic ductal adenocarcinoma (PDAC) and pancreatic neuroendocrine neoplasms (PNEN), and quantifying the association of 42 potential symptoms in different time windows relative to the date of diagnosis. This study confirmed several symptoms as risk factors for PDAC reported in previous UK studies, which had much smaller sample sizes. A deeper understanding of symptoms associated with PNEN is gained. Considering that most symptoms associated with pancreatic cancer are non-specific and do not qualify for urgent referral for investigation (2-week-wait) in the current NICE guideline, GPs should be vigilant of patients presenting with several concurrent non-specific symptoms and make proper safety-netting strategies. GPs should also increase the awareness of the risk of pancreatic cancer among people with comorbidities, and be careful not to attribute potential symptoms of pancreatic cancer to patients with existing health conditions.

Statistical analysis
Descriptive statistics were used to summarise the sociodemographic and clinical characteristics of patients diagnosed with PDAC and PNEN, and the matched control group. The key clinical characteristics of PDAC and PNEN cases were compared.
Exploratory analyses were conducted to investigate the association between the most recently recorded symptom relative to the index date in seven different periods and patient groups (case/control) using univariable conditional logistic regression. These seven timeframes were <1 month, 1-3 months, 4-6 months, 7-12 months, 1-2 years, 2-3 years, and 3-5 years before the index date. The purpose of setting different timeframes for the same symptoms was to compare how the odds ratio (OR) would change, and the implication of timeframes for earlier diagnosis of PDAC and PNEN based on symptomatic presentation. Based on the exploratory results and the clinical relevance of timeframes for earlier diagnosis of pancreatic cancer, the seven timeframes were narrowed down to the following four in two sets of analysis: • within 3 months for alarm symptoms, or 1 year for other symptoms (denoted as 3M/1Y); and • within 6 months for alarm symptoms, or 2 years for other symptoms (denoted as 6M/2Y).
Alarm symptoms included jaundice, dysphagia, and gastrointestinal (GI) bleeding. These three symptoms were analysed within shorter timeframes (within 3 to 6 months before the index date), as they are widely accepted as 'red flag' symptoms and should be promptly investigated in primary care, or referred to secondary care. The timeframes were longer for the other symptoms (within 1 or 2 years before the index date), as they are non-specific, probably caused by other benign conditions, and not easily ascribed to an underlying tumour. A categorical variable was used to denote whether the patients presented to their GPs for each symptom based on the most recent date of presentation, for example, no record of presenting with jaundice (reference category), presenting with jaundice within 3 months, or >3 months before the index date. Symptoms in other timeframes (6 months, 1 or 2 years) were operationalised in the same way.
Because of the large difference in sample sizes, PDAC and PNEN were analysed separately. Five variables contained missing data, including ethnicity, Townsend quintile, BMI, smoking and drinking statuses. Multiple imputation with chained equations was used to impute missing values for these variables under the missing at random assumption. Ten imputations were conducted. Multivariable conditional logistic regression models were used to identify symptoms significantly associated with the diagnoses of PDAC and PNENs, adjusting for patient characteristics and comorbidities, with Rubin's rules used to pool the parameter estimates across the 10 imputed datasets. 19 Possible interactions were considered and tested in the model. Odds ratios (OR) and 95% confidence intervals (CI) for each symptom were calculated and visualised in forest plots. Symptoms with an OR >1.2 at a significance level of P<0.01 for PDAC or PNEN were considered clinically and statistically relevant. Sensitivity analyses were conducted in patients (both cases and control) with at least 3 years of electronic health records (EHRs) before the index date. All statistical analyses were conducted in Stata (version 16.1). The reporting of this study followed the recommendations of the STROBE (strengthening the reporting of observational studies in epidemiology) statement. 20

Population characteristics
The open cohort included 15 194 279 patients aged ≥25 years, with a total of 100 290 294 person years of follow-up. A total of 23 640 PDAC and 596 PNEN cases were identified from the cohort. Case ascertainment from the linked data sources is shown in Supplementary   1.05 to 1.15)  1.63 (1.53 to 1.74)  1.68 (1.54 to 1.84

Variables
Odds ratio (95% CI) -value P P   Tables S2-S5). Most results were congruent in the two models, though symptoms in a shorter timeframe (3M/1Y) generally had higher ORs than those in a longer period (6M/2Y) and had wider confidence intervals. In addition, symptoms within the cut-off periods (for example, 3 to 6 months or 1 to 2 years) were statistically significant. For symptoms longer than the cut-off periods, there were two main patterns: either the symptoms became non-significant, or the direction of OR in the symptoms reversed (from >1 to <1, significantly higher odds in control), which meant the controls were more likely to consult with those (non-specific) symptoms after the cut-off periods.

Adjusted odds ratio (95% CI) for PDAC after multiple imputation (significant results only)
Noticeably, the effect of unintentional weight loss may be longer than 2 years before the diagnosis of PNEN. Interaction terms were tested; however, they were not in the final model because including interaction terms increased the number of parameters that need to be estimated. The model was unable to converge, especially for PNEN with a small sample size.
Jaundice had the highest adjusted OR in both PDAC and PNEN. Symptoms associated with PNEN were a subset of symptoms associated with PDAC, but the strength of ORs among the significant symptoms may not be the same in PDAC and PNEN. Nine symptoms were significantly associated with the diagnosis of PNEN in the timeframe of 3M/1Y, including jaundice, GI bleeding, diarrhoea, bowel change, vomiting, indigestion, abdominal mass, abdominal pain, and weight loss. The additional significant symptoms associated with PDAC included constipation, steatorrhea, abdominal distension, nausea, flatulence, heartburn, fever, tiredness, appetite loss, itching, back pain, thirst, and dark urine (Figure 1a).

Other risk factors
Compared with people of white ethnicity, people of Indian, Bangladeshi, and other Asian (not including Chinese) ethnicity were less likely to develop PDAC (OR<1). Smoking and drinking were risk factors for PDAC. When considering comorbidities, type 2 diabetes mellitus (T2DM), venous thromboembolism, Cushing's syndrome, and presence of pancreatic cysts significantly increased the risks of PDAC and PNEN. Acute pancreatitis, cholangitis, a family history of GI cancer, and type 1 diabetes were significant risk factors for PDAC, but not for PNEN (Figures 1 and 2,  comparing a and b).

Sensitivity analysis
About 26% (25.9%) of patients (65 884 out of 254 260, including cases and controls) were excluded from the sensitivity analysis. By comparing the results from the main analysis, the conclusion of symptoms in the sensitivity analysis did not change, though there were some changes in OR. Complete results of sensitivity analyses are available in Supplementary Tables S6-S9.  identified from the QResearch database. Nine symptoms were significantly associated with the diagnosis of PNEN, which is a subset of 23 significant symptoms for PDAC. A shorter timeframe (3M/1Y)) was considered better than a longer one (6M/2Y) for earlier cancer diagnosis, as cases had higher odds of presenting symptoms in the 3M/1Y timeframe. Jaundice had the highest adjusted OR in both PDAC and PNEN. Thirst and dark urine were the two newly identified symptoms associated with PDAC, not previously reported in other studies. Thirst could be a symptom explained by T2DM, which is associated with pancreatic cancer. Dark urine could be caused by progressing liver dysfunction, or the manifestation of biliary duct obstruction.   0.97 to 0.97)   1.09 (1.05 to 1.14)  1.62 (1.53 to 1.73)  1.68 (1.54 to 1.82)  1.84 (1.67 to 2.03)   1.11 (1.04 to 1.17)   0.59 (0.48 to 0.73)  0.62 (0.48 to 0.81)  0.34 (0.22 to 0.53)  0.54 (0.41 to 0.

Variables
Odds ratio (95% CI) -value P Adjusted odds ratio (95% CI) for PDAC after multiple imputation (significant results only) Symptom period: 6 months/2 years before the index date 0.  The complete findings of this study are summarised in Box 2.
Early diagnosis of pancreatic cancer from primary care is still challenging, owing to non-specific symptoms. This study identified 23 symptoms associated with the diagnosis of PDAC and nine symptoms for PNEN. Risk prediction models incorporating comprehensive symptomatology would help identify patients with a high risk of developing pancreatic tumours from primary care. Patients could benefit from an earlier cancer diagnosis and better survival outcomes, which can also save costs for the NHS.

Strengths and limitations
The QResearch database provided rich data for this study, which is by far the largest study of its kind. The representative patient population makes the study findings more generalisable to a broader UK population. The use of EHRs avoided selection, recall, and responder biases from the survey, and also provided benefits from the accuracy of coding and data completeness in the UK general practice. The authors explored the effects of symptoms in seven timeframes first, and then narrowed down to 3M/1Y and 6M/2Y before the index date. Symptoms recorded longer than the timeframes (>3M/1Y, >6M/2Y) were not mixed with no symptom recorded, which provided new information about the symptoms beyond the cut-off periods in cases and controls. Though symptoms were the focus of this study, the background risk factors and relevant comorbidities were taken into account, included, and adjusted for in the model, which is another strength. The authors conducted the study as transparently and thoroughly as possible. The research protocol has been published on the QResearch website. The reporting of this article complies with the STROBE statement.
Information bias in EHRs is the first limitation. The authors could not evaluate how accurately the information was recorded across practices. The recording habit may have considerably differed among GPs. The heterogeneity of recording habits was mitigated by using all possible Read codes for each variable. Because of the small sample size in PNEN cases, it is possible that the full burden of symptoms in  PNEN could not be captured. Therefore, the authors could not discern whether a lower number of significant symptoms for PNEN is a lack of statistical power, or PNEN has truly less prominent symptomatology, or both. The researchers planned to explore whether there was any symptom associated with early/late stages at diagnosis in PDAC and PNEN. Unfortunately, the large amount of missing data in cancer staging (Supplementary Table S10) did not allow them to conduct such an analysis.

Comparison with existing literature
The current study has some improvements from the authors' previous QCancer (pancreas) prediction model, 10 including a longer study period, a larger sample size of incident cases, more symptoms examined, and the exploration of PNEN, which resulted in an additional 13 significant symptoms identified. Some UK studies examined the symptomatology of PDAC in primary care settings, with a similar study design (matched case-control study) and statistical method (conditional logistic regression), using the GPRD 21 and the Health Improvement Network (THIN) 22 databases. The findings in this study are generally consistent with these two publications. No studies have systematically and robustly evaluated the symptomatology of PNEN in primary care. Patients with PNEN (n = 64) reported their symptoms in a voluntary, internet-based survey.
Given the small sample size and potential recall bias in that study, 23 the authors believe that the present population-based approach offers a more robust and generalisable insight of PNEN symptomatology. Older age, smoking, excess alcohol intake, chronic pancreatitis, and T2DM are common risk factors for pancreatic cancer. [24][25][26][27][28] The findings are the same in the present study.

Implications for research and practice
Most symptoms identified in this study do not qualify for a rapid referral in the current NICE guideline for suspected (pancreatic) cancer pathway referral (NG12). 17 GPs should be vigilant to patients presenting with alarm symptoms and non-specific but concerning symptoms, especially when patients have existing comorbidities. Public and patient engagement events could raise public awareness of the symptoms of pancreatic cancer, which may help patients see their GPs more promptly when noticing bodily changes.
Based on the study findings, the authors can update the QCancer (pancreas) prediction model 10 and develop a new model for PNEN. It is also possible to quantify the risk of patients presenting with several concurrent non-specific symptoms, and the predictive values of such symptom combinations. It would be interesting to further understand how GPs managed and investigated patients presenting with different symptom combinations, and the association with the route to diagnosis and cancer stage.
In addition, there is an ongoing project in the authors' team, investigating diabetes as a risk pathway towards pancreatic cancer: Diabetes as a risk pathway towards early diagnosis and prognostication of pancreatic cancer (www.qresearch.org/research/approvedresearch-programs-and-projects/diabetesas-a-risk-pathway-towards-early-diagnosisand-prognostication-of-pancreatic-cancer/).

Ethical approval
This study utilised QResearch ® data and obtained approval from the QResearch Scientific Committee in July 2018. QResearch is a Research Ethics Approved Research Database, confirmed from the East Midlands -Derby Research Ethics Committee (research ethics reference: 18/EM/0400). A dedicated webpage for this project has been created on the QResearch website: www.qresearch.org/ research/approved-research-programs-andprojects/adepts-accelerated-diagnosis-inneuroendocrine-and-pancreatic-tumours/. The study protocol and statistical analysis plan are available from this webpage.

Provenance
Freely submitted; externally peer reviewed.