Main

Increasing the proportion of cancer patients who are diagnosed in early stage could help decrease the number of cancer-related deaths (Abdel-Rahman et al, 2009). Therefore, national cancer control policies in several countries currently encompass initiatives supporting early detection and diagnosis (Olesen et al, 2009; Richards, 2009; Coleman et al, 2011).

The evidence base supporting these initiatives, however, is complex and heterogeneous (Richards, 2009). Markers and measures of the timeliness of diagnosis currently in use include short-term survival (NCIN (National Cancer Intelligence Network), 2008a; Møller et al, 2009; Rachet et al, 2009), diagnosis after an emergency hospital admission (NCIN (National Cancer Intelligence Network), 2010), and length of time intervals between symptom onset and diagnosis (Neal and Allgar, 2005; Macleod et al, 2009; Olesen et al, 2009). Stage at diagnosis is an excellent measure of early detection, but UK population-based data regarding this measure are limited. A recent National Audit Office report indicated that the completeness of stage information across English cancer registries is <40% (NAO (National Audit Office), 2010).

A better understanding of socio-demographic variation in stage at diagnosis could help stratify and tailor symptom awareness and early diagnosis interventions aimed at specific patient groups. We distinguish between ‘stratification’ that is, the targeting of an intervention to patient populations at a higher risk and ‘tailoring’, that is, the adaptation (or customising), a generic intervention to make its application more suitable for specific patient groups. An example of this concept relates to targeted interventions to increase breast cancer symptom awareness amongst older women (Forbes et al, 2011). It can also help focus early diagnosis audit efforts (RCGP (Royal College of General Practitioners), 2011) towards the cancers and patient groups with greatest potential for improvement.

Against this background, we have set out to examine socio-demographic variation in stage at diagnosis for female breast and lung cancers (two common cancers responsible for about 30% of all cancer diagnoses and cancer deaths in England (NCIN (National Cancer Intelligence Network), 2008b) during a recent period.

Materials and methods

Data

We analysed information on the stage at diagnosis of East of England patients diagnosed with female breast (‘breast’ hereafter) and lung cancer during the 4-year period 2006–2009 (International Classification of Diseases (ICD)-10 codes C50 and C34, respectively). The study period was chosen as the most recent for which data were available at the time of analysis. Anonymous data were extracted from the Eastern Cancer Registration and Information Centre (ECRIC), a population-based cancer registry covering a general population of 5.7 million. The Registry has excellent performance as indicated by conventional measures of cancer registration quality such as death-certificate only registrations (0%) and, uniquely at present among other English cancer registries, it holds information on stage at diagnosis for a particularly high proportion of patients (NAO (National Audit Office), 2010). Stage at diagnosis was classified using the 5th edition of the TNM classification, comprising stages I–IV (Sobin and Wittekind, 1997). Stage at diagnosis was assigned by CHB and BR, integrating clinical, imaging and pathological information. Patient socioeconomic status was ascribed using the income domain of the Index of Multiple Deprivation (IMD) 2004 deprivation score of the Lower Super Output Area (LSOA) of patients’ residence in order to define quintile groups (1=least deprived, or ‘most affluent’; 5=most deprived) (Office of the Deputy Prime Minister, 2004). The income domain of IMD 2004 incorporates information on the proportion of residents of a small area who live in households receiving state-funded support (for example, in the form of income support, unemployment benefit and tax credits). Tumour histological type was categorised into seven groups for breast (infiltrating ductal carcinoma, lobular carcinoma, mixed ductal lobular, other adenocarcinoma, other specified carcinoma, specified not carcinoma tumours and other unspecified) and eight for lung cancer (adenocarcinoma, squamous cell carcinoma, other non-small cell, small cell carcinoma, large cell carcinoma, carcinoid, other specified and other unspecified), using appropriate ICD-Oncology morphology codes (WHO (Word Health Organisation), 2000).

Analysis

We aimed to examine socio-demographic variation in advanced stage at diagnosis.

Initial analysis was confined to patients with known stage (complete case analysis). Binary logistic regression was used, defining advanced stage at diagnosis both as diagnosis in stages III/IV, or alternatively as diagnosis in stages II–IV (that is, diagnosis other than in stage I). For brevity, we present findings regarding variation in diagnosis in stages III/IV (vs I–II) in the main paper and append analysis relating to diagnosis at stage I (vs II–IV). We considered, but did not use, ordinal logistic regression because initial analysis provided evidence of violation of the proportional odds assumption.

Mixed-effects logistic regression models were used to predict advanced stage at diagnosis, adjusting for age group, deprivation quintile and tumour type (both cancers), sex (lung cancer) and screening detection status (breast cancer) as fixed effect categorical variables and including a random effect for Primary Care Trust. Although the UK government plans to abolish Primary Care Trusts in the future, they were responsible for planning, purchasing and quality assuring preventive services and primary or specialist health care for their residents during the study period (2006–2009). A model using only fixed effect variables for patient characteristics would assume that all observations are independent. In reality, patients within the same organisation may be more similar. Therefore, the models used recognise the hierarchical nature of the data, with patient-level observations being nested within Primary Care Trusts. Therefore, they provided information about patient-level variation (for example, between patients of different age, sex or deprivation status) without the risk of identifying spurious associations arising from potential clustering of different patient subgroups in Primary Care Trusts with higher or lower rates of advanced stage at diagnosis. To explore a potential interaction between age and sex for lung cancer, we have included in a subsequent model an interaction variable for age category (continuous) by sex.

Significance testing was principally based on joint log likelihood ratio tests. We specifically focused aspects of the analysis on patients aged >70 years of age because in recent decades improvements in cancer survival in this age group were smaller compared with those observed in younger patients, a finding thought to partially reflect relatively more advanced stage at diagnosis amongst older patients (Quaglia et al, 2009). Therefore, in addition to testing the overall effect of age, we also examined the significance of differences between patients 70 years compared with patients in all other age groups. Further, tests for linear trend were used to examine the significance of deprivation group gradients by treating deprivation quintile as continuous rather than a categorical variable.

Sensitivity analysis

Complete case analysis may be biased, depending on the mechanism responsible for missing data, that is, if data are not ‘missing completely at random’ (MCAR) (Appendix Table A1). (Sterne et al, 2009). Therefore, in addition, we have used two different sensitivity analysis approaches for handling potential bias arising from missing stage information, bearing in mind different assumptions about the potential mechanisms generating missing data.

First, we used multiple imputation to impute stage. Multiple imputation is a method increasingly used in the context cancer epidemiological studies (He et al, 2008; Nur et al, 2010; Ali et al, 2011). It assumes that data are ‘missing at random’ (MAR), that is, that any systematic differences between the missing and observed values can be estimated using information from the observed data (note: the MAR assumption does not mean that there are no systematic associations between missing data and specific variables) (Appendix Table A1). We included in imputation models survival, tumour histological grade, basis of diagnosis (that is, whether the diagnosis was verified with histology or not), Primary Care Trust and oestrogen receptor status (breast cancer imputation models only) in addition to all the variables used in the analysis models. All exposure variables used in either the analysis or imputation models were complete, except for grade and oestrogen receptor status (used in imputation models).

Second, as it is not possible to verify the MAR assumption empirically, we conducted sensitivity analysis with a more extreme imputation of missing stage that falls under the assumption of data ‘missing not at random’ (MNAR) (Appendix Table A1). To do this, we assigned all patients with unknown stage to the advanced stage category (III/IV), and repeated the analysis. This extreme case scenario approach is based on observations that the survival of patients with missing stage information is typically similar to that of patients diagnosed in advanced stage (ECRIC (Eastern Cancer Registration and Information Centre), 2011). We do not expect this extreme case scenario to represent a true situation, but we use it to illustrate how sensitive the complete case and multiple imputation analyses may be to the MCAR or MAR assumptions, respectively. All analysis was conducted in STATA 11 (StataCorp. 2009, College Station, TX, USA), including using the ice and mim commands used for multiple imputation (Royston, 2007). Further details are provided in Appendix Table A1.

Results

Data relate to 17 836 and 13 286 patients with incident diagnosis of breast and lung cancer. Information on stage at diagnosis was complete for 16 460 (92%) and 10 435 (79%) patients. The completeness of stage information varied substantially between patients with different socio-demographic characteristics and tumour types – missing stage was more frequent in older patients in particular (P<0.001 for both cancers, Appendix Table A2). Among staged patients with breast and lung cancer, 41% and 15% were diagnosed in stage I, and 86% and 21% in stages I/II, respectively (Table 1).

Table 1 Proportion of patients by stage, gender, age and deprivation group categories for breast and lung cancer (2006–2009)

Multivariate complete case analysis

Breast cancer

There was very strong evidence of an association between age and diagnosis in stages III/IV, (Table 2). Specifically for women aged 70 years, the frequency of diagnosis in stages III/IV increased progressively with older age (odds ratios (ORs): 1.21, 1.46, 1.68 and 1.78 for women aged 70–74, 75–79, 80–84 and 85 years, respectively, P<0.001). Increasing deprivation was associated with a greater frequency of stage III/IV diagnosis (joint log likelihood ratio P=0.010, p for trend=0.002; Table 2).

Table 2 Breast cancer. Independent associations of age and deprivation with advanced stage at diagnosis (i.e., stage III/IV vs stage I/II)a (n=16 460)

Lung cancer

There was very strong evidence of an association between age and advanced stage at diagnosis (Table 3). The frequency of stage III/IV diagnosis decreased progressively among patients aged 70 years (ORs: of 0.82, 0.74, 0.73 and 0.66 for patients aged 70–74, 75–79, 80–84 and 85 years, respectively, P<0.001). There was no evidence for deprivation group differences in lung cancer diagnosis at stages III/IV, in spite of an apparent trend towards lower frequency with increasing deprivation (p for trend=0.236) (Table 3). There was strong evidence of a higher frequency of advanced stage at diagnosis in men (odds ratio of 1.14 for diagnosis in stages III/IV, P=0.011). There was no evidence for a differential effect of age in men and women (OR for men vs women per increase in age group category=0.96, 95% CI 0.92–1.01, P=0.100). Although this may reflect lack of power, the size of the interaction indicates that a large synergistic effect is unlikely.

Table 3 Lung cancer. Independent associations of age, deprivation and sex with advanced stage diagnosis (i.e., stage III/IV vs stage I/II)a (n=10 435)

Examining variation in diagnosis in stage I vs II–IV produced overall similar findings for lung cancer. For breast cancer, the findings were similar in respect of variation in older age, but there was no evidence of deprivation differences (Appendix Tables A3 and A4).

Sensitivity analysis

Repeating the analysis using multiple imputation of missing stage information produced highly similar values and patterns to those derived by the complete case analysis (Tables 4 and 5). Specifically, for both breast and lung cancer the same patterns of variation by age, deprivation and sex (for lung cancer only) were apparent. Repeating the analysis using the extreme case scenario approach (missing stage=advanced stage) produced similar patterns of variation for lung cancer. For breast cancer, in the extreme case scenario that the true stage at diagnosis of all women with missing information was either stage III or IV, deprivation differences in advanced stage at diagnosis would be smaller. The full output from all analysis models is provided in Appendix Table A5.

Table 4 Breast cancer. Summary of outputs obtained by complete case analysis and sensitivity analyses (odds ratios for stage III/IV vs I/II).
Table 5 Lung cancer. Summary of outputs obtained by complete case analysis and sensitivity analyses (odds ratios for stage III/IV vs I/II)

Discussion

Summary of findings and comparisons with other literature

Using population-based data, we identified substantial socio-demographic variation in the stage at diagnosis of breast and lung cancer. Breast cancer patients who were 70 years of age had a higher frequency of advanced stage at diagnosis. Conversely, age 70 was associated with a lower frequency of advanced stage at diagnosis for lung cancer. Advanced stage at diagnosis was more frequent in more deprived patients with breast cancer. Men with lung cancer had a higher frequency of advanced stage at diagnosis. The findings were robust to multiple imputation of missing stage (under the MAR assumption). Similar patterns of variation were also observed for extreme case scenario analysis (under the MNAR assumption of missing stage=advanced stage), except that deprivation differences in advanced stage diagnosis for breast cancer were smaller.

Regarding age differences in stage at diagnosis, no apparent age patterns were apparent in a recent analysis of the US breast cancer data (CDC, 2010). For lung cancer, evidence from Denmark indicates a lower frequency of advanced stage at diagnosis with increasing age, as observed in our own study (Dalton et al, 2011).

For breast cancer, the observed socioeconomic differences concord with other evidence from the United Kingdom, United States and Canada, indicating a higher frequency of advanced stage at diagnosis among women of lower socioeconomic position. (Adams et al, 2004; Clegg et al, 2009; Cuthbertson et al, 2009; Booth et al, 2010). For lung cancer, studies from Canada, Denmark and Sweden have indicated only limited socioeconomic differences in advanced stage at diagnosis (Berglund et al, 2010; Booth et al, 2010; Dalton et al, 2011). A previous UK study reported lower frequency of advanced stage at diagnosis in more deprived patients (Brewster et al, 2001). The findings of our study are similar with previous UK research, although there was no independent evidence of an association (P for trend=0.236) that may reflect the lack of power.

Strengths and limitations

The principal strengths of the study are its population-based design, and the high quality and completeness of information on stage at diagnosis and other tumour variables. Unlike previous studies in this field, we adjusted the analysis for tumour subtype and employed sensitivity analyses approaches using different assumptions about potential mechanisms responsible for missing stage data. Previous studies on stage at diagnosis of breast cancer did not encompass adjustment for screening or symptomatic detection status, and this factor complicated the interpretation of age and socioeconomic differences in stage at diagnosis (Macleod et al, 2000; Adams et al, 2004; Cuthbertson et al, 2009). In contrast, our findings indicate that substantial age and deprivation differences in stage at diagnosis of breast cancer exist independently of whether a woman was diagnosed by screening or after symptomatic presentation. A previous UK study on stage at diagnosis of lung cancer only reported on socioeconomic differences (not encompassing age and sex differences) in the mid-1990s (Brewster et al, 2001). Therefore, we believe the findings enrich substantially the currently available evidence on patterns of stage at diagnosis in patients with breast and lung cancer.

The study also has certain limitations. We could not adjust the analysis for ethnicity – a potential confounder of deprivation in particular. During the study period, the proportion of East of England residents belonging to ethnic minorities was relatively small, particularly among persons 65 years (where the majority of cancer cases occur); 97% of the East of England resident population in this age group were estimated as being British White in 2007 (ONS (Office for National Statistics), 2009). Given the demographic characteristics of the East of England population, the findings can be considered to chiefly describe socio-demographic variation in stage at diagnosis among White British patients. Nevertheless, examination of patterns of stage at diagnosis by ethnic group is warranted in the future.

We examined data from a single region that includes about 10% of the total English population. Socioeconomic differences in short-term cancer survival, however, (a marker of early diagnosis) are relatively similar across different English regions (Rachet et al, 2009). Inequalities in cancer treatment patterns observed in East of England cancer patients are also similar to those observed nationwide (Wishart et al, 2010). These considerations indicate that the observed socio-demographic patterns of stage at diagnosis can be applicable to the rest of the English population. The size of the East of England population (5.7 million) is similar to that of several European countries.

In common with previous authoritative UK research (Brewster et al, 2001; Adams et al, 2004; Rachet et al, 2010), we used an area-based measure of socioeconomic status in our study, relating to the population characteristics of highly homogeneous small areas (LSOA) (Woods et al, 2005). Socioeconomic status can be measured either directly (for example, by measuring a person's income, occupation or education) or indirectly (ecologically) by measuring the characteristics of the population of a small area (Liberatos et al, 1988). Both direct and area-based measures of socioeconomic status have limitations (Sloggett et al, 2007), and might be affected by lack of homogeneity within groups (for example, between patients of the same social class, income, education or neighbourhood) (Carstairs and Morris, 1989). Using an area-based measure of socioeconomic status may have either underestimated or overestimated socioeconomic gradients in stage at diagnosis compared with direct measures (Sloggett et al, 2007), and research examining such gradients using both area-based and direct measures would be useful.

Interpretation and research policy implications

A key consideration in interpreting the findings is whether the observed variation in advanced stage at diagnosis, particularly in relation to age, can be considered avoidable. In theory, the findings might in part reflect differences in the malignant potential of tumours between patients of different ages. The analysis was, however, adjusted for tumour subtype. This makes it less likely that age differences in tumour biology can be responsible for major part of the observed age differences in stage at diagnosis.

For breast cancer, it is possible that the observed variation in stage at diagnosis reflects differences in the awareness of cancer symptoms between different patient groups. Awareness of cancer symptoms and signs in the United Kingdom is socio-demographically patterned, and is lower among individuals aged >65 and of lower socioeconomic status (Robb et al, 2009). The findings of the study would support the targeting of breast cancer awareness interventions at older women (Forbes et al, 2011).

The lower frequency of advanced stage at diagnosis among older lung cancer patients could reflect more frequent use of chest X ray investigations in older patients (for example, in the context of investigating either a chest infection or other clinical presentations such as shortness of breath). A recent population study from Denmark indicated a lower frequency of advanced stage lung cancer diagnosis among patients with higher levels of comorbidity and also (as observed in our study) with increasing age (Dalton et al, 2011). Another potential explanation is that ‘stage for stage’ lung cancer is more symptomatic in older patients, for example, either because of a higher propensity to present with concomitant chest infection (prompting earlier investigation and leading to earlier diagnosis) or earlier presentation of dyspnoea because of physiologically declining lung capacity in older age. Further research in this area is clearly needed to explore the validity of these hypotheses, and to identify the mechanisms responsible for excess risk of advanced stage at diagnosis in relatively younger patients.

There was a substantial excess risk of advanced stage at diagnosis among breast cancer women 70 years of age. These differences should not be dismissed as clinically unimportant; in our study sample, one-third of women with breast cancer were aged 70 years. In the United Kingdom, life expectancy for women aged 70 and 80 year-old is 16.5 and 9.5 years, respectively (ONS (Office for National Statistics), 2011). Decreasing the frequency of advanced stage at diagnosis among women 70 years can therefore contribute substantially to reducing avoidable mortality in this age group. In contrast, the findings also identify opportunities for achieving earlier stage diagnosis of lung cancer in relatively young patients (for example, those aged 60–74 years).

Conclusion

There is substantial potential for improvements in early diagnosis in older patients with breast cancer and in relatively younger patients with lung cancer. The findings could help guide breast and lung cancer early diagnosis initiatives and research focused on individuals of different age groups at highest risk of advanced stage at diagnosis. These could, for example, encompass age stratified and tailored cancer symptoms awareness interventions, or educational interventions for physicians and healthcare professionals, targeted at patients of different age groups. We provide an exemplar of how population-based cancer registration information could help support national initiatives aimed at improving early diagnosis, and inform further policy and research.