Health Heart

12b. ADULTS WHO SMOKE INDICATOR Lower Tier Local Authority


Information component Pg 4 Health Summary – Indicator 12
Subject category / domain(s) The way we live
Indicator name (* Indicator title in health profile) Estimated prevalence of adult smoking (*Adults who smoke)
PHO with lead responsibility SEPHO
Date of PHO dataset creation 15/12/2006
Indicator definition Prevalence of smoking, percentage of resident population, adults, 2000-2002, persons
Geography Local Authority: County Districts, Metropolitan County Districts, Unitary Authorities, London Boroughs.
Timeliness Updated as ad-hoc; the next generation of synthetic estimates is due for release in Summer 2007.
Rationale:What this indicator purports to measure Expected prevalence of adult smoking.
Rationale:Public Health Importance Smoking is the most important cause of preventable ill health and premature mortality in the UK. It is linked to respiratory illness, cancer and coronary heart disease. Smoking not only affects the smoker; over 17,000 children under the age of five are admitted to hospital every year with illnesses resulting from passive smoking.A list of disease specific conditions attributable to smoking is published in The Smoking Epidemic in England, HDA, 2004 is a modifiable lifestyle risk factor; effective tobacco control measures can reduce the prevalence of smoking in the population.
Rationale: Purpose behind the inclusion of the indicator To estimate the expected proportion of adult smokers in local authorities given the characteristics of local authority populations.Smoking prevalence is a direct measure of health care need i.e. the ability to benefit from tobacco control interventions, including smoking cessation services.
Rationale:Policy relevance Choosing Health:  Making healthy choices easier. Publications/PublicationsPolicyAndGuidance/DH_4094550. Smoking Kills.  A White Paper on Tobacco Publications/PublicationsPolicyAndGuidance/DH_4006684Tackling Health Inequalities:  A Programme for Action Publications/PublicationsPolicyAndGuidance/DH_4008268
Interpretation: What a high / low level of indicator value means Given the characteristics of the local population, a high indicator value (red circle in health summary chart) represents a statistically significant higher level of estimated adult smoking prevalence for that local authority when compared to the national value.Given the characteristics of the local population, a low indicator value (amber circle in health summary chart) represents a statistically significant lower level of estimated adult smoking prevalence for that local authority when compared to the national value. However smoking at any prevalence level greater than 0 is undesirable, and therefore a low indicator value should not m ean that PH action is not needed.
Interpretation: Potential for error due to type of measurement method It is important that users note that as these synthetic estimates are modelled they do not take account of any additional local factors that may impact on the true smoking prevalence rate in an area (e.g. local initiatives designed to reduce smoking). The figures, therefore, cannot be used to monitor performance or change over time.The model is a non-aetiological model i.e. is not based on known aetiological risk factors.  This may lead to estimated smoking drinking levels which are at odds with, for example, local lifestyle survey results or modelled estimates which use known co-variates such as socio-economic status, age, gender and ethnicity such as the smoking prevalence estimates modelled in the Health Poverty Index available at  (see variables used in generation of model in calculation of indicator section below).There may also be a discrepancy between the modelled lower tier estimates (districts) and upper tier (County geographies and above) estimates which are based on actual Health Survey for England data.  This has lead to inconsistencies between lower tier and county estimates for some areas as the datasets are derived using different methods.
Interpretation: Potential for error due to bias and confounding The synthetic estimates are subject to both sampling error and modelling error. Sampling error arises from the fact that only one of a number of possible samples from the population has been selected. Generally, the smaller the sample size the larger the variability in the estimates that one would expect to obtain from all the possible samples. The use of statistical models for prediction involves making assumptions about relationships in the data. The suitability of the chosen models for the given data and the validity of the model in describing real world dynamics have a bearing on the nature and magnitude of the errors introduced. A key source of modelling error arises from omitting variables that would otherwise help improve the model predictions either by error or because there is no available or reliable data source for them.The synthetic estimate generated for a particular area is the expected measure for that area based on its population characteristics – and not an estimate of the actual prevalence. In statistical terms, the synthetic estimate is actually a biased estimate of the true value for the area and, as such, should be treated with caution. As mentioned above, the model-based estimates are unable to take account of any additional local factors that may impact on the true prevalence rate (e.g. local initiatives designed to reduce smoking levels).Validation exercises were used to check the appropriateness of the chosen models. Confidence intervals are placed around the synthetic estimates to capture both sampling and modelling error. The confidence intervals provide a range within which we can be fairly sure the ‘true’ value for that area lies. We recommend that users need to look at the confidence interval for the estimates, not just the estimate. Estimates for two areas can only be described as significantly different if the confidence intervals for the estimates do not overlap.Users should also note that the potential sources of bias and error also apply to any ranking or banding of the small-area estimates. NatCen do not encourage any ranking of small area estimates within larger areas such as Local Authorities, Primary Care Organisations and Strategic Health Authorities.
Confidence Intervals: Definition and purpose A confidence interval is a range of values that is normally used to describe the uncertainty around a point estimate of a quantity, for example, a mortality rate. This uncertainty arises as factors influencing the indicator are subject to chance occurrences that are inherent in the world around us. These occurrences result in random fluctuations in the indicator value between different areas and time periods. In the case of indicators based on a sample of the population, uncertainty also arises from random differences between the sample and the population itself.The stated value should therefore be considered as only an estimate of the true or ‘underlying’ value. Confidence intervals quantify the uncertainty in this estimate and, generally speaking, describe how much different the point estimate could have been if the underlying conditions stayed the same, but chance had led to a different set of data. The wider is the confidence interval the greater is the uncertainty in the estimate.Confidence intervals are given with a stated probability level. In Health Profiles 2007 this is 95%, and so we say that there is a 95% probability that the interval covers the true value. The use of 95% is arbitrary but is conventional practice in medicine and public health. The confidence intervals have also been used to make comparisons against the national value. For this purpose the national value has been treated as an exact reference value rather than as an estimate and, under these conditions, the interval can be used to test whether the value is statistically significantly different to the national. If the interval includes the national value, the difference is not statistically significant and the value is shown on the health summary chart with a white symbol. If the interval does not include the national value, the difference is statistically significant and the value is shown on the health summary chart with a red or amber symbol depending on whether it is worse or better than the national value respectively.


Indicator definition: Variable Prevalence of smoking.Smoking is defined as self-reported current cigarette smoking.
Indicator definition: Statistic Percentage
Indicator definition: Gender Persons
Indicator definition: age group Adults (aged 16 and over)
Indicator definition: period 2000-2002
Indicator definition: scale Per resident adult population aged 16 and over
Geography: geographies available for this indicator from other providers Ward and new PCOavailable from Statistics/StatisticalWorkAreas/Statisticalworkareaneighbourhood/DH_4116713
Dimensions of inequality: subgroup analyses of this dataset available from other providers None.
Data extraction: Source National Centre for Social Research (NatCen).
Data extraction: source URL Data received directly from NatCen.
Data extraction: date February 2006
Numerator: definition Numerator is not applicable because the synthetic estimates at local authority level are a weighted average of the ward level synthetic estimates (proportion), aggregated to local authority level, weighting the contribution of each ward in proportion to its population size, derived from the Census 2001 counts.
Numerator: source Model estimates by NatCen using data from a number of sources including Health Survey for England 2000-2002, Census 2001, Index of Multiple Deprivation 2004.
Denominator: definition Not applicable.
Denominator: source Not applicable.
Data quality: Accuracy and completeness The model-based approach generates estimates that are of a different nature from standard survey estimates because they are dependent upon how well the relationship between healthy lifestyle behaviours for individuals and the Census/administrative information about the area in which they live are specified in the model.The accuracy and completeness of the information will be subject to the same constraints surrounding the Health Survey for England and Census data sets on which they are based.


Numerator: extraction Not Applicable
Numerator: aggregation /allocation The ward level synthetic estimates (proportion) are aggregated to local authority level by weighting the contribution of each ward in proportion to its population size, derived from the Census 2001 counts.
Numerator data caveats See Interpretation: potential sources of error section.
Denominator data caveats Not applicable.
Methods used to calculate indicator value The multi-level logistic regression model-based method used to produce the 2000-2002 ward and PCO level estimates combined two sets of information.  First, the HSfE provided health behaviour data (e.g. smoking).  Second, the 2001 Census and other administrative data sources provided information about the characteristics of the area in which informants lived.  A statistical model was used to examine the relationships between the healthy lifestyle behaviours and area characteristics. The final model was then used to calculate the prevalence estimate of smoking for all wards and PCOs in England.The ward-level characteristics associated with increased propensity for a person to smoke were: a higher proportion of females aged 25-34; a higher proportion of residents of working age who had a limiting longstanding illness; an Index of Multiple Deprivation (IMD) decile ranking of 5 or 8 (where 1 is least deprivation and 10 greatest deprivation); being located in the North West region and a relatively higher education, skills and training deprivation score.  The ward level characteristics associated with lower propensity for a person to smoke were: a higher proportion of household residents over the age of 16 who were living as a couple; a higher proportion of non-white residents; a higher proportion of residents who were classified as being in managerial and professional occupations; a higher attendance allowance claimant rate and being located in the South West Region.
Small Populations: How Isles of Scilly and City of London populations have been dealt with Isles of Scilly are included with Penwith local authority.  City of London is excluded from the dataset.
Disclosure Control Not applicable.
Confidence Intervals calculation method The following method describes confidence interval calculation method for ward level synthetic estimates.  A similar method was applied for local authority level data.Markov Chain Monte Carlo (MCMC) methods, within a Bayesian framework, were used to generate the confidence intervals (actually referred to as ‘credible intervals’) for the synthetic estimates. One of the key differences between Bayesian statistics and traditional (frequentist) methods is that the parameter estimates are treated as random with corresponding (prior) probability distributions; there is no single point estimate for each parameter as there would be for traditional statistical methods. To generate the estimates for parameters, it is necessary to run an iterative procedure (the MCMC procedure) that generates a series of values for each parameter. The sample of values for each parameter can then be used to estimate, for example, the mean and variance for a parameter. The confidence intervals for the synthetic estimates were derived as follows. First, we generated a set of 100,000 values for each parameter using the MCMC procedure in MLwiN. For each ward, we then generate 100,000 feasible values of the synthetic estimate – one for each set of parameter estimates. We then simulated 100,000 estimates of the true measure for the ward by including a random term derived from the area-level variance2; this is done by drawing from the normal distribution with the appropriate estimate of the variance (equal to the area-level variance) and zeromean. So, the estimate of the true value based on the rth set of parameters for ward k

The confidence interval (credible interval) for the synthetic estimate for the ward can then be derived directly from this set of feasible true estimates – the 95% credible interval as the range between the 2,500th and 7,500th largest feasibletrue estimate (i.e. would be the 2.5th and 97.5th percentiles).

Posted in: Indicators