Relative risk

In statistics and mathematical epidemiology, relative risk (RR) is the risk of an event (or of developing a disease) relative to exposure. Relative risk is a ratio of the probability of the event occurring in the exposed group versus the control (non-exposed) group.


 * $$RR= \frac {p_\mathrm{exposed}}{p_\mathrm{control}} $$

For example, if the probability of developing lung cancer among smokers was 20% and among non-smokers 1%, then the relative risk of cancer associated with smoking would be 20. Smokers would be twenty times as likely as non-smokers to develop lung cancer.

Statistical use and meaning
Relative risk is used frequently in the statistical analysis of binary outcomes where the outcome of interest has relatively low probability. It is thus often suited to clinical trial data, where it is used to compare the risk of developing a disease, in people not receiving the new medical treatment (or receiving a placebo) versus people who are receiving an established (standard of care) treatment. Alternatively, it is used to compare the risk of developing a side effect in people receiving a drug as compared to the people who are not receiving the treatment (or receiving a placebo). It is particularly attractive because it can be calculated by hand in the simple case, but is also susceptible to regression modelling, typically in a Poisson regression framework.

In a simple comparison between an experimental group and a control group:


 * A relative risk of 1 means there is no difference in risk between the two groups.
 * A RR of < 1 means the event is less likely to occur in the experimental group than in the control group.
 * A RR of > 1 means the event is more likely to occur in the experimental group than in the control group.

As a consequence of the Delta method, the log of the relative risk has a sampling distribution that is approximately normal with variance that can be estimated by a formula involving the number of subjects in each group and the event rates in each group (see Delta method). This permits the construction of a confidence interval (CI) which is symmetric around $$\log(RR)$$, i.e.


 * $$CI = \log(RR)\pm \mathrm{SE}\times z_\alpha$$

where $$z_\alpha$$ is the standard score for the chosen level of significance and SE the standard error. The antilog can be taken of the two bounds of the log-CI, giving the high and low bounds for an asymmetric confidence interval around the relative risk.

In regression models, the treatment is typically included as a dummy variable along with other factors that may affect risk. The relative risk is normally reported as calculated for the mean of the sample values of the explanatory variables.

Association with odds ratio
Relative risk is different from the odds ratio, although it asymptotically approaches it for small probabilities. In fact, the odds ratio has much wider use in statistics, since logistic regression, often associated with clinical trials, works with the log of the odds ratio, not relative risk. Because the log of the odds ratio is estimated as a linear function of the explanatory variables, the estimated odds ratio for 70-year-olds and 60-year-olds associated with type of treatment would be the same in a logistic regression models where the outcome is associated with drug and age, although the relative risk might be significantly different. In cases like this, statistical models of the odds ratio often reflect the underlying mechanisms more effectively.

Since relative risk is a more intuitive measure of effectiveness, the distinction is important especially in cases of medium to high probabilities. If action A carries a risk of 99.9% and action B a risk of 99.0% then the relative risk is just over 1, while the odds associated with action A are almost 10 times higher than the odds with B.

In medical research, the odds ratio is favoured for case-control studies and retrospective studies. Relative risk is used in randomized controlled trials and cohort studies.

In statistical modelling, approaches like poisson regression (for counts of events per unit exposure) have relative risk interpretations: the estimated effect of an explanatory variable is multiplicative on the rate, and thus leads to a risk ratio or relative risk. Logistic regression (for binary outcomes, or counts of successes out of a number of trials) must be interpreted in odds-ratio terms: the effect of an explanatory variable is multiplicative on the odds and thus leads to an odds ratio.

Size of relative risk and relevance
In the standard or classical hypothesis testing framework, the null hypothesis is that RR=1 (the putative risk factor has no effect). The null hypothesis can be rejected in favor of the alternative hypothesis of that the factor in question does affect risk if the confidence interval for RR excludes 1.

Critics of the standard approach, notably including John Brignell and Steven Milloy, believe published studies suffer from unduly high type I error rates, and have argued for an additional requirement that the point estimate of RR should exceed 2   (or, if risks are reduced, be below 0.5) and have cited a variety of statements by statisticians and others supporting this view. The issue has arisen particularly in relation to debates about the effects of passive smoking, where the effect size appears to be small (relative to smoking), and exposure levels are difficult to quantify in the affected population.

In support of this claim, it may be observed that, if the base level of risk is low, a small proportionate increase in risk may be of little practical significance. (In the case of lung cancer, however, the base risk is substantial).

In addition, if estimates are biased by the exclusion of relevant factors, the likelihood of a spurious finding of significance is greater if the estimated RR is close to 1. In his paper "Why Most Published Research Findings Are False" John Ioannidis writes, "The smaller the effect sizes in a scientific field, the less likely the research findings are to be true. [...] research findings are more likely true in scientific fields with [...] relative risks 3–20 [...], than in scientific fields where postulated effects are small [...] (relative risks 1.1–1.5)." "if the majority of true genetic or nutritional determinants of complex diseases confer relative risks less than 1.05, genetic or nutritional epidemiology would be largely utopian endeavors."

In assessing results claiming an increase of relative risk arising from exposure to a hazard, statisticians and epidemiologists consider a range of factors including the size of the effect, the level of statistical significance, whether the results arise from a clinical trial or observation of a population, the significance of possible confounding factors, the extent to which results have been replicated, and the presence or absence of a biomedical model for the claimed effect. Important confounding factors for observational studies of health risks include tobacco smoking and social class.

While few statisticians accept the general claim that a relative risk level greater than 2 is required before a finding of increased risk can be accepted, most agree with this view in relation to findings from single studies without biomedical support. Marcia Angell of the New England Journal of Medicine has stated


 * As a general rule of thumb we are looking for a relative risk of three or more [before accepting a paper for publication], particularly if it is biologically implausible or if it's a brand-new finding.

The arguments of Milloy, Brignell and others, put forward in relation to passive smoking, have been criticised by epidemiologists. Their approach to epidemiology, involving efforts to discredit individual studies rather than addressing the evidence as a whole, was described in the American Journal of Public Health:

A major component of the industry attack was the mounting of a campaign to establish a "bar" for "sound science" that could not be fully met by most individual investigations, leaving studies that did not meet the criteria to be dismissed as "junk science." The campaign also included attempts to characterize relative risks of 2 or less as highly questionable and not amenable to investigation by epidemiologic methods.

These efforts were largely abandoned by the tobacco industry when it became clear that no independent epidemiological organization would agree to the standards proposed by Philip Morris et al.

Statistical significance (confidence) and relative risk
Whether a given relative risk can be considered statistically significant is dependent on the relative difference between the conditions compared, the amount of measurement and the noise associated with the measurement (of the events considered). In other words, the confidence one has, in a given relative risk being non-random (i.e. it is not a consequence of chance), depends on the signal-to-noise ratio and the sample size.

Expressed mathematically, the confidence that a result is not by random chance is given by the following formula by Sackett :

$$confidence = \frac{signal}{noise} \times \sqrt{sample\ size}$$

For clarity, the above formula is presented in tabular form below.

Dependence of confidence with noise, signal and sample size (tabular form)

In words, the confidence is higher if the noise is lower and/or the sample size is larger and/or the effect size (signal) is increased. The confidence of a relative risk value (and its associated confidence interval) is not dependent on effect size alone. If the sample size is large and the noise is low a small effect size can be measured with great confidence. Whether a small effect size is considered important is dependent on the context of the events compared.

In medicine, small effect sizes (reflected by small relative risk values) are usually considered clinically relevant (if there is great confidence in them) and are frequently used to guide treatment decisions. A relative risk of 1.10 may seem very small, but over a large number of patients will make a noticeable difference. Whether a given treatment is considered a worthy endeavour is dependent on the risks, benefits and costs.