Website and R Package for Computing E-Values (2024)

Journal List
HHS Author Manuscripts
PMC6066405

As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsem*nt of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more: PMC Disclaimer | PMC Copyright Notice

Epidemiology. Author manuscript; available in PMC 2019 Sep 1.

Published in final edited form as:

Epidemiology. 2018 Sep; 29(5): e45–e47.

To the Editor

Observational studies often attempt to address questions related to causation. However, even with statistical adjustment for a number of measured confounders, residual unmeasured confounding may still compromise causal conclusions. New methods help quantify evidence strength for causality in the possible presence of unmeasured confounding through a new measure called the E-value^1,2. The E-value is defined as the minimum strength of association on the risk ratio scale that an unmeasured confounder would need to have with both the exposure and the outcome, conditional on the measured covariates, to fully explain away a specific exposure–outcome association. As discussed below, the E-value makes no assumptions on whether the unmeasured confounders are binary, continuous, or categorical, on how they are distributed, or on the number of confounders, and it can be applied to several common outcome types in observational research. To facilitate these sensitivity analyses, we provide an R package (“EValue”³) and also an online E-value calculator (https://mmathur.shinyapps.io/evalue/) that compute E-values for a variety of outcome measures, including risk ratios, odds ratios, rate ratios, risk differences, hazard ratios, and standardized mean differences².

Suppose we have an observational study with a binary exposure E, a binary outcome D, and a possible binary unmeasured confounder U (though note that as discussed below, the E-value applies more generally). Two sensitivity parameters jointly determine the maximum bias that could result from unmeasured confounding in the estimated relative risk of the exposure on the outcome. First, to characterize the strength of association between the unmeasured confounder and the outcome, let RR_UD be the relative risk of the outcome comparing subjects with versus without the unmeasured confounder (U = 1 vs. U = 0) and taken as the maximum over unexposed (E = 0) or exposed (E = 1) subjects. Second, to characterize the extent to which the prevalence of the unmeasured confounder is unbalanced between the exposed and the unexposed, let RR_EU be the relative risk of U = 1 versus U = 0, comparing the exposed to the unexposed group and again conditional on any measured confounders. Then, if the two sensitivity parameters RR_UD and RR_EU are taken to be equal, the E-value is the minimum value for both associations that would be capable of attenuating the observed association to the null¹. The E-value can be calculated for an observed risk ratio (denoted RR) by $E-value = R R + \sqrt{R R \times (R R - 1)}$ . If the original risk ratio is below 1, then one first takes the inverse before applying the E-value formula. This formula can also be used for hazard ratios or odds ratios with outcomes that are rare at the end of follow-up. For hazards or odds ratio with a common outcome at the end of follow-up, or with continuous outcomes, approximate E-values can still be obtained through various transformations¹.

For example, with an observed risk ratio of RR = 1.33, we can calculate an E-value of $1.33 + \sqrt{1.33 \times (1.33 - 1)} = 2$ . This E-value indicates that if there were an unmeasured confounder that (1) doubled the risk of the outcome among either the unexposed or the exposed (RR_UD = 2) and (2) that were also twice as prevalent among the exposed than among the unexposed (RR_EU = 2), this amount of confounding could suffice to completely “explain away” the observed association, but weaker confounding could not. Although this interpretation of the two sensitivity parameters is given in the context of a binary unmeasured confounder, the E-value applies without modification to multiple, potentially categorical, confounders by considering the maximum risk ratio comparing any two categories of the unmeasured confounder(s). With a continuous confounder, the interpretations of the parameters RR_UD and RR_EU are slightly different, but the mathematical form of the E-value is unchanged².

Ideally, we believe, E-values would be reported routinely for observational studies to better characterize evidence strength for causality above and beyond the presence of a “statistically significant”, but potentially spurious, association^1,2,4. The E-value could be reported for both the point estimate and the corresponding confidence interval limit that is closer to the null; these E-values represent the minimum confounding strength, respectively, capable of attenuating the point estimate to the null and capable of attenuating the confidence interval such that it includes the null¹. Last, it is easy to calculate E-values for values of a true effect other than the null of RR = 1 to assess how much confounding would be needed to move the estimate to any other value. For example, as part of a holistic assessment of the scientific importance of the true causal effect in an observational study, one could choose an effect size threshold below which a causal effect might be considered too weak to be meaningful, as informed by the specific scientific context. Then, one could assess the E-value capable of attenuating the observed association to this small, non-null effect size threshold, or alternatively, to increase a near-null result to one that is of meaningful size in the given scientific context².

In addition to calculating E-values, the R package we provide also produces plots visualizing the maximum possible bias in the observed association as a function of RR_EU and RR_UD. In contrast to existing code², the present R package handles more outcome types and can characterize the minimum confounding strength capable of attenuating the observed association to a non-null threshold of scientific importance. Additionally, we provide a freely available website (https://mmathur.shinyapps.io/evalue/) to easily compute E-values without requiring coding or familiarity with R.

Acknowledgments

Source of funding: MM was supported by National Defense Science and Engineering Graduate Fellowship 32 CFR 168a. PD was supported by IES Grant R305D150040 from the Institute for Education Science and DMS grant 1713152 from the National Science Foundation. CAR received salary support from McGill University’s Department of Epidemiology, Biostatistics, and Occupational Health. TVW was supported by NIH grant ES017876. The funders had no role in the design, conduct, or reporting of this research.

We thank Jaffer Zaidi for serving as a pilot tester.

Footnotes

Conflicts of interest: The authors declare that they have no conflicts of interest.

Reproducibility: No data analyses were conducted.

Reproducibility

All code for the R package and website is publicly available (https://github.com/mayamathur/evalue).

References

1. VanderWeele TJ, Ding P. Sensitivity analysis in observational research: Introducing the E-value. Annals of Internal Medicine Am Coll Physicians. 2017;167(4):268–274. [PubMed] [Google Scholar]

2. Ding P, VanderWeele TJ. Sensitivity analysis without assumptions. Epidemiology Wolters Kluwer Health. 2016;27(3):368. [PMC free article] [PubMed] [Google Scholar]

3. Mathur MB, Ding P, VanderWeele TJ. Package ‘EValue’, version 1.0.0. 2017. [Google Scholar]

4. Localio AR, Stack CB, Griswold ME. Sensitivity analysis for unmeasured confounding: E-values for observational studies. Annals of Internal Medicine Am Coll Physicians. 2017;167(4):285–286. [PubMed] [Google Scholar]