Do proton pump inhibitors increase mortality? A systematic review and in‐depth analysis of the evidence

Abstract Proton pump inhibitors (PPIs) were primarily approved for short‐term use (2 to 8 weeks). However, PPI use continues to expand. Widely believed to be safe, we reviewed emerging evidence on increased mortality with PPI long‐term use. Our 2016 systematic PPI drug class review found that mortality was not reported as an outcome in randomized controlled trials (RCTs) that directly compared different PPIs. We sought more recent and comprehensive data on PPI harm outcomes from research syntheses as a follow‐on. A search was conducted from January 2014 to January 2020. We searched MEDLINE, EMBASE, and Cochrane Central for evidence from systematic reviews (SRs) and primary studies reporting all‐cause mortality in adults treated with a PPI for any indication (duration >12 weeks) compared to patients without PPI treatment (no use, placebo, or H2RA use). Two independent investigators assessed study eligibility, synthesized evidence, and assessed the quality of the included studies. Data on all‐cause mortality were sought, analyzed, critically examined, and interpreted herein. From 1304 articles, one SR was identified that reported on all‐cause mortality. The SRs pooled three observational studies with data to 1 year: odds ratio, 95% confidence interval (CI) 1.53‐1.84. A RCT, the COMPASS (Cardiovascular Outcomes for People Using Anticoagulant Strategies) RCT with data to 3 years: hazard ratio (HR) 1.03, 95% CI 0.92‐1.15. The US Veterans Affairs cohort study using a large national dataset with data to 10 years found a HR of 1.17, 95% CI (1.10‐1.24) and (NNH) of 22. The most common causes of death were from cardiovascular and chronic kidney diseases, with an excess death of 15 and 4 per 1000 patients, respectively, over the 10‐year period. Harms arising from real‐world medication use are best evaluated using a pharmacovigilance “convergence of proof” approach using data from a variety of sources and various study designs. Given that most PPI indications for use recommended a treatment duration of less than 12 weeks, it seems clear that PPIs were significantly overused in older patients. The median exposure time to PPI ranged from 1 to 4.6 years. Signals of serious harms including increased mortality with long‐term PPI use are reported in observational studies. The COMPASS trial findings are not inconsistent with contemporaneous findings from observational studies. The COMPASS RCT was unlikely to detect an increase in mortality given the trial was not powered to detect this outcome. The potential increase in mortality in older patients associated with prolonged PPI exposure needs to be conveyed to health professionals. Clinicians and patients may be able to reverse the relentless expansion of long‐term PPI exposure by reviewing indications and considering potential harms as well as benefits.

ported on all-cause mortality. The SRs pooled three observational studies with data to 1 year: odds ratio, 95% confidence interval (CI) 1.53-1.84. A RCT, the COMPASS (Cardiovascular Outcomes for People Using Anticoagulant Strategies) RCT with data to 3 years: hazard ratio (HR) 1.03, 95% CI 0.92-1. 15. The US Veterans Affairs cohort study using a large national dataset with data to 10 years found a HR of 1.17, 95% CI (1.10-1. 24) and (NNH) of 22. The most common causes of death were from cardiovascular and chronic kidney diseases, with an excess death of 15 and 4 per 1000 patients, respectively, over the 10-year period. Harms arising from real-world medication use are best evaluated using a pharmacovigilance "convergence of proof" approach using data from a variety of sources and various study designs. Given that most PPI indications for use recommended a treatment duration of less than 12 weeks, it seems clear that PPIs were significantly overused in older patients. The median exposure time

| INTRODUC TI ON
Prescription proton pump inhibitors (PPIs) are primarily approved for short-term use (2 to 8 weeks) for peptic ulcer disease (PUD), reflux esophagitis, and nonulcer dyspepsia. 1 However, PPI use continues to expand. In British Columbia, Canada for example, 64% of adults ≥age 65 with a prescription for a PPI in 2018 had a cumulative exposure exceeding 2 years; 44% exceeded 5 years.
Long-term PPI use is approved by regulators and/or endorsed by gastroenterologists for prevention of gastric damage associated with the adverse effects of other drugs, gastric bleeding, severe esophagitis or Barrett's esophagus, or to prevent gastric damage associated with adverse effects of other drugs, all indications, which only account for a small proportion of long-term PPI use in Canada. 2,3 While studies of patient populations with indications for long-term use are worthy of study, this group is out of scope for our review.
Unnecessary overuse has not been identified as a concern in this population.
The short-term benefits of PPIs as a drug class are not disputed. [2][3][4][5] However, the belief that the positive net benefit to harm ratio with short-term treatment extends to long-term use (greater than 12 weeks) has been challenged by postmarket analyses. [6][7][8][9] Health Canada 10 has issued warnings for a number of adverse events and drug interactions that were not recognized when the first A number of professional associations and independent drug bulletins recommend reducing PPI exposure and provide tools for deprescribing. 11,12 Encouraging restraint has yet to achieve a measurable impact on long-term PPI prescribing for the common indications. Is the evidence of harms sufficient that we should intensify efforts to constrain new prescriptions and to deprescribe for long-term users?
In a systematic review conducted by our group in 2016, we reported on the comparative effectiveness of PPIs, benefits, and harms, as well as evidence for considering deprescribing. 2,3 In many clinical settings, we do not know whether the benefits of long-term PPI use outweigh the harms. Harms were underreported in RCTs that directly compared different PPIs. Mortality, serious adverse events, and withdrawal due to adverse events were not reported. 2,3 We found no long-term, head-to-head comparative RCTs that were specifically designed to monitor adverse effects of PPIs.
Recent evidence from a clinical trial 13 has raised doubts on a growing consensus from observational studies and systematic reviews (SRs) of observational studies that PPI exposure is associated with increased risk of death; the risk increases with increased exposure. 14-16 Therefore, the aim of this review was to summarize and critically examine evidence from SRs and primary studies reporting all-cause mortality.

| Searching strategy
Recently in our 2016 systematic review, mortality outcome was not reported in RCTs that directly compared different PPIs. 2,3 An updated search was performed by information specialist from January 2014-the date of our last comprehensive search and PPI class review to January 2020 in the following databases: PubMed, MEDLINE, EMBASE (through Ovid), the Cochrane Central Register of Controlled Trials (CENTRAL), and the Cochrane Database of Systematic Reviews. The combination of the following medical subheadings (MeSH) and keywords was used for database searching: proton pump inhibitors or PPI and adverse events or esomeprazole or pantoprazole or omeprazole or to PPI ranged from 1 to 4.6 years. Signals of serious harms including increased mortality with long-term PPI use are reported in observational studies. The COMPASS trial findings are not inconsistent with contemporaneous findings from observational studies. The COMPASS RCT was unlikely to detect an increase in mortality given the trial was not powered to detect this outcome. The potential increase in mortality in older patients associated with prolonged PPI exposure needs to be conveyed to health professionals. Clinicians and patients may be able to reverse the relentless expansion of long-term PPI exposure by reviewing indications and considering potential harms as well as benefits.

K E Y W O R D S
long-term use, medication harms, mortality, mortality studies, pharmacovigilance, proton pump inhibitors, systematic reviews rabeprazole or lansoprazole and any indications. Alternative spellings and abbreviations of the above keywords were also considered with no limitation on the language or the publishing date.

| Inclusion criteria
Systematic reviews (with or without meta-analysis) or primary studies were included that met the following criteria: (Cochrane "PICOS" Primary studies were sought and included that had not been available by SR search cutoff dates up to January 2020.

| Data extraction and synthesis
Two investigators (MBE and CJG) independently selected eligible systematic review. Disagreement was resolved by discussion with another investigator (VM). Data on all-cause mortality were sought, synthesized, analyzed, critically examined, and interpreted from SRs and primary studies. We extracted odds ratio (ORs), relative risk (RRs), or hazard ratios (HRs) from the included studies with 95% CI. We did not reanalyze the authors' original data or conduct new meta-analyses by combining studies.

| Harm outcome hierarchy
The Therapeutics Initiative analyses all available evidence for harms according to a consistent hierarchy of harm outcomes, ranked by clinical importance starting with all-cause mortality, cause-specific mortality, total serious adverse events, and other adverse events.
For this study we limited our reporting of findings to all-cause mortality and cause-specific mortality.

| RE SULTS
Three recent studies reporting on all-cause mortality with PPI use were identified that met our inclusion criteria; each having a different study design. 17 identified that specifically included all-cause mortality as an outcome in its protocol. A RCT and a longitudinal cohort study that were published after the date of our search for SRs met our inclusion criteria. Figure 1 shows selection process and provides the reasons why some articles were excluded.

| Appraisal of included studies
The included studies used different study designs and can be evaluated using the three sets of quality criteria appropriate for their respective design. Such heterogeneity is appropriate for considerations of medication harm in the real world. Each publication has been peer reviewed and meets sufficient criteria to be valid for the research question, methods and findings presented.
The representativeness across all included studies is problematic as the populations were primarily Caucasian that may limit generalizability to other populations. It is known that up to 20% of Asians (vs 3% Caucasians) have low CYP2C19 enzyme activity and are therefore poor metabolizers of PPIs with a doubling of plasma PPI levels and therefore greater exposure. 17,21 Each study also has limitations within the respective study design. These are highlighted here.

| Observational studies
Common to all the included studies is the challenge of misclassification of drug use. Prescription data may not truly reflect drug con- confounder that also caused them to be prescribed a PPI. Healthy populations were not however well represented in the study populations of any of the analyses and each demonstrated that the control population was comparable on comorbidities as well as characteristics such as age and sex.
The pooled analysis SR by Shiraev 2018 included studies if they "examined death or atherosclerotic events (including myocardial infarct, stroke, or peripheral arterial events), and compared a group exposed to PPIs with a control group (not exposed to PPIs), in any group of patients". 17 The search cutoff date of

| RCT
There are several reasons for cautious interpretation of the COMPASS trial results. Serious harms such as cardiovascular disease, kidney diseases, or development of cancers over relatively long time periods because of the slow onset. The duration of exposure and follow-up and consistency with the VA cohort means that serious but relatively rare harm may not have been detected. The authors recognized that low event rates for some outcomes limited their ability "to exclude a modest risk increase" from pantoprazole. Of the three included studies, the COMPASS trial was the only one with potential conflict of interest due to funding of the research and investigators. There is also the chal-

| Interpretation
The VA cohort study found an excess of deaths in its sample that included 12 times as many participants as the COMPASS RCT and follow-up that was over three times longer. Furthermore, the Shiraev 2018 SR pooled analysis was heavily weighted by a study using the Danish national level administrative data collected from routine care transactions. It would be difficult to create an RCT of an adverse drug event on the scale of either study.
The median exposure to PPI was longer than in the COMPASS

RCT (4.6 years vs <3 years). With only 3 years of follow-up, COMPASS
did not have statistical power to detect 10% increases in risk for several of its prespecified outcomes. For example, COMPASS's point estimate hazard ratio of 1.17 (0.94 to 1.45) for chronic kidney disease was similar to the VA's hazard ratio of 1. 16 (1.01 to 1.33) for acute kidney injury.
In the COMPASS RCT, pantoprazole increased enteric infections (mostly C difficile) with an odds ratio of 1.33 (1.01-1.75), absolute risk increase of 0.4%. However, the incident rates for most serious harm, such as cardiovascular disease, hospitalizations, chronic kidney disease, or dementia, were consistently higher among pantoprazole users compared to placebo group. The COMPASS authors admit this limitation, yet conclude perhaps inappropriately that PPIs "are not associated with any long-term harm." 13  13 Cause-specific mortality data are consistent with the overall data analysis as well as consistent with findings of SRs that report on cardiovascular 17 and kidney disease. 25 This consistency is an indication of the VA study's internal validity-the findings are consistent within the study. And the study is consistent with other data 17,25 which is an indication of external validity-that the findings may be applicable beyond this study population. The Bradford-Hill criteria provide another framework used to increased gastric microbiota and small intestine bacterial overgrowth, reduced immune response, tubular-interstitial inflammation, increased bone turnover, and accumulation of amyloid in the brain. 27 PPI use was also significantly associated with renal insufficiency even after adjusting for acute interstitial nephritis (AIN) in the Xie et al, 2019 VA cohort analysis. AIN is a drug reaction known to be caused by PPI. 28 SRs of observational studies have found PPIs to be associated with chronic kidney disease (CKD). 29 The finding of continued renal insufficiency even after adjustment suggested the existence of unrecognized AKI or chronic latent renal injury. 18

| D ISCUSS I ON
An evidence-based approach to interpretation of clinical trial data turns first to the hierarchy of evidence. RCTs are higher on the hierarchy than observational studies because randomization provides powerful protection against known and unknown confounders that observational studies do not. Given that the COMPASS findings were from an RCT and found no increase in all-cause mortality and the observational studies found an increase in all-cause mortality with PPI use, the hierarchy of evidence points to the interpretation that the RCT findings should be accepted and the observational findings understood as being most likely explained by an unidentified confounder. 30 Pharmacovigilance-"the science and activities relating to the detection, assessment, understanding, and prevention of adverse effects or any other drug-related problem" 31 -challenges the use of the hierarchy of evidence for evaluating drug risk: [N]one of the methods … (experimental data, clinical trials, spontaneous notifications, case-control studies, cohort studies and data mining) should be consid- and analytic strategies. 33 Older observational studies that use datasets to look for associations between the independent and dependent variables using factorial analyses are primitive by comparison.
Clinicians are correct in being skeptical of associations that are in the range of OR and HR less than 2, given the vulnerability of such analyses to unrecognized confounders. In evaluating clinical data, Additional features like propensity score analysis and using physician preferences as a calibration check on the analysis also provide important safeguards.
The 95% CI provides more accurate representation of reality than single point estimate. COMPASS researchers interpret their findings to "suggest PPI therapy is safe for up to a median of 3 years. 13

| CON CLUS I ON S AND IMPLI C ATI ON S FOR PR AC TI CE
Our interpretive framework supports the principle that no one study or pooled analysis of studies can adequately determine whether the harm associated with drug therapy is real. A convergence of proof using data from various sources and study designs is needed.
Considering the data from the COMPASS RCT together with the pharmaco-epidemiology observational studies leads us to conclude that on balance, it is likely that long-term PPI use increases all-cause mortality in older adults. Given the high prevalence of long-term PPI utilization, this message needs to be conveyed to health professionals and patients.

ACK N OWLED G M ENTS
The authors thank Cochrane Hypertension for the help pro-

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that supports the findings of this study are available in the supplementary material of this article. Supplementary file shows a bibliography sorted by harm type.