Volume 57, Issue 5 p. 616-621
Free Access

A comparison of three different sources of data in assessing the frequencies of adverse reactions to amiodarone

Yoon K. Loke

Corresponding Author

Yoon K. Loke

Department of Clinical Pharmacology, University of Oxford, Radcliffe Infirmary, Oxford OX2 6HE, UK

Yoon K Loke, Department of Clinical Pharmacology, University of Oxford, Radcliffe Infirmary, Oxford OX2 6HE, UK. Tel: + 44 1865 22 4524 Fax: + 44 1865 79 1712 E-mail: [email protected]Search for more papers by this author
Sheena Derry

Sheena Derry

Department of Clinical Pharmacology, University of Oxford, Radcliffe Infirmary, Oxford OX2 6HE, UK

Search for more papers by this author
Jeffrey K. Aronson

Jeffrey K. Aronson

Department of Clinical Pharmacology, University of Oxford, Radcliffe Infirmary, Oxford OX2 6HE, UK

Search for more papers by this author
First published: 03 February 2004
Citations: 40



To compare the frequencies of adverse drug reactions (ADRs) to amiodarone from three separate datasets: (i) a meta-analysis of clinical trials, (ii) spontaneous reports published in medical journals, and (iii) spontaneous reports sent to the World Health Organization (WHO).


We classified the ADRs into eight categories, based on the site involved, and built a rank order of the ADRs by category (from most to least commonly reported) for each dataset. We also calculated the relative proportions for all eight ADR categories within each dataset, in order to be able to compare the distributions of ADR frequencies: we assigned an index value of 1.0 to the frequency of respiratory toxicity in each set and calculated values for the other ADRs relative to respiratory toxicity.


Thyroid disorders were the most commonly reported ADRs in the WHO dataset. In contrast, published case reports showed a preponderance of respiratory disorders, while in the meta-analysis cardiac conduction problems were the most frequent. The rank orders of ADRs differed among the three datasets, as did the index values of specific ADR categories with respect to the respiratory category.


The distributions of ADR rank order and relative frequencies are dissimilar among the three datasets, as each dataset compiles information in a different way. Nevertheless, each dataset has its own specific strengths, and all three should be used together in obtaining a complete picture of a drug's safety profile. Important therapeutic and regulatory decisions should not simply be based on one source of data.


Information on adverse drug reactions (ADRs) often comes from case reports in scientific journals, and it is likely that the numbers of such reports influence people's perceptions of the frequencies of ADRs. Regulatory authorities also have an interest in maintaining spontaneous reporting systems to detect and record ADRs, and marketing approval may be withdrawn as a result of anecdotal reports [1]. However, anecdotal case reports are low in the evidence hierarchy compared with randomized controlled trials, and it is unclear whether the numbers of reports of particular ADRs are in any way indicative of their true incidence. We have investigated the question of whether the ADRs with the highest attributable  rates  in  randomized  controlled  trials  are also the ones that are most often reported in anecdotal journal articles and to the World Health Organization (WHO) International Drug Monitoring Programme.

Direct comparison of the frequencies of case reports of ADRs with their frequencies in trials is not possible, because we cannot, in the absence of data on the number of people exposed, calculate ADR rates from spontaneous reports. However, we can build a rank order of ADRs (from most to least commonly reported) and compare this with a rank order based on data from trials. We can also calculate relative frequencies of ADRs within each dataset and compare them across datasets; for instance, if there are three times as many case reports of one particular ADR compared with another, we can ask whether the attributable rates for these ADRs in a clinical trial also reflect this three-fold difference.

In order to address these questions, we have compared the frequencies of ADRs of amiodarone from three separate sources:

  • a meta-analysis of data from placebo-controlled randomized trials;

  • data from the WHO Collaborating Centre for International Drug Monitoring in Uppsala;

  • and published case reports in scientific journals.

We specifically chose amiodarone as the example to study because it has been available for many years and its adverse effects are numerous and well described. This made it easy to obtain large amounts of data from the sources of interest; this is particularly true for the clinical trials, as the investigators would have been well aware of the adverse effects they should have been looking for. Our study was specifically designed to compare the frequencies of well-established reactions to amiodarone as a test of the different data sources; we did not intend to study the role of different sources in uncovering new or unusual adverse reactions of a freshly launched drug.



We classified adverse effects into eight broad groups:

  • 1

    Thyroid gland: hyperthyroidism or hypothyroidism.

  • 2

    Respiratory, e.g. pneumonitis, bronchiolitis, pulmonary fibrosis.

  • 3

    Liver, e.g. acute or chronic hepatitis, cirrhosis.

  • 4

    Skin, e.g. phototosensitivity, abnormal pigmentation, rash.

  • 5

    Nervous system, e.g. peripheral neuropathy, optic neuropathy, ataxia, movement disorders.

  • 6

    Heart conduction disturbances, e.g. bradycardia, heart block.

  • 7

    Gastrointestinal, e.g. pancreatitis, nausea and vomiting.

  • 8

    Eyes, e.g. corneal deposits, visual impairment.


The adverse effects of amiodarone were evaluated in two separate meta-analyses published in 1997 [2, 3]. We therefore performed an updated search of Medline for trials of amiodarone published in 1997–2002. We used the following inclusion criteria:

  • randomized, placebo controlled trials;

  • double-blind;

  • a planned treatment duration of 1 year or more.

We also checked all the trials listed in the bibliography of a 1998 meta-analysis of the therapeutic effects of amiodarone [4].

Adverse events were reported using different terminology in different papers. We therefore used data (when available) that were described as follows, in this order of preference: severe ADRs, withdrawals due to adverse effects, ADRs judged to be related to treatment, minor ADRs, any other format.

Statistical analysis

We analysed pooled Peto odds ratios and heterogeneity using RevMan 4.1. The number needed to harm (NNH), with 95% confidence intervals (95% CI), was calculated by a computer program, Visual Rx, which applied the calculated odds ratio to the pooled control event rate. The NNH is defined as the number of patients who need to be treated with a drug for a given period of time for one additional patient to be harmed by an ADR. It reflects the absolute frequency of the adverse reaction (the lower the NNH the more frequent the adverse reaction).

Published case reports

We used a sensitive search string (‘amiodarone and case-report in TG’) to search Medline (1966–2000), with no language restrictions. Eligible articles were checked and classified by a clinical pharmacologist (Y.K.L.) experienced in the assessment of ADRs. We selected articles that reported therapy with oral amiodarone, based on the following criteria:

  • indexed as a case report in a MeSH subheading in Medline,

together with at least one of the following:

  • the title or abstract clearly indicated that the purpose of the article was to provide a case report of an adverse drug reaction thought to be due to oral amiodarone,


  • the article's format was consistent with that of a suspected adverse drug reaction report to the Committee on Safety of Medicines, UK (Yellow Card System), in that it provided details on:

  • age, sex, and clinical features (e.g. history and test results) for each individual patient;

  • medication history (when started, dosage, and use of concomitant therapies);

  • chronology and clinical features leading to the diagnosis of the adverse event;

  • a statement of the outcome.

We excluded articles describing:

  • the effects of a particular therapy for dealing with the adverse event;

  • the role of specific radiological or laboratory evaluations in patients with the adverse event;

  • a quiz or medical education series;

  • children (under 18 years);

  • drug interactions;

  • a case series, which we have defined as comprising more than three patients.

We chose not to analyse data from case series, because we wanted to compare data from the highest level in the evidence hierarchy (a meta-analysis) with data from the lowest level (case reports). We consider case series to be at a different level in the evidence hierarchy, above that of isolated reports. Moreover, we were concerned about instances of duplicate publication that would seriously undermine the reliability of adding up the numbers from case series. In the course of this work, we found examples of single case reports that were subsequently republished in larger series of patients. We chose reports of up to three cases as our cut-off point, as this enabled us to check each case in detail to confirm that there was no duplication.

WHO pharmacovigilance

The WHO International Drug Monitoring Programme in Uppsala has been collecting spontaneous reports on the adverse reactions to amiodarone since 1982, and data were kindly supplied to us by Professor I. R. Edwards. We extracted information on the total numbers of reports for the 50 adverse reactions with the highest cumulative counts in the middle of 2001.

Comparing the relative frequencies of ADRs

We calculated the relative proportions for all eight ADR categories within each of the three available datasets, in order to be able to compare the distributions of ADR frequencies. We defined the frequency of respiratory toxicity as the index value, with a value of 1.0. The proportions of other ADRs relative to that of respiratory toxicity were calculated by dividing their frequency by that of the frequency of cases of respiratory toxicity. For instance, if there were 12 published cases of eye toxicity and 117 cases of respiratory toxicity, the relative frequency of eye toxicity would be 0.1. Similarly, if the NNH in clinical trials for respiratory toxicity was 65, and the NNH for eye toxicity was 138, then eye toxicity had an index value of 0.47.


Meta-analysis of ADRs

Because we had strict selection criteria, we identified only six randomized controlled trials of amiodarone in which ADRs were reported in detail [5–10]. Data were available on just over 4000 patients, who took an average dose of 230 mg day−1 for 22 months. Specific monitoring for ADRs was described in all six trials; heart, lung, and thyroid abnormalities were the ones most commonly sought. Further details of the six trials are available at http://www.uea.ac.uk/~wm107/Amiodaron eADR.html.

There was no significant heterogeneity in the meta-analysis. The pooled Peto odds ratios and NNHs (based on an average treatment duration of about 22 months) for each category of ADR are shown in Table 1, as are the relative frequencies of each ADR compared with the index value of 1.0 for respiratory toxicity. The relative frequencies are reproduced in Table 2 for comparison with the data obtained by the other methods.

Table 1. Meta-analysis of adverse effects of amiodarone in six randomized controlled trials
Site of ADR Number of trials reporting Number affected in amiodarone group Number affected in placebo group Peto odds ratio (95% CI) P-value Number needed to harm (95% CI) Relative frequency
Heart 6 98/2087 43/2066 2.40 (1.69, 3.41) < 0.00001  36 (21, 36) 1.80
Thyroid 5 74/2038 12/2014 4.19 (2.72, 6.45) < 0.00001  54 (32, 99) 1.20
Respiratory 6 76/2087 42/2066 1.78 (1.23, 2.58)   0.002  65 (33, 219) 1.00
Nervous system 5 48/1782 19/1758 2.40 (1.48, 3.89)   0.0004  68 (33, 196) 0.96
Liver 5 30/1782 14/1758 1.95 (1.07, 3.56)   0.03 134 (50, 1809) 0.49
Gastrointestinal tract 5 66/2038 47/2014 1.32 (0.9, 1.94)   0.15 138 (48, ∞) 0.47
Eyes 4 23/1702  6/1676 3.05 (1.46, 6.38)   0.003 138 (53, 610) 0.47
Skin 6 24/2087 12/2066 1.93 (1.00, 3.72)   0.05 187 (65, ∞) 0.35
Table 2. Relative frequencies (index values) of adverse drug reactions (ADRs) of amiodarone at eight sites, measured in three different ways
Site of ADR Meta-analysis index value WHO cases Published case reports
Total number Index value Total number Index value
Heart 1.80  474 0.44  13 0.11
Thyroid 1.20 1829 1.70  51 0.44
Respiratory 1.00 1078 1.00 117 1.00
Nervous system 0.96  964 0.89  54 0.46
Liver 0.49  832 0.77  31 0.26
Eyes 0.47  216 0.20  12 0.10
Gastrointestinal tract 0.47  526 0.49   2 0.02
Skin 0.35  1124 1.04  31 0.26

Case reports

The search string yielded 622 hits, of which 13 could not be assessed because they were not available from the British Library (n = 3) or were in a language that none of us was able to translate (n = 10). Of the remaining 609 reports, 357 fulfilled the inclusion criteria for analysable case reports. The absolute numbers in each classification category are shown in Table 2, together with their relative frequencies, compared with the index value of 1.0 for respiratory toxicity.

WHO Reports

There were 640 types of ADRs reported for amiodarone. We analysed the 50 with the highest cumulative counts as of mid 2001 and classified them into our eight categories. Table 2 shows both their absolute frequencies and their relative frequencies compared with the index value of 1.0 for respiratory toxicity.

The rank order of ADR index values, according to data source, are summarized in Figure 1.

Details are in the caption following the image

Rank order of adverse drug reactions, according to data source


In considering the safety or tolerability of a particular drug, the following questions are important:

  • Which adverse effects are the most likely?

  • How often is a particular adverse effect likely to occur?

  • How much more (or less) common is one adverse effect relative to another? For example, what is the likelihood of a serious adverse effect, such as pulmonary toxicity, compared with a less severe one, such as photosensitivity?

Case reports are widely published in scientific journals [11], but our results suggest that evaluating the numbers of spontaneous reports may not help in answering these questions. There was little agreement between the figures obtained from published reports in journals and those of a formal pharmacovigilance monitoring system. For example, in published reports, pulmonary toxicity was the most common ADR, with twice as many reports as the next most frequent, thyroid disorders. In contrast, WHO monitoring figures suggested that thyroid disorders are actually most common, with 1.8 times as many reports as respiratory toxicity. Furthermore, the data from these two sources bore little relation to the results of the meta-analysis.

Given these major differences between the three data sources, we need to decide on which we should rely. Do any of the three methods get close to the truth? Here, it may be helpful to review the strengths and weaknesses of each data source.


Meta-analysis is the statistical combination of data from different studies. In this instance, the data come from six randomized clinical trials that were double-blind and placebo-controlled. This minimizes selection and detection bias, especially in the three trials in which safety monitors were blinded to treatment assignation (the reports of the other three trials did not state whether such blinding took place or not). The other major strength of this meta-analysis is that most of the trial protocols included systematic monitoring of cardiac, liver, thyroid, and respiratory adverse effects. This suggests that the clinical trials data are relatively clean and unbiased, and should yield reliable estimates of ADR frequencies.

The weakness of the meta-analysis is that the trial reports provided only a categorized summary of the data, and some details were not clearly reported. The types of ADRs reported varied among the trials, and little mention was made of how investigators monitored adverse effects on the skin, gastrointestinal tract, and eyes. The bulk of the patients were treated for only 2 years or less, and they were mainly middle-aged men. Furthermore, all the patients had significant heart disease, and most would have been closely monitored by specialists in hospital settings. The applicability of such data to unselected, wider populations may be limited. Moreover, only about 2000 patients were treated with amiodarone in the six trials analysed here. These studies are unlikely to yield any useful information on rare and/or previously unrecognized adverse effects.

Spontaneous reports

In contrast, spontaneous reporting, either as published cases or to the WHO, can potentially provide data on a broad spectrum of patients. Moreover, ADRs that occur after prolonged exposure can be detected through spontaneous reports, as can ADRs that are extremely rare. However, the absence of a control group and the lack of a denominator mean that the attributable rate of the ADR cannot be calculated. Under-reporting is a recognized problem [12], as are the many biases inherent in the clinician's decision of whether or not to publish or submit for publication a spontaneous report of a particular ADR. Furthermore, the WHO data are not homogeneous in origin, as they are compiled from many National Centres, which may use different criteria for acceptance of a report (e.g. accepting reports from pharmaceutical companies).

Another influence on the numbers of published reports of an ADR stems from the editorial decision on whether the ADR is considered new, or unusual, or interesting, or is associated with a useful teaching point. The profusion of case reports of pulmonary toxicity might reflect the true frequency, but might simply be a reflection of the life-threatening effects of the ADR, coupled with the journalistic merit in having pictures of the chest radiograph to accompany the narrative. In contrast, because bradycardia and conduction disturbances are well-recognized adverse effects of antiarrhythmic drugs, neither clinicians nor journal editors may feel the need to report them. The high frequency of bradycardia and conduction disturbances in clinical trials reflects intensive cardiological monitoring in these patients, all of whom had significant heart disease.

Spontaneous reporting systems are generally thought to be of most value in the ‘signal detection’ of ADRs, and of limited use in assessing frequencies of ADRs. Our findings support this view, especially as there were considerable differences in the relative frequency distributions of ADRs, even between the two types of spontaneous reports. We have also shown that even the rank order of frequencies of reports of different types of adverse drug reactions does not relate to the rank order of the frequencies of those adverse reactions in randomized controlled trials.

Clinicians and safety monitors may arrive at erroneous conclusions about frequencies of ADRs if they base their judgements on how many cases they identify through searching Medline or a spontaneous reporting database, as is commonly done. While clinical trials may not be helpful in detecting all adverse effects, we believe that the randomized, double-blinded, placebo-controlled nature of the data helps in accurately determining the frequencies of specific well-recognized ADRs in well-defined populations. Furthermore, systematic review of trials yields improved estimates of the rates of adverse effects. In order to increase the chances of performing systematic reviews, authors of reports of clinical trials should be encouraged to report adverse reactions in detail [13].

In contrast, spontaneous case reports are of most value in identifying new or unexpected adverse events that require further evaluation, and in generating hypotheses. Furthermore, case reports may help in demonstrating diagnostic techniques, elucidating or suggesting mechanisms or methods of management, or teaching and reminding [14]. They can also give some idea of the range of ADRs in unselected populations, outside hospitals, during routine therapy.

Although case reports and case series may be relatively low in the hierarchy of evidence, Jenicek notes that they have a very important role as the ‘first line of evidence’[15]. While it may be difficult to judge the validity of a single isolated report, a systematic study of a series of case reports can provide compelling evidence that a particular event merits further attention. For instance, cerivastatin had to be withdrawn after it was found that 31 patients had died from severe rhabdomyolysis [16]. Indeed, Vandenbroucke rightly points out that case reports and case series are important sources of new ideas in medicine [17].

As such, both systematic review and spontaneous reporting should usefully complement each other in helping us to paint a complete picture of a drug's safety profile.

We thank Professor Ralph Edwards for providing data and for helpful comments on an early draft of the manuscript. The information presented here does not represent the opinion of the World Health Organization. The tabulated data from the Uppsala Monitoring Centre are not homogeneous with respect to origin or likelihood that the pharmaceutical product caused the adverse drug reaction. S.D. was supported by a research grant from the Sir Jules Thorne Trust, a charitable foundation.