Critics of data mining can reasonably suggest that, with all the possible relationships in a huge database, many medicine-adverse reaction associations will occur by chance, even though they seem to be significantly associated.
‘VALIDATION’ OF THE DATA MINING
APPROACH
Critics
of data mining can reasonably suggest that, with all the possible relationships
in a huge database, many medicine-adverse reaction associations will occur by
chance, even though they seem to be significantly associated. The Bayesian
methodology used by the UMC can take account of the size of the database in
assigning probabilities, and its current implemen-tation is optimised for the
WHO database. While the aim of the quantitative analysis is hypothesis
gener-ation and most false positives can be expected to be identified as such
in the clinical review, one must be as sure as possible that national centres
and reviewers are not provided with what amounts to a huge amount of useless
probabilistic information. On the other hand it is clear that finding signals
early will necessarily entail some false positives.
Determining
the performance of the BCPNN is a difficult task because there is no ‘gold
standard’ for comparison. Also there are different definitions of the term
signal. According to the definition used in the WHO programme a signal is
essentially a hypothesis together with data and arguments, and it is not only
uncertain but also preliminary in nature: the situa-tion may change
substantially over time (Edwards and Biriell, 1994; Meyboom et al., 1997).
Practically,
signals can only be validated by increas-ing recognition with time. What is
meant by ‘recog-nition’ is problematic in itself. In order to gain more insight
both into the BCPNN performance and the ‘validation’ problem in general, we
felt we would achieve a reasonable estimate of the predictive power of the
BCPNN tool by checking historical associa-tions identified by the BCPNN against
standard refer-ence sources (Lindquist et
al., 2000). Martindale has worldwide coverage, recognition and wide
availability and was used as a standard for well known, recog-nised ADRs. The
US Physicians Desk Reference, though not international, gives very recent
informa-tion on drugs. It has a comprehensive ADR list-ing, generally more
inclusive than that of Martindale. However, PDR includes suspected adverse
reactions, whether substantiated or not. We considered an ADR listed in PDR an
indication of a possible drug–ADR relationship.
Two
main studies of the performance of the BCPNN were reported in the same paper
(Lindquist et al., 2000). The first
study concerned a test of the BCPNN
predictive value in new signal detec-tion as compared with reference literature
sources (Martindale’s Extra Pharmacopoeia from 1993 and 2000, and the
Physicians Desk Reference from 2000). In the study period (the first quarter
year 1993) 107 drug-adverse reaction combinations were highlighted as new
positive associations by the BCPNN, and referred to new drugs. Fifteen
drug-adverse reaction combinations on new drugs became negative BCPNN
associations in the study period.
The
BCPNN method detected signals with a posi-tive predictive value of 44% and a
negative predictive value of 85%. Seventeen as yet unconfirmed positive
associations could not be dismissed with certainty as false positives.
The
second study was a comparison of the new BCPNN with the results of the former
signalling procedure. Of the 10 drug-adverse reaction signals produced by the
former signal detection system from data sent out for review during the study
period, 6 were also identified by the BCPNN. These 6 associ-ations have all had
a more than ten-fold increase of reports and 4 of them have been included in
the refer-ence sources. The remaining 4 signals that were not identified by the
BCPNN had a small, or no, increase in the number of reports, and are not listed
in the reference sources.
The
length of time chosen for the retrospective check against the literature was
not arbitrary, but based on the assumption that 7 years would be enough for
ADRs to be included in the reference sources, allowing for the maximum
reporting for new drugs to have taken place (the Weber effect). We know however
that one new association appeared in Martin-dale between 1999 and 2000, and 7
years still may not be long enough. Publishing delay must be consid-ered in the
use of these reference sources, but this is minimised now by their availability
online using an Internet browser.
The
use of our selected literature sources as a ‘gold standard’ is open to debate.
The literature is not intended as an early signalling system, and uses many
sources for its information other than the WHO database: the biases affecting
inclusion and exclusion of ADR information therefore may be very different.
Factors such as those affecting the differential reporting to WHO and the
inclusion of new informa-tion in the reference sources will have an effect
which is independent of the performance of the BCPNN. The BCPNN is run every
quarter, and we selected just one quarter: since the BCPNN is used in
continuous analysis, the specificity and sensitivity are subject to necessary
time-dependent changes in classification of ‘positives’ and ‘negatives’. It is
difficult to consider something as a ‘non-association’ because of this time
dependency, and it is clear that there is an asymmetry in the effect of time on
our results. This is explicable using the following logic.
Exceptionally
high reporting of an ADR-to-product combination, which causes the combination
to stand out from the background of the whole database will cause any other
product-to-ADR combination contain-ing the product or ADR to stand out slightly less. It is not common for alterations
in the background to significantly alter the status of an association. On the
other hand it is more common for the reporting of a particular ADR and
medicinal product to increase at a rate which is broadly related to the
incidence of the ADR to the point where it becomes an association. Publicity
about an ADR may affect this rate dramati-cally, but this by no means
invalidates the association, only complicates its interpretation. Another
asymme-try is that the negative associations are a selection of all
non-associations. This assumes that definite negative associations represent
all non-associations, though it is clear that some non-associations will become
positive associations in time. Thus a non-association can be either a
combination of an ADR term with a medicinal product which is not a positive
association and remains stable or one which is statis-tically a negative
association at a high probability. Considering all this, we have in this study
defined the inverse of a positive association as a definite nega-tive
association. This again shows the difficulty of evaluating a signalling system.
An
assumption was made that a substantial increase in the number of reports of an
association over the period indicated ongoing clinical interest in an
asso-ciation. More reports may be seen as a support for the validity of the
associations, though there is often a tendency for ADRs that are becoming well
known to be more reported anyway.
An
obvious limitation of any quantitative analysis of spontaneous reporting data
is the dependence on the terminology used for recording of adverse reactions.
There are only few examples of work done on any of the medical terminologies in
use or proposed to determine their relative value in searching for new drug
signals (Brown, 2002).
Although
we found that the use of the BCPNN gave a 44% positive predictive value, and a
high nega-tive predictive value of 84%, the normal methods for assessing the
power of a method are difficult to apply to the BCPNN, because of the reasons
above. It is for this reason that ‘validation’ is placed in quota-tion marks in
the title of this section. The BCPNN is not a panacea for drug safety
monitoring. The drug– ADR combinations which reach significance do so only in
comparison with the background experience of 3+ million case reports. This is
particularly impor-tant for commonly reported ADRs, which, however serious,
would not reach significance until the quan-titative experience for a drug and
such an ADR is excessive. We have stressed (Lindquist et al., 2000) that the BCPNN has its limitations, is not a
substitute for expert review, but has a place particularly where large volumes
of data are involved. It is reassuring, however, that all signals identified in
the previous system that went on to become frequently reported in the WHO
database were also identified in the retro-spective BCPNN analysis.
On
the other hand, the BCPNN has the power to analyse signals further. We are
developing its use for looking at complex variables and in unsupervised pattern
recognition to see whether parameters such as gender, age, other drug use
increase the strength of association, and whether ‘syndromes’ of reported terms
are present (Orre et al., 2005).
However, a very large amount is necessary initially, as with any subdivision of
data, to attain statistical significance in subsets. This is a major advantage
of using the large pooled WHO database, and we are trying to maximise this
potential.
Related Topics
TH 2019 - 2024 pharmacy180.com; Developed by Therithal info.