Emerging Lessons From The Drug Effectiveness Review Project

  1. Peter J. Neumann

+ Author Affiliations

  1. Center for the Evaluation of Value and Risk in Health at the Institute for Clinical Research and Health Policy Studies at Tufts–New England Medical Center

Abstract

The Drug Effectiveness Review Project (DERP) is an alliance of fifteen states and two private organizations, which have pooled resources to synthesize and judge clinical evidence for drug-class reviews. The experience shines a bright light on challenges involved in implementing an evidence-based medicine process to inform drug formulary decisions: When should evidence reviewers accept surrogate markers and assume therapeutic class effects? How open and participatory should review procedures be? Should reviewers consider cost-effectiveness information? What is the appropriate role of the public sector in judging evidence? The DERP illustrates that attempts to undertake evidence-based reviews, apart from the methods themselves, which continue to evolve, involve questions of organization, process, and leadership.

An initiative to synthesize and judge clinical evidence about drugs, which started in Oregon, has now spread to fifteen states.

Since 2003, health policy innovators in Oregon have been leading a determined effort—the Drug Effectiveness Review Project (DERP)—to apply principles of evidence-based medicine (EBM) to inform drug formulary decisions. In some ways, the initiative evokes Oregon’s earlier plan to prioritize medical treatments for Medicaid recipients. In a similar fashion, it pushes established health policy boundaries; it shifts the debate from one focused on financing and costs to one focused on the benefits provided; and it has generated an uproar in some quarters. Moreover, like the Oregon Health Plan (OHP), the new program is spearheaded by John Kitzhaber, the state’s former governor and a longtime health policy maverick.

Unlike the OHP, however, the DERP has gained traction elsewhere in the United States. As of early 2006, fifteen states and two private organizations were participating in the project. In this paper I discuss the DERP experience and the debate surrounding it.

The Drug Effectiveness Review Project

Systematic reviews take center stage.

Public and private health care payers have long used preferred drug lists (PDLs) to steer patients to selected products; however, evidence reviews for drugs tended to lack scientific rigor and transparency.1 Rising prescription drug spending in the 1990s focused new attention on the evidence: Did the benefits of the new drugs warrant their costs? A movement emerged to systematically synthesize and scrutinize the evidence.

Creation of the DERP.

The idea of formalizing drug reviews found one of its most ardent adherents in Kitzhaber, the governor of Oregon (1995–2003) who had long pushed his state to finance Medicaid expansions in part by restricting the benefit package to health services with demonstrated value. The notion of subjecting therapeutic drug classes to closer scrutiny was a natural extension.

In 2001, researchers at Oregon Health and Science University (OHSU), which houses an Agency for Healthcare Research and Quality (AHRQ)–designated Evidence-based Practice Center (EPC), began conducting reviews of several therapeutic drug classes for the Oregon Medicaid program. Officials in Idaho and Washington State soon began drawing upon these reviews.2

In 2003 Kitzhaber, newly out of office, created the Center for Evidence-based Policy within OHSU to oversee the newly created DERP. The idea was to invite other states and private organizations to help shape and use drug-class reviews and to share project costs. The DERP commenced its reviews in November 2003, with ten member organizations, and has since expanded to include seventeen participants: fifteen states (Alaska, Arkansas, California, Idaho, Kansas, Michigan, Minnesota, Missouri, Montana, New York, North Carolina, Oregon, Washington, Wisconsin, and Wyoming) and two nonprofits (the California HealthCare Foundation and Canadian Agency for Drugs and Technologies in Health).3

Governance.

Contrary to some impressions, the DERP is not run by “Oregon” (although former state officials and OHSU researchers have been instrumental in its creation and sustenance). Rather, the program is managed by its varied participants. According to its staff leadership, all DERP participants (Oregon is but one member) are involved in key decisions, such as which drug classes to review.4 Participating organizations all contribute the same amount—$96,600 per year for three years—to finance the $4.2 million project. Importantly, participants retain local authority for interpreting DERP reports, developing preferred drug lists (PDLs), and negotiating prices. The Center for Evidence-based Policy staffs and manages the project, facilitates collaboration among members, and coordinates external communications.5

The process for reviewing evidence.

In selecting which therapeutic categories to review, DERP participants give priority to certain types of classes: those accounting for a large share of pharmacy budgets; those consisting of multiple drugs; those with substantial off-label use; and those with recent additions of costly drugs.

The evidence reviews themselves are conducted by one of three AHRQ-designated EPCs, one based at OHSU and the others at RAND in southern California and at the University of North Carolina (although to date the OHSU EPC has conducted about 75 percent of the reviews). EPC reviewers have emphasized that the process for synthesizing evidence continues to evolve.6 In reviewing a drug class, reviewers first work with DERP participants and outside experts to frame questions such as these: How should classes be defined? Which indications should be examined? What comparator drugs are appropriate?

The DERP Web site emphasizes that researchers then examine medical literature databases and solicit input from content experts and pharmaceutical companies.7 Reviewers exclude articles if they fail prespecified criteria (for example, if they have inadequate control groups or poor retention rates). EPCs consider only those clinical studies that drug manufacturers are willing to have made public. Moreover, per a decision by DERP participants, the EPCs consider clinical evidence only and do not take evidence on cost-effectiveness into account.

At least two reviewers abstract data from articles. External experts selected by the EPC peer review draft reports. The final report contains a judgment about the quality of the evidence and about whether meaningful clinical differences exist across drugs in a class. Reviews are updated every seven to twenty-four months.8

DERP officials have emphasized that they encourage an explicit, transparent, public process with multiple opportunities for communication and input.9 Before finalizing reports, for example, DERP staffers post drafts on the DERP Web site for two weeks for public comment. Comments received are then made available to the public upon request. Final reports are posted on the DERP Web site.10

Criticism of the DERP.

The DERP has drawn strong criticism from various quarters, notably from the pharmaceutical industry, which has lobbied state legislators against the project, but also from some patient advocacy groups and professional societies, including the National Mental Health Association (NMHA), the National Alliance on Mental Illness (NAMI), the American Psychiatric Association, and the American Medical Association.11

The criticism centers most generally on allegations that the project is a thinly veiled cost containment exercise that restricts access to important therapies. Critics contend that the DERP gives cash-strapped Medicaid programs and other organizations political cover to justify not paying for expensive new drugs. Former Pharmaceutical Research and Manufacturers of America (PhRMA) chief Alan Holmer has written:

The program…has led to the development of more formulary restrictions reducing patients’ access to medicines, according to a Kaiser Family Foundation report. The program distorts “evidence-based medicine,” to create a veil behind which government officials and some managed care organizations justify restrictions on patients’ access to health care in order to reduce short-term drug costs. The Oregon program fails to take into account adequately individual patients’ medical needs and ways to contain health care costs overall in the long- and short-term.12

More specifically, critics argue that DERP reviewers tend to assume that all drugs within a therapeutic class are equivalent (and to encourage participants to steer patients to the lowest-price drug in the class), which they say ignores important differences in medications’ effectiveness and tolerability. Drug industry officials have complained that DERP reviews tend to conclude that drugs are equivalent, simply because there is an absence of data from randomized controlled trials (RCTs) showing the superiority of one drug versus another in the class.13

Critics have singled out DERP reviewers’ proclivity to favor evidence from RCTs to the exclusion of observational studies and other data sources. Detractors also question the manner in which evidence appraisers compare disparate trials of competing within-class products. Furthermore, some critics contend that DERP reviews do not promote the true “value” of drugs in any meaningful sense, because reviews do not consider cost-effectiveness formally.

Some observers have also criticized the review process, arguing that it should allow for more transparency and opportunity for input and feedback. They point out that although DERP officials post draft reports for public comment, they do not actually publish the comments received, nor do they usually submit final reports for publication in peer-reviewed journals.14 Detractors also argue that the review process should involve a broader range of physician-specialists with relevant expertise, as well as patient advocates, and that it should not exclude experts with industry ties.15

Finally, some of the criticism of the DERP is actually leveled at the local pharmacy and therapeutics (P&T) committees that use the reports, not at the DERP per se. Drug industry officials have argued that opportunity for public input during the P&T committee process is limited. The NMHA has stated:

Although NMHA believes in the value of objective review and analysis that the Center strives for, there are serious implications for how the Center’s recommendations are interpreted at the state and local levels and ultimately, how these recommendations may be used to achieve political objectives that restrict access to medications.16

DERP reviewers respond vigorously to these charges. They emphasize that it is simply not true that reports ignore nonrandomized evidence, pointing to numerous reports that include such information. They also note that complaints against the DERP’s consideration of cost-effectiveness are disingenuous, because the drug industry has fought hard against inclusion of economic factors. Finally, they point out that the process undertaken by DERP participants is far more transparent and rigorous than processes used elsewhere.17

Influence of the DERP.

Despite the intensity of the debate, there is scant evidence on the DERP’s actual impact (although an evaluation is under way). Different participants have sometimes come to different formulary decisions using the same EPC reports.18 A recent Henry J. Kaiser Family Foundation report found important differences in how four state Medicaid programs use EPC reports.19 Some states (such as Washington) use reports as the main source of clinical evidence for formulary development, while others (such as North Carolina) use them as one of many inputs. Moreover, some nonparticipating organizations use published EPC reports, which makes evaluation of the project’s influence challenging.

The reach of the DERP appears to be growing. The DERP has been reviewing or re-reviewing twenty-six drug classes over a three-year span ending September 2006. Other states and organizations are considering participation.20 DERP officials expect the project to continue beyond the three-year contractual period now in place. They are also discussing the option of conducting additional reviews, including class-versus-class reviews for selected conditions and reviews of clinical guidelines.21

Participating organizations have used DERP reports not only for Medicaid coverage decisions but also to inform drug coverage policy for state employees or other public programs.22 Notably, Consumers Union (CU) and AARP, although not DERP members, have begun adapting DERP reviews for consumers.23

The DERP In Perspective

The evolving science of systematic reviews.

Evidence reviewers have long grappled with clinical studies of variable quality, conflicting results from well-conducted studies, and questions about external validity.24 As David Atkins and colleagues have noted, uncertainties loom large in each step of evidence considerations: How good is the evidence that an intervention can improve important health outcomes? How good is the evidence that an intervention will work in a particular setting? How do potential benefits compare with possible harms or costs of the intervention?25

The value of systematic reviews lies in the structured methodology for searching, selecting, and synthesizing evidence.26 Formal procedures for integrating disparate types of evidence help demonstrate a clear chain of logic that links an intervention to improved health outcomes. The risk otherwise is to focus selectively on particular studies, especially as advocates introduce and promote them.27

This science of conducting reviews has evolved from informal overviews and unstructured review articles. Progress is needed on many fronts: on methods to search for, select, and use observational studies to complement efficacy trials; on the use of decision analysis and cost-effectiveness analysis to complement clinical data; and on the costs and benefits of collecting evidence.28

One of the enduring legacies of the DERP might be its contribution to the methodological basis for reviewing drug classes. Another might be its impact on the generation of more and different types of evidence by industry.

The need for good process.

Progress is also needed to establish well-defined and accepted procedures for conducting reviews and for making coverage and reimbursement decisions based on them.

The process for reviewing evidence.

For participating states, the DERP process likely represents a marked advance over their previous practices and over practices used by the vast majority of states and private plans not participating in the DERP. Additional openness and inclusiveness (for example, adding clinical specialists and consumer advocates to the EPC deliberations; publishing comments received on the DERP reports; and submitting more DERP reports for peer-reviewed publication) are worthy goals, but the benefits must be balanced against the costs in terms of additional time and resources. Finally, it will be important to evaluate the entire enterprise.

The process for making coverage and reimbursement decisions.

At their best, systematic reviews clarify the evidence base: They do not tell decisionmakers what to do, nor do they ensure good decision making.29 Local decisionmakers have different perspectives and values, make different judgments, and face different relative prices.30 Not surprisingly, they sometimes come to different conclusions using the same evidence review.

Experience with DERP reports has revealed that a key challenge involves translating technical details of evidence syntheses for use by local P&T committees. A number of efforts are under way to help make processes for coverage and reimbursement decisions transparent, explicit, and understandable and to allow opportunity for input and appeal.31

When to assume therapeutic-class effects?

The debate over DERP reviews reflects a larger debate over the role of nonrandomized evidence and a related issue concerning therapeutic-class effects. Despite charges otherwise, EPC reports indeed often find differences among drugs in a class. Still a question lingers: Should evaluators (and decisionmakers) accept evidence based on surrogate markers, especially for drugs with alternatives in the same therapeutic class when the alternatives have evidence from long-term randomized trials? Potentially, a great deal rides on this question in terms of the health and choices of patients (not to mention Medicaid budgets and the stock prices of drug companies).

Consider an example: Company A develops and tests a new drug (drug A) to lower cholesterol. The company incurs the cost of conducting long-term RCTs that demonstrate that the drug not only lowers cholesterol but also reduces mortality. Company B develops a cholesterol-lowering drug (drug B) in the same therapeutic class. Drug B receives FDA approval based on the intermediate endpoint (lipid levels) but does not conduct long-term studies. Should a formulary committee that already covers drug A also include drug B? What if drug B has a slightly favorable side-effect profile or dosing regimen? What if some patients respond better to drug B? What if drug B’s price is 20 percent less than drug A’s?

A call is often heard for head-to-head RCTs to address such questions. But such trials are lengthy, expensive, and, for drug companies, risky, in part because it is difficult to determine the appropriate comparator years in advance. One would like a better methodological basis for such matters, and researchers are attempting to provide one.32 An assumption of a “class effect” (that all drugs in the therapeutic class have similar effects) might be reasonable in some instances. On the other hand, establishing a link between surrogate endpoints and tangible clinical benefits is complex.33 Moreover, granting reimbursement based on one drug’s surrogate endpoints in cases where another drug has long-term outcome data allows late-to-the-party companies to free-ride on the first movers’ research, thus reducing incentives for companies to undertake long-term research in the first place.34

What role should cost-effectiveness analysis play?

The separation of economic evidence from clinical evidence is understandable as a political construct. Open consideration of cost-effectiveness remains largely anathema in the United States.35

DERP participants have adopted the convention among many technology assessment organizations to consider clinical evidence on its own merits, without respect to costs. However, the lack of procedures for considering economic evidence more forthrightly and holistically also creates problems. For one, it likely contributes to some of the frustration and mistrust of efforts like the DERP, because observers assume that costs are considered surreptitiously in reviews, or misused downstream by payers. For another, considering costs after the clinical review tends to focus decisions on a drug’s price rather than its overall value.

The entire debate would benefit from more openness and clarity about cost-effectiveness. Rigorous evidence syntheses that consider cost-effectiveness are a natural and expected reaction to the incompleteness of available clinical information, and to rising drug spending and payers’ ongoing fiscal challenges. Although concerns linger about the methodology of cost-effectiveness analysis, questions about the value of drugs are best considered in an evaluation that combines clinical and economic evidence simultaneously.

The DERP experience underscores the fact that health policy emanating from Oregon, the cauldron of debates over rationing and cost-effectiveness a decade ago, has evolved in an interesting and little-remarked-upon way. The DERP decision to ignore cost-effectiveness considerations reveals a society still unable to consider economic factors openly in evidence reviews, even in a program led from Oregon, the most willing of all states to push health policy limits.

What is the appropriate role of government in judging evidence?

The debate over the DERP reflects another larger policy discussion about the appropriate role of the public sector in evidence reviews. In part, critics of the DERP fear the notion of too much authority centralized in one evidence synthesizer.

The idea of pooling resources across states to review drug classes has clearly resonated with many state officials, who have been otherwise engaged in efforts to manage drug spending through generic substitution, incentive-based formularies, drug quantity limits, and disease management.36

The backdrop to the DERP debate is the relative vacuum left by federal and state governments in synthesizing evidence on drug classes. The DERP is something of a political innovation: a nongovernmental entity to conduct drug reviews for mostly government clients. Kitzhaber has argued that the DERP has found a niche because government institutions have been largely incapable of responding. In a recent interview, he stated that the “political establishment is afraid of its own shadow.”37

The federal government has also been hamstrung. The Medicare Prescription Drug, Improvement, and Modernization Act (MMA) of 2003, sec. 1013, contains a provision calling on AHRQ to conduct research on the “outcomes, comparative clinical effectiveness, and appropriateness of health care, including prescription drugs.” Under this initiative, AHRQ has begun to conduct comparative effectiveness reviews of drugs and other technologies, with the Oregon EPC as coordinator. The initiative shows the potential of the federal government to play a more active role, although, to date, the amounts authorized for the effort have been relatively small. Also, MMA prohibits the Centers for Medicare and Medicaid Services (CMS) from actually using the evidence from AHRQ reviews to withhold coverage of a prescription drug.

The role of John Kitzhaber.

A final point pertains to the remarkable role of Kitzhaber, the former emergency room physician, state legislator, and two-term Oregon governor. He has spent considerable energy over the past two decades fighting to change the terms of the health care debate. “The health care benefit ought to have some value related to health,” he has argued.38 As he puts it, instead of watching politicians debate how to pay for the current system, it is better to reform the system and to avoid paying for services of questionable value.39

Kitzhaber has his devotees and detractors. Some argue that the kind of resolute direction he and his team have provided is required to insulate review staffers from powerful industry and advocacy groups. Others contend that Kitzhaber’s style and larger agenda to reshape the health care landscape is part of the problem.

Regardless, in forging the DERP, Kitzhaber has focused the microscope on the value of prescription drugs and willed a new review process onto the national stage. His efforts illustrate the inherent power of pooling state Medicaid resources to fund evidence reviews. They also illustrate the potential for leadership, in short supply in health care debates nationwide.

Conclusions

Observers have remarked that like the expression “low-carb,” the term “EBM” is ubiquitous.40 They have also stressed that simply calling something “evidence-based” doesn’t make it so. The real difficulty is defining what EBM is. The DERP reveals the outcome of one EBM process and all of the challenges involved in applying the concept.

DERP reviewers are not alone in confronting challenges in synthesizing evidence—every reviewer faces a similar ordeal. However, it has emerged as an important and special case, and it likely portends the kinds of evidence reviews we will see used more widely in the future. The effort reminds us that attempts to undertake evidence-based reviews, apart from the methods themselves, which continue to evolve, involve key questions of organization, process, and leadership.

Footnotes

  • Peter Neumann (pneumann@tufts-nemc.org) is director of the Center for the Evaluation of Value and Risk in Health at the Institute for Clinical Research and Health Policy Studies at Tufts–New England Medical Center and a professor of medicine at Tufts University School of Medicine, in Boston, Massachusetts.

  • The author is grateful to Jenny Palmer for excellent research assistance. No external funding was received for this paper. The author receives funding from multiple sources for his work, including government, foundation, and industry sources.

NOTES

Responses on This Article

Articles citing this article

| Table of Contents