Daubert v. Merrell Dow Pharmaceuticals, Inc. (on remand)
43 F.3d 1311 (9th Cir. 1995)
 

Kozinski, Circuit Judge.

On remand from the United States Supreme Court, we undertake "the task of ensuring that an expert’s testimony both rests on a reliable foundation and is relevant to the task at hand." Daubert v. Merrell Dow Pharmaceuticals, Inc. (1993).

I

A. BACKGROUND

Two minors brought suit against Merrell Dow Pharmaceuticals, claiming they suffered limb reduction birth defects because their mothers had taken Bendectin, a drug prescribed for morning sickness to about 17.5 million pregnant women in the United States between 1957 and 1982. This appeal deals with an evidentiary question: whether certain expert scientific testimony is admissible to prove that Bendectin caused the plaintiffs’ birth defects.

For the most part, we don’t know how birth defects come about. We do know they occur in 2-3% of births, whether or not the expectant mother has taken Bendectin. See Jose F. Cordero & Godfrey P. Oakley, Jr., Drug Exposure During Pregnancy: Some Epidemiologic Considerations, 26 Clinical Obstetrics & Gynecology 418, 424-25 (June 1983). Limb defects are even rarer, occurring in fewer than one birth out of every 1000. But scientists simply do not know how teratogens (chemicals known to cause limb reduction defects) do their damage: They cannot reconstruct the biological chain of events that leads from an expectants mother’s ingestion of a teratogenic substance to the stunted development of a baby’s limbs. Nor do they know what it is about teratogens that causes them to have this effect. No doubt, someday we will have this knowledge, and then we will be able to tell precisely whether and how Bendectin (or any other suspected teratogen) interferes with limb development; in the current state of scientific knowledge, however, we are ignorant.

Not knowing the mechanism whereby a particular agent causes a particular effect is not always fatal to a plaintiff’s claim. Causation can be proved even when we don’t know precisely how the damage occurred, if there is sufficiently compelling proof that the agent must have caused the damage somehow. One method of proving causation in these circumstances is to use statistical evidence. If 50 people who eat at a restaurant one evening come down with food poisoning during the night, we can infer that the restaurant’s food probably contained something unwholesome, even if none of the dishes is available for analysis. This inference is based on the fact that, in our health-conscious society, it is highly unlikely that 50 people who have nothing in common except that they ate at the same restaurant would get food poisoning from independent sources.

It is by such means that plaintiffs here seek to establish that Bendectin is responsible for their injuries. They rely on the testimony of three groups of scientific experts. One group proposes to testify that there is a statistical link between the ingestion of Bendectin during pregnancy and limb reduction defects. These experts have not themselves conducted epidemiological (human statistical) studies on the effects of Bendectin; rather, they have reanalyzed studies published by other scientists, none of whom reported a statistical association between Bendectin and birth defects. Other experts proffered by plaintiffs propose to testify that Bendectin causes limb reduction defects in humans because it causes such defects in laboratory animals. A third group of experts sees a link between Bendectin and birth defects because Bendectin has a chemical structure that is similar to other drugs suspected of causing birth defects.

The opinions proffered by plaintiffs’ experts do not, to understate the point, reflect the consensus within the scientific community. The FDA—an agency not known for its promiscuity in approving drugs—continues to approve Bendectin for use by pregnant women because "available data do not demonstrate an association between birth defects and Bendectin." U.S. Department of Health and Human Services News, No. P80-45 (Oct. 7, 1980). Every published study here and abroad—and there have been many—concludes that Bendectin is not a teratogen. In fact, apart from the small but determined group of scientists testifying on behalf of the Bendectin plaintiffs in this and many other cases, there doesn’t appear to be a single scientist who has concluded that Bendectin causes limb reduction defects.

It is largely because the opinions proffered by plaintiffs’ experts run counter to the substantial consensus in the scientific community that we affirmed the district court’s grant of summary judgment the last time the case appeared before us. The standard for admissibility of expert testimony in this circuit at the time was the so-called Frye test: Scientific evidence was admissible if it was based on a scientific technique generally accepted as reliable within the scientific community. We found that the district court properly applied this standard, and affirmed. The Supreme Court reversed, holding that Frye was superseded by Federal Rule of Evidence 702, and remanded for us to consider the admissibility of plaintiffs’ expert testimony under this new standard. . . .

II

A. BRAVE NEW WORLD

Federal judges ruling on the admissibility of expert scientific testimony face a far more complex and daunting task in a post-Daubert world than before. The judge’s task under Frye is relatively simple: to determine whether the method employed by the experts is generally accepted in the scientific community. Under Daubert, we must engage in a difficult, two-part analysis. First, we must determine nothing less than whether the experts’ testimony reflects "scientific knowledge," whether their findings are "derived by the scientific method," and whether their work product amounts to "good science." Second, we must ensure that the proposed expert testimony is "relevant to the task at hand," i.e., that it logically advances a material aspect of the proposing party’s case. The Supreme Court referred to this second prong of the analysis as the "fit" requirement.

The first prong of Daubert puts federal judges in an uncomfortable position. The question of admissibility only arises if it is first established that the individuals whose testimony is being proffered are experts in a particular scientific field; here, for example, the Supreme Court waxed eloquent on the impressive qualifications of plaintiffs’ experts. Yet something doesn’t become "scientific knowledge" just because it’s uttered by a scientist; nor can an expert’s self-serving assertion that his conclusions were "derived by the scientific method" be deemed conclusive, else the Supreme Court’s opinion could have ended with footnote two. As we read the Supreme Court’s teaching in Daubert, therefore, though we are largely untrained in science and certainly no match for any of the witnesses whose testimony we are reviewing, it is our responsibility to determine whether those experts’ proposed testimony amounts to "scientific knowledge," constitutes "good science," and was "derived by the scientific method."

The task before us is more daunting still when the dispute concerns matters at the very cutting edge of scientific research, where fact meets theory and certainty dissolves into probability. As the record in this case illustrates, scientists often have vigorous and sincere disagreements as to what research methodology is proper, what should be accepted as sufficient proof for the existence of a "fact," and whether information derived by a particular method can tell us anything useful about the subject under study.

Our responsibility, then, unless we badly misread the Supreme Court’s opinion, is to resolve disputes among respected, well-credentialed scientists about matters squarely within their expertise, in areas where there is no scientific consensus as to what is and what is not "good science," and occasionally to reject such expert testimony because it was not "derived by the scientific method." Mindful of our position in the hierarchy of the federal judiciary, we take a deep breath and proceed with this heady task.

B. DEUS EX MACHINA

The Supreme Court’s opinion in Daubert focuses closely on the langauge of Fed. R. Evid. 702, which permits opinion testimony by experts as to matters amounting to "scientific . . . knowledge." The Court recognized, however, that knowledge in this context does not mean absolute certainty. Rather, the Court said, "in order to qualify as ‘scientific knowledge,’ an inference or assertion must be derived by the scientific method." Elsewhere in its opinion, the Court noted that Rule 702 is satisfied where the proffered testimony is "based on scientifically valid principles." Our task, then, is to analyze not what the experts say, but what basis they have for saying it.

Which raises the question: How do we figure out whether scientists have derived their findings through the scientific method or whether their testimony is based on scientifically valid principles? Each expert proffered by the plaintiffs assures us that he has "utiliz[ed] the type of data that is generally and reasonably relied upon by scientists" in the relevant field, and that he has "utilized the methods and methodology that would generally and reasonably be accepted" by people who deal in these matters. The Court held, however, that federal judges perform a "gatekeeping role," to do so they must satisfy themselves that scientific evidence meets a certain standard of reliability before it is admitted. This means that the expert’s bald assurance of validity is not enough. Rather, the party presenting the expert must show that the expert’s findings are based on sound science, and this will require some objective, independent validation of the expert’s methodology.

While declining to set forth a "definitive checklist or test," the Court did list several factors federal judges can consider in determining whether to admit expert scientific testimony under Fed. R. Evid. 702: whether the theory or technique employed by the expert is generally accepted in the scientific community; whether it’s been subjected to peer review and publication; whether it can be and has been tested; and whether the known or potential rate of error is acceptable.3 We read these factors as illustrative rather than exhaustive; similarly, we do not deem each of them to be equally applicable (or applicable at all) in every case. Rather, we read the Supreme Court as instructing us to determine whether the analysis undergirding the experts’ testimony falls within the range of accepted standards governing how scientists conduct their research and reach their conclusions.

One very significant fact to be considered is whether the experts are proposing to testify about matters growing naturally and directly out of research they have conducted independent of the litigation, or whether they have developed their opinions expressly for purposes of testifying. That an expert testifies for money does not necessarily cast doubt on the reliability of his testimony, as few experts appear in court merely as an eleemosynary gesture. But in determining whether proposed expert testimony amounts to good science, we may not ignore the fact that a scientist’s normal workplace is the lab or the field, not the courtroom or the lawyer’s office.

That an expert testifies based on research he has conducted independent of the litigation provides important, objective proof that the research comports with the dictates of good science. See Peter W. Huber,

3. These factors raise many questions, such as how do we determine whether the rate of error is acceptable, and by what standard? Or, what should we infer from the fact that the methodology has been tested, but only by the party’s own expert or experts? Do we ask whether the methodology they employ to test their methodology is itself methodologically sound? Such questions only underscore the basic problem, which is that we must devise standards for acceptability where respected scientists disagree on what’s acceptable.

Galileo’s Revenge: Junk Science in the Courtroom 206-09 (1991) (describing how the prevalent practice of expert-shopping leads to bad science). For one thing, experts whose findings flow from existing research are less likely to have been biased toward a particular conclusion by the promise of remuneration; when an expert prepares reports and findings before being hired as a witness, that record will limit the degree to which he can tailor his testimony to serve a party’s interests. Then, too, independent research carries its own indicia of reliability, as it is conducted, so to speak, in the usual course of business and must normally satisfy a variety of standards to attract funding and institutional support. Finally, there is usually a limited number of scientists actively conducting research on the very subject that is germane to a particular case, which provides a natural constraint on parties’ ability to shop for experts who will come to the desired conclusion. That the testimony proffered by an expert is based directly on legitimate, preexisting research unrelated to the litigation provides the most persuasive basis for concluding that the opinions he expresses were "derived by the scientific method."

We have examined carefully the affidavits proffered by plaintiffs’ experts, as well as the testimony from prior trials that plaintiffs have introduced in support of that testimony, and find that none of the experts based his testimony on preexisting or independent research. While plaintiffs’ scientists are all experts in their respective fields, none claims to have studied the effect of Bendectin on limb reduction defects before being hired to testify in this or related cases.

If the proffered expert testimony is not based on independent research, the party proffering it must come forward with other objective, verifiable evidence that the testimony is based on "scientifically valid principles." One means of showing this is by proof that the research and analysis supporting the proffered conclusions have been subjected to normal scientific scrutiny through peer review and publication. Huber, Galileo’s Revenge at 209 (suggesting that "[t]he ultimate test of [a scientific expert’s] integrity is her readiness to publish and be damned").

Peer review and publication do not, of course, guarantee that the conclusions reached are correct; much published scientific research is greeted with intense skepticism and is not borne out by further research. But the test under Daubert is not the correctness of the expert’s conclusions but the soundness of his methodology. That the research is accepted for publication in a reputable scientific journal after being subjected to the usual rigors of peer review is a significant indication that it is taken seriously by other scientists, i.e., that it means at least the minimal criteria of good science. . . . If nothing else, peer review and publication "increase the likelihood that substantive flaws in methodology with be detected."

Bendectin litigation has been pending in the courts for over a decade, yet the only review the plaintiffs’ experts’ work has received has been by judges and juries, and the only place their theories and studies have been published is in the pages of federal and state reporters. None of the plaintiffs’ experts has published his work on Bendectin in a scientific journal or solicited formal review by his colleagues. Despite the many years the controversy has been brewing, no one in the scientific community—except defendant’s experts—has deemed these studies worthy of verification, refutation or even comment. It’s as if there were a tacit understanding within the scientific community that what’s going on here is not science at all, but litigation.

Establishing that an expert’s proffered testimony grows out of pre-litigation research or that the expert’s research has been subjected to peer review are the two principal ways the proponent of expert testimony can show that the evidence satisfies the first prong of Rule 702.10 Where such evidence is unavailable, the proponent of expert scientific testimony may attempt to statisfy its burden through the testimony of its own experts. For such a showing to be sufficient, the experts must explain precisely how they went about reaching their conclusions and point to some objective source—a learned treatise, the policy statement of a professional association, a published article in a reputable scientific journal or the like—to show that they have followed the scientific method, as it is practiced by (at least) a recognized minority of scientists in their field. See United States v. Rincom, 28 F.3d 921, 924 (9th Cir. 1994) (research must be described "in sufficient detail that the district court [can] determine if the research was scientifically valid").11

Plaintiffs have made no such showing. As noted above, plaintiffs rely entirely on the experts’ unadorned assertions that the methodology

10. This showing would not, of course, be conclusive. Proffering scientific testimony and making an initial showing that it was derived by the scientific method enables a party to establish a prima facie case as to admissibility under Rule 702. The opposing party would then be entitled to challenge that showing. This it could do by presenting evidence (including expert testimony) that the proposing party’s expert employed unsound methodology or failed to assiduously follow an otherwise sound protocol. Where the opposing party thus raises a material dispute as to the admissibility of expert scientific evidence, the district court must hold an in limine hearing (a so-called Daubert hearing) to consider the conflicting evidence and make findings about the soundness and reliability of the methodology employed by the scientific experts. . . .

11. This underscores the difference between Daubert and Frye. Under Frye, the party proffering scientific evidence had to show it was based on the method generally accepted in the scientific community. The focus under Daubert is on the reliability of the methodology, and in addressing that question the court and the parties are not limited to what is generally accepted; methods accepted by a minority in the scientific community may well be sufficient. However, the party proffering the evidence must explain the expert’s methodology and demonstrate in some objectively verifiable way that the expert has both chosen a reliable scientific method and followed it faithfully. Of course, the fact that one party’s experts use a methodology accepted by only a minority of scientists would be a proper basis for impeachment at trial.

they employed comports with standard scientific procedures. In support of these assertions, plaintiffs offer only the trial and deposition testimony of these experts in other cases. While these materials indicate that plaintiffs’ experts have relied on animal studies, chemical structure analyses and epidemiological data, they neither explain the methodology the experts followed to reach their conclusions nor point to any external source to validate that methodology. We’ve been presented with only the experts’ qualifications, their conclusions and their assurances of reliability. Under Daubert, that’s not enough.

This is especially true of Dr. Palmer—the only expert willing to testify "that Bendectin did cause the limb defects in each of the children." In support of this conclusion, Dr. Palmer asserts only that Bendectin is a teratogen and that he has examined the plaintiffs’ medical records, which apparently reveal the timing of their mothers’ ingestion of the drug. Dr. Palmer offers no tested or testable theory to explain how, from this limited information, he was able to eliminate all other potential causes of birth defects, nor does he explain how he alone can state as a fact that Bendectin caused plaintiffs’ injuries. We therefore agree with the Sixth Circuit’s observation that "Dr. Palmer does not testify on the basis of the collective view of his scientific discipline, nor does he take issue with his peers and explain the grounds for his differences. Indeed, no understandable scientific basis is stated. Personal opinion, not science, is testifying here." For this reason, Dr. Palmer’s testimony is inadmissible as a matter of law under Rule 702.

The failure to make any objective showing as to admissibility under the first prong of Rule 702 would also fatally undermine the testimony of plaintiffs’ other experts, but for the peculiar posture of this case. Plaintiffs submitted their experts’ affidavits while Frye was the law of the circuit and, although they’ve not requested an opportunity to augment their experts’ affidavits in light of Daubert, the interests of justice would be disserved by precluding plaintiffs from doing so. Given the opportunity to augment their original showing of admissibility, plaintiffs might be able to show that the methodology adopted by some of their experts is based on sound scientific principles. For instance, plaintiffs’ epidemiologists might validate their reanalyses by explaining why they chose only certain of the data that was available, or the experts relying on animal studies might point to some authority for extrapolating human causation from teratogenicity in animals.

Were this the only question before us, we would be inclined to remand to give plaintiffs an opportunity to submit additional proof that the scientific testimony they proffer was "derived by the scientific method." Daubert, however, establishes two prongs to the Rule 702 admissibility inquiry. We therefore consider whether the testimony satisfies the second prong of Rule 702: Would plaintiffs’ proffered scientific evidence "assist the trier of fact to . . . determine a fact in issue"? Fed. R. Evid. 702.

C. NO VISIBLE MEANS OF SUPPORT

In elucidating the second requirement of Rule 702, Daubert stressed the importance of the "fit" between the testimony and an issue in the case: "Rule 702’s ‘helpfulness’ standard requires a valid scientific connection to the pertinent inquiry as a precondition to admissibility." Here, the pertinent inquiry is causation. In assessing whether the proffered expert testimony "will assist the trier of fact" in resolving this issue, we must look to the governing substantive standard, which in this case is supplied by California tort law.

Plantiffs do not attempt to show causation directly; instead, they rely on experts who present circumstantial proof of causation. Plaintiffs’ experts testify that Bendectin is a teratogen because it causes birth defects when it is tested on animals, because it is similar in chemical structure to other suspected teratogens, and because statistical studies show that Bendectin use increases the risk of birth defects. Modern tort law permits such proof, but plaintiffs must nevertheless carry their traditional burden; they must prove that their injuries were the result of the accused cause and not some independent factor. In the case of birth defects, carrying this burden is made more difficult because we know that some defects—including limb reduction defects—occur even when expectant mothers do not take Bendectin, and that most birth defects occur for no known reason.

California tort law requires plaintiffs to show not merely that Bendectin increased the likelihood of injury, but that it more likely than not caused their injuries. See Jones v. Ortho Pharmaceutical Corp., 163 Cal. App. 3d 396, 403, 209 Cal. Rptr. 456 (1985). In terms of statistical proof, this means that plaintiffs must establish not just that their mothers’ ingestion of Bendectin increased somewhat the likelihood of birth defects, but that it more than doubled it—only then can it be said that Bendectin is more likely than not the source of their injury. Because the background rate of limb reduction defects is one per thousand births, plaintiffs must show that among children of mothers who took Bendectin the incidence of such defects was more than two per thousand.

None of plaintiffs’ epidemiological experts claims that ingestion of Bendectin during pregnancy more than doubles the risk of birth defects. To evaluate the relationship between Bendectin and limb reduction defects, an epidemiologist would take a sample of the population and compare the frequency of birth defects in children whose mothers took Bendectin with the frequency of defects in children whose mothers did not. The ratio derived from this comparison would be an estimate of the "relative risk" associated with Bendectin. See generally Joseph L. Fleiss, Statistical Methods for Rates and Proportions (2d ed. 1981). For an epidemiological study to show causation under a preponderance standard, "the relative risk of limb reduction defects arising from the epidemiological data . . . will, at a minimum, have to exceed ‘2’." That is, the study must show that children whose mothers took Bendectin are more than twice as likely to develop limb reduction birth defects as children whose mothers did not. While plaintiffs’ epidemiologists make vague assertions that there is a statistically significant relationship between Bendectin and birth defects, none states that the relative risk is greater than two. These studies thus would not be helpful, and indeed would only serve to confuse the jury, if offered to prove rather than refute causation. A relative risk of less than two may suggest teratogenicity, but it actually tends to disprove legal causation, as it shows that Bendectin does not double the likelihood of birth defects.

With the exception of Dr. Palmer, whose testimony is inadmissible under the first prong of the Rule 702 analysis, the remaining experts proffered by plaintiffs were equally unprepared to testify that Bendectin caused plaintiffs’ injuries; they were willing to testify only that Bendectin is "capable of causing" birth defects. Plaintiffs argue "these scientists use the words ‘capable of causing’ meaning that it does cause. This is an ambiguity of language. . . . If something is capable of causing damage in humans, it does." But what plaintiffs must prove is not that Bendectin causes some birth defects, but that it caused their birth defects. To show this, plaintiffs’ experts would have had to testify either that Bendectin actually caused plaintiffs’ injuries (which they could not say) or that Bendectin more than doubled the likelihood of limb reduction birth defects (which they did not say).

As the district court properly found below, "the strongest inference to be drawn for plaintiffs based on the epidemiological evidence is that Bendectin could possibly have caused plaintiffs’ injuries." The same is true of the other testimony derived from animal studies and chemical structure analyses—these experts "testify to a possibility rather than a probability." Plaintiffs do not quantify this possibility, or otherwise indicate how their conclusions about causation should be weighted, even though the substantive legal standard has always required proof of causation by a preponderance of the evidence. Unlike these experts’ explanation of their methodology, this is not a shortcoming that could be corrected on remand; plaintiffs’ experts could augment their affidavits with independent proof that their methods were sound, but to augment the substantive testimony as to causation would require the experts to change their conclusions altogether. Any such tailoring of the experts’ conclusions would, at this stage of the proceedings, fatally undermine any attempt to show that these findings were "derived by the scientific method." Plaintiffs’ experts must, therefore, stand by the conclusions they originally proffered, rendering their testimony inadmissible under the second prong of Fed. R. Evid. 702.

Conclusion

The district court’s grant of summary judgment is affirmed.

 


div1.gif (1531 bytes)
Home | Contents | Topical Index | Syllabi | Search | Contact Us | Professors' Pages
Cases | Problems | Rules | Statutes | Articles | Commentary