Until every raven is observed, the statement "All ravens are black" remains a theory. Theories are uncertain, otherwise we'd use the easier-to-spell word "fact". Confirmation or refutation by an experiment is also uncertain. Those Danish scientists that wrote that article in Science claiming they discovered a non-black raven; how do we know they weren't looking at a pigeon, or even more likely, were drunk?

This is where statistics comes in. Each experiment provides evidence to confirm or refute a theory (unless, of course, it's completely irrelevant). Observing a large number of black ravens provides significant evidence supporting the theory; observing a large number of non-black non-ravens provides a minuscule amount of evidence supporting the theory (unless you live in an alternative universe where almost everything is a raven). Observing a non-black raven, however, is strong evidence against a theory, especially if you're a credible witness.

What a respectable scientist (i.e., a scientist respected by a statistician) would do is:

  • attempt to estimate the conditional probability of the hypothesis given the evidence;
  • make a wild-ass guess of the prior probability of the hypothesis (i.e., will everyone laugh at me for trying to disprove it); then
  • apply Bayes' theorem to find the posterior probability (i.e., what everyone will think once they've read my paper).

A note to address dogboy's write-up:

The original point of this node was that in the context of inductive reasoning, logical consistency is equivalent to supporting evidence: a non-black, non-raven lends support to the claim that all ravens are black. My point is that not all evidence is equal and Bayes' theorem provides the best, and possibly the only consistent way, to update belief based on accumulating new evidence as it arrives, especially when dealing with support of theories. "Best", I'll admit, doesn't equate to the respect of statisticians.

I agree with dogboy's assertion that Bayesian reasoning isn't the most common tool of scientists and it's certainly not the only appropriate tool. A classical approach to the raven blackness problem might be to:

  1. Define the null hypothesis to be "there exists a non-black raven".
  2. Define blackness (probably in terms of luminosity and hue).
  3. Measure blackness over some sample of ravens. Convince everyone that your sample is unbiased.
  4. Calculate the P valueof your test.
  5. Claim victory if P ≥ 0.05.

I'll concede that this will get a scientist the respect of most statisticians (especially biostatisticians). But I'll also claim that it's not as convincing as a Bayesian argument and the classicist's distaste of a Bayesian's subjective probability isn't consistent with an almost religious acceptance of comparing P values to 0.05, an arbitrary number picked by Ronald A. Fisher in an attempt to eschew the equal arbitrariness of human belief.

The "all ravens are black" problem isn't really applicable to experiments, in the same way that common research problems in physics or biology might be. It's more akin to problems in astronomy, which makes high use of descriptive statistics. Descriptive statistics are useful for gleaning information from data but not for updating belief in theories; that's where Bayesian reasoning comes in.