This

theorem is somewhat

unintuitive, and often used in this nonobvious way:
Suppose we've made an

observation O. We hypothesise that O happened because of some

hypothesis H (e.g. O can be "a

coin toss came up 6 times heads and 4 times tails", and H can be "the coin is a

fair coin"). We know how to calculate

`P(O | H)`; that's

elementary probability. But what we're really interested in (especially if we want to apply the

Neyman-Pearson lemma) is the converse probability

`P(H | O)`. By Bayes' Theorem, this is just

P(H | O) = P(O | H) * P(H) / P(O) (*)

where we know how to calculate `P(O | H)`; we *might* also be able to estimate `P(H)`, but usually both `P(H)` and `P(O)` are very hard to estimate (you need to make much stronger assumptions about the real world for that, since the luxury of assuming H -- what we'd like to be true -- is gone!). However, for Neyman-Pearson `P(O)` doesn't matter, and we can (often) "swallow" `P(H)` into a parameter.

But think what (*) is telling us! It says how to calculate the probability that our **hypothesis** is true, given what we've **observed**. No wonder it's so hard to discover `P(H)` and `P(O)` -- they tell us how to check if a theory about the real world is true!