Answer to old chestnut: common birthdays
If you have just two people, the probability of a shared birthday is simply 1/365.
Calculating the probability that at least two of N people will have the same birthday is hard to do directly, because there are a lot of pairs that could have the same birthday even for fairly small N, and then you have to account for overcounting in the case that three or more actually have the same birthday, or two or more pairs have the same birthdays.
However, there's a trick that makes this easy to calculate, one that is common to probability puzzles, and that is to calculate the probability that all of the people have different birthdays, and then subtract that from 1.
For instance, if you have three people, the probability that two of them were born on the same day is 1 minus the probability that they were born on three different days.
To calculate this other probability, take each person in any arbitrary sequence. The first person can have any birthday whatsoever and not cause any same-birthday pairs
because there are no pairs. The second person can have any other birthday, and that occurs with a probability of 364/365. The third person can have any birthday except the ones those other two people have, and that occurs with probability 363/365.
These events are independent -- one person's birth did not influence any other person's birth. So, the total probability of all these events occurring is the product:
1 x --- x --- = approx. 0.9917958
And the probability of a shared birthday is 1 minus this, or
0.0082042 -- about 0.82%. Note that this is about three times the probability when you have only two people. The reason for this is that there are three possible pairs that can share a birthday -- A and B, B and C, or A and C. It's not exactly three times because of the small chance that all three have the same birthday. If we just took three times the probability of one pair having the same birthday, we'd be counting that small case three times.
As you add more people to the group, the probability grows quickly, since the number of pairs grows quadratically compared to the number of people. However, the amount that the total probability is reduced by the overcounting effect also increases.
The probability for 2 of N people to have the same birthday is given by:
365 x 364 x 363 x ... x (366-N)
1 - -------------------------------
1 - ---------------
(365-N)! x 365N
where ! represents the factorial
We want this probability to greater than 50%, or the reverse probability to be less than 50%, or
365! < 0.5 x (365-N)! x 365N
We can use Stirling's Formula, an approximization for the factorial function that is very good for large factorials (and pretty good even for factorials of numbers as small as 5 to 10), to make evaluation of this massive equation easier.
Stirling's formula (also called Stirling's approximation) is:
N! ~ NNe-N
This is more commonly written in a logarithmic form, as:
ln N! ~ N ln N - N
(You may also see a more complex version of this formula; both versions come from truncation of Stirling's series, an exact representation of the value of the factorial, and if more of the series is retained you get the extra stuff.)
In this case, if we take the natural logarithm of both sides of the inequality we found above:
ln 365! < ln 0.5 + ln (365-N)! + N ln 365
If we then insert Stirling's approximation for ln N!:
365 ln 365 - 365 < ln 0.5 + (365-N) ln (365-N) - (365-N) + N ln 365
(365-N) ln 365 < ln 0.5 + (365-N) ln (365-N) + N
(365-N) (ln 365 - ln (365-N)) - N < ln 0.5
This number is easily calculated for various N. For N=23 the probability first goes beyond 50% that two people will have the same birthday. The 0.5 number at the right of this last equation is the reverse probability we plugged in at the beginning, so to find the number of people necessary for a 90% chance of a repeated birthday, we plug in 0.1 instead of 0.5, and for N=41 we first get more than a 90% chance of a repeated birthday.