Friday, May 11, 2012

Law of Large Numbers Explained to Kids

The Law of Large Numbers Explained to Kids






Let's go back to our 15 year-old student example. She knows that each time she asks her parents to go to movie theater, there is a 50% chance that they will say yes. Since she loves to go to movie theater, she asks her parents every day. Does that means that after two days she would have seen 1 movie? After 4 days, 2 movies? Maybe yes, but most surely no. 

In fact, the variable "YES" to the question "Can I go to movie Theater?" follows a bernouilli(0.5) distribution. After 4 days, the probability that she would have seen 0 movies is not 0. It is in fact roughly 5%. This is not big, but still there is a chance that she has to wait a couple of more days before seeing a movie. The chart below depicts the probability of no YES after 1, 2, 3, 4 ... 10 days.






However, we know that if she repeats the experiment every day, for an entire year, she will most likely have seen 180 movies during the year. This is the Law of Large Numbers. If you repeat the same experiment (asking for movie theater) a lot of times, the proportion of "YES" (the arithmetic mean) will tend to be exactly equal the the theoretical probability of having a "YES" upfront.

In the chart below, I depict the proportion of "YES" after 1, 2, 3, ... 365 days. We see that after a full year, the proportion of "YES" tends to be roughly 50%. However, if the experiment is repeated only a limited number of times, the proportion of "YES" can diverge significantly from 50%.





Music !


Friday morning commute song: White Trash Story by Casey Donahew.


Thursday, May 10, 2012

The Central Limit Theorem


The Central Limit Theorem


I know of scarcely anything so apt to impress the imagination as the wonderful form of cosmic order expressed by the "Law of Frequency of Error". The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement, amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy, the more perfect is its sway. It is the supreme law of Unreason. Whenever a large sample of chaotic elements are taken in hand and marshaled in the order of their magnitude, an unsuspected and most beautiful form of regularity proves to have been latent all along.


Suppose you are a 15 year-old girl, in Fourth year in a mixed College. There are 30 students in your class. During an amazing Statistics class, you start looking around you and figure out that there are 11 girls and 19 boys. Assuming that the probability of giving birth to a girl in Quebec is roughly 50%, how come you end up being only 37% in your class?

You text your friend sitting in the other class across the hall, and ask her how many girls they are in her class. They are 20 out of 30 (or 67%).

You start a Facebook page and ask all 15-year-old students across Quebec what is the proportion of girls in their classes. It turns out that you get 500 answers and start plotting the distribution of the percentages of girls in each class. On the x-axis you have the proportion of girls in a given class and on the y-axis you have the proportion of the 500 answers with that proportion of girls.


You start putting everything together and you finally figure out that the variable "being a girl" is a discrete random variable with only two possible outcomes: yes or no (or 1 or 0). And, it is a yes with 50% chances. It is called a Bernoulli(p) variable, where p is 50%. Woaw ! In other word if you randomly pick a 15 year-old person in the population, you have 1 chance over 2 of picking a girl. Big News.

You continue your investigation and find out that if you add up n Bernouilli(p) random variables, you end up with a variable coming from Binomial(n,p) distribution. For example, the number of girls in a random sample of 30 15 year-old kiddos follows a Binomial(30,0.5) distribution. The expected value of the number of girls in such a sample is 30*0.5 = 15 (or 50% in percentages) ! 

Now, let's assume for the moment that the 500 answers that you got earlier come from 500 classes of 30 students each. These 500 answers are 500 random variables coming from the same Bernouilli(n,p) distribution.

It starts becoming really interesting when you look at the distribution of those 500 proportions. They are basically 500 realizations of the mean of random variables coming from a Bernouilli(0.5) distribution.

The Central Limit Theorem tells us that the statistical distribution of those 500 Means is a Normal Distribution  with Mean 0.5 and Variance 0.5*(1-0.5) / 30.

The Central Limit Theorem basically states that regardless of the underlying distribution of the observations in a sample, the Mean will always follow a Normal Distribution. Isn't a powerful theorem !




Wednesday, May 9, 2012

The Simpson Paradox

The Simpson Paradox Explained (Civil Right Acts of 1964)




Here is the breakdown of the votes by the House of Representatives by Region and by Party:


How come the Democrats voted "yes" in higher proportions in both regions and the Republicans voted "yes" in higher proportions overall ?????

This is known as the Simpson Paradox. The correlation between the variable "Party" and "Yes" is reversed when the data are aggregated (or when the Regions are analyzed together). 

It happens because the correlation between the column variable (Party) and "Yes" is weaker than the correlation between the row variable (Region) and "Yes". In other words, there is stronger correlation between the "Region" and "Yes" than between the "Party" and "Yes".

More info on Wikipedia

The Birthday Paradox

What is the probability that at least two persons have the same birthday (month and day) in a group of 25 people?



First compute the probability of having no pairs within the 25 people. Imagine that the 25 people are ordered from number 1 to number 25. For individual 1, the probability of having a distinct birthday from the previous individual is 1 (there are no previous individual). For the second individual, the probability of having a distinct birthday from the previous individual is 364/365 (there is 1 chance over 365 that she has the same birthday). For the third individual, the probability of having a distinct birthday from the previous 2 individuals is 363/365 and so forth.

The joint probability of seeing no pairs in a group of 25 people is (this is simply the product of all the individual probabilities):

 1 * 364/365 * 363/365 * ... * 1/365 = 43.1%

Then, the probability of seeing at least one pair is 1 - 0.431 = 56.9%. This is more than you would think. Try it on your friends.

This is known as the Birthday Paradox. There are plenty of references to it on the web.

Wikipedia link to the Birthday Paradox