Posts Tagged: probability


30
Aug 10

The Chosen One

Toss one hundred different balls into your basket. Shuffle them up and select one with equal probability amongst the balls. That ball you just selected, it’s special. Before you put it back, increase its weight by 1/100th. Then put it back, mix up the balls and pick again. If you do this enough, at some point there will be a consistent winner which begins to stand out.

The graph above shows the results of 1000 iterations with 20 balls (each victory increases the weight of the winner by 5%). The more balls you have, the longer it takes before a clear winner appears. Here’s the graph for 200 balls (0.5% weight boost for each victory).

As you can see, in this simulation it took about 85,000 iterations before a clear winner appeared.

I contend that as the number of iterations grows, the probability of seeing a Chosen One approaches unity, no matter how many balls you use. In other words, for any number of balls, a single one of them will eventually see its relative weight, compared to the others, diverge. Can you prove this is true?

BTW this is a good Monte Carlo simulation of the Matthew Effect (no relation).

Here is the code in R to replicate:

numbItems = 200
items = 1:numbItems
itemWeights = rep(1/numbItems,numbItems) # Start out uniform
iterations = 100000
itemHistory = rep(0,iterations)
 
for(i in 1:iterations) {
	chosen = sample(items, 1, prob=itemWeights)
	itemWeights[chosen] = itemWeights[chosen] + (itemWeights[chosen] * (1/numbItems))
	itemWeights = itemWeights / sum(itemWeights) # re-Normalze
	itemHistory[i] = chosen
}
 
plot(itemHistory, 1:iterations, pch=".", col="blue")

After many trials using a fixed large number of balls and iterations, I found that the moment of divergence was amazingly consistent. Do you get the same results?


12
Jun 10

A different way to view probability densities

The standard, textbook way to represent a density function looks like this:

Perhaps you have seen this before? (Plot created in R, all source code from this post is included at the end). Not only will you find this plot in statistics books, you’ll also see it in medical texts, sociology, and even economics books. It gives you a clear view of how likely an observation is to fall in a particular range of x. So what’s the problem?

The problem is that what usually concerns us isn’t probability in isolation. What matters is the impact that observations have on some other metric of importance, like the total or average. The key thing we want to know about a distribution is: What range of observations will contribute the most to our expected value, and in what way? We want a measure of influence.

Here’s the plot of the Cauchy density:

From this view, it doesn’t look all that different from the Normal. Sure it’s a little more narrow, with “fatter tails”, but no radical difference, right? Of course, the Cauchy is radically different from the normal distribution. Those slightly fatter tails give very little visual indication that the Cauchy is so extreme-valued that it has no expected value. Integrating to find the exception gives you infinity in both directions. If your distribution is like this, you’ve got problems and your plot should tell you that right away.

Here’s another way to visualize these two probability distributions:

Go ahead and click on the image above to see the full view. I’ll wait for you…

See? By plotting the density multiplied by the observation value on the y-axis, you get a very clear view of how the different ranges of the function effect the expectation. Looking at these, it should be obvious that the Cauchy is an entirely different beast. In the normal distribution, extreme values are so rare as to be irrelevant. This is why researchers like to find ways to treat their sample as normally distributed: a small sample gives enough information to tell the whole story. But if your life (or livelihood) depends on a sum or total amount, you’re probably best off plotting your (empirical) density in the way shown above.

Another bit of insight from this view is that the greatest contribution to the expectation comes at 1 and -1, which in the case of the Normal isn’t the mean, but rather the second central moment (plus or minus). That’s not a coincidence, but it’s also not always the case, as we shall see. But first, what do things look like when a distribution gets completely out of hand?

The Student’s t distribution, on 1 Degree of Freedom , is identical to the Cauchy. But why stop at a single DF? You can go all the way down to the smallest (positive) fraction.

The closer you get to zero, the flatter the curve gets. Can we ever flatten it out completely? Not for a continuous distribution with support over an infinite range. Why not? Because in order for the slope of value * density to continue to flatline it indefinitely, the density function would have to be some multiple of \frac{1}{x}, and of course the area under this function diverges as we go to infinity, and densities are supposed to integrate to 1, not infinity, right?

What would the plot look like for a continuous function that extends to infinity in just one direction? Here’s the regular Exponential(1) density function plot:

Now look at the plot showing contribution to expectation:

Were you guessing it would peak at 1?  Again, the expectation plot provides insight into which ranges of the distribution will have the greatest impact on our aggregate values.

Before I go look at an a discrete distribution, try to picture what the expectation curve would look like for the standard Uniform(0,1) distribution. Did you picture a diagonal line?

Can we flatten things out completely with an infinitely-supported discrete distribution? Perhaps you’ve heard of the St. Petersburg Paradox. It’s a gambling game that works like this: you flip a coin until tails comes up. If you see one head before a tails, you get $1. For 2 heads you get $2, for 3 heads $4, and so on. The payoff doubles each time, and the chances of reaching the next payoff are halved. The paradox is that even though the vast majority of your winnings will be quite modest, your expectation is infinite.  The regular view of the probability mass function for provides almost no insight:

But take a look at the expectation plot:

Flat as a Nebraska wheat field. You can tell right away that something unusual is happening here.

I could go on with more examples, but hopefully you are beginning to see the value in this type of plot. Here is the code, feel free to experiment with other distributions as well.

# Useful way to make dots look like a line
x = seq(-5,5,length=1500)
 
# You've seen this before. Our good friend the Normal
plot(x,dnorm(x),pch=20,col="blue", main="Standard Normal density function")
 
# Cauchy looks a little different, but it's not obvious how different it is 
plot(x,dcauchy(x),pch=20,col="blue", main="Cauchy density function")
 
# New way of plotting the same
plot(x,dnorm(x)*x,pch=20,col="blue", main="Normal density: contribution to expectation")
abline(h=0,lty="dashed",col="gray")
 
plot(x,dcauchy(x)*x,pch=20,col="blue", main="Cauchy density: contribution to expectation")
abline(h=0,lty="dashed",col="gray")
 
# Extreme student-t action:
plot(x,dt(x,0.001)*x,pch=20,col="blue", main="Student's t on 0.001 d.f. contribution to expectation")
abline(h=0,lty="dashed",col="gray")
 
 
# The Exponential
x = seq(0,10,length=1500)
plot(x,dexp(x,1),pch=20,col="blue", main="Standard Exponential density function")
 
# The expectation view:
plot(x,dexp(x,1)*x,pch=20,col="blue", main="Exponential mass contribution to expectation")
 
# What do we see with the St. Petersburg Paradox
x = 2^(0:30)
dStPete <- function(x) {
	return (1/(2*x))
}
 
# Note the log
plot(x,dStPete(x),pch=20,col="blue", main="St. Petersburg mass function", log="x", xlab="Payoff", ylab="Probability",ylim=c(0,.5))
 
# Now we see the light
plot(x,dStPete(x)*x,pch=20,col="blue", main="St. Petersburg mass fcn: contribution to expectation", xlab="Payoff", log="x", ylab="Payoff times probability",ylim=c(0,.5))
abline(h=0,lty="dashed",col="gray")

30
Apr 10

How many girls, how many boys?

I found this interesting question over here at mathoverflow.net. Here’s the question:

If you have a country where every family will continue to have children until they get a boy, then they will stop. What is the proportion of boys to girls in the country.

First off, there are some assumptions you need to make that aren’t stated in the problem. The most important one is that boys are just as likely as girls to be born. This is empirically false, but there’s nothing wrong with assuming it for the problem, so long as the assumption is acknowledged.

My first thought about solving the problem was to think about Martingales and stopping times, but that’s more complication than you need. If you look at it from the point of view of expectation things are simpler.The probability of having a boy first is 1/2, at which point the family stops and you have a ratio of 1 boy per 1 births. The probability of having one girl, then one boy is 1/4, at which point you have a ratio of 1 boy per 2 births. Multiplying the probabilities by the ratios and summing from 1 birth to infinity, you get an expectation of approximately 69.31% boys.

Problem is, this is the expectation for a single family. Because families who have more children (and thus more girls) contribute disproportionately to the pool of children, one family is a biased estimator for the proportion in the entire population. Douglas Zare at the above-linked mathoverflow question does a good job of working out the details for a country with an arbitrary number of families. Here is what he comes up with for the percentage of girls:

Where k is the number of families in the country, and  is the digamma function.

To be true to the new motto of this site, I decided to test this out in R using a Monte Carlo method. Here is my code:

And here is the resulting graph:

Looks like a good match.

Maybe you noticed that at the beginning I mentioned assumptions, as in more than one. We are also assuming that that all of the boys and girls, no matter how old, are still considered boys and girls. All of the parents were already in the country at the beginning of the problem, and then they all started having children until they had a boy and stopped. None of these children have had any children. The process is complete, and the new generation is the last. Obviously even if parents in a country did follow the rule of "babies until boy then stop", the results wouldn't match the theoretical because at any given moment there are many families in the process of still having kids. This is where, if I were so inclined or needed a more accurate model, I would dive back into the Martingale issue and things would get messy.