Probability and statistics blog

Borel-Cantelli and Annihilation Events and more

admin — Wed, 21 Dec 2022 15:02:43 +0000

As should be clear from the timestamps on previous posts, this blog is now mostly in hibernation mode. Meanwhile, I have been writing and recording podcast episodes, sometimes about topics related to what you used to find on this blog. If you want to continue to follow my thoughts on topics like how to evaluate the contents of black boxes, the proper understanding of tail risks, and the use of humans as randomization devices, please subscribe to my substack and add my podcast to your favorite listening device.

Here’s a sampling of those articles and episodes:

• Borel-Cantelli and Annihilation Events

• Institutionalized Tail Risk and the Black Swan Superhighway

• The Pleasures and Perils of Arbitrage (coming soon to my substack)

• Betting Against Pascal

• Black Box Thinking, UFOs, and a Fist Full of Dung

• Deborah Mayo on Error, Replication, and Severe Testing

• Scott Aaronson on the Hunt for Real Randomness

• Russ Roberts on the Curious Task of Epistemology

• Andrew Gelman on Data, Modeling, and Uncertainty Amidst the Forking Paths

• Cargo Cults

Andrew Gelman and other interviews

admin — Wed, 07 Oct 2020 13:23:01 +0000

In addition to updating this blog once every few years, I have a more regular podcast, The Filter. While not about just statistics, readers of this blog might find the following episodes, among others, interesting:

To subscribe to the podcast, search for The Filter at your favorite podcast directory or add the RSS feed directly.

The epistemology crisis

admin — Mon, 26 Nov 2018 15:37:12 +0000

We have a crisis of epistemology. A tsunami of bad tools, bad ideas, biased actors, and unresolved problems. Among our many issues, we have: Predictions treated as facts, and inherently fuzzy historical data presented without error bars. Small scale studies on college students and professional guinea pigs extrapolated out to whole populations. Overused assumptions of normality and linearity, a holdover from when computation was hard. Scientific consensus treated as sacrosanct, theories with irrefutable tenants that adapt to all conceivable data, bad math that always skews in the direction of orthodoxy, and heretics burned in reputation and job prospects. The ongoing scandals of p-hacking, and the significance cliff itself, along with public confusion over significance versus effect size. The replicability crisis in the social sciences. Overconfidence is everywhere, with extreme predictions given publicity and bad predictions buried. Even basic questions, like how to correctly deal with outliers, let alone define them non-arbitrarily, remain unresolved.

Visualising random variables, Terence Tao style

admin — Tue, 24 May 2016 17:58:45 +0000

Recently mathematician Terence Tao posted some ruminations on how to visualize the different values a random variable could take. He created some basic animated loops that cycled through some samples from the distribution, and proposed a way to represent conditionality as well.

I liked the idea so much I’ve added it to my probability distributions JS library. The numbers can be shown directly, or interpreted as waiting times where each arrival is shown with a flashing symbol.

For documentation and examples see http://statisticsblog.com/probability-distributions/#visualize

If you use this feature do me a favor and let me know.

Probability Podcast Ep2: Imprecise probabilities with Gert de Cooman

admin — Tue, 15 Mar 2016 17:54:25 +0000

I happened to be travelling through Brussels, so I stopped by Ghent, the world hotspot for research into imprecise probabilities, and setup an interview with Gert de Cooman. Gert has been working in imprecise probabilities for more than twenty years, is a founding member and former President of SIPTA, the Society for Imprecise Probability: Theories and Applications, and has helped organize many of the ISIPTA conferences and SIPTA Schools.

Topics include fair betting rates, Dutch books, Monte Carlo methods, Markov chains, utility, and the foundations of probability theory. We had a rich, wide-ranging discussion. You may need to listen two (or more!) times to process everything.

Episode on SoundCloud

Random samples in JS using R functions

admin — Thu, 15 Oct 2015 21:35:54 +0000

For a JavaScript-based project I’m working on, I need to be able to sample from a variety of probability distributions. There are ways to call R from JavaScript, but they depend on the server running R. I can’t depend on that. I need a pure JS solution.

I found a handful of JS libraries that support sampling from distributions, but nothing that lets me use the R syntax I know and (mostly) love. Even more importantly, I would have to trust the quality of the sampling functions, or carefully read through each one and tweak as needed. So I decided to create my own JS library that:

Conforms to R function names and parameters – e.g. rnorm(50, 0, 1)
Uses the best entropy available to simulate randomness
Includes some non-standard distributions that I’ve been using (more on this below)

I’ve made this library public at Github and npm.

Not a JS developer? Just want to play with the library? I’ve setup a test page here.

Please keep in mind that this library is still in its infancy. I’d highly recommend you do your own testing on the output of any distribution you use. And of course let me know if you notice any issues.

In terms of additional distributions, these are marked “experimental” in the source code. They include the unreliable friend and its discrete cousin the FML, a frighteningly thick-tailed distribution I’ve been using to model processes that may never terminate.

Arbitrage anyone?

admin — Mon, 28 Sep 2015 18:14:40 +0000

I’m looking to put together a small crew to take on a large arbitrage project. The (rough) model for this would be “Hong Kong Syndicate” which took on the horse betting market. To be involved you have to be willing to make a large commitment in terms of time or money (I plan to contribute both). I have a set of proposed guidelines for identifying potential arbitrage targets, send me an email for more info.

UPDATE (Oct 5): Lots of interest in this, hope to finalize a core team this week. Let me know right away if you’re interested. Also, this would not necessarily be targeted at horse racing.

Guide for new users posted

admin — Thu, 19 Feb 2015 21:13:03 +0000

If you are a first-timer here at StatisticsBlog.com, or if you’re looking for a list of Greatest Hits, check out the shiny new Start Here page.

Can pregnant women intuit the sex of their children?

admin — Fri, 12 Dec 2014 00:50:01 +0000

“So let’s start with the fact that the study had only 100 people, which isn’t nearly enough to be able to make any determinations like this. That’s very small power. Secondly, it was already split into two groups, and the two groups by the way have absolutely zero scientific basis. There is no theory that says that if I want a girl or if I want a boy I’m going to be better able at determining whether my baby is in fact a girl or a boy.”

– Maria Konnikova, speaking on Mike Pesca’s podcast, The Gist.

Shown at top, above the quote by Konnikova, is a simulation of the study in question, under the assumption that the results were completely random (the null hypothesis). As usual, you’ll find my code in R at the bottom. The actual group of interest had just 48 women. Of those, 34 correctly guessed the sex of their gestating babies. The probability that you’d get such an extreme result by chance alone is represented by the light green tails. To be conservative, I’m making this a two-tailed test, and considering the areas of interest to be either that the women were very right, or very wrong.

The “power” Konnikova is referring to is the “power of the test.” Detecting small effects requires a large sample, detecting larger effects can be done with a much smaller sample. In general, the larger your sample size, the more power you have. If you want to understand the relationship between power and effect size, I’d recommend this lovely video on the power of the test.

As it turns out, Konnikova’s claims notwithstanding, study authors Victor Shamas and Amanda Dawson had plenty of power to detect what turns out to be a very large effect. Adding together the two green areas in the tails, their study has a p-value of about 0.005. This a full order of magnitude beyond the generally used threshold for statistical significance. Their study found strong evidence that women can guess the sex of their babies-to-be.

Is this finding really as strong as it seems? Perhaps the authors made some mistake in how they setup the experiment, or in how they analyzed the results.

Since apparently Konnikova failed not only to do statistical analysis, but also basic journalism, I decided to clean up on that front as well. I emailed Dr. Victor Shamas to ask how the study was performed. Taking his description at face value, it appears that the particular split of women into categories was based into the study design; this wasn’t a case of “p-value hacking”, as Konnikova claimed later on in the podcast.

Konnikova misses the entire point of this spit, which she says has “absolutely zero scientific basis.” The lack of an existing scientific framework to assimilate the results of the study is meaningless, since the point of the study was to provide evidence (or not) that that our scientific understanding lags behind what woman seem to intuitively know.

More broadly, the existence of causal relationships does not depend in any way on our ability to understand or describe (model) them, or on whether we happen to have an existing scientific framework to fit them in. I used to see this kind of insistence on having a known mechanism as a dumb argument made by smart people, but I’m coming to see it in a much darker light. The more I learn about the history of science, the more clear it becomes that the primary impediment to the advancement of science isn’t the existence of rubes, it’s the supposedly smart, putatively scientific people who are unwilling to consider evidence that contradicts their worldview, their authority, or their self-image. We see this pattern over and over, perhaps most tragically in the unwillingness of doctors to wash their hands until germ theory was developed, despite evidence that hand washing led to a massive reduction in patient mortality when assisting with births or performing operations.

Despite the strength of Shamas and Dawson’s findings, I wouldn’t view their study as conclusive evidence of the ability to “intuit” the sex of your baby. Perhaps their findings were a fluke, perhaps some hidden factor corrupted the results (did the women get secret ultrasounds on the sly?). Like any reasonable scientist, Shamas wants to do another study to replicate the findings, and told me that has a specific follow-up in mind.

Code in R:

trials = 100000
results = rep(0,trials)
for(i in 1:trials) {
	results[i] = sum(sample(c(0,1),48,replace=T))
}

extremes = length(results[results<=14]) + length(results[results>=34]) 
extremes/trials

dat <- data.frame( x=results, above=((results <= 14) | (results >= 34)))
library(ggplot2)
qplot(x,data=dat,geom="histogram",fill=above,breaks=seq(1,48))

Labor day distribution fun

admin — Tue, 02 Sep 2014 01:41:23 +0000

Pinned, entropy augmented, digitally normal distribution, of no particular work-related use and thus perfectly suitable for today. Code in R:

iters = 1000
sd = 2
precision = 20

results = rep(0,iters)

for(i in 1:iters) {
	x = floor(rnorm(20,5,sd) %% 10)
	results[i] = paste(c('.',x),sep="",collapse="")
}

results = as.numeric(results)

plot(density(results,bw=.01),col="blue",lwd=3,bty="n")