Statistical wisdom and Weldon’s dice


I went to a library talk, by my friend Jonathan Skinner, reviewing a book, The Seven Pillars of Statistical Wisdom, by Stephen Stigler. Jonathan was a professional statistician. One thing I enjoyed was his quoting Christopher Hitchens: “What can be asserted without evidence can also be dismissed without evidence.”

I also learned how the Arctic and Antarctic got their names. Skinner said Aristotle named them, based on the Greek word for “bear.” That surprised me; could Aristotle have been aware of the poles’ existence? And how could he have known about polar bears? But when I mentioned this to my (smarter) wife, she suggested the “bear” reference was to a constellation. I checked Wikipedia and while the Greek origin is correct, there was no mention of Aristotle. And of course my wife was right.

Skinner discussed a basic statistical concept: probability. He talked about it in connection with dice as an example. This reminded me of the Tom Stoppard play, Rosencrantz and Guildenstern Are Dead, and Guildenstern’s repeated coin flips. They come up heads every time. Not statistically impossible, but increasingly unlikely as the number of flips mounts. Stoppard is jesting with the laws of probability.

Of course they tell us heads and tails should be 50-50. But I also remembered a guy who wrote in to Numismatic News, doubting that theory, and reporting his own test. He flipped a coin 600 times and got 496 heads! Of course, the probability of that result is not zero. But I actually calculated it, and the answer is one divided by 6.672 times 10 to the 61st power. For readers not mathematically inclined, that’s an exceedingly tiny probability. Ten to the 61st power means 1 followed by 61 zeroes.

However, that guy, as if to flaunt his scientific rigor, explained his procedure: on each of his 600 tosses, he methodically started with the coin in the heads-up position, and then . . . well, enough said.

But Skinner related a similar tale, of Frank Weldon who (in 1894) really did try to put the theory to a rigorous test. He rolled a dozen dice 26,306 times, and recorded the results. That huge effort would make him either a martyr to science, or a fool (like the Numismatic News guy) because, after all, what is there to test? Is there any sane reason to doubt what such a simple probability calculation dictates?

However, Skinner quoted Yogi Berra: “In theory, theory and practice are the same. In practice they are not.”

Well, guess what. Weldon found the numbers five and six over-represented. With six faces to each die, you should expect any two numbers to come up one-third of the time, or 33.33%. But Weldon got 33.77%. You might think that’s a minor deviation, down to random chance. But statisticians have mathematical tools to test for that, i.e., whether a result is “statistically significant.” And the odds against Weldon’s result were calculated to be 64,499 to one.

So another fool (er, researcher), Zacariah Labby, decided to repeat Weldon’s experiment, but this time using machinery to roll the dice, and a computer to tabulate the results. He got 33.43%, a smaller deviation, but still statistically significant.

How can this be explained? It had been suggested that the small concave “pips” on the die faces denoting the numbers might affect the results. And then Labby measured his die faces with highly accurate equipment and found the dice were not absolutely perfect cubes.

But don’t rush out to a casino to try to capitalize on the Weldon/Labby deviation. Labby concluded his paper by noting that casinos use dice without concave pips and more precisely engineered, to scotch any such bias.

One Response to “Statistical wisdom and Weldon’s dice”

  1. Lee Says:

    In 12×26,306 = 315,672 rolls of a die, Zacariah Labby gets a five or six 105,528 times. This is only 304 more times than would be expected for the fair probability of 1/3, and has a chi-squared value of 1.32 for one degree of freedom. This chi-squared value of 1.32 or more extreme happens about 25% of the time, which is not terribly rare at all.

    Failing to find what he wanted to find, he then begins testing other things. This is a major “no no” in statistics because if you try enough things you will eventually find something a little weird. That is, if you try 100 things you should not be surprised if 1 or 2 of them appear to be a 1-in-100 long shot. One thing he tries is the distribution of how many fives or sixes are seen in each set of 12 rolls. The thirteen counts that he gets, for each of 0 through 12 possible high rolls, are also as expected, having a chi-squared value of 6.19 for 12 degrees of freedom. In fact rather than indicating bias, they are tamer than usual; that amount of tameness is expected to occur only about 9% of the time.

    Undeterred he looks for another test to show that things aren’t random. He tallies the occurrences of 1, 2, 3, 4, 5, and 6 separately. Here he claims that he finds that the distribution is not uniform. I says “claims” because I cannot check it. The one result that is reported to be statistically significant is the one that he didn’t give the raw counts for, giving only approximate percentages.

    I will give him the benefit of the doubt and attribute this all to sloppiness rather than sensationalism.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s