Daily Mail or the daily espresso in the morning?

An article today on the BBC regarding research into coffee consumption and depression led me to look up what the Daily Mail had published. The BBC headlined with “Coffee may prevent depression”, whereas the Daily Mail opened with “Coffee is good for you“.

The BBC at least mentioned that the 50,000 study participants were all US female nurses and linked to the research article, though neither revealed that the mean age of those involved was 63. The Daily Mail also adds in a bit of science which is just plain wrong, “…caffeine works like antidepressant pills by stopping the production of certain hormones such as serotonin”.

Both give the figure that those who drink between 3 cups per day were 15% less likely to suffer depression without giving the absolute risk difference (there were only 2607 cases of depression total over the ten years in the study.) A rough, back-of-the-envelope calculation gives “Out of 10,000 quite old women, half of which drink coffee, there will be 27 cases of depression amongst the non-drinkers this year and only 24 within the coffee-drinking group.” I guess that’s why I don’t write headlines.

The BBC article mentions other factors in the study, and how the initial headline statistic might come down once they are controlled for. The Daily Mail does not. Neither article considers whether this small difference is casual or correlative (coffee drinkers perhaps being less depressed because the reason they drink coffee in the first place is because they have lots of friends to meet up with at Starbucks).

courtesy of xkcd.com


You’ve got to be in it to win it

The main UK lottery requires participants to pick 6 numbers from 1 to 49, which are then compared to the 6 that are drawn out of the machine (ignoring the bonus ball). There are prizes for matching anywhere from 3 to 6 of the numbers.

The probability of winning any prize is 2%, and in fact the most likely outcome is that you’ll match some numbers, yet not win (but giving a feeling of being close). Some observations on winning the jackpot (approx 1 in 14m):

  • Unless you’re quite young, and buy your ticket in the few minutes before the deadline, you’re statistically more likely to die before the draw is made than you are to win the top prize
  • You’ve as much chance of correctly picking the one minute I’m thinking of from the first 27 years of my life than you have of selecting all 6 numbers

And of course, there is also the cliche statistic of comparing winning the lottery to being struck by lightning (about 30-60 people are struck by lightning each year in Britain of whom, on average, three may be killed. source)

3 Logicians walk into a bar

Hat tip SpikedMath

Here are some of my most popular posts and a question on maths

God does not play baseball

A bat and a ball cost $1.10 in total. The bat costs $1 more than the ball. How much does the ball cost?

The same question was used in a survey on belief in god:

“Some say we believe in God because our intuitions about how and why things happen lead us to see a divine purpose behind ordinary events that don’t have obvious human causes,” study researcher Amitai Shenhav of Harvard University said in a statement. “This led us to ask whether the strength of an individual’s beliefs is influenced by how much they trust their natural intuitions versus stopping to reflect on those first instincts.”

Shenhav and his colleagues investigated that question in a series of studies. In the first, 882 American adults answered online surveys about their belief in God. Next, the participants took a three-question math test with questions such as, “A bat and a ball cost $1.10 in total. The bat costs $1 more than the ball. How much does the ball cost?”

The intuitive answer to that question is 10 cents, since most people’s first impulse is to knock $1 off the total. But people who use “reflective” reasoning to question their first impulse are more likely to get the correct answer: 5 cents.

Sure enough, people who went with their intuition on the math test were found to be one-and-a-half times more likely to believe in God than those who got all the answers right. The results held even when taking factors such as education and income into account.

More here

I, Pencil

I, Pencil“, written by Leonard E. Read some 50 years ago, has been described as an essay which “many first-time readers never see the world quite the same again” after reading. It takes just a few minutes to read, and here is a flavour:

I am a lead pencil—the ordinary wooden pencil familiar to all boys and girls and adults who can read and write.

Simple? Yet, not a single person on the face of this earth knows how to make me … millions of human beings have had a hand in my creation, no one of whom even knows more than a very few of the others.

It’s a simple story implying that economies cannot be centrally planned, yet as Milton Friedman wrote:

I know of no other piece of literature that so succinctly, persuasively, and effectively illustrates the meaning of both Adam Smith’s invisible hand—the possibility of cooperation without coercion—and Friedrich Hayek’s emphasis on the importance of dispersed knowledge and the role of the price system in communicating information that “will make the individuals do the desirable things without anyone having to tell them what to do.”


And a random but related note on the urban legend that America spent millions of dollars designing a space pen that would work in a zero gravity vacuum,  whereas the Russians used a pencil:

Alas, for all its appeal and plausibility, this is not true.  Initially, astronauts and cosmonauts were both equipped with pencils, but there were problems: if a piece of lead broke off, for example, it could float into someone’s eye or nose.  A pen was needed, one that would defy gravity, write in extreme heat or cold, and be leak proof: blobs of ink floating around the cabin would be more perilous than a stray pencil lead.  A long-time pen maker named Paul C. Fisher patented the “space pen” in 1965 (which he had developed at the cost of a million dollars, at the request of but not under the auspices of NASA.)  NASA bought four hundred of them at $6 each, and, after a couple of years of testing, the pens were put into space


Selection Effects

An article on the BBC website just now regarding the London riots has caught my attention with the headline:

One in four riot suspects had 10 previous offences

And it goes onto say that:

Three-quarters had a previous caution or conviction

Is it really true that 75% of those committing crimes on those evenings had previous convictions, or is it that those with previous convictions have their details on a police database and it was therefore possible to identify (and find) those people based on images from photographs and recordings made last month?

Many people fail to notice when a selection effect (selection bias) is occurring, and in many situations. I remember many weekend nights out in my youth with my good friend John where he would observe that all the girls were going in the opposite direction to us and should we not go somewhere else. I stuck to the line that we were only ever going to pass people going in the opposite direction. It certainly wasn’t anything to do with us being uncool.

And here’s a good article on World War 2, regarding selection bias and aircraft design. An engineer was asked to inspect aircraft returning from battle over a period of time, and to come up with a recommendation on where to add armour. After building up a statistical model of where the aircraft he inspected has sustained damage, he recommended reinforcing armour in the parts that were generally not damaged. Can you think why? Answer under the fold.

Read more of this post

You couldn’t make it up

One method that can sometimes be applied to a set of numbers to see if they are genuine or made up is Benford’s Law. I came across it some years ago, and Tim Harford mentioned it this weekend in relation to the data the Greek government supplied for their submission to join the eurozone:

In the late 1990s, eurozone wannabes squeezed and stretched to meet the criteria for accession, including low inflation and government deficits, and moderate levels of debt. The criteria were somewhat irksome, especially for an economy such as Greece, but nevertheless the Greeks seemed to comply…Eventually, it became clear that the Greek numbers did not quite add up.

The law applies to sets of numbers (e.g., lengths of rivers in a country, line items in a company’s cost accounts) that don’t follow any particular statistical distribution, and ideally span several orders of magnitude (i.e., from 5km to 5,000km, or £0.50 to £500,000). It presumes also that the sample you take has no selection effects (see here for a recent catchy BBC headline regarding the London riots that I’m fairly sure contains some stunning selection bias).

Specifically, Benford’s Law has something to say about how many times the digits 1 to 9 should be the leading digit of the numbers in your set. So, items costing £4.00, £412.99, and £4,999.00 all start with the digit 4, and you’d expect around 10% of your set to start with a 4.

That doesn’t seem so remarkable, with there being just 9 leading digits to pick from, of course you’d expect each to appear around 10% of the time. But in fact no, the lower the digit, the more likely you should see it. So, numbers in your set should begin with a 1 around 30% of the time, down to 10% for a 4, and with 9’s only showing up 5% of the time. Here is chart I lifted from Wikipedia, regarding the population of the world’s countries. The red bar is the data, and the black dots are what Benford’s law predicts:

Source: Wikipedia (here is a thought about the number of wikipedia articles from an earlier post)

Using ideas like this to investigate sets of numbers reminds me of a project, at my first employer, from many years ago. A bank had participated in some collaboration with another consultancy, whereby many banks submitted their trade data (number of trades, fees charged, etc). The consultant then calculated some statistics for each bank, such as average cost per trade, and shared them anonymously back with all the participants. That should have been the end of it, except for that the consultant had left more information in his spreadsheet than intended, as data was given to many decimal places, albeit formatted to just show one decimal place. One of the banks came to my company with that, and asked what could be done.

Now, if you’ve got an average cost per trade of £2.08375638201 you can write a VBA programme to automate the process of cycling through all possibilities to back calculate and see which two numbers (an integer which is the number of trades, divided by the total revenue derived as shown in one of the participant’s annual reports and of the format £xxxxx.xx) would have got to that answer. Cute.

Other Posts on Maths/Distributions:

Why your friends are more popular than you

Selection effects and the London riots

God and maths tests