Daily Archives: 09/11/2012

What do I want from a beginners course in Statistics

Sometimes I am led to think that statistics was invented by some ugly sadist who feels a secrete pleasure in making us –normal mortals-feel inadequate.

First let us understand where do I come from. I am a biologist who managed to avoid Statistics courses as much as possible. I ran away from numbers as the devil from the cross! But when I decided to do a PhD I realised I couldn’t avoid it anymore and Stats was something that I had to learn very, very quickly.

Well… I bought a series of books and started reading them, just to reach the end of the chapters on descriptive statistics and put it aside. It was a hopeless battle. I went to some post-graduate statistics courses and again, they could as well be teaching me Chinese!…
I developed a aversion against stats, and an inferiority complex only to learn that I was not alone. Then I started wondering why so many people would feel the same as me in relation to stats. In my carrier I have met many people who completed PhDs in science and still claim they haven’t a clue about statistics.

A opened a statistics book again when I was asked to teach a course on methods in animal behaviour. There were other courses I was happy to lecture, but they came in a package and refusing to teach methods in behaviour would imply that I would not be able to lecture the other issues that I liked so much. So I went back to my search for the perfect book in statistics.

I goggled Statistics for dummies, Statistics with and without  tears,  but nothing!  Attempting to learn statistics alone usually ends up with tears! I couldn’t find a book that answered my questions about statistics. But on day I stumbled on the book by Dytham, C. (2003) Choosing and Using Statistics: A Biologist’s Guide. Blackwell Publishing. This was an great day. The book skips  all the unnecessary mathematical complications on how do we end up in the formulas for standard deviation, Person’s coefficient and so on and instead it focuses  on explaining when and why should we use a particular statistical test .

In fact, this is what I was looking for. I don’t care about how  the formula for a Pearson correlation or the sum of squares look like. I need to know about the sum of squares as much as I need to know the screws and bolts propelling my car. I need a car to go from A to B and eventually I just need to know when to change the oil and fill it up with petrol. I don’t need to know the chemical composition of the oil and where does it go in the engine.

The same way I don’t need a mechanical engineer to teach me how to drive a car I also don’t need a statistician to teach me how do we arrive at the formulae to calculate standard deviations and all other statistical opacities. The only think I need to know is what is the meaning of those statistics. What does it tell me about my data? Do I need to know if my data is normally distributed? Yes… but once I know the answer I just use this information to determine what type of tests should I use to test my hypothesis. I care as much about the formulae that enables me to calculate normality as I care about how to calculate the square root of 456 by hand. This is what calculators and stats programmes are for. We just have to insert our data on a spread-sheet and then click on an off the shelf statistics test included with the package.

To me the goal of learning statistics is not about learning how the test works, but what the results mean. What do these tests tell me about my data? I think this is where people that teach statistics get it wrong. They start by giving us all sorts of unnecessary information that does not provide  much help on how to interpret our data and results. Especially, what we need to know is the meaning of the tables displayed at the end of an analysis provided by the package.

I have questions such as “why are degrees of freedom always presented as n-1?

“Why not n-2 or n-35 for that matter?” .What are degrees of freedom any way? What are they for?

The other problem with  statistics courses is that  they start  without providing an overview of probabilities. And her we are we looking at  something like P<0.05 without having a clue what the P stands for.

Many people just memorised a rule of thumb; if they get something like P<0.05 that is a good thing ! It  confirms what they wanted to prove. This is not good enough!

It wasn’t until I started teaching critical thinking in science, and issues about causal reasoning and bias that I finally understood the philosophy behind the need for statistics.
It felt as if I had been living in a hazy landscape where the statistical formulae where nothing more than the grey contours of trees in the horizon. And now, the sun is starting to shine through the clouds and I realise that those trees actually have green and interesting leaves.

I think that statistics is hated because there are few people who can explain it properly. Statisticians may well know a lot about statistics, but in my experience they have great difficulty in conveying that information to beginners.

My suggestion to teachers of statistics is to start by explaining the principles of the scientific method. Teach first how to generate hypothesis and how and why attempt to refute them.

Many of people I have talked to about statistics, have difficulty understanding that they are trying to  refute a hypothesis. They start their research by wanting to prove that what they believe is true rather than attempting to disprove it.

Actually what we attempt to do with statistics is not to prove that the alternative hypothesis is true, but that the zero hypothesis is likely to be false.

Usually when people start doing research their first instinct is to seek confirmation of their theories and this is the wrong approach. These concepts are necessary to understand even before we start calculating means, medians and modes. They are the framework that holds the whole picture. Otherwise statistics becomes nothing more than some incomplete blots on an impressionist aquarelle.

When I was teaching methods in behaviour a created a power point from the Dytham, C. (2003) book which I gave away to many graduate students struggling with the very same conceptual problems that I had when I stared doing science. I got a positive feedback. Many said that this PowerPoint helped them to put all the tests they read about in papers, in perspective. The slides also provided some complementary help to the book.

The problem with books is that they are static. Especially when they present a load or numbers and graphs. All those numbers feel like visual pollution or like a pneumatic driller in the street  when you are trying to listen to Beethoven. But when we see the numbers and explanations appearing progressively in animated slides and the graphs start taking shape before our eyes then things start making some sense.

This PowerPoint is not yet complete, but I am just giving you access to part of what I have done so far and I would like to ask you if this helps.

Statistical Tests