## What is the point of statistics?

Let's take a look at an example, let's say we ran a simple experiment to determine if FONT SIZE affects READING SPEED. Let's set up two groups; first group is timed for how long they read a paragraph of words printed in Large fonts and the second group is timed for how long they read a paragraph in Small fonts. Everything else remains the same (ceteris paribus - don't you love latin?).

The scores are tallied and we have two means: M_{large} = 30s, M_{small} = 60s. So we can conclude that people reading smaller fonts read slower, as they take more time!

Yes, the experiment is over, we can now happily go on to publish this amazing research without any useless "statistics".

…or do we? In the above scenario, its easy to see that there's a difference (M_{small} is twice as large as M_{small}) due to the treatment (change in font size). So what is the problem? we just have to compare the means to determine if there's any effect, right?

Wrong! Using that logic, let's consider these then?

M_{large} = 2s, M_{small} = 60s, significantly different?

How about: M_{large} = 20s, M_{small} = 60s, significantly different?

How about: M_{large} = 50s, M_{small} = 60s, significantly different?

How about: M_{large} = 59s, M_{small} = 60s, significantly different?

How about: M_{large} = 59.9s, M_{small} = 60s, significantly different?

How about: M_{large} = 59.99s, M_{small} = 60s, significantly different?

Did you randomly subjectively assign your own personal value to what makes a number significantly different? If you did, don't you think that that random subjective element would be different among different people? Hence, that would mean that what is significant to someone may not be significant to someone else. We can't have that, can we?

If you see the point of this, the question now comes apparent: at what value do we assign as a cut-off - the *empirical* value to say "Yes, this difference between two numbers is actually different because of the effect of the experiment and not simply random variations in score due to chance".

This is what statistical tests are designed to do, and quite productive at that, taking into account not only the means, but variance of the score and the sample sizes.

So, basically, z-test, Student t test and ANOVA are tools used to determine if the difference we find is due to chance (insignificant) or due to the treatment (significant).

## What is the difference between z-tests, t-test and ANOVA?

All these tools are used to measure whether the effect causing a difference in means are:

1. Due to *chance*

2. Due to the *effect of the treatment*

What's the difference?

The answer has to do with the capabilities of each test. What each one can do is pretty limited. So, when do you use it?

The answer is simple: depends on how many means you have and your experimental design.

QUESTION: How many sample groups do you have?

Answer 1: I have only one group, and i am, therefore, comparing against a population.

If you chose answer 1, the next thing you have to do is check if you know what the POPULATION parameters are: specifically, its mean, variance, and standard deviation, good ol' mu and sigma:

(1)Since you should already have the population mean (mu), the question is if you have the variance (sigma). Yes? THEN, run a z-test. No? THEN, run a single sample t-test. Simple.

Answer 2: I have two groups, and i am, therefore, comparing the two against each other.

Answer 2: Run a t-test - But which one? That depends, my friend, on the design of your experiment, whether it is run BETWEEN GROUPS or WITHIN GROUPS.

BETWEEN GROUPS studies just means that you took your entire sample size and divided them into two groups and each group ran different treatment conditions. Like pharmacological studies, where one group gets the drug and another gets the placebo. OBVIOUSLY, you wouldn't give both groups the placebo or drug, yeah? If your experiment is a BETWEEN GROUP study, run a Independent t-test.

WITHIN GROUPS studies means that you made your entire sample size go through two different conditions. This happens a lot in longitudinal studies. For example, if you were testing if a memory supplement works, your entire sample would first take a memory test (task 1: without supplement condition) and then they all took the supplements and did another memory test (task 2: with supplement condition). You would need to check task 1 and task 2 to see if there *was* a difference, yeah? If your experiment is a WITHIN GROUP study, run a repeated-measure t-test (also known as within-group t-test).

Answer 3: I have more than two groups, and i am, therefore, comparing all of it against all the others.

Answer 3: Whoa. Good luck, buddy. z-test and *all* t-test look at difference between two means. If you have more than two, comparing purely on means seems a bit problematic. From our example with font sizes, for instance, M_{small} = 10, M_{medium} = 20, M_{large} = 50. Of course, you can run t-tests between each font sizes (small vs. medium, medium vs. large, etc.). But, its easier (and more accurate) to just run an ANOVA. ANalysis of VAriance allows the comparisons between a number of samples to be done very efficiently and accurately without the means.

Why doesn't it use the means? Cause it doesn't have to. The idea is that if two groups are compared with each other without a experimental condition, their means and variances should not differ. They should be the same coming from the same group of "participants". When an experimental condition is given to one group, their means *and* variances WILL change. Thats how ANOVA computes its infamous F ratio. So the F ratio will tell you whether there is a significant difference between M_{small} = 10, M_{medium} = 20, M_{large} = 50. Notice that the F ratio doesn't tell you whether there's a difference between two specific scores? Enter: Post-hoc testing!

While T-Tests compare the different between the mean of two samples, only two, ANOVA is uses to compare more then 2 samples.

## An illustration to understand the concept of Anova :)

Let say you want to compare three groups,

Group A listen to Jazz

Group B listen to Waltz

Group C listen to News (Control Group)

Now, if Group A to C all dance to same move, even though they expose to different music. THEN, there is no EFFECT (your whatever, drugs/therepy does not work). The point is, expose to different music, they must dance differently.

In Anova, you are comparing variances within each group as compare to other groups.. unlike T-Test, which only control the MEAN..

Of course, for better explanation go wikipedia wikipeida Anova

Hope you understand :)

Mosesbold text