The p-values are uniformly spread from 0 to 1.Most values of r are close to zero but some are quite far on either side of it.Here is how they look:ġ0,000 studies (true r = 0, each sample size 20)
We can run this simulation 10,000 times and draw histograms of the correlation coefficient (r) and the p-value (as calculated by cor.test) for each simulated study. But let’s see what happens…Īs you will have seen, sometimes the correlation is positive, other times, negative, and occasionally it is quite big.
By design, there is no true correlation between the variables. Here is an animation of 100 simulated studies, each with 20 simulated participants. Also I haven’t defined “close to zero.”Ĥ.2.1 What can samples look like when the true correlation is 0? Therefore, there probably is a correlation.Įxcept it’s not quite this, because I have slipped the word “probably” in a couple of times and the two probabilities aren’t the same.If there were no true correlation, then the sample correlation would probably be close to zero.If Rex were a duck, then Rex would quack.The argument is similar to the structure of modus tollens from classical propositional logic. Work out how probable the correlation you got, or greater, in either direction would be if there were no true correlation.Calculate the actual correlation in the sample you have.Build a model of what the world could look like if there were no true correlation between your variables of interest.Here’s the gist for testing a correlation. It’s popular because it is easier to work with than alternatives. The basic idea is simple, though feels back-to-front. 9.8 Another worked example: the European Social Survey.8.12.2 Check that the residual mean is constant.8.11.4 Interpret using predicted probabilities.8.11.2 Interpret using the “divide-by-4” approximation.8.6 What is a generalised linear model?.8.5 Intermezzo: parametric versus nonparametric.
7.6.2 How are binary (two-level) categorical predictors encoded?.7.6.1 How are categorical variables encoded?.7.6 Understanding factors in regression models.7.5 The punchline: occupation type does predict prestige.7 Categorical predictors and interactions.6.9 Checking the variance inflation factors (VIFs).6.8.5 So, er, what should we do with “potentially influential” observations…?.6.8.3 DFBETA and (close sibling) DFBETAS.6.8 Checking influence: leave-one-out analyses.6.7.1 What should be linear in a linear model?.6.6 Checking for relationships between residuals and predicted outcome or predictors.6.5 Checking constant residual variance.6.4 Checking for normally distributed residuals.5.14.2 Another way to make scatterplots: GGally.5.12 Optional: that pesky negative intercept.5.11 Interpreting regression models with two or more predictors.5.10 Regression with two or more predictors.5.7 Adding a slope to the regression model.5.6 The simplest regression model: intercept-only model.5.5 Prep to understand the simplest regression model.4.3.2 What is a confidence interval, then?.4.2.2 Understanding actual data in relation to these simulations.4.2.1 What can samples look like when the true correlation is 0?.3.11 Other handy tools: select, slice, bind, and arrange.3.8 Plot the mean life expectancy by continent.3.6 Aggregating/summarising data by group.3.5.1 Activity to develop your help-searching skill!.