One-Way ANOVA in R

Learn how to perform One-Way ANOVA in R with code examples, post-hoc tests (Tukey HSD), assumption checks, and result interpretation. Perfect for beginners & researchers!

The two-sample t or z-test is used to compare two groups from an independent population. However, if there are more than two groups, One-Way ANOVA (analysis of variance) or its further versions can be used in R.

Introduction to One-Way ANOVA

The statistical test associated with ANOVA is the F-test (also called F-ratio). In the ANOVA procedure, an observed F-value is computed and then compared with a critical F-value derived from the relevant F-distribution. The F-value comes from a family of F-distribution defined by two numbers (the degrees of freedom). Note that the F-distribution cannot be negative as it is the ratio of variance, and variances are always positive numbers.

The One-Way ANOVA is also known as one-factor ANOVA. It is the extension of the independent two-sample test for comparing means when there are more than two groups. The data in One-Way ANOVA is organized into several groups based on grouping variables (called factor variables, too).

To compute the F-value, the ratio of “the variance between groups” and the “variance within groups” needs to be computed. The assumptions of ANOVA should also be checked before performing the test. We will learn how to perform One-Way ANOVA in R.

Suppose we are interested in finding the difference of miles per gallon based on the number of cylinders in an automobile; from the dataset “mtcars”. Let us get some basic insight into the data before performing the ANOVA.

# load and attach the data mtcars
attach(mtcars)

# see the variable names and initial observations
head(mtcars)

Let us draw the boxplot of each group

boxplot(mpg ~ cyl, main="Boxplot", xlab="Number of Cylinders", ylab="mpg")

Basic Syntax of aov() Function

# Using aov()
result <- aov(dependent_variable ~ independent_variable, data = dataset)
summary(result)

# Using lm() (alternative method)
result <- lm(dependent_variable ~ independent_variable, data = dataset)
anova(result)

Now, to perform One-Way ANOVA in R using the aov() function. The example for performing One-Way ANOVA in R is as follows

aov(mpg ~ cyl)

The variable “mpg” is continuous, and the variable “cyl” is the grouping variable. From the output note, the degrees of freedom are under the variable “cyl”. It will be one. It means the results are not correct as the degrees of freedom should be two as there are three groups on “cyl”. In the mode (data type) of grouping variable required for ANOVA should be the factor variable. For this purpose, the “cyl” variable can be converted to a factor as

cyl <- as.factor(cyl)

Now re-issue the aov( ) function as

aov(mpg ~ cyl)

Now the results will be as required. To get the ANOVA table, use the summary() function as

summary(aov (mpg ~ cyl))

Let’s store the ANOVA results obtained from aov() function in object say res

res <- aov(mpg ~ cyl)
summary(res)

Let us find the means of each number of the cylinder group

print(model.tables(res, "means"), digits = 4)

Post-hoc tests for ANOVA in R (Tukey HSD)

Post-hoc tests or multiple-pairwise comparison tests help in finding out which groups differ (significantly) from one other and which do not. The post-hoc tests allow for multiple-pairwise comparisons without inflating the type-I error. To understand it, suppose the level of significance (type-I error) is 5%. Then, the probability of making at least one Type-I error (assuming independence of three events), the maximum family-wise error rate, will be

$1-(0.95 \times 0.95 \times 0.95) = 14.2%$

It will give the probability of having at least one FALSE alarm (type-I error).

To perform Tukey’s post hoc test and plot the group’s differences in means from Tukey’s test.

# Tukey Honestly Significant Differences
TukeyHSD(res)
plot(TukeyHSD(res))

Diagnostic Plots (Checking Model Assumptions)

The diagnostic plots can be used to check the assumptions of heteroscedasticity, normality, and influential observations.

layout(matrix(c(1, 2, 3, 4), 2, 2))
plot(res)

Levene’s Test

To check the assumption of ANOVA, Levene’s test can be used. For this purpose leveneTest() function can be used which is available in the car package.

library(car)
leveneTest(res)

https://itfeature.com

https://gmstat.com

One-Way ANOVA in R

Table of Contents

Introduction to One-Way ANOVA

One-Way ANOVA in R

Basic Syntax of aov() Function

Post-hoc tests for ANOVA in R (Tukey HSD)

Diagnostic Plots (Checking Model Assumptions)

Levene’s Test

Related

1 thought on “One-Way ANOVA in R”

Leave a ReplyCancel reply

Table of Contents

Introduction to One-Way ANOVA

One-Way ANOVA in R

Basic Syntax of aov() Function

Post-hoc tests for ANOVA in R (Tukey HSD)

Diagnostic Plots (Checking Model Assumptions)

Levene’s Test

Related

1 thought on “One-Way ANOVA in R”

Leave a ReplyCancel reply

Discover more from R Programming FAQs