Step-by-step tutorial for doing ANOVA test in R software

R is an open source statistics program requiring knowledge of computer programming. It can be obtained from the following sources:

  • (Windows)
  • (Mac)
  • (Linux)

Here, I have presented the step by step guide to do Analysis of Variance test, commonly called ANOVA, in R software. R software screenshot is shown below:

R screenshot

[sociallocker]NOTE: In order to put the comments put the pound sign (#) before the statement/term. The comments are not the part of programming. These are used to give information or to remember, why the statements were used.

 R comments

Importing tables from excel to R:

In R software, tables can easily be imported from the other programs such as excel. You can make table in excel, save the file in .csv format and import the data to the R program. Suppose, you made a file in .csv format and saved on Desktop in C (Local Disk). You can import the data in R program by writing the file directory. In my case, it is as follows:


R import data from CSV

You can also specify a name for this data. In my case, I have given it a name of “test1”.

test1 = read.csv(“C:\\Users\\Usman\\Desktop\\test.csv”)

After specifying the name, you would be able to get the data directly by writing “test1” as shown in the figure below:

Specifying name to the data from csv in R

Telling R about the data:

You can explore your data by using following commands and writing your specified name within the brackets (test1) in this case:

  • dim(test1)
  • attach(test1)

“dim” command tells the dimensions of the data, e.g. in this case, it gives the values of “5 3”, i.e. 5 columns and 3 rows (5 observations of 3 variables). The command “attach” helps R to know the data set you are referring to, by unmasking the objects from the data, i.e. Notes, Objects, Points, in this case. You can see the results in the picture below:

Dim and attach commands in R

Single independent variable ANOVA test:

Now, you will be able to do a single independent variable ANOVA test. You can use the following commands to run ANOVA.

  • data name for ANOVA=aov(y variable~x variable) #runs the ANOVA test.
  • ls(data name for ANOVA) #lists the items stored by the test.
  • summary(data name for ANOVA) #give the basic ANOVA output.

In this case, it will be as follows:

  • aov.test1=aov(Notes~Objects)
  • ls(aov.test1)
  • summary(aov.test1)

In this example, “Notes” are considered as dependent variables taken along y-axis and “Objects” are taken as independent variables taken along x-axis. You can consider that there is only single independent variable, i.e. “Objects”.

NOTE: In this case, you have to be careful about the punctuations and capitalization of the letters/terms. Moreover, you must also have to run the “attach(specified data name)” [e.g., attach(test1)] command.

You can see the results of these commands in the figure below:

Single independent variable ANOVA in R

More than one independent variable ANOVA test:

Similarly, you can do more than one independent variable ANOVA test with a little change in commands, i.e. adding asterisk (*) in-between the independent variables as follows:

  • aov.test1=aov(Notes~Objects*Points)
  • ls(aov.test1)
  • summary(aov.test1)

So, here I have considered “Notes” as dependent variables taken along y-axis, and “Objects” and “Points” are taken as independent variables taken along x-axis. You can consider that there are two variables; “Objects” and “Points”. You can see the picture below for this analysis.

More than one independent variable ANOVA in R

Basic Interpretation of Results:

Here, you can see that the Pr(>F) value (called as p value) are greater than the alpha(=0.05) level that is usually taken in most studies. On the other hand, if the p value is less than 0.05 than the results are considered as statistically significant and such results may require regression.[/sociallocker]

Usman Zafar Paracha

Usman Zafar Paracha is Assistant Professor, Pharmaceutics, in Hajvery University, Lahore, Pakistan.