Step-by-step tutorial for doing ANOVA test using completely randomized design in R software

R is an open source statistics program requiring knowledge of computer programming. It can be obtained from the following sources:

  • http://cran.r-project.org/bin/windows/base/ (Windows)
  • http://cran.r-project.org/bin/macosx/ (Mac)
  • http://cran.r-project.org/ (Linux)

Here, I have presented the step by step guide to do Analysis of Variance test, commonly called ANOVA, in R software. R software screenshot is shown below:

R screenshot


[sociallocker]NOTE:
In order to put the comments put the pound sign (#) before the statement/term. The comments are not the part of programming. These are used to give information or to remember, why the statements were used.

Adding comments

Importing tables from excel to R:

In R software, tables can easily be imported from the other programs such as excel. You can make table in excel, save the file in .csv format and import the data to the R program. Suppose, you made a file in .csv format and saved on Desktop in C (Local Disk). You can import the data in R program by writing the file directory. In my case, it is as follows:

> read.csv(“C:\\Users\\Usman\\Desktop\\test.csv”)

Import data from CSV

You can also specify a name for this data. In my case, I have given it a name of “test1”.

> test1 = read.csv(“C:\\Users\\Usman\\Desktop\\test.csv”)

Specifying name to the data from csv

After specifying the name, you would be able to get the data directly by writing “test1” as shown in the figure below:

Concatenating the data rows and generating the treatment factors:

Concatenate the data rows (link the data together in a sequence) of test1 into a single vector testy as follows:

> testy = c(t(as.matrix(test1))) # response data

> testy

[1]   223    26     2   234    56   546   332    34  1000   445    23   347

[13]   343    65 20000

Testy screenshot

as.matrix helps to convert an argument into a matrix.

We have three treatment levels – Objects, Notes and Points – and five observations. Now we will assign new variables for treatment levels, number of treatment levels and the number of observations as follows:

> f = c(“Objects “, ” Notes “, ” Points “)   # treatment levels
> k = 3                    # number of treatment levels
> n = 5                    # number of observations per treatment level

Now we create a vector of treatment factors that corresponds to each element of testy with the gl function.

> testx = gl(k, 1, n*k, factor(f))   # matching treatments

> testx

[1] Objects   Notes    Points  Objects   Notes    Points  Objects   Notes

[9]  Points  Objects   Notes    Points  Objects   Notes    Points

Levels: Objects   Notes   Points

testx screenshot

It is the function of gl to generate factors by specifying the pattern of their levels. Here k shows the number of levels, 1 shows the number of replications (the given levels have to be mentioned individually at a time) and n*k shows the length of the result. You can see that three treatment levels are repeated here individually for five times giving a length of fifteen.

ANOVA analysis:

Now we will apply the function aov as follows:

> aov.test1 = aov(testy ~ testx)

> summary(aov.test1)

Df           Sum Sq                 Mean Sq              F value                  Pr(>F)

testx                      2              59013716             29506858              1.159                    0.347

Residuals             12           305574720           25464560

ANOVA table

Basic interpretation of the results:

Here, we see that the p-value (Pr(>F)) of 0.347 is greater than the 0.05 (5%) significance level that is why we do not reject the null hypothesis (H0), i.e. we would not be able to prove our theory.

You can ask questions in the comments.

[/sociallocker]

Usman Zafar Paracha

Usman Zafar Paracha is Assistant Professor, Pharmaceutics, in Hajvery University, Lahore, Pakistan.