# KeepNotes blog

Stay hungry, Stay Foolish.

0%

I thought I used to understand the ANOVA definitely. But when I’d like to apply the MANOVA model, I found I was totally wrong. I even had no clear understanding about which variables, continuous or categorical, should be used in ANOVA. So I decided to keep notes to figure out what is the difference between ANOVA, MANOVA and ANCOVA.

#### ANOVA

ANOVA is a statistical technique that assesses potential differences in dependent variables by categorical variables. Commonly, ANOVAs are used in three ways: one-way ANOVA, two way ANOVA and N-way ANOVA.

##### Assumptions
• Independence of observations, that there are no hidden relationships among observations.
• Normally-distributed dependent variables, comply with normal distribution. If it is not met, you can try a data transformation.
• Homogeneity of variables, the variances in each group are similar. If it is not met, you may be able to use non-parametric alternatives, like the Kruskal-Wallis test.

Types of data in ANOVA, T test and Chi-Squared Test

X independent variables X group Y Analysis
categorical Two or more groups quantitative ANOVA
categorical Just two groups quantitative T test
categorical Two or more groups quantitative Chi-Squared Test
##### One way ANOVA

One way ANOVA has just one independent variable affecting a dependent variable, and the independent variable can have 2 or more categories to compare.

The null hypothesis for the test is that means in groups are equal, which means there is no difference among group means. Therefore, a significant result means that the means are unequal. If you want to compare two groups, use the T-test instead.

ANOVA uses the F-test for statistical significance. If the variance within groups is smaller than the variance between groups, the F-test will find a higher F-value, that means a higher significance.

ANOVA only tells you if there are differences among the independent variables(levels), but not which differences are significant. To find out how the levels differ from one another, perform a TukeyHSD post hoc test.

##### Two way ANOVA

Two way ANOVA has two independent variables, or two categorical variables, which is the most different from one way ANOVA. These categories are also called factors, and the factors can be split into multiple levels. So if one factor can be split into 3 levels, and another level can be split into 3 levels. In this condition, there will be 3x3=9 groups.

Use a two way ANOVA when you want to know how two independent variables, in combination, affect a dependent variable. So A two way ANOVA with interaction tests three null hypotheses at the same time:

• There is no difference in group means at any level of the first independent variable.
• There is no difference in group means at any level of the second independent variable.
• The effect of one independent variable does not depend on the effect of the other independent variable (a.k.a. no interaction effect)

If you want a two way ANOVA without interaction effect, only need the first two hypotheses.

``````data <- mtcars[,c("am", "mpg", "hp", "vs")] %>%
mutate(am = factor(am), vs = factor(vs))
summary(data)

# One-way ANOVA
one.way <- aov(mpg~am, data = data)
summary(one.way)

# Two-way ANOVA
two.way <- aov(mpg~am+vs, data = data)
summary(two.way)
# Two-way ANOVA with interaction
two.way <- aov(mpg~am*vs, data = data)
summary(two.way)``````

#### MANOVA

We know that one or two way ANOVA has only one dependent variable, but MANOVA is not limited. We alway call MANOVA the multivariate analysis of variance, so it is used when there are two or more dependent variables. It’s purpose is to find out if dependent variables differ from independent variables simultaneously.

MANOVA assumes that independent variables are categorical and dependent variables are continuous, the same as ANOVA.

Instead of a univariate F value, we would obtain a multivariate F value, and several test statistics are available: Wilks' λ, Hotelling's trace, Pillai's criterion.

##### Benefits form MANOVA

Sometimes, we use one way ANOVA can not find out the significance for each dependent variable between groups (independent variables). Therefore we conclude that there is no relation between dependent and independent. However when we apply MANOVA to these dependent variables simultaneously, it concludes that dependent variables are affected by the independent variables.

if you're still confused about it, try read this post Comparison of MANOVA to ANOVA Using an Example, will give a better example to interpret.

When you need to perform a series of one way ANOVA because you have multiple dependent variables to analyze, in this situation using MANOVA can protect against Type I errors.

Example:

• dependent variables: Sepal.Length and Petal.Length
• independent variables: Species

Fit model and summarize:

``````sepl <- iris\$Sepal.Length
petl <- iris\$Petal.Length
# MANOVA test
res.manova <- manova(cbind(Sepal.Length, Petal.Length) ~ Species, data = iris)
# define statistics, Wilks
summary(res.manova, test = "Wilks")``````

#### ANCOVA

ANCOVA is like an extension of ANOVA, and can be used to adjust other factors that might affect the outcome, such as age, gender or drug use. Otherwise it can be also used to combine with the categorical variable as a continuous variable(one factor is categorical, another is quantitative), or variables on a scale as predictors. It means the covariate is a variable of interest, not the one you want to control for.

Therefore, you can enter any covariate you want to ANCOVA. The more you enter, the fewer degrees of freedom you will have, so that it will reduce the statistical power. Finally, the lower the power, the less likely you will be able to rely on the results of the test.

##### Assumptions

Before performing ANCOVA, besides normal distribution and homogeneity of variance, we need to verify that covariate and the independent variable are independent of each other, since adding a covariate into a model only makes sense if the covariate and independent variable act independently to the dependent variable.

NOTE, if you use type 1 sum of square for the model, you must note the order, the covariate goes first(and there is no interaction)

Example:

• dependent variables: Petal.Length
• independent variables: Species
• covariate: Sepal.Length

Fit model and summarize:

``````# fit ANCOVA model
fit <- aov(Petal.Length~Sepal.Length+Species, iris)
# view summary of model
car::Anova(fit, type = 2)``````