The University of North Carolina at Pembroke
MBA 510--Quantitative Methods
ANOVA and MANOVA

Link to MBA 510 Main Page
Link to Dr. Frederick's page

Glossary
Assumptions
Review of One-Way ANOVA
Multiple-Comparisons Problem
Variations on ANOVA
ANOVA
    Factorial: Two-Way ANOVA
    Nested
    Latin Square
ANCOVA
MANOVA
MANCOVA
 

Assumptions:  These assumptions are common to ANOVA, MANOVA, ANCOVA, and MANCOVA.
1. Y is a dependent (or response) variable.  It is a continuous variable (measured on an interval or ratio scale).
Xi (i = 1 to m) are the independent variables.  They are categorical (nominal or ordinal) and represent experimental treatments or random factors.  (If one or more of the Xi is a numeric variable (interval or ratio), the analysis becomes an ANCOVA.)  The different values that Xi can have are called its levels.
Random influences on Y cannot change the value of Xi.
2.  Normality: Within each subpopulation defined by combinations of the Xi the Ys have a normal distribution.
3.  Homoskedasticity: Within each subpopulation, the Ys have a variance s², which is the same for all subpopulations.
4.  Independence: The value of each Y in the sample is independent of each the other observed value of Y in its subpopulation and of the Y values in other subpopulations.

Review of One-Way ANOVA
Null hypothesis: the means of all subpopulations are equal to each other.
I.e.,  H0: µ1 = µ2 = µ3 = . . . = µk ., where k is the number of subpopulations.

Example:

In the graph above, there are four subpopulations representing four geographic regions.  The horizontal axis measures the Y variable, in this case it is consumers' estimated price of a new product.  The graph suggests that consumers in the North put a lower price on the product, while consumers in the East put a slightly higher price on it.  There is no difference between the estimated prices of consumers in the South and in the West.

The first assumption is show by the fact that the value of Y does seem to depend on X, where X is the categorical variable Region.  Region has four levels: North, South, East, West.

The normality assumption is shown by the fact that each distribution for each group has the normal bell-shaped curve.  The distribution of Y for the combined population (without considering the subpopulations separately) would not be normal.  Imagining what these curves would look like if added horizontally suggests that the population of Y would be bimodal.

The homoskedasticity assumption is shown by each subpopulation's distribution having the same width (and therefor the same height, since the area under the curve must equal 1.00).

The independence assumption cannot be shown on this graph.

The term analysis of variance is misleading.  The purpose of analysis of variance is to detect a difference between the means, µ, of the subpopulations.  The null hypothesis is that there are no differences in the populations and hence that changing the level of X does not have an influence on the value of Y.

The procedure is to estimate the variance, s², in two different ways and compare the results.  If the results agree, the null hypothesis cannot be rejected.

The direct way to estimate s² is from the errors in each group.  This is usually referred to as the within-group error.  If the accompanying dataset, the values of Y in the North are 11, 12, 13, 14, and 15, the average value (y-bar) is 13.  The errors (e = y - y-bar) are -2, -1, 0, 1, and 2.  Taking the sum of the squares of these, we get 10.  Doing the same for all groups and adding the results we would get SSE = 40.  The variance estimate is SSE/d.f. = 40/16 = 2.50. This is called the Mean Square for Error, MSE.

The second way to estimate s² is to measure the variation among the of y-bars of the groups.  This is usually referred to as the between-group error.  In the accompanying dataset, the sample means for the North, South, East, and West are 13, 16, 17, and 16, respectively.  Since in this sample each subgoup has the same sample size, the overall mean of the Ys (y-bar-bar, which JMP calls Mean of Response) is (13+16+17+16)/4 = 15.5.  The variance of the y-bars can be estimated from this sample of means.  This estimate is 3.  If all of the subpopulations are really the same, their means (µi) will be identical, and the only reason that the sample means (y-bar) are not equal is due to randomness within that population.  In this case, the variance of the distribution of y-bar is s²/ni, where ni is the number of observations in each group--in this case, 5.  From these figures, we can estimate s² as 3x5 = 15.  This is called the Mean Square for the model.  If the population means are not all the same, then the differences among the y-bars is due to more than the variance within the population--there are distinct subpopulations with different means which increase the variance of the y-bars.

We compare the two estimates by calculating the F ratio: MSmodel/MSE = 15/2.5 = 6.0.  In other words, the indirect method gives us a variance estimate that is six times that of the direct method.  The p value for the F ratio is 0.0061.  This is too small to believe that, in repeated similar studies with different data, the two variance estimates would tend to be equal.  So, we do have evidence that something is causing the indirect method to give a larger variance.  That something is a significant difference among the sample means.  The null hypothesis that all subpopulation means are equal can be rejected since 0.0061< 0.05.

The ANOVA Table:
The analysis of variance is usually reported by statistical software packages in the form of a table.
 

Source of Variation d.f. Sums of 
Squares
Mean
Squares
F ratio Prob>F
Model 3 45 15.00 6.00 0.0061
Error 16 40   2.50    
Total 19 85    
For a one-way ANOVA, the degrees of freedom for the model are k-1, the degrees of freedom for the error are n-k, and the total degrees of freedom are n-1, where n is the overall number of observations in the whole sample, k is the number of subpopulations.  MS = SS/d.f.   and F = MSmodel/MSE.

The Multiple-Comparisons Problem:
The F test tells us whether there are real, significant differences among the subpopulations' means that are likely to appear in a duplication of the study with a different data set.  If the F test tells us that there are real differences, the second phase of the study is to find which subpopulation means are different from which.  If there are k groups, the number of possible pairs of groups is k(k-1)/2.  In this example, k = 4, so there are 6 possible pairs to compare for differences: NS, NE, NW, SE, SW, and EW.  If we use a level of significance of a = 0.05 on each of these six tests, there is a 0.265 (26.5%) chance that we will detect a difference between at least one of these pairs, even if there really are no differences in the population!  The 0.05 is the pair-wise a, where as the 0.265 is the experiment-wise, or global, a.  The global a should be 0.05 or less.  The formula for finding the global a is 1 - (1 - a)c, where a is the pair-wise a and c is the number of potential pairs to compare.  If we look at the results and decide to make only one comparison--the largest mean vs. the smallest mean, the multiple-comparisons problem still exists.  We have made "eye-ball" comparisons to select the largest and the smallest means.  The only time that the multiple-comparisons problem does not exist is when we had a reason for checking a specific pair of means, prior to gathering and analyzing the data.

There are several approaches to correcting for this problem.
Bonferroni.  This is the simplest approach.  It reduces the global a by dividing the pair-wise a by c.  In the example above, the pair-wise  would have been 0.05/6 = 0.00833.  We could only reject the null hypothesis (and therefor detect a difference between means) if the p value for a pair of means was less than 0.00833.  The Bonferroni approach is too conservative; that is, it makes the global a too small, especially when k is large.  In the example above, the global a is 0.049. This implies that the pair-wise a could have been a bit larger, making it a bit easier to reject the null hypothesis.

Tukey's honestly significant difference (HSD):  A disadvantage of this method is that it only gives an exact correction when the sample sizes are equal.  Otherwise, it is too conservative.  Hsu called this test the MCA; he recommends it when you want to compare all possible pairs.

Hsu's MCB (Best): Should be used when you want to compare the means with an unknown maximum value (or an unknown minimum).  Included in JMP.

Dunnett:  This test is recommended when one of the levels of X represents a control group (a no-treatment baseline group) and you want to compare the other groups to the control group.  Hsu called this test the MCC.  Included in JMP.

Scheffe': The Scheffe' method can be used when the sample sizes are not equal.  It will always agree with the ANOVA F test in the sense that if the F test detects differences, then at least one Scheffe' test will detect a difference.  Conversely, if the F test does not detect any differences, then none of the Scheffe' tests will.

Duncan: This test has good power.

Ryan-Einot-Gabriel-Welsh (REGW):  This test has good power.

Student's t: Usually you should not use this test.  It makes no correction for multiple comparisons.  JMP only includes this in case you have a pair that you had planned to compare a priori.
 

ANOVA with JMP IN:

  1. Enter the data into a JMP data table, or read in a JMP dataset.  The data for the table above can be downloaded by clicking here.
  2. Click on Analysis, and then click on Fit Y by X.
  3. In the Fit Y by X--Contextual box, highlight the variable to be used as the dependent variable then click on the Y, Response button.  Highlight the variable to be used as the independent variable, then click on the X, Factor button.  Click the OK button.
  4. Click on the pop-up menu (the red triangle) beside One-Way Analysis of ... by ... ., then click on Means/Anova/t test.
  5. For the multiple comparisons tests, click on the pop-up menu, then click on Compare Means.  Of the tests available in JMP, I recommend Tukey's HSD for most cases.

 

Variations on ANOVA
 

Number of Ys Categorical
Only
Numerical
Only
Mixed
One ANOVA Linear
Regression
ANCOVA
or Lin. Reg. with
dummy variables
More than one MANOVA Canonical
Correlation
MANCOVA
or Can.Corr. with
dummy variables

ANOVA: Analysis of Variance
ANCOVA: Analysis of Covariance
MANOVA: Multivariate Analysis of Variance
MANCOVA: Multivariate Analysis of Covariance
 

ANOVA
Types of Effects:
Fixed Effect: an independent variable that is under the control of the researcher and covers all of the values of interest in the experiment.  Examples: sex (M or F); supplier brand, if there is a finite number of suppliers and all are represented in the study.
Random Effect: an independent variable that covers a subset of all values of interest.  Usually, the values that are used are determined by nature or determined at random.  The researcher is interested in extrapolating from the values used in the study to all values that the variable can have.  Examples: consumers (you may want to say that there is variation from consumer to consumer, but you will have a random sample of consumers); supplier brand (you want to say that there are differences among suppliers, and out of 20 suppliers you use 6 in your study).

Main Effect: the effect of a variable, per se, without any interactions.
Interaction Effect: the effect of two variables acting together apart from the sum of their separate main effects.
For example, Y depends on two factors: sex and region.  The overall average Y is 12. The value of Male is +5, while the value of Female is 0.  The value of West is +9.  The average value of Y for Males from the West is 29.  Without the interaction, we would have predicted the average value of Y for Western Males to be 12 + 5 + 9 = 26.  Since the average for Western Males is really 29, there must be a special interaction effect for West and Male of +3.
 

Factorial: Two-Way ANOVA
If there are an equal number of observations for each combination of the levels of the factors, the analysis is said to be a factorial analysis (or a factorial design).  If there are more than one observations per combination of levels, then interaction effects can be estimated.

Factorial ANOVA in JMP:
Use the Fit Model tool in JMP.  (The Fit Y by X tool cannot accommodate more than one X at a time.)
To specify the main effects, highlight the variables to be included as main effects, then click on the Add button.
To specify the interaction effects, highlight the two variables that will be interacting, then click on the Cross button.  (Third-order interactions (interactions among three variables) may exist, but usually they are ignored.)
Click on the Run Model button.

Nested ANOVA
A nested ANOVA includes a variable which has different levels for each level of a second variable.  For example, management wants to study the productivity of its supervisors.  There are three supervisors and each supervisor has five workers.  The five workers that work under supervisor A never work under supervisor B or C.  The same is true of B's and C's workers.  We say that Worker is nested under Supervisor.  A nested ANOVA is not a factorial ANOVA, since there are many Supervisor-Worker pairs that are not observed.  This points out a weakness of the nested ANOVA design: If supervisor C's five workers have higher productivity than the other workers, on the average, it will be impossible to tell whether their high productivity is due to supervisor C's efforts or due to the workers' innate skills.

Nested ANOVA in JMP:
A dataset for nested ANOVA may be dowloaded by clicking here.
Use the Fit Model tool in JMP.  (The Fit Y by X tool cannot accommodate more than one X at a time.)
Specify the main effects: highlight the variables to be included as main effects (both Supervisor and Worker in this example), then click on the Add button.
To specify the nested effects, highlight the the nested variable (Worker) that you just put into the Construct Model Effects box on the lower right corner of the Model Specification window.  Highlight the variable which it will be nested under (Supervisor) in the Select Columns box.  Click on the Nest button.  You should now see something like Worker(Supervisor) among the effects in the model.  This should be read as "the levels of Worker depend on the Supervisor."
Click on the Run Model button.

Latin Square
Latin Square design is appropriate when there are one fixed factor and two random factors and there are as many levels of each random factor as there are of the fixed factor.  If there are r levels of the fixed factor, the object of the Latin square design is to arrange for one measurement of Y for each combination of the levels of the two random factors.  For this to work correctly, there must be r levels of the first random factor and r levels of the second random factor.

ANCOVA

MANOVA
Multivariate Analysis of Variance (MANOVA) allows the researcher to analyze several dependent variables at a time with the same set of independent variables.

An example:
Consider the marketing example above.  Suppose the consumers had not only been asked to estimate the price of the product, but also to rate the likelihood that they would buy one on a scale of 0 to 100% and how much they would actually be willing to pay for the product.  There are now three different measures of value instead of just one.  The researcher still wants to know whether consumers in different parts of the country place different values on the product.

The researcher could perform three separate ANOVAs: one with estimated price as the Y, one with likelihood of purchase as the Y, and one with the the maximum price as the Y.  However, MANOVA has several advantages over three separate ANOVAs:

Benefits of MANOVA:
1.  Control of the global a:  If the researcher uses a = 0.05 in each of the three separate ANOVAs, and if Region really has no effect on consumers' opinions of the product, the chance of incorrectly detecting a relationship between Region and value in at least one of the ANOVAs would be 0.143 (14.3%), which is rather high.  Using MANOVA to test all three relationships simultaneously can keep the overall a down to the desired 0.05.

2.  Increased power:  The power of a test is the probability that the test will correctly detect a difference when there is a difference.  (The probability of rejecting H0 when H0 is in fact false.  Power = 1 - b.)  If the three response measures do in fact reflect the consumers' opinions of the product, they should be somewhat correlated with each other.  Using all three measures in the MANOVA simultaneously lets this correlation reinforce the relationship between perceived value and Region.  By bringing more information to bear on the problem, it may be possible to detect differences among the Regions that would be too weak to detect one Y at a time.

3.  In a MANOVA, it is possible to make comparisons across equations that would not be possible in separate ANOVAs.  One common instance of this is a before-and-after study.  Suppose the consumers had been asked to evaluate the new product once, and then were allowed to use the product and asked to evaluate it again.  (This is sometimes called a repeated measures study, although some writers use that term to mean simply that there were several observations for each group.)  It would also be possible to compare the strength of the Northern effect on estimated price vs. the strength of the Northern effect on maximum price.  A third type of cross-equation effect would be to test whether the strength of the Northern effect on Y1 and on Y2 and on Y3 adds up to, say, 100.
 

Drawbacks from MANOVA:
1.  Degrees of freedom: Each additional Y costs one degree of freedom.  If the Y variables are not correlated at all, there is no improvement from using MANOVA instead of separate ANOVAs, and the results may be worse because of the lost degree of freedom.  Furthermore, MANOVA forces all X variables to appear in each equation.  Sometimes there may be a priori reasons to eliminate one of the Xs from an equation.  This would save a degree of freedom.

2.  MANOVA is harder to interpret than ANOVA.  Using multiple Y variables increases the chance that one of the Xs may be related to one of the Ys for unrelated reasons.

Cautions:
Y variables should be moderately correlated--not too high, not too low.  If highly correlated a problem similar to multicollinearity arises.  (In the extreme case Y1 could be equal to Y2.  Clearly, there is no point in using the same Y twice.)  If the correlation is low, there will be no improvement in power and there will be the lost degree of freedom.

It is important to think carefully about the selection of the Xs and the Ys to make sure that there are no spurious correlations between them.  It is also important to examine the coefficients of each equation to see that the values make sense.

MANOVA in JMP:
The dataset for ANOVA above can be used for the MANOVA.  It can be downloaded by clicking here.
Click on Analyze on the Top Menu, then click on Fit Model.
Highlight the dependent variables in the Select Columns box on the left, then click on the Y button.
Highlight the independent variables in the Select Columns box on the left, then click on the Add button.
Add any interaction effects or nested effects that are appropriate as described above under ANOVA.
Click on the rectangle beside Personality in the upper right corner of the Fit Model window.  Choose Manova.
Click on Run Model.  The Fit Manova report window will appear.  In the Response Specification section, click on the Choose Response button.
For the standard MANOVA, highlight the Identity response.
For repeated measures (longitudinal) studies, highlight Repeated Measures.
Other options impose various relationships among the Y variables.  For example, if you choose Sum, your dependent variable becomes the sum of the Ys.
When you have chosen the type of response structure for the Ys, click on Run.
The statistics for the MANOVA will appear below the summary statistics in the Fit Manova report window.

Tests:
There are four statistical tests that are commonly used with MANOVA.  For all of these tests the null hypothesis is that there is no relationship between the X variables and the Y variables.  A low p value for any of these tests means that the null hypothesis can be rejected and implies that there is a relationship between at least one X and at least one Y.  All of these tests are equivalent to each other when the sample size is large.

Wilk's lambda (commonly available in most stat packages),
Hotelling's trace (T²)
Pillai's criterion (Pillai-Bartlett trace, V) (more robust, better when sample sizes are small),
Roy's greatest characteristic root (GCR) (depends on assumption of normality)

MANCOVA
 

Glossary

balanced design
blocking
control
covariate
degrees of freedom
design
effect
F ratio
factor
factorial design
fixed factor
homoskedasticity
heteroskedasticity
interaction effect
level
main effect
multiple comparisons
power
subpopulation
random factor
 


created April 5, 2001, by James R. Frederick
copyright 2001, James R. Frederick