Quick Navigation
In an unusually detailed preparation for an Environmental Effects Statement for a proposed discharge of dairy wastes into the Curdies River, in western Victoria, a team of stream ecologists wanted to describe the basic patterns of variation in a stream invertebrate thought to be sensitive to nutrient enrichment. As an indicator species, they focused on a small flatworm, Dugesia, and started by sampling populations of this worm at a range of scales. They sampled in two seasons, representing different flow regimes of the river - Winter and Summer. Within each season, they sampled three randomly chosen (well, haphazardly, because sites are nearly always chosen to be close to road access) sites. A total of six sites in all were visited, 3 in each season. At each site, they sampled six stones, and counted the number of flatworms on each stone.
Open the curdies data file.
The SITE variable is supposed to represent a random factorial variable (which site). However, because the contents of this variable are numbers, R initially treats them as numbers, and therefore considers the variable to be numeric rather than categorical. In order to force R to treat this variable as a factor (categorical) it is necessary to first convert this numeric variable into a factor (HINT).
Notice the data set - each of the nested factors is labelled differently - there can be no replicate for the random (nesting) factor.
Normally, we are not interested in formally testing the effect of the nested factor to get the correct F test for the nested factor (SITE), examine a representation of the anova table of the fitted linear model that assumes all factors are fixed (HINT)
A biologist studying starlings wanted to know whether the mean mass of starlings differed according to different roosting situations. She was also interested in whether the mean mass of starlings altered over winter (Northern hemisphere) and whether the patterns amongst roosting situations were consistent throughout winter, therefore starlings were captured at the start (November) and end of winter (January). Ten starlings were captured from each roosting situation in each season, so in total, 80 birds were captured and weighed.
Open the starling data file.
Situation 1 vs Situation 2 Situation 1 vs Situation 3 Situation 1 vs Situation 4 Situation 2 vs Situation 3 Situation 2 vs Situation 4 Situation 3 vs Situation 4
Here is a modified example from Quinn and Keough (2002). Stehman and Meredith (1995) present data from an experiment that was set up to test the hypothesis that healthy spruce seedlings break bud sooner than diseased spruce seedlings. There were 2 factors: pH (3 levels: 3, 5.5, 7) and HEALTH (2 levels: healthy, diseased). The dependent variable was the average (from 5 buds) bud emergence rating (BRATING) on each seedling. The sample size varied for each combination of pH and health, ranging from 7 to 23 seedlings. With two factors, this experiment should be analyzed with a 2 factor (2 x 3) ANOVA.
Open the stehman data file.
The variable PH contains a list of pH values and is supposed to represent a factorial variable. However, because the contents of this variable are numbers, R initially treats them as numbers, and therefore considers the variable to be numeric rather than categorical. In order to force R to treat this variable as a factor (categorical) it is necessary to first convert this numeric variable into a factor (HINT).
An ecologist studying a rocky shore at Phillip Island, in southeastern Australia, was interested in how clumps of intertidal mussels are maintained. In particular, he wanted to know how densities of adult mussels affected recruitment of young individuals from the plankton. As with most marine invertebrates, recruitment is highly patchy in time, so he expected to find seasonal variation, and the interaction between season and density - whether effects of adult mussel density vary across seasons - was the aspect of most interest.
The data were collected from four seasons, and with two densities of adult mussels. The experiment consisted of clumps of adult mussels attached to the rocks. These clumps were then brought back to the laboratory, and the number of baby mussels recorded. There were 3-6 replicate clumps for each density and season combination.
Open the quinn data file.
Confirm the need for a square root transformation, by examining boxplots and mean vs variance plots for both raw and transformed data. Note that square root transformation was selected because the data were counts (count data often includes values of zero - cannot compute log of zero).
Also confirm that the design (model) is unbalanced and thus warrants the use of Type II sums of squares. (HINT)
Minchinton & Ross (1999) investigated the distribution of oyster substrates for limpets in four zones alone the shore (the landward zone high on the shore, the mid zone with mangrove trees, the seaward zone with mangrove trees and the seaward zone without trees) by measuring the number of limpets per oyster shell (expressed as the number of limpets per 100 oysters) in five quadrats per zone. Data were collected from two sites (considered a random factor) so as to provide some estimates of the spatial generality of the observed trends (from Box 9.4 of Quinn & Keough (2002)).
Open the minch data file.
Minchinton & Ross (1999) considered the zone factor to be a fixed factor and the site factor to be a random factor and thus the data represent a Model III design
Make any necessary corrections