Module 7


Advanced Statistical Methods

Objectives of Module 7

To introduce some advanced statistical techniques with some code that may be useful. In this module we have various advanced techniques introduced. We hope this will be expanded in teh future as the site matures and we integrate more  questions from users.


Matrix properties and R code

This file show some properties of matrices and how to look at their properties and functions in R. An appreciation of matrices can help with Design concepts in data analysis. We come across a correlation matrix, co-variance matrix, The Design matrix. In advanced genetic studies models can incorporate pedigree matrices to look at genotypic effects.

Some matrix properties in R

Back to top.

Mixed Models

Back to top.

An example of a mixed model when we are interested in measurement error at different levels within an experiment

Imagine the sampling within an experiment. This figure shows the Field, where there may be many plants within a plot. If were were interested in sampling from leaves we would then have to decide how many leaves, and in particular which leaf. (For example typically in sorghum we may select the last fully expanded leaf , or the leaf below the flag leaf). then within a leaf there are decisions to make on where to take samples. Discs may be taken and then in some situations there are replicate measurements on an individual sample.  Some levels of sampling can be examined as a random sampling, and others are fixed. Our treatment comparisons of Genotype or nitrogen treatments would be fixed effects in most physiological studies. The position of the samples within a leaf and repeated measures within a sample are random effects. On the right the two yellow arrows indicate replicate measurements on a leaf sample. These entered into the analysis as a mean value. They are an example of pseudo-replication.

Sampling within an experiment

Random effects an example

Back to top.

Nonlinear Models

Coming soon

Back to top.


The ARIMA refers to AUTO Regressive Moving Average models. In the case of agriculture these can be applied to spatial situations, where plots close to each other are more likely to be similar. There are terms such as AR1 and AR2 which refer to how far the regression relates to close-by units. Most software will use maximum likelihood estimation methods to make the estimates.

Back to top.

Logistic Regression

Coming soon. Also see Module 3 (regression) and Module 4 (count data).

Back to top.


An overview of types of analysis with multiple response variables

Back to top.

Multiple Environment Trials (MET)

The use of multi-environment trails (MET) are widely conducted to look at the response of crop  genotypes grown across many sites both within a country and across an agro-ecological region, which is general part of a continent. The analysis is a specialised area of statistics, and involves typically many years, sites and genotypes. Due to the unbalanced nature, and the complex interaction of environments and site there are components of both fixed and random effects. The use of methods able to cope with spatial analysis and these complexities involve mixed models. In some studies within a trail there are also aspects of management that add to the complexity.

MET trials as presentation

Back to top.

Principle Component Analysis (PCA)

coming soon

Back to top.


Leave a Reply