Title: Statistical Tools for the Analysis of Multi Environment Agronomic Trials
Version: 1.0.0
Description: Data from multi environment agronomic trials, which are often carried out by plant breeders, can be analyzed with the tools offered by this package such as the Additive Main effects and Multiplicative Interaction model or 'AMMI' ('Gauch' 1992, ISBN:9780444892409) and the Site Regression model or 'SREG' ('Cornelius' 1996, <doi:10.1201/9780367802226>). Since these methods present a poor performance under the presence of outliers and missing values, this package includes robust versions of the 'AMMI' model ('Rodrigues' 2016, <doi:10.1093/bioinformatics/btv533>), and also imputation techniques specifically developed for this kind of data ('Arciniegas-Alarcón' 2014, <doi:10.2478/bile-2014-0006>).
License: GPL-2
Encoding: UTF-8
LazyData: true
RoxygenNote: 7.3.3
Imports: stats, ggforce, ggplot2, scales, MASS, pcaMethods, rrcov, dplyr, missMDA, tidyr, rlang, corpcor
Suggests: agridat, spelling, knitr, rmarkdown, patchwork, testthat
VignetteBuilder: knitr
Depends: R (≥ 2.12.0)
URL: https://jangelini.github.io/geneticae/, https://github.com/jangelini/geneticae
BugReports: https://github.com/jangelini/geneticae/issues
Language: en-US
NeedsCompilation: no
Packaged: 2026-04-16 17:25:21 UTC; julia
Author: Julia Angelini ORCID iD [aut, cre], Marcos Prunello ORCID iD [aut], Gerardo Cervigni [aut]
Maintainer: Julia Angelini <jangelini_93@hotmail.com>
Repository: CRAN
Date/Publication: 2026-04-16 17:50:02 UTC

Biplot Imputation Method

Description

This function implements the Biplot imputation method as proposed by Yan (2013). It is an iterative algorithm that uses the singular value decomposition (SVD) to impute missing values in a genotype by environment matrix.

Usage

BiplotImputfun(X, precision = 0.01, max.iter = 1000, n_pc = 2)

Arguments

X

A data frame or matrix with genotypes in rows and environments in columns.

precision

(optional) Convergence threshold. The algorithm stops when the relative change in imputed values is less than this value. Default is 0.01.

max.iter

(optional) Maximum number of iterations. Default is 1000.

n_pc

Number of principal components to use for imputation. Default is 2.

Value

A list containing:

References

Yan, W. (2013). Biplot analysis of incomplete two-way data. Crop Science, 53(1), 48-57. doi:10.2135/cropsci2012.05.0301

Arciniegas-Alarcón, S., García-Peña, M., Krzanowski, W., & Dias, C. T. S. (2014b). An alternative methodology for imputing missing data in trials with genotype-by-environment interaction: some new aspects. Biometrical Letters, 51(2), 75-88. doi:10.2478/bile-2014-0006


EM-AMMI imputation method

Description

This function was writted by Paderewski (2013) and allow imputing missing values by the EM-AMMI algorithm

Usage

EM.AMMI(
  X,
  PC.nb = 1,
  initial.values = NA,
  precision = 0.01,
  max.iter = 1000,
  change.factor = 1,
  simplified.model = FALSE
)

Arguments

X

a data frame or matrix with genotypes in rows and environments in columns when there are no replications of the experiment.

PC.nb

the number of principal components in the AMMI model that will be used; default value is 1. For PC.nb=0 only main effects are used to estimate cells in the data table (the interaction is ignored). The number of principal components must not be greater than min(number of rows in the X table, number of columns in the X table)–2.

initial.values

(optional) initial values of missing cells. It can be a single value, which then will be used for all empty cells, or a vector of length equal to the number of missing cells (starting from the missing values in the first column). If omitted, initial values will be obtained by the main effects from the corresponding model, that is, by grand mean of observed data increased (or decreased) by row and column main effects.

precision

(optional) algorithm converges if the maximal change in the values of the missing cells in two subsequent steps is not greater than this value (the default is 0.01).

max.iter

(optional) a maximum permissible number of iterations (that is, number of repeats of the algorithm’s steps 2 through 5); default value is 1000.

change.factor

(optional) introduced by analogy to step size in gradient descent method, this parameter that can shorten the time of executing the algorithm by decreasing the number of iterations. The change.factor=1 (default) defines that the previous approximation is changed with the new values of missing cells (standard EM-AMMI algorithm). However, when change.factor<1, then the new approximations are computed and the values of missing cells are changed in the direction of this new approximation but the change is smaller. It could be useful if the changes are cyclic and thus convergence could not be reached. Usually, this argument should not affect the final outcome (that is, the imputed values) as compared to the default value of change.factor=1.

simplified.model

the AMMI model contains the general mean, effects of rows, columns and interaction terms. So the EM-AMMI algorithm in step 2 calculates the current effects of rows and columns; these effects change from iteration to iteration because the empty (at the outset) cells in each iteration are filled with different values. In step 3 EM-AMMI uses those effects to re-estimate cells marked as missed (as default, simplified.model=FALSE). It is, however, possible that this procedure will not converge. Thus the user is offered a simplified EM-AMMI procedure that calculates the general mean and effects of rows and columns only in the first iteration and in next iterations uses these values (simplified.model=TRUE). In this simplified procedure the initial values affect the outcome (whilst EM-AMMI results usually do not depend on initial values). For the simplified procedure the number of iterations to convergence is usually smaller and, furthermore, convergence will be reached even in some cases where the regular procedure fails. If the regular procedure does not converge for the standard initial values (see the description of the argument initial.values), the simplified model can be used to determine a better set of initial values.

Value

A list containing:

References

Paderewski, J. (2013). An R function for imputation of missing cells in two-way initial.values), the simplified model can be used to determine a better set of initial values. data sets by EM-AMMI algorithm.. Communications in Biometry and Crop Science 8, 60–69.


EM-SREG Imputation Method

Description

Iterative algorithm to impute missing values in two-way tables using the Sites Regression (SREG) model. It supports several variants including standard SVD and Bayesian PCA.

Usage

EM.SREG(
  X,
  PC.nb = 1,
  initial.values = NA,
  precision = 0.01,
  max.iter = 1000,
  change.factor = 1,
  simplified.model = FALSE,
  type = c("EM-SREG", "EM-bSREG")
)

Arguments

X

A data frame or matrix with genotypes in rows and environments in columns.

PC.nb

Number of principal components to be used. Default is 1.

initial.values

(optional) Initial values for missing cells. If NA, initial values are obtained from column means (environment effects).

precision

Convergence threshold. Default is 0.01.

max.iter

Maximum number of iterations. Default is 1000.

change.factor

Step size for updating missing values (standard is 1).

simplified.model

Logical. If TRUE, effects are only calculated in the first iteration.

type

Method type: "EM-SREG" (Standard), "EM-bSREG" (Bayesian).

Value

A list containing:

References

Angelini, J., Cervigni, G. D. L., & Quaglino, M. B. (2024). New imputation methodologies for genotype-by-environment data: an extensive study of properties of estimators. Euphytica, 220(6), 92. doi:10.1007/s10681-024-03344-z


Eigenvector Imputation Function (Internal)

Description

Internal function for GxE imputation using the Krzanowski (1988) eigenvector approach with a leave-one-out strategy.

Usage

Eigenvectorfun(X, f)

Arguments

X

A matrix with missing values (NAs).

f

Number of components (rank) to use for the reconstruction.

Value

A list with the number of iterations, convergence status, final rank used, and the imputed matrix.


GabrielEigen imputation method

Description

GabrielEigen imputation method

Usage

Gabriel.Calinski(X)

Arguments

X

a data frame or matrix with genotypes in rows and environments in columns when there are no replications of the experiment.

Value

A list containing:

References

Arciniegas-Alarcón S., García-Peña M., Dias C.T.S., Krzanowski W.J. (2010). An alternative methodology for imputing missing data in trials with genotype-by-environment interaction. Biometrical Letters 47, 1–14.


Weighted GabrielEigen imputation method

Description

agregar descripcion.

Usage

WGabriel(DBmiss, Winf, Wsup)

Arguments

DBmiss

a data frame or matrix that contains the genotypes in the rows and the environments in the columns when there are no replications of the experiment.

Winf

inferior weight

Wsup

superior weight

Value

A list containing:

References

Arciniegas-Alarcón S., García-Peña M., Krzanowski W.J., Dias C.T.S. (2014). An alternative methodology for imputing missing data in trials with genotype-byenvironment interaction: some new aspects. Biometrical Letters 51, 75-88.


Imputation of missing cells in two-way data sets

Description

Missing values are not allowed by the AMMI, GGE or SREG methods. This function provides several methods to impute missing observations in data from multi-environment trials and to subsequently adjust the mentioned methods.

Usage

imputation(
  Data,
  genotype = "gen",
  environment = "env",
  response = "yield",
  rep = NULL,
  type = "EM-AMMI",
  nPC = 2,
  initial.values = NA,
  precision = 0.01,
  maxiter = 1000,
  change.factor = 1,
  simplified.model = FALSE,
  scale = TRUE,
  method = "EM",
  row.w = NULL,
  coeff.ridge = 1,
  seed = NULL,
  nb.init = 1,
  Winf = 0.8,
  Wsup = 1
)

Arguments

Data

dataframe containing genotypes, environments, repetitions (if any) and the phenotypic trait of interest. Other variables that will not be used in the analysis can be present.

genotype

column name containing genotypes.

environment

column name containing environments.

response

column name containing the phenotypic trait.

rep

column name containing replications. If this argument is NULL, there are no replications available in the data. Defaults to NULL.

type

imputation method. Either "EM-AMMI", "EM-GGE", "EM-SREG", "EM-bSREG", "Gabriel", "Eigenvector", "WGabriel", "EM-PCA". Defaults to "EM-AMMI".

nPC

number of components used to predict the missing values. Default to 2.

initial.values

initial values of the missing cells. It can be a single value or a vector of length equal to the number of missing cells.

precision

threshold for assessing convergence.

maxiter

maximum number of iteration for the algorithm.

change.factor

When 'change.factor' is equal to 1, the previous approximation is changed with the new values (standard EM). Smaller values can help convergence if changes are cyclic.

simplified.model

logical. If TRUE, calculates effects only in the first iteration to speed up convergence or help in cases where the regular procedure fails.

scale

boolean. By default TRUE for "EM-PCA".

method

"Regularized" or "EM" for "EM-PCA".

row.w

row weights for "EM-PCA".

coeff.ridge

ridge coefficient for "EM-PCA".

seed

integer for random initialization in "EM-PCA".

nb.init

number of random initializations for "EM-PCA".

Winf

lower weight for WGabriel.

Wsup

upper weight for WGabriel.

Details

Often, multi-environment experiments are unbalanced because several genotypes are not tested in some environments. Several methodologies have been proposed in order to solve this lack of balance caused by missing values, some of which are included in this function:

Value

A matrix of the imputed data.

References

Paderewski, J. (2013). An R function for imputation of missing cells in two-way data sets by EM-AMMI algorithm. Communications in Biometry and Crop Science 8, 60–69.

Yan, W. (2013). Biplot analysis of incomplete two-way data. Crop Science, 53(1), 48-57. doi:10.2135/cropsci2012.05.0301

Arciniegas-Alarcón, S., García-Peña, M., Krzanowski, W., & Dias, C. T. S. (2014b). An alternative methodology for imputing missing data in trials with genotype-by-environment interaction: some new aspects. Biometrical Letters, 51(2), 75-88. doi:10.2478/bile-2014-0006

Angelini, J., Cervigni, G. D. L., & Quaglino, M. B. (2024). New imputation methodologies for genotype-by-environment data: an extensive study of properties of estimators. Euphytica, 220(6), 92. doi:10.1007/s10681-024-03344-z

Julie Josse, Francois Husson (2016). missMDA: A Package for Handling Missing Values in Multivariate Data Analysis. Journal of Statistical Software 70, 1-31.

Arciniegas-Alarcón S., García-Peña M., Dias C.T.S., Krzanowski W.J. (2010). An alternative methodology for imputing missing data in trials with genotype-by-environment interaction. Biometrical Letters 47, 1–14.

Arciniegas-Alarcón S., García-Peña M., Krzanowski W.J., Dias C.T.S. (2014). An alternative methodology for imputing missing data in trials with genotype-byenvironment interaction: some new aspects. Biometrical Letters 51, 75-88.

Examples

library(geneticae)
# Data without replications
library(agridat)
data(yan.winterwheat)

# generating missing values
yan.winterwheat[1,3]<-NA
yan.winterwheat[3,3]<-NA
yan.winterwheat[2,3]<-NA

imputation(yan.winterwheat, genotype = "gen", environment = "env",
           response = "yield", type = "EM-AMMI")

# Data with replications
data(plrv)
plrv[1,3] <- NA
plrv[3,3] <- NA
plrv[2,3] <- NA
imputation(plrv, genotype = "Genotype", environment = "Locality",
           response = "Yield", rep = "Rep", type = "EM-AMMI")


Clones from the PLRV population

Description

resistance study to PLRV (Patato Leaf Roll Virus) causing leaf curl. 28 genotypes were experimented at 6 locations in Peru. Each clone was evaluated three times in each environment, and yield, plant weight and plot were registered.

Usage

data(plrv)

Format

Data frame with 504 observations and 6 variables (genotype, locality, repetition, weightPlant, weightPlot and yield).

References

Felipe de Mendiburu (2020). agricolae: Statistical Procedures for Agricultural Research. R package version 1.3-2. https://CRAN.R-project.org/package=agricolae

Examples

library(geneticae)
data(plrv)
str(plrv)


Robust AMMI Model

Description

Fits a classical or robust Additive Main effects and Multiplicative Interaction (AMMI) model.

Usage

rAMMIModel(
  Data,
  genotype = "gen",
  environment = "env",
  response = "Y",
  rep = NULL,
  Ncomp = 2,
  type = "AMMI"
)

Arguments

Data

a dataframe with genotypes, environments and the phenotypic trait.

genotype

column name containing genotypes.

environment

column name containing environments.

response

column name containing the phenotypic trait.

rep

column name containing replications. If provided, means are calculated.

Ncomp

number of principal components to retain.

type

method for fitting: '"AMMI"', '"rAMMI"', '"hAMMI"', '"gAMMI"', '"lAMMI"' or '"ppAMMI"'.


AMMI Biplots with ggplot2

Description

Produces a biplot for objects of class 'AMMI'.

Usage

rAMMIPlot(
  model_res,
  colGen = "gray47",
  colEnv = "darkred",
  sizeGen = 6,
  sizeEnv = 6,
  titles = TRUE,
  footnote = TRUE,
  axis_expand = 1.2,
  limits = TRUE,
  axes = TRUE,
  axislabels = TRUE
)

Arguments

model_res

an object of class 'AMMI' from AMMIModel.

colGen

genotype colour. Defaults to "gray47".

colEnv

environment colour. Defaults to "darkred".

sizeGen

genotype text size.

sizeEnv

environment text size.

titles

logical, show plot title.

footnote

logical, show footnote with explained variance.

axis_expand

expansion factor for axis limits.

limits

logical. If 'TRUE' axes are automatically rescaled. Defaults to 'TRUE'.

axes

logical, if this argument is 'TRUE' axes passing through the origin are drawn. Defaults to 'TRUE'.

axislabels

logical, if this argument is 'TRUE' labels axes are included. Defaults to 'TRUE'


Site Regression model

Description

The Site Regression model (also called genotype + genotype-by-environment (GGE) model) is a powerful tool for effective analysis and interpretation of data from multi-environment trials in breeding programs. There are different functions in R to fit the SREG model, however, this function has the following improvements:

Usage

rSREGModel(
  Data,
  genotype = "gen",
  environment = "env",
  response = "yield",
  rep = NULL,
  model = "SREG",
  SVP = "symmetrical"
)

Arguments

Data

dataframe with genotypes, environments, repetitions (if any) and the phenotypic trait of interest. Additional variables that will not be used in the model may be present in the data.

genotype

column name for genotypes.

environment

column name for environments.

response

column name for the phenotypic trait.

rep

column name for replications. If this argument is NULL, there are no replications in the data. Defaults to NULL.

model

method for fitting the SREG model: '"SREG"','"CovSREG"','"hSREG"' or '"ppSREG"' (see References). Defaults to '"SREG"'.

SVP

method for singular value partitioning. Either '"row"', '"column"', or '"symmetrical"'. Defaults to '"symmetrical"'.

Details

A linear model by robust regression using an M estimator proposed by Huber (1964, 1973) fitted by iterated re-weighted least squares, in combination with three robust SVD/PCA procedures, resulted in a total of three robust SREG alternatives. The robust SVD/PCA considered were:

Value

A list of class GGE_Model containing:

model

SREG model version.

coordgenotype

plotting coordinates for each genotype in every component.

coordenviroment

plotting coordinates for each environment in every component.

eigenvalues

vector of eigenvalues for each component.

vartotal

overall variance.

varexpl

percentage of variance explained by each component.

labelgen

genotype names.

labelenv

environment names.

axes

axis labels.

Data

scaled and centered input data.

SVP

name of SVP method.

A biplot of class ggplot

References

Julia Angelini, Gabriela Faviere, Eugenia Bortolotto, Gerardo Domingo Lucio Cervigni & Marta Beatriz Quaglino (2022) Handling outliers in multi-environment trial data analysis: in the direction of robust SREG model, Journal of Crop Improvement, DOI: 10.1080/15427528.2022.2051217

Examples

 library(geneticae)

 # Data without replication
 library(agridat)
 data(yan.winterwheat)
 GGE1 <- rSREGModel(yan.winterwheat, genotype="gen", environment="env", response="yield")

 # Data with replication
 data(plrv)
 GGE2 <- rSREGModel(plrv, genotype = "Genotype", environment = "Locality",
                  response = "Yield", rep = "Rep")


GGE biplots with ggplot2

Description

GGE biplots are used for visual examination of the relationships between test environments, genotypes, and genotype-by-environment interactions. ‘rSREGPlot()' produces a biplot as an object of class ’ggplot', using the output of the rSREGModel function. Several types of biplots are offered which focus on different aspects of the analysis. Customization options are also included. This function is a modification of the 'rSREGPlot' function from the GGEBiplots package.

Usage

rSREGPlot(
  rSREGModel,
  type = "Biplot",
  d1 = 1,
  d2 = 2,
  selectedE = NA,
  selectedG = NA,
  selectedG1 = NA,
  selectedG2 = NA,
  colGen = "gray47",
  colEnv = "darkred",
  colSegment = "gray30",
  colHull = "gray30",
  sizeGen = 6,
  sizeEnv = 6,
  largeSize = 4.5,
  axis_expand = 1.2,
  axislabels = TRUE,
  axes = TRUE,
  limits = TRUE,
  titles = TRUE,
  footnote = TRUE
)

Arguments

rSREGModel

An object of class rSREGModel.

type

type of biplot to produce.

  • "Biplot": Basic biplot.

  • "Selected Environmen"t: Ranking of cultivars based on their performance in any given environment.

  • "Selected Genotype": Ranking of environments based on the performance of any given cultivar.

  • "Relationship Among Environments".

  • "Comparison of Genotype".

  • "Which Won Where/What": Identifying the 'best' cultivar in each environment.

  • "Discrimination vs. representativeness": Evaluating the environments based on both discriminating ability and representativeness.

  • "Ranking Environments": Ranking environments with respect to the ideal environment.

  • "Mean vs. stability": Evaluating cultivars based on both average yield and stability.

  • "Ranking Genotypes": Ranking genotypes with respect to the ideal genotype.

d1

PCA component to plot on x axis. Defaults to 1.

d2

PCA component to plot on y axis. Defaults to 2.

selectedE

name of the environment to evaluate when 'type="Selected Environment"'.

selectedG

name of the genotype to evaluate when 'type="Selected Genotype"'.

selectedG1

name of the genotype to compare to 'selectedG2' when 'type="Comparison of Genotype"'.

selectedG2

name of the genotype to compare to 'selectedG1' when 'type="Comparison of Genotype"'.

colGen

genotype attributes colour. Defaults to '"gray47"'.

colEnv

environment attributes colour. Defaults to '"darkred"'.

colSegment

segment or circle lines colour. Defaults to '"gray30"'.

colHull

hull colour when 'type="Which Won Where/What"'. Defaults to "gray30".

sizeGen

genotype labels text size. Defaults to 4.

sizeEnv

environment labels text size. Defaults to 4.

largeSize

larger labels text size to use for two selected genotypes in 'type="Comparison of Genotype"', and for the outermost genotypes in 'type="Which Won Where/What"'. Defaults to 4.5.

axis_expand

multiplication factor to expand the axis limits by to enable fitting of labels. Defaults to 1.2.

axislabels

logical, if this argument is 'TRUE' labels for axes are included. Defaults to 'TRUE'.

axes

logical, if this argument is 'TRUE' x and y axes going through the origin are drawn. Defaults to 'TRUE'.

limits

logical, if this argument is 'TRUE' the axes are re-scaled. Defaults to 'TRUE'.

titles

logical, if this argument is 'TRUE' a plot title is included. Defaults to 'TRUE'.

footnote

logical, if this argument is 'TRUE' a footnote is included. Defaults to 'TRUE'.

Value

A biplot of class ggplot

References

Yan W, Kang M (2003). GGE Biplot Analysis: A Graphical Tool for Breeders, Geneticists, and Agronomists. CRC Press.

Sam Dumble (2017). GGEBiplots: GGE Biplots with 'ggplot2'. R package version 0.1.1. https://CRAN.R-project.org/package=GGEBiplots

Examples

 library(geneticae)

 # Data without replication
 library(agridat)
 data(yan.winterwheat)
 GGE1 <- rSREGModel(yan.winterwheat)
 rSREGPlot(GGE1)

 # Data with replication
 data(plrv)
 GGE2 <- rSREGModel(plrv, genotype = "Genotype", environment = "Locality",
                  response = "Yield", rep = "Rep")
 rSREGPlot(GGE2)