The case study handles the situation in German Holstein Frisian (HF) breeding where the general Index (RZG) is based on several traits. The composition of the index changed in 2021, so that there are now eight instead of 6 traits in the index. Also, the exact trait definition was changed. In total, there are 10 “breeding goal traits”. See Simianer et al. “How economic weights translate into genetic and phenotypic progress, and vice versa” Genet Sel Evol 55, 38 (2023) for more details on it.
The economic weights (w) of the indices are as follows,
with traits not in the index having zero weight.
tn <- c("RZM", "RZN", "RZEo", "RZEn", "RZR", "RZKm", "RZKd", "RZH", "RZC", "RZS")
w_old <- c(0.45, 0.2, 0.15, 0, 0.1, 0.03, 0, 0, 0, 0.07)
names(w_old) <- tn; w_old
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
w_new <- c(0.36, 0.18, 0, 0.15, 0.07, 0.015, 0.015, 0.18, 0.03, 0)
names(w_new) <- tn; w_old
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07Breeding values are scaled to a mean of 100 index points and a additive genetic standard deviation of 12 index points. This makes it easy to set up the genetic variance-covariance matrix ( \(\Gamma = G\) ) from the genetic correlation matrix by simply multiplying the correlation matrix with a constant of 122.
G <- matrix(
  c(1.0,0.13,0.13,0.07,-0.15,0.11,0.07,0.09,-0.02,0.04,
    0.13,1.0,0.23,0.28,0.43,0.25,0.22,0.78,0.13,0.46,
    0.13,0.23,1.0,0.92,0.02,0.09,-0.05,0.25,-0.1,0.19,
    0.07,0.28,0.92,1.0,0.06,0.08,-.03,0.31,-.1,0.25,
    -0.15,0.43,.02,0.06,1.0,0.32,0.19,0.41,0.04,0.15,
    0.11,0.25,0.09,0.08,0.32,1.0,0.0,0.25,0.04,0.13,
    0.07,0.22,-.05,-.03,0.19,0,1.0,0.23,0.05,0.10,
    0.09,0.78,0.25,0.31,0.41,0.25,0.23,1.0,0.1,0.57,
    -.02,0.13,-.10,-.1,0.04,0.04,0.05,0.1,1.0,0.02,
    0.04,0.46,0.19,0.25,0.15,0.13,0.10,0.57,0.02,1.0)
  ,byrow = TRUE, nrow = length(tn), ncol = length(tn), dimnames = list(tn, tn)
)
G <- G*12^2
G
#>         RZM    RZN   RZEo   RZEn    RZR   RZKm   RZKd    RZH    RZC    RZS
#> RZM  144.00  18.72  18.72  10.08 -21.60  15.84  10.08  12.96  -2.88   5.76
#> RZN   18.72 144.00  33.12  40.32  61.92  36.00  31.68 112.32  18.72  66.24
#> RZEo  18.72  33.12 144.00 132.48   2.88  12.96  -7.20  36.00 -14.40  27.36
#> RZEn  10.08  40.32 132.48 144.00   8.64  11.52  -4.32  44.64 -14.40  36.00
#> RZR  -21.60  61.92   2.88   8.64 144.00  46.08  27.36  59.04   5.76  21.60
#> RZKm  15.84  36.00  12.96  11.52  46.08 144.00   0.00  36.00   5.76  18.72
#> RZKd  10.08  31.68  -7.20  -4.32  27.36   0.00 144.00  33.12   7.20  14.40
#> RZH   12.96 112.32  36.00  44.64  59.04  36.00  33.12 144.00  14.40  82.08
#> RZC   -2.88  18.72 -14.40 -14.40   5.76   5.76   7.20  14.40 144.00   2.88
#> RZS    5.76  66.24  27.36  36.00  21.60  18.72  14.40  82.08   2.88 144.00In our case, reliabilities (\(r^2_{AI}\)) of the estimated breeding values are available for all traits.
r2 <- c(0.743, 0.673, 0.638, 0.717, 0.541, 0.635, 0.604, 0.720, 0.499, 0.764)
names(r2) <- tn; r2
#>   RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS 
#> 0.743 0.673 0.638 0.717 0.541 0.635 0.604 0.720 0.499 0.764Note: If you calculate an index for observed phenotypes based on own performances, the reliability ( \(r^2\) ) equals the narrow sense heritability ( \(h^2\) ).
If we regard a situation where breeding value estimation is only performed for the actual index traits, \(r^2\) needs to be subsetted.
r2_old <- r2[w_old != 0]
r2_new <- r2[w_new != 0]For the case of the old index, estimates of the observed genetic gains of the traits in the index are available from evaluations of the HF breeding program.
deltautr <- c(0.28401392, 0.21637703, 0.17932963, 0.09986764, 0.08767239, 0.13273939)
names(deltautr) <- tn[w_old>0]We may further need heritabilities (\(h^2\)) of the traits to translate genetic gain into phenotypic gain.
h2 <- c(0.314, 0.090, 0.194, 0.194, 0.013, 0.049, 0.033, 0.061, 0.014, 0.273)
names(h2) <- tnFurther, to allow residual errors to be correlated, we need a variance-covariance matrix of breeding values (\(H\)) as starting point for the internal estimation process.
H <- matrix (
  c(1.00,0.06,0.06,-.20,0.05,0.03,
    0.06,1.00,0.14,0.40,0.20,0.46,
    0.06,0.14,1.00,-.03,0.03,0.15,
    -.20,0.40,-.03,1.00,0.30,0.13,
    0.05,0.20,0.03,0.30,1.00,0.11,
    0.03,0.46,0.15,0.13,0.11,1.00),
  nrow=6, ncol=6,
  dimnames = list(tn[w_old != 0], tn[w_old != 0]))
H <- H * 144
H
#>         RZM    RZN   RZEo    RZR   RZKm    RZS
#> RZM  144.00   8.64   8.64 -28.80   7.20   4.32
#> RZN    8.64 144.00  20.16  57.60  28.80  66.24
#> RZEo   8.64  20.16 144.00  -4.32   4.32  21.60
#> RZR  -28.80  57.60  -4.32 144.00  43.20  18.72
#> RZKm   7.20  28.80   4.32  43.20 144.00  15.84
#> RZS    4.32  66.24  21.60  18.72  15.84 144.00IndexWizardAll index calculations are performed within the function
SelInd() . Minimum required input are w,
G and r2. Note that all vectors and matrices
need to be named to allow checks and sorting based on trait names.
res <- SelInd(
  w = w_old,
  G = G,
  r2 = r2_old
)
#> - only reliabilities given
#>   --> setting up E based on r2 and G
#>   --> residual errors are assumed to be uncorrelated!
#> - no selection intensity provided
#>   --> can only compute the relative genetic and phenotypic trend
#> - no heritabilities provided
#>   --> cannot compute the expected relative phenotypic trend
#> - No observed composition of genetic progress given
#>   --> cannot calculate realized weightsThe function by default informs the user, if certain calculations
cannot be performed. This behavior can be silenced with setting
verbose = FALSE.
res <- SelInd(w = w_old,  G = G,  r2 = r2_old, verbose = FALSE)The summary() function further gives information on the
number of traits, and all available entries in the SelInd
object.
summary(res)
#> An object of class SelInd. The index is based on:
#>    - n = 10 breeding goal traits
#>    - 6 traits with economic weight != 0
#>    - m = 6 index traits
#> 
#> Residual errors were assumed to be uncorrelated.
#> 
#> The SelInd object contains the entries w, G, E, r2, b, b_scaled, var_I, d_G_exp_scaled, r_IP, r_IH, del_d_scaled, delta, D, del_d_abs.
#> The SelInd object does not contain the entries H, h2, w_real, b_real, i, d_G_obs, d_G_exp, d_P_exp, d_P_exp_scaled, dG.The print() function further re-formats the output to a
more readable format, which includes rounding to two decimals. If you
want to see the “pure” list, use print.default()
instead.
res
#> An object of class SelInd. The index is based on:
#>    - n = 10 breeding goal traits
#>    - 6 traits with economic weight != 0
#>    - m = 6 index traits
#> 
#> --------------------------------------------------------------------
#> 
#> Matrices G, E present in SelInd object, but not printed to reduce complexity.
#> Extract them by the use of the `$` operator.
#> 
#> --------------------------------------------------------------------
#> 
#> Vaiance of the index (`var_I`), expected genetic gain (`dG`)
#> and assumed selection intensity (`i`):
#> 
#> $var_I:
#> [1] 41.29
#> 
#> --------------------------------------------------------------------
#> 
#> Economic (`w`) and index weights (`b`). Potentially they are rescaled (`scaled`)
#> so that sum(abs()) = 1. The weights might represent realized (`real`) weights
#> based on an observed composition of the genetic trend:
#> 
#> $w:
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07 
#> 
#> $b:
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.45 0.24 0.17 0.09 0.07 0.11 
#> 
#> $b_scaled:
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.40 0.21 0.15 0.08 0.06 0.09 
#> 
#> --------------------------------------------------------------------
#> 
#> reliabilities (`r2`) and heritabilities (`h2`) of the traits:
#> 
#> $r2:
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.74 0.67 0.64 0.54 0.64 0.76 
#> 
#> --------------------------------------------------------------------
#> 
#> Composition (`d`) of genetic (`G`) / phenotypic (`P`) trend.
#> The composition might be observed (`obs`) or expected (`exp`).
#> The composition might be scaled (`scaled`) so that sum(abs()) = 1:
#> 
#> $d_G_exp_scaled:
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.20 0.16 0.11 0.10 0.05 0.08 0.04 0.14 0.00 0.11 
#> 
#> --------------------------------------------------------------------
#> 
#> Analytic measures:
#> Correlation between the overall index and the phenotype of trait j (`r_IP`) and
#> the loss in prediction accuracy when omitting trait j from the index (`r_IH`).
#> Further the approximate first derivative of d_G_exp with respect to w (`del_d_scaled`) with rows
#> being scaled  so that sum(abs()) = 1.
#> 
#> $r_IP:
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.80 0.68 0.49 0.25 0.37 0.43 
#> 
#> $r_IH:
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.30 0.05 0.03 0.01 0.01 0.01 
#> 
#> $del_d_scaled:
#>        RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS
#> RZM   0.15 -0.15 -0.09 -0.10 -0.14 -0.06 -0.02 -0.14 -0.01 -0.13
#> RZN  -0.16  0.23 -0.01  0.02  0.15  0.04  0.05  0.18  0.04  0.12
#> RZEo -0.11 -0.01  0.36  0.33 -0.03 -0.02 -0.05  0.02 -0.05  0.02
#> RZEn -0.13  0.03  0.33  0.31 -0.01 -0.02 -0.04  0.05 -0.04  0.06
#> RZR  -0.14  0.15 -0.02  0.00  0.31  0.12  0.05  0.13  0.02  0.04
#> RZKm -0.09  0.06 -0.03 -0.02  0.19  0.49 -0.01  0.07  0.02  0.01
#> RZKd -0.08  0.16 -0.13 -0.10  0.18 -0.03  0.07  0.13  0.04  0.08
#> RZH  -0.15  0.19  0.01  0.04  0.14  0.05  0.04  0.17  0.03  0.18
#> RZC  -0.06  0.19 -0.16 -0.13  0.11  0.07  0.05  0.12  0.06  0.04
#> RZS  -0.14  0.13  0.02  0.05  0.05  0.01  0.02  0.19  0.01  0.39Nevertheless, each entry can also be extracted by the $
operator.
res$w
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.45 0.20 0.15 0.00 0.10 0.03 0.00 0.00 0.00 0.07
res$b_scaled
#>        RZM        RZN       RZEo        RZR       RZKm        RZS 
#> 0.39887304 0.20882509 0.15191062 0.08378771 0.06243987 0.09416367The basic usage of the selection index would be the calculation of
index weights (b) to combine several estimated breeding
values to one combined selection criterium (I). We can do
this for both of our indices as follows:
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, verbose = FALSE)
res_new <- SelInd(w = w_new, G = G,  r2 = r2_new, verbose = FALSE)The according index weights would then be:
round(res_old$b,2)
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.45 0.24 0.17 0.09 0.07 0.11
round(res_new$b,2)
#>  RZM  RZN RZEn  RZR RZKm RZKd  RZH  RZC 
#> 0.36 0.22 0.17 0.07 0.05 0.04 0.23 0.03Let’s assume an arbitrary individual with following estimated breeding values:
ind <- c(RZM = 130, RZN = 110, RZEo = 105, RZEn = 106, RZR = 95,
         RZKm = 100, RZKd = 101, RZH = 115,  RZC = 90,  RZS = 120)The RZG_old of this animal would then be calculated as follows:
t(res_old$b_scaled) %*% ind[names(res_old$b_scaled)]
#>          [,1]
#> [1,] 116.2783The same individual would have a slightly lower RZG given the new index.
t(res_new$b_scaled) %*% ind[names(res_new$b_scaled)]
#>          [,1]
#> [1,] 114.4104Further interest might be, how the index translates into genetic
gain. We can derive the expected composition of the total genetic gain
by extracting d_G_exp_scaled. This is scaled so that the
sum of absolute values in the vector sum up to one. Note that genetic
gain is also expected in breeding goal traits that are not part of the
index as correlated selection response.
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, verbose = FALSE)
round(res_old$d_G_exp_scaled, 2)
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.20 0.16 0.11 0.10 0.05 0.08 0.04 0.14 0.00 0.11Even though the expected phenotypic trend in natural units equals the
expected genotypic trend in natural units, a comparison of phenotypes
across traits would usually follow a scaling in phenotypic standard
deviations to make traits comparable. Due to different heritabilities of
the traits, this also leads to another composition of the genetic gain.
To calculate the expected composition of the total phenotypic gain
(d_P_exp_scaled), we need to pass also heritabilities to
the function. In our example, the phenotypic gain depends relatively
more on milk (“RZM”) than the genetic gain, as milk comes with a higher
heritability.
h2
#>   RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS 
#> 0.314 0.090 0.194 0.194 0.013 0.049 0.033 0.061 0.014 0.273
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, h2 = h2, verbose = FALSE)
round(res_old$d_P_exp_scaled, 2)
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.29 0.13 0.13 0.12 0.02 0.05 0.02 0.09 0.00 0.15The total genetic gain further depends on the selection intensity
(i). So if we are interested in the total genetic gain in
genetic standard deviations (dG), we need to further assume
a selection intensity.
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, h2 = h2, i = 0.1, verbose = FALSE)
round(res_old$dG, 2)
#> [1] 0.64We can then also retrieve the expected gain for the single traits
(d_G_exp or d_P_exp) scaled in genetic/
phenotypic standard deviations.
round(res_old$d_G_exp, 2)
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.83 0.67 0.47 0.44 0.22 0.35 0.16 0.58 0.02 0.45
round(res_old$d_P_exp, 2)
#>  RZM  RZN RZEo RZEn  RZR RZKm RZKd  RZH  RZC  RZS 
#> 0.04 0.02 0.02 0.02 0.00 0.01 0.00 0.01 0.00 0.02Of strong practical interest should be the question how the genetic gain changes when the index is changed. This question can either be answered by comparison of different indices, or by calculation of analytic measures.
We will first regard a situation where we not only have a numeric
change in the economic weights, but also a discrete change in the index
traits. Traits in w, but not in r2 have an
economic weight of zero, which allows to calculate information on the
indirect response through correlations to index traits.
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, h2 = h2, verbose = FALSE)
res_new <- SelInd(w = w_new, G = G,  r2 = r2_new, h2 = h2, verbose = FALSE)This then allows to compare the changes between the two indices, and we see that the total gain will be slightly less due to milk (“RZM”), but more due to gain in functional traits.
round(res_new$d_G_exp_scaled - res_old$d_G_exp_scaled,2)
#>   RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS 
#> -0.06  0.01 -0.01  0.00  0.02 -0.01  0.02  0.03  0.01 -0.01The same can also be done for the phenotypic trend, where this effect is even stronger due to the higher heritability of Milk.
round(res_new$d_P_exp_scaled - res_old$d_P_exp_scaled,2)
#>   RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS 
#> -0.07  0.02  0.00  0.01  0.01  0.00  0.01  0.03  0.00  0.00If we have a certain index and want to see which effect the single traits have on the composition of the total index, several analytic measures might help.
res_old <- SelInd(w = w_old, G = G,  r2 = r2_old, verbose = FALSE)First one would be the correlation between the overall index \(I\) and a phenotype in trait \(j\) (\(r_{I,y_j}\); r_IP ). It
reflects, to what extent a trait contributes to the response in overall
genetic merit.
round(res_old$r_IP, 2)
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.80 0.68 0.49 0.25 0.37 0.43Second one is the loss of accuracy of prediction if trait \(j\) is omitted from the index (\(\frac{r_{IH_{-j}}}{r_{IH}}\);
r_IH), which reflects the contribution of trait \(j\) to the overall selection objective.
round(res_old$r_IH, 2)
#>  RZM  RZN RZEo  RZR RZKm  RZS 
#> 0.30 0.05 0.03 0.01 0.01 0.01Further, the first derivative of \(d\) with respect to \(w\) (\(\frac{\delta d}{\delta w}\);
del_d_scaled) tells us how a change in the economic weight
in some trait would affect the composition of the genetic trend. Note
that the values are scaled so that the sum of the absolute values within
a row sums up to one. The first row thereby tells us that a change of
the weight for “RZM” mainly affects the gain in “RZM”, but only
moderately affects the functional traits. A change in the weight for
“RZN” (second row) also indirectly affects e.g. “RZR”.
round(res_old$del_d_scaled, 2)
#>        RZM   RZN  RZEo  RZEn   RZR  RZKm  RZKd   RZH   RZC   RZS
#> RZM   0.15 -0.15 -0.09 -0.10 -0.14 -0.06 -0.02 -0.14 -0.01 -0.13
#> RZN  -0.16  0.23 -0.01  0.02  0.15  0.04  0.05  0.18  0.04  0.12
#> RZEo -0.11 -0.01  0.36  0.33 -0.03 -0.02 -0.05  0.02 -0.05  0.02
#> RZEn -0.13  0.03  0.33  0.31 -0.01 -0.02 -0.04  0.05 -0.04  0.06
#> RZR  -0.14  0.15 -0.02  0.00  0.31  0.12  0.05  0.13  0.02  0.04
#> RZKm -0.09  0.06 -0.03 -0.02  0.19  0.49 -0.01  0.07  0.02  0.01
#> RZKd -0.08  0.16 -0.13 -0.10  0.18 -0.03  0.07  0.13  0.04  0.08
#> RZH  -0.15  0.19  0.01  0.04  0.14  0.05  0.04  0.17  0.03  0.18
#> RZC  -0.06  0.19 -0.16 -0.13  0.11  0.07  0.05  0.12  0.06  0.04
#> RZS  -0.14  0.13  0.02  0.05  0.05  0.01  0.02  0.19  0.01  0.39Since V0.2.0.0, we calculate an approximate derivative by incrementing the weight of one trait slightly, while reducing the weight of the other traits accordingly, to be able to account for the side restriction that \(\sum w = 0\).
A further interesting feature of the method is to compare the
expected composition of the gain with the observed genetic trend. Note
that we subset w and G to the index traits, to
have a matching scaling.
res_old <- SelInd(w = w_old[w_old>0], G = G[w_old>0,w_old>0],  r2 = r2_old, verbose = FALSE)
round(deltautr - res_old$d_G_exp_scaled, 3)
#>    RZM    RZN   RZEo    RZR   RZKm    RZS 
#>  0.007 -0.007  0.022  0.027 -0.031 -0.018This shows us that the resulting trend matches the expected well with probably slightly more weight on exterior (“RZEo”) and reproduction (“RZR”).
We can also check what this meant for the economic and index weights by calculating “realized weights” given the observed gains
res_old <- SelInd(
  w = w_old[w_old>0],
  G = G[w_old>0, w_old>0],
  r2 = r2_old,
  d_G_obs = deltautr,
  verbose = FALSE)
round(res_old$w_real - res_old$w, 2)
#>   RZM   RZN  RZEo   RZR  RZKm   RZS 
#> -0.05 -0.11  0.03  0.12 -0.10 -0.03
round(res_old$b_real - res_old$b_scaled, 2)
#>   RZM   RZN  RZEo   RZR  RZKm   RZS 
#> -0.01 -0.05  0.04  0.10 -0.08 -0.03This reveals that there was effectively slightly more weight on “RZEo” and “RZR” in practical breeding decisions than suggested by the index.