Chapter 7 Neutral and adaptive genetic variation effect on individual growth

We investigated effects of ecological and evolutionary processes on individual growth, using genetic species and kinship. The individual growth of individual \(i\) in population \(p\) between individual recruitment \(y_0\) and 2017, corresponds to the difference of DBH between the two years, and is defined with a hierarchical model in a lognormal distribution as follows:

\[DBH_{y=2017,p,i} - DBH_{y=y0,p,i} \sim logN(log[\sum_{y=y0}^{y=2017}AGR(DBH_{y,p,i})], \sigma^2_1)\]

where the difference of DBH \(DBH_{y=2017,p,i}-DBH_{y=y_0,p,i}\) is defined with a lognormal distribution located on the logarithm of the sum of annual growth rates \(AGR\) during the period \(y_0-2017\) and of shape \(\sigma_1\). The annual growth rate \(AGR\) for individual \(i\) in population \(p\) at year \(y\) with a diameter of \(DBH_{y,p,i}\) is defined following a Gompertz model (Gompertz 1825) already identified as the best model for growth-trajectories in Paracou (Hérault et al. 2011):

\[AGR(DBH_{y,p,i}) = Gmax_i.exp(-\frac12[\frac{log(\frac{DBH_{y,p,i}}{Doptp})}{Ksp}]^2)\]

where \(Gmax_i\) is the maximum growth potential (maximal AGR during individual life) for individual \(i\), \(Dopt_p\) is the population optimal diameter at which the individual reaches its maximum growth potential, and \(Ks_p\) is the population kurtosis defining the width of the bell-shaped growth-trajectory (see figure 1 in Hérault et al. 2011). To ease model inference, population optimal diameter \(Dopt_p\) and kurtosis \(Ks_p\) were defined as random population effects centered on a global \(Dopt\) and \(Ks\) with corresponding variances \(\sigma^2_{P,Dopt}\) and \(\sigma^2_{P,Ks}\). Individual \(i\) maximum growth potential \(Gmax_i\) was defined in a nested Animal model with a lognormal distribution:

\[Gmax_i \sim logN(log(Gmax_p.a_i), \sigma_{R,Gmax})\] \[a_i \sim MVlogN(log(1), \sigma_{G,Gmax}.K)\]

where \(Gmax_p\) is the mean \(Gmax\) of population \(p\), \(a_i\) is the breeding value of individual \(i\), and \(\sigma_{R,Gmax}\) is the shape of the lognormal distribution. Individual breeding values \(a_i\) are defined following a multivariate lognormal law \(MVlogN\) with a co-shape matrix defined as the product of the kinship matrix \(K\) and the genotypic variation \(\sigma_{G,Gmax}\). To estimate variances on a normal-scale, we log-transformed population fixed effect, genetic additive values, and calculated conditional and marginal \(R^2\) (Nakagawa & Schielzeth 2013). We used Bayesian inference with No-U-Turn Sampler (NUTS, Hoffman & Gelman 2014) using stan language (Carpenter et al. 2017).

7.1 Growth data

trees <- src_sqlite(file.path("data", "Paracou","trees", "Paracou.sqlite")) %>% 
  tbl("Paracou") %>% 
  filter(Genus == "Symphonia") %>% 
  mutate(DBH = CircCorr/pi) %>% 
  filter(!(CodeMeas %in% c(4))) %>% 
  collect()
trees <- read_tsv(file.path(path, "..", "variantCalling", "paracou",
                          "symcapture.all.biallelic.snp.filtered.nonmissing.paracou.fam"),
         col_names = c("FID", "IID", "FIID", "MIID", "sex", "phenotype")) %>% 
  mutate(Ind = gsub(".g.vcf", "", IID)) %>% 
  mutate(X = gsub("P", "", Ind)) %>% 
  separate(X, c("Plot", "SubPlot", "TreeFieldNum"), convert = T) %>% 
  left_join(trees) %>% 
  left_join(read_tsv(file.path(path, "bayescenv", "paracou3pop.popmap"),
                     col_names = c("IID", "pop"))) %>% 
  left_join(read_tsv(file.path(path, "populations", "paracou.hybridmap")),
            by = "Ind", suffix = c("", ".hybrid"))
trees <- trees %>% 
  group_by(Ind) %>% 
  mutate(Y0 = dplyr::first(CensusYear), DBH0 = dplyr::first(DBH), 
         DBHtoday = dplyr::last(DBH), N = n()) %>%  
  ungroup() %>% 
  dplyr::select(Ind, Xutm, Yutm, IID, pop, Y0, DBH0, DBHtoday, N) %>% 
  unique() %>% 
  mutate(DBHtoday = ifelse(DBHtoday == DBH0, DBHtoday + 0.1, DBHtoday))
write_tsv(trees, file = "save/Growth.tsv")

7.2 Simulated Growth and Animal Model

We used the following growth model with a lognormal distribution to estimate individual growth potential and associated genotypic variation:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \theta_{1,p,i}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\theta_{2,p}})}{\theta_{3,p}}]^2)), \sigma_1) \\ \theta_{1,p,i} \sim \mathcal {logN}(log(\theta_{1,p}.a_{1,i}), \sigma_2) \\ \theta_{2,p} \sim \mathcal {logN}(log(\theta_2),\sigma_3) \\ \theta_{3,p} \sim \mathcal {logN}(log(\theta_3),\sigma_4) \\ a_{1,i} \sim \mathcal{MVlogN}(log(1), \sigma_5.K) \tag{7.1} \end{equation}\]

We fitted the equivalent model with the following priors:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \hat{\theta_{1,p,i}}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\hat{\theta_{2,p}}})}{\hat{\theta_{3,p}}}]^2)), \sigma_1) \\ \hat{\theta_{1,p,i}} = e^{log(\theta_{1,p}.\hat{a_{1,i}}) + \sigma_2.\epsilon_{1,i}} \\ \hat{\theta_{2,p}} = e^{log(\theta_2) + \sigma_3.\epsilon_{2,p}} \\ \hat{\theta_{3,p}} = e^{log(\theta_3) + \sigma_4.\epsilon_{3,p}} \\ \hat{a_{1,i}} = e^{\sigma_5.A.\epsilon_{4,i}} \\ \epsilon_{1,i} \sim \mathcal{N}(0,1) \\ \epsilon_{2,p} \sim \mathcal{N}(0,1) \\ \epsilon_{3,p} \sim \mathcal{N}(0,1) \\ \epsilon_{4,i} \sim \mathcal{N}(0,1) \\ ~ \\ (\theta_{1,p}, \theta_2, \theta_3) \sim \mathcal{logN}^3(log(1),1) \\ (\sigma_1, \sigma_2, \sigma_3, \sigma_4, \sigma_5) \sim \mathcal{N}^5_T(0,1) \\ ~ \\ V_P = Var(log(\mu_p)) \\ V_G=\sigma_5^2\\ V_R=\sigma_2^2 \tag{7.2} \end{equation}\]

Table 7.1: Animal model fitted versus expected values.
Parameter Estimate Standard error Expected \(\hat R\)
thetap1[1] 0.8266180 0.1121729 0.5300000 1.0008119
thetap1[2] 0.4816256 0.0889950 0.5400000 1.0015447
thetap1[3] 0.3413224 0.0629704 0.3600000 1.0009558
theta2 0.2596370 0.0764302 0.2500000 1.0004898
theta3 0.7276357 0.1440250 0.7000000 0.9996955
Vp 0.1379774 0.0635966 0.0352066 1.0016146
Vg 0.0280998 0.0758363 0.1350074 1.0016744
Vr 0.5217135 0.1090083 0.4489000 1.0170006
Parameters for Growth & Animal model: trace plots and expected values in red.

Figure 7.1: Parameters for Growth & Animal model: trace plots and expected values in red.

7.3 Genetic variance

We used the following growth model with a lognormal distribution to estimate individual growth potential and associated genotypic variation:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \theta_{1,p,i}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\theta_{2,p}})}{\theta_{3,p}}]^2)), \sigma_1) \\ \theta_{1,p,i} \sim \mathcal {logN}(log(\theta_{1,p}.a_{1,i}), \sigma_2) \\ \theta_{2,p} \sim \mathcal {logN}(log(\theta_2),\sigma_3) \\ \theta_{3,p} \sim \mathcal {logN}(log(\theta_3),\sigma_4) \\ a_{1,i} \sim \mathcal{MVlogN}(log(1), \sigma_5.K) \tag{7.3} \end{equation}\]

We fitted the equivalent model with the following priors:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \hat{\theta_{1,p,i}}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\hat{\theta_{2,p}}})}{\hat{\theta_{3,p}}}]^2)), \sigma_1) \\ \hat{\theta_{1,p,i}} = e^{log(\theta_{1,p}.\hat{a_{1,i}}) + \sigma_2.\epsilon_{1,i}} \\ \hat{\theta_{2,p}} = e^{log(\theta_2) + \sigma_3.\epsilon_{2,p}} \\ \hat{\theta_{3,p}} = e^{log(\theta_3) + \sigma_4.\epsilon_{3,p}} \\ \hat{a_{1,i}} = e^{\sigma_5.A.\epsilon_{4,i}} \\ \epsilon_{1,i} \sim \mathcal{N}(0,1) \\ \epsilon_{2,p} \sim \mathcal{N}(0,1) \\ \epsilon_{3,p} \sim \mathcal{N}(0,1) \\ \epsilon_{4,i} \sim \mathcal{N}(0,1) \\ ~ \\ (\theta_{1,p}, \theta_2, \theta_3) \sim \mathcal{logN}^3(log(1),1) \\ (\sigma_1, \sigma_2, \sigma_3, \sigma_4, \sigma_5) \sim \mathcal{N}^5_T(0,1) \\ ~ \\ V_P = Var(log(\mu_p)) \\ V_G=\sigma_5^2\\ V_R=\sigma_2^2 \tag{7.4} \end{equation}\]

Table 7.2: Individual growth potential model.
Parameter Estimate Standard error \(\hat R\) \(N_{eff}\)
thetap1[1] 0.5423337 0.0699702 1.002517 1993
thetap1[2] 0.5596376 0.1084135 1.001838 2331
thetap1[3] 0.3485849 0.0301761 1.002514 1218
theta2 0.2528277 0.0858377 1.005150 2249
theta3 0.6963378 0.1038580 1.006909 1345
sigma[1] 0.1373708 0.0742090 1.356851 22
sigma[2] 0.5075622 0.1714122 1.107800 61
sigma[3] 0.2805989 0.2994495 1.000577 2627
sigma[4] 0.1377079 0.2346745 1.002896 2934
sigma[5] 0.4684783 0.1850532 1.068632 87
Trace plots of model parameters.

Figure 7.2: Trace plots of model parameters.

Energy of the model.

Figure 7.3: Energy of the model.

Species predicted growth curves.

Figure 7.4: Species predicted growth curves.

R2 for Gmax.

Figure 7.5: R2 for Gmax.

Genetic variance partitioning for Gmax.

Figure 7.6: Genetic variance partitioning for Gmax.

Relation between genotypic values for individual growth potential (Gmax) and neighbourhood crowding index (NCI), an indirect measurement of access to light, for different classes of diameters. Regression lines represent a linear model of form y ~ x. Annotations give for each diameter class the Pearson’s R correlation coefficient and the associated p-value.

Figure 7.7: Relation between genotypic values for individual growth potential (Gmax) and neighbourhood crowding index (NCI), an indirect measurement of access to light, for different classes of diameters. Regression lines represent a linear model of form y ~ x. Annotations give for each diameter class the Pearson’s R correlation coefficient and the associated p-value.

Spatial autocorrelogram (Moran's I) of variables and associated genetic additive values (a).

Figure 7.8: Spatial autocorrelogram (Moran’s I) of variables and associated genetic additive values (a).

7.4 Confounding phenotypic and environmental variation

If the phenotypic variation is only plastic and confounded with the environmental variation, the genotypic variance associated to the phenotype while controlling for the environmental variation should be null (\(\sigma^2_{G|E}=0\)). Instead, we still observed a non null genotypic variation while controlling for the environmental variation (\(\frac{\sigma^2_{G|E}}{\sigma^2_P}=0.12\), Fig. 7.13).

We used the following growth model with a lognormal distribution to estimate individual growth potential and associated genotypic variation controlling for environmental variation (\(NCI\) and \(TWI\)) at the individual scale:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \theta_{1,p,i}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\theta_{2,p}})}{\theta_{3,p}}]^2)), \sigma_1) \\ \theta_{1,p,i} \sim \mathcal {logN}(log(\theta_{1,p}.a_{1,i}. \beta_1.TWI_i.\beta_2.NCI_i), \sigma_2) \\ \theta_{2,p} \sim \mathcal {logN}(log(\theta_2),\sigma_3) \\ \theta_{3,p} \sim \mathcal {logN}(log(\theta_3),\sigma_4) \\ a_{1,i} \sim \mathcal{MVlogN}(log(1), \sigma_5.K) \tag{7.5} \end{equation}\]

We fitted the equivalent model with following priors:

\[\begin{equation} DBH_{y=today,p,i} - DBH_{y=y0,p,i} \sim \\ \mathcal{logN} (log(\sum _{y=y_0} ^{y=today} \hat{\theta_{1,p,i}}.exp(-\frac12.[\frac{log(\frac{DBH_{y,p,i}}{100.\hat{\theta_{2,p}}})}{\hat{\theta_{3,p}}}]^2)), \sigma_1) \\ \hat{\theta_{1,p,i}} = e^{log(\theta_{1,p}.\hat{a_{1,i}}. \beta_1.TWI_i.\beta_2.NCI_i) + \sigma_2.\epsilon_{1,i}} \\ \hat{\theta_{2,p}} = e^{log(\theta_2) + \sigma_3.\epsilon_{2,p}} \\ \hat{\theta_{3,p}} = e^{log(\theta_3) + \sigma_4.\epsilon_{3,p}} \\ \hat{a_{1,i}} = e^{\sigma_5.A.\epsilon_{4,i}} \\ \epsilon_{1,i} \sim \mathcal{N}(0,1) \\ \epsilon_{2,p} \sim \mathcal{N}(0,1) \\ \epsilon_{3,p} \sim \mathcal{N}(0,1) \\ \epsilon_{4,i} \sim \mathcal{N}(0,1) \\ ~ \\ (\theta_{1,p}, \theta_2, \theta_3) \sim \mathcal{logN}^3(log(1),1) \\ (\sigma_1, \sigma_2, \sigma_3, \sigma_4, \sigma_5) \sim \mathcal{N}^5_T(0,1) \\ ~ \\ V_P = Var(log(\mu_p)) \\ V_G=\sigma_5^2\\ V_{TWI} = Var(log(\beta_1.TWI_i)) \\ V_{NCI} = Var(log(\beta_2.NCI_i)) \\ V_R=\sigma_2^2 \tag{7.6} \end{equation}\]

Table 7.3: Individual growth potential model.
Parameter Estimate Standard error \(\hat R\) \(N_{eff}\)
thetap1[1] 0.6819703 0.1750196 1.0017710 3771
thetap1[2] 0.4693647 0.1630881 1.0003561 3238
thetap1[3] 0.7133448 0.1741461 1.0009126 3999
beta[1] 1.0338461 0.4773498 1.0009493 5751
beta[2] 1.0231845 0.4821474 0.9998852 6093
theta2 0.2212118 0.0872344 1.0031382 1564
theta3 0.6542952 0.1080025 1.0011451 2755
sigma[1] 0.1302832 0.0953805 1.3708842 22
sigma[2] 0.8841277 0.1185958 1.1145255 81
sigma[3] 0.3078621 0.3262250 1.0020929 2855
sigma[4] 0.1441136 0.2493211 1.0005756 4162
sigma[5] 0.4157660 0.2270843 1.1430011 55
Traceplot of model parameters.

Figure 7.9: Traceplot of model parameters.

Energy of the model.

Figure 7.10: Energy of the model.

Species predicted growth curve.

Figure 7.11: Species predicted growth curve.

R2 for Gmax.

Figure 7.12: R2 for Gmax.

Variance partitioning for individual maximum growth potential (Gmax). Variation of individual maximum growth potential has been partitioned into among-species (red), among-genotype (brown), along forest gap dynamics (NCI, green), along topography (TWI, blue), and residual (pink) variations.

Figure 7.13: Variance partitioning for individual maximum growth potential (Gmax). Variation of individual maximum growth potential has been partitioned into among-species (red), among-genotype (brown), along forest gap dynamics (NCI, green), along topography (TWI, blue), and residual (pink) variations.

Relation between genotypic values for individual growth potential controlling for TWI and NCI (Gmax) and neighbourhood crowding index (NCI), an indirect measurement of access to light, for different classes of diameters. Regression lines represent a linear model of form y ~ x. Annotations give for each diameter class the Pearson’s R correlation coefficient and the associated p-value.

Figure 7.14: Relation between genotypic values for individual growth potential controlling for TWI and NCI (Gmax) and neighbourhood crowding index (NCI), an indirect measurement of access to light, for different classes of diameters. Regression lines represent a linear model of form y ~ x. Annotations give for each diameter class the Pearson’s R correlation coefficient and the associated p-value.

Spatial autocorrelogram (Moran's I) of variables and associated genetic multiplicative values.

Figure 7.15: Spatial autocorrelogram (Moran’s I) of variables and associated genetic multiplicative values.

7.5 Manuscript figure

References

Carpenter, B., Gelman, A., Hoffman, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P. & Riddell, A. (2017). Stan : A Probabilistic Programming Language. Journal of Statistical Software, 76. Retrieved from http://www.jstatsoft.org/v76/i01/

Gompertz, B. (1825). On the nature of the function expressive of the law of humanmortality, and on a newmode of determining the value of life contingencies. Philosophical Transactions of the Royal Society of London, 115, 513–583. Retrieved from https://www.tandfonline.com/doi/full/10.1080/14786445908642737

Hérault, B., Bachelot, B., Poorter, L., Rossi, V., Bongers, F., Chave, J., Paine, C.E.T., Wagner, F. & Baraloto, C. (2011). Functional traits shape ontogenetic growth trajectories of rain forest tree species. Journal of Ecology, 99, 1431–1440.

Hoffman, M.D. & Gelman, A. (2014). The no-U-turn sampler: Adaptively setting path lengths in Hamiltonian Monte Carlo. Journal of Machine Learning Research, 15, 1593–1623. Retrieved from http://arxiv.org/abs/1111.4246

Nakagawa, S. & Schielzeth, H. (2013). A general and simple method for obtaining R2 from generalized linear mixed-effects models. Methods in Ecology and Evolution, 4, 133–142.