Forest

Evaluation of the TROLL version 4 final structure and composition of the forest in relation to field observations and remote sensing at two Amazonian sites with spin-up repetitions.

Structure

The main development of TROLL version 4 is the inclusion of a new explicit modelling of daily water and carbon fluxes as well as a new leaf phenology. We thus focused TROLL version 4 evaluation on fluxes, as forest structure and composition was extensively assessed in Maréchaux and Chave (2017) . However here, we quickly investigate back forest structure and composition in Paracou and Tapajos.

Code
files <- list.files("results/eval_isa/", pattern = "final_pattern.txt", 
                    recursive = T, full.names = T) 
names(files) <- list.files("results/eval_isa/", pattern = "final_pattern.txt", 
                    recursive = T, full.names = F) 
troll_forest <- files %>% 
  lapply(vroom::vroom) %>% 
  bind_rows(.id = "file") %>% 
  separate(file, "sim", "/") %>% 
  separate(sim, c("site", "repetition")) %>%
  mutate(repetition = as.numeric(gsub("R", "", repetition))) %>% 
  mutate(species = gsub("_", " ", s_name)) %>% 
  mutate(dbh = dbh*100)
write_tsv(troll_forest, "outputs/evaluation_forest2.tsv")

files <- list.files("results/eval_isa/", pattern = "CHM.txt", 
                    recursive = T, full.names = T) 
names(files) <- list.files("results/eval_isa/", pattern = "CHM.txt", 
                    recursive = T, full.names = F) 
troll_height <- files %>% 
  lapply(vroom::vroom) %>% 
  bind_rows(.id = "file") %>% 
  separate(file, "sim", "/") %>% 
  separate(sim, c("site", "repetition")) %>%
  mutate(repetition = as.numeric(gsub("R", "", repetition))) %>% 
  group_by(site, repetition, height) %>% 
  summarise(n = n()) %>% 
  mutate(pct = n/sum(n)*100)
write_tsv(troll_height, "outputs/evaluation_height2.tsv")

Forest

We characterized global forests structure through their distribution of basal area (m2/ha) and log-transformed number of stems per 10-cm classes of diameter at breast height (DBH, cm) for trees above 10 cm DBH. We found simulated back the same distributions in Paracou with a small lack of smaller diameters and slight increase in trees with a diameter around 100 cm. But beware of stochasticity that may have resulted in a few big trees surviving in the 16-ha simulations. We found back in Paracou a total basal area of 29 m2/ha against a mean observed of 30 across plots and a total number of 576 individuals simulated per hectare against a mean observed of 655 across plots. In Tapajos, we found very similar distribution but with increased small diameters in simulation and a lack of big trees (above 100cm). We thus overestimated basal area (30 m2/ha against 24 observed) and abundance (598 against 632 observed). However, Tapajos data are based on only 8 very small plots of 0.25ha, which may have resulted in an over-representation of a few big individuals. The resulting quantiles comparisons are very good with a correlation coefficient above 0.93 for all metrics and sites except Tapajos were basal area correlation coefficient was only 0.72.

Code
structure <- read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh >= 10) %>% 
  mutate(plot = repetition) %>% 
  select(site, plot, dbh) %>% 
  mutate(dbh_class = cut(dbh, breaks = seq(10, 200, by = 10), 
                         labels = seq(10, 190, by = 10)+5)) %>% 
  mutate(dbh_class = as.numeric(as.character(dbh_class))) %>% 
  group_by(site, plot, dbh_class) %>% 
  summarise(abundance = n()/4,
            ba = sum((dbh/2)^2*pi)/10^4/4) %>% 
  mutate(log_abundance = log10(abundance)) %>% 
  gather(variable, simulated, -site, -dbh_class, -plot) %>% 
  full_join(read_tsv("outputs/structure.tsv") %>% 
              filter(type != "understory") %>% 
              select(-type)) %>% 
  gather(type, value, -site, -dbh_class, -variable, -plot) %>% 
  group_by(site, dbh_class, variable, type) %>% 
  summarise(l = quantile(value, 0.025, na.rm = TRUE), 
            m = quantile(value, 0.5, na.rm = TRUE), 
            h = quantile(value, 0.975, na.rm = TRUE)) %>% 
  na.omit() %>% 
  mutate(varlong = recode(variable, "ba" = 'Basal~area~"["~m^{2}~ha^{-~1}~"]"',
                          "log_abundance" = 'Log[10]~abundance~"["~ha^{-~1}~"]"'))
# read_tsv("outputs/structure.tsv") %>% 
#               filter(type != "understory") %>% 
#               select(-type) %>% 
#   group_by(site, variable, plot) %>% 
#   summarise(observed = sum(observed)) %>% 
#   summarise(l = quantile(observed, 0.025, na.rm = TRUE), 
#             m = quantile(observed, 0.5, na.rm = TRUE), 
#             h = quantile(observed, 0.975, na.rm = TRUE))
Code
labs <- structure %>% 
  filter(variable %in% c("ba", "log_abundance")) %>% 
  mutate(m = ifelse(variable == "log_abundance", 10^m, m)) %>% 
  group_by(variable, varlong, type, site) %>% 
  summarise(total = sum(m)) %>% 
  group_by(variable, site) %>% 
  pivot_wider(names_from = type, values_from = total) %>% 
  mutate(label = paste0("Field: ", round(observed), "\nTROLL: ", round(simulated))) %>% 
  mutate(dbh_class = 120) %>% 
  mutate(m = ifelse(variable == "log_abundance", 1.5, 5)) 
filter(structure, variable %in% c("ba", "log_abundance")) %>% 
  ggplot(aes(dbh_class, m)) +
  geom_linerange(aes(ymin = l, ymax = h, col = type)) +
  geom_point(aes(col = type)) +
  geom_line(aes(col = type)) +
  theme_bw() +
  facet_grid(varlong ~ site, scales = "free_y", labeller = label_parsed) +
  xlab("DBH [ cm ]") + ylab("") +
  scale_color_discrete("") +
  theme(legend.position = "bottom") +
  geom_text(data = labs, aes(label = label)) +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))

Evaluation of the size structure estimated by TROLL at the Paracou and Tapajos sites, expressed in terms of number of stems and basal area. Confidence intervals on inventory analysis at 95 % are shown with error bars and are based on among plots variations. The confidence on the TROLL values was estimated generating an ensemble of 10 simulations (see methods).
Code
filter(structure, variable %in% c("ba", "abundance")) %>% 
  select(site, dbh_class, variable, m, type) %>% 
   pivot_wider(names_from = type, values_from = m) %>% 
  ungroup() %>% 
  mutate(observed = ifelse(is.na(observed), 0, observed)) %>% 
  mutate(simulated = ifelse(is.na(simulated), 0, simulated)) %>% 
  nest_by(site, variable) %>% 
  mutate(R2 = summary(lm(observed ~ 0 + simulated, data = data))$r.squared,
         CC = cor(data$simulated, data$observed),
         RMSEP = sqrt(mean((data$simulated-data$observed)^2)),
         RRMSEP = sqrt(mean((data$simulated-data$observed)^2))/mean(data$observed),
         SD =sd(data$simulated-data$observed),
         RSD =sd(data$simulated-data$observed)/mean(data$observed)) %>% 
  select(-data) %>% 
  knitr::kable(caption = "Evaluation of number of stems and basal area distribution at Paracou and Tapajos.")
Evaluation of number of stems and basal area distribution at Paracou and Tapajos.
site variable R2 CC RMSEP RRMSEP SD RSD
Paracou abundance 0.9992724 0.9995453 4.4209881 0.0870730 4.4393945 0.0874356
Paracou ba 0.9923733 0.9920770 0.3110493 0.1204531 0.3240880 0.1255023
Tapajos abundance 0.9956985 0.9979059 22.6077689 1.0616908 22.3931696 1.0516129
Tapajos ba 0.8382692 0.8096759 0.7846062 0.5498374 0.8087298 0.5667427

Understory

Taking advantage of new inventories of understory in Paracou, we assessed TROLL version 4 ability to simulate understory. We found TROLL to understimate the number of small trees with DBH close to 1cm and overestimate bigger trees with DBH closer to 10. We ewpected this behaviour as TROLL only simulates one tree per square meter. Nonetheless we found back a basal area of 4 m2/ha despite an underestimation of 2,310 individuals per hectare against 3,787, observed.

Code
understory <- read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh <= 10) %>% 
  mutate(plot = repetition) %>% 
  select(site, plot, dbh) %>% 
  mutate(dbh_class = cut(dbh, breaks = seq(1, 10, by = 1), 
                         labels = seq(1, 9, by = 1)+0.5)) %>% 
  mutate(dbh_class = as.numeric(as.character(dbh_class))) %>% 
  group_by(site, plot, dbh_class) %>% 
  summarise(abundance = n()/4,
            ba = sum((dbh/2)^2*pi)/10^4/4) %>% 
  mutate(log_abundance = log10(abundance)) %>% 
  gather(variable, simulated, -site, -dbh_class, -plot) %>% 
  full_join(read_tsv("outputs/structure.tsv") %>% 
            filter(type == "understory") %>% 
              select(-type)) %>% 
  gather(type, value, -site, -dbh_class, -variable, -plot) %>% 
  group_by(site, dbh_class, variable, type) %>% 
  summarise(l = quantile(value, 0.025, na.rm = TRUE), 
            m = quantile(value, 0.5, na.rm = TRUE), 
            h = quantile(value, 0.975, na.rm = TRUE)) %>% 
  na.omit() %>% 
  mutate(varlong = recode(variable, "ba" = 'Basal~area~"["~m^{2}~ha^{-~1}~"]"',
                          "log_abundance" = 'Log[10]~number~of~stems~"["~ha^{-~1}~"]"'))
Code
labs <- understory %>% 
  filter(site == "Paracou") %>% 
  filter(variable %in% c("ba", "log_abundance")) %>% 
  mutate(m = ifelse(variable == "log_abundance", 10^m, m)) %>% 
  group_by(variable, varlong, type, site) %>% 
  summarise(total = sum(m)) %>% 
  group_by(variable, site) %>% 
  pivot_wider(names_from = type, values_from = total) %>% 
  mutate(label = paste0("Field: ", round(observed,1), "\nTROLL: ", round(simulated,1))) %>% 
  mutate(dbh_class = 5) %>% 
  mutate(m = ifelse(variable == "log_abundance", 3, 0.5)) 
g <- filter(understory, variable %in% c("ba", "log_abundance")) %>% 
  filter(site == "Paracou") %>% 
  ggplot(aes(dbh_class, m)) +
  geom_linerange(aes(ymin = l, ymax = h, col = type)) +
  geom_point(aes(col = type)) +
  geom_line(aes(col = type)) +
  theme_bw() +
  facet_wrap(site ~ varlong, scales = "free_y", labeller = label_parsed) +
  xlab("DBH [ cm ]") + ylab("") +
  scale_color_discrete("") +
  theme(legend.position = "bottom") +
  # geom_text(data = labs, aes(label = label)) +
  geom_text(aes(col = "observed", label = lab), 
            data = data.frame(dbh_class = 5, 
                              site = "Paracou",
                              m = c(3.25, .55),
                              lab = c("3,787 trees/ha",
                                      "3.7 m2/ha"),
                              varlong = c("Log[10]~number~of~stems~\"[\"~ha^{-~1}~\"]\"",
                                          "Basal~area~\"[\"~m^{2}~ha^{-~1}~\"]\""))) +
  geom_text(aes(col = "simulated", label = lab), 
            data = data.frame(dbh_class = 5, 
                              site = "Paracou",
                              m = c(3, .5),
                              lab = c("2,138 trees/ha",
                                      "2.9 m2/ha"),
                              varlong = c("Log[10]~number~of~stems~\"[\"~ha^{-~1}~\"]\"",
                                          "Basal~area~\"[\"~m^{2}~ha^{-~1}~\"]\""))) +
  geom_text(aes(col = "simulated", label = lab), 
            data = data.frame(dbh_class = c(4,6), 
                              site = "Paracou",
                              m = c(2, .2),
                              lab = c("RMSEP=342.15",
                                      "RMSEP=0.12"),
                              varlong = c("Log[10]~number~of~stems~\"[\"~ha^{-~1}~\"]\"",
                                          "Basal~area~\"[\"~m^{2}~ha^{-~1}~\"]\""))) +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))
ggsave("figures/f03.png", g, dpi = 300, width = 8, height = 5, bg = "white")
g

Evaluation of the size structure estimated by TROLL at the Paracou site with understorey trees, expressed in terms of number of stems and basal area. Confidence intervals on inventory analysis at 95 % are shown with error bars and are based on among plots variations. The confidence on the TROLL values was estimated generating an ensemble of 10 simulations (see methods).
Code
filter(understory, variable %in% c("ba", "abundance")) %>% 
  select(site, dbh_class, variable, m, type) %>% 
   pivot_wider(names_from = type, values_from = m) %>% 
  ungroup() %>% 
  mutate(observed = ifelse(is.na(observed), 0, observed)) %>% 
  mutate(simulated = ifelse(is.na(simulated), 0, simulated)) %>% 
  nest_by(site, variable) %>% 
  mutate(R2 = summary(lm(observed ~ 0 + simulated, data = data))$r.squared,
         CC = cor(data$simulated, data$observed),
         RMSEP = sqrt(mean((data$simulated-data$observed)^2)),
         RRMSEP = sqrt(mean((data$simulated-data$observed)^2))/mean(data$observed),
         SD =sd(data$simulated-data$observed),
         RSD =sd(data$simulated-data$observed)/mean(data$observed)) %>% 
  select(-data) %>% 
  na.omit() %>% 
  knitr::kable(caption = "Evaluation of number of stems and basal area distribution at Paracou for the understory.")
Evaluation of number of stems and basal area distribution at Paracou for the understory.
site variable R2 CC RMSEP RRMSEP SD RSD
Paracou abundance 0.9643668 0.9964503 415.2703897 0.9869114 379.2107604 0.9012138
Paracou ba 0.9312054 0.9003840 0.1338779 0.3251708 0.0793203 0.1926579

Height

Code
g <- read_tsv("outputs/evaluation_height2.tsv") %>% 
  group_by(site, height) %>% 
  summarise(l = quantile(pct, 0.025, na.rm = TRUE), 
            m = quantile(pct, 0.5, na.rm = TRUE), 
            h = quantile(pct, 0.975, na.rm = TRUE)) %>% 
  mutate(type = "TROLL") %>% 
  bind_rows(
    read_tsv("outputs/hieght_chm.tsv") %>% 
      rename(m = pct)
  ) %>% 
  filter(height > 5) %>% 
  ggplot(aes(height, m, col = type)) +
  geom_ribbon(aes(ymin = l, ymax = h, fill = type), col = NA, alpha = 0.2) +
  geom_line() +
  facet_wrap(~ site) +
  theme_bw() +
  coord_flip() +
  ylab("Density [ % ]") + xlab("Height [ m ]") +
  scale_color_manual("", values =  as.vector(cols[c("obs", "sim")])) +
  scale_fill_manual("", values =  as.vector(cols[c("obs", "sim")])) +
  theme(legend.position = "bottom") +
  geom_text(aes(col = "TROLL", label = lab), 
            data = data.frame(height = 60, 
                              m = 2,
                              site = c("Paracou", "Tapajos"),
                              lab = c("RMSEP=0.76",
                                      "RMSEP=0.56")))
ggsave("figures/f02.png", g, dpi = 300, width = 8, height = 5, bg = "white")
g

Evaluation of the height structure estimated by TROLL at the Paracou and Tapajos sites. Confidence intervals on inventory analysis at 95% are shown with shaded areas. The confidence on the TROLL values was estimated generating an ensemble of 10 simulations (see methods).
Code
read_tsv("outputs/evaluation_height2.tsv") %>% 
  group_by(site, height) %>% 
  summarise(l = quantile(pct, 0.025, na.rm = TRUE), 
            m = quantile(pct, 0.5, na.rm = TRUE), 
            h = quantile(pct, 0.975, na.rm = TRUE)) %>% 
  mutate(type = "TROLL") %>% 
  bind_rows(
    read_tsv("outputs/hieght_chm.tsv") %>% 
      rename(m = pct)
  ) %>% 
  filter(height > 5) %>% 
  select(site, height, m, type) %>% 
  pivot_wider(names_from = type, values_from = m) %>% 
  rename(simulated = TROLL) %>% 
  gather(type, observed, -site, -height, -simulated) %>% 
  mutate(observed = ifelse(is.na(observed), 0, observed)) %>% 
  mutate(simulated = ifelse(is.na(simulated), 0, simulated)) %>% 
  ungroup() %>% 
  nest_by(site, type) %>% 
  mutate(R2 = summary(lm(observed ~ 0 + simulated, data = data))$r.squared,
         CC = cor(data$simulated, data$observed),
         RMSEP = sqrt(mean((data$simulated-data$observed)^2)),
         RRMSEP = sqrt(mean((data$simulated-data$observed)^2))/mean(data$observed),
         SD =sd(data$simulated-data$observed),
         RSD =sd(data$simulated-data$observed)/mean(data$observed)) %>% 
  select(-data) %>% 
  knitr::kable(caption = "Evaluation of height distribution at Paracou and Tapajos.")
Evaluation of height distribution at Paracou and Tapajos.
site type R2 CC RMSEP RRMSEP SD RSD
Paracou ALS 0.9179100 0.9453754 0.8344928 0.4653980 0.8418443 0.4694979
Tapajos ALS 0.9668732 0.9601405 0.3920132 0.2364139 0.3952359 0.2383574

Composition

Species

Observed species rank-abundance curves in Paracou and Tapajos revealed low evenness with a few dominating species even more intense in Tapajos. However, Tapajos data are based on only 8 very small plots of 0.25ha, which may not be representative despite their virtual pooling in two 1-ha plots, and the quality of the species identifications can be lower. Simulated rank-abundance curve in Paracou was similar to the observed one with a correlation coefficient of 0.95 but with higher evenness resulting in underestimation of dominant species abundances and overestimation of rare species abundances. Simulated rank-abundance curve in Tapajos was very similar to the one simulated in Paracou but differed a lot of the one in Tapajos with strong underestimation of dominant species abundances and overestimation of rare species abundances. But once again observation data in Tapajos might be question for species resolution.

Code
compositon <- read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh >= 10) %>% 
  mutate(plot = repetition) %>% 
  mutate(species = gsub("_", " ", s_name)) %>% 
  select(site, plot, species) %>% 
  group_by(site, plot, species) %>% 
  summarise(simulated = n()/4) %>% 
  mutate(variable = "abundance") %>% 
  select(site, plot, species, variable, simulated) %>% 
  full_join(read_tsv("outputs/composition.tsv")) %>% 
  gather(type, value, -site, -plot, -variable, -species) %>% 
  group_by(site, plot, variable, type) %>% 
  arrange(desc(value)) %>% 
  mutate(rank = 1:n()) %>% 
  group_by(site, rank, type, variable) %>% 
  summarise(l = quantile(value, 0.025, na.rm = TRUE), 
            m = quantile(value, 0.5, na.rm = TRUE), 
            h = quantile(value, 0.975, na.rm = TRUE)) %>% 
  filter(rank < 115)
Code
g <- compositon %>% 
  ggplot(aes(rank, m, col = type)) +
  geom_line(aes(y = l), linewidth = 0.5, linetype = "dashed") +
  geom_line(aes(y = h), linewidth = 0.5, linetype = "dashed") +
  geom_line() +
  theme_bw() +
  facet_wrap(~ site) +
  xlab("Rank") + ylab(expression(Abundance~"["~ha^{-~1}~"]")) +
  scale_color_discrete("") +
  theme(legend.position = "bottom") +
  scale_y_log10()  +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL")) +
   geom_text(aes(col = "simulated", label = lab), 
            data = data.frame(rank = 25, 
                              m = 1,
                              site = c("Paracou", "Tapajos"),
                              lab = c("RMSEP=0.59",
                                      "RMSEP=1.10")))
ggsave("figures/f04.png", g, dpi = 300, width = 8, height = 5, bg = "white")
g

Evaluation of the species composition estimated by TROLL at the Paracou and Tapajos sites expressed in terms of species rank-abundance curve. Confidence intervals on inventory analysis at 95 % are shown with error bars and are based on among plots variations. The confidence on the TROLL values was estimated generating an ensemble of 10 simulations (see methods).
Code
compositon %>% 
  select(site, rank, m, type) %>% 
  mutate(m = log(m)) %>% 
   pivot_wider(names_from = type, values_from = m) %>% 
  ungroup() %>% 
  na.omit() %>% 
  nest_by(site) %>% 
  mutate(R2 = summary(lm(observed ~ 0 + simulated, data = data))$r.squared,
         CC = cor(data$simulated, data$observed),
         RMSEP = sqrt(mean((data$simulated-data$observed)^2)),
         RRMSEP = sqrt(mean((data$simulated-data$observed)^2))/mean(data$observed),
         SD =sd(data$simulated-data$observed),
         RSD =sd(data$simulated-data$observed)/mean(data$observed)) %>% 
  select(-data) %>% 
  knitr::kable(caption = "Evaluation of rank-abundance distribution at Paracou and Tapajos.")
Evaluation of rank-abundance distribution at Paracou and Tapajos.
site R2 CC RMSEP RRMSEP SD RSD
Paracou 0.8910454 0.8796668 0.5876773 0.5563652 0.4362957 0.4130494
Tapajos 0.4660703 0.9579145 1.1039089 2.2045961 0.6776426 1.3533074
Code
read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh >= 10) %>% 
  mutate(plot = repetition) %>% 
  mutate(species = gsub("_", " ", s_name)) %>% 
  select(site, plot, species) %>% 
  group_by(site, plot, species) %>% 
  summarise(simulated = n()/4) %>% 
  group_by(site, species) %>% 
  summarise(sim_m = median(simulated), 
            sim_h = quantile(simulated, .95),
            sim_l = quantile(simulated, .5)) %>% 
  left_join(read_tsv("outputs/composition.tsv") %>% 
              filter(variable == "abundance") %>% 
              group_by(site, species) %>% 
              summarise(obs_m = median(observed), 
                        obs_h = quantile(observed, .95),
                        obs_l = quantile(observed, .5))) %>%  
  ggplot(aes(obs_m, sim_m)) +
  geom_abline() +
  geom_point() + 
  theme_bw() +
  scale_y_log10() +
  scale_x_log10() +
  geom_smooth(method = "lm", se = FALSE) +
  ggpubr::stat_cor(aes(label = ..rr.label..)) +
  xlab(expression("Observed mean abundance ["~ha^{-1}~"]")) +
  ylab(expression("Simulated mean abundance ["~ha^{-1}~"]")) +
  coord_equal()

Abund-abund
Code
troll_traits <- list(
    read_tsv("outputs/Paracou_species_troll.tsv"),
    read_tsv("outputs/Tapajos_species_troll.tsv")
) %>% bind_rows() %>% 
  unique() %>% 
  rename_all(~ gsub("s_", "", .)) %>% 
  select(-seedmass, -regionalfreq) %>% 
  rename(species = name) %>% 
  mutate(species = gsub("_", " ", species)) %>% 
  gather(trait, trait_value, -species)
t <- read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh >= 10) %>% 
  mutate(plot = repetition) %>% 
  mutate(species = gsub("_", " ", s_name)) %>% 
  select(site, plot, species) %>% 
  group_by(site, plot, species) %>% 
  summarise(simulated = n()/4) %>% 
  mutate(type = "sim") %>% 
  rename(value = simulated) %>% 
  bind_rows(read_tsv("outputs/composition.tsv") %>% 
              filter(variable == "abundance") %>% 
              group_by(site, species) %>% 
              mutate(type = "obs") %>% 
              rename(value = observed) %>% 
              select(-variable)) %>% 
  left_join(troll_traits)
t %>% 
  filter(trait %in% c("wsg", "hmax")) %>% 
  ggplot(aes(trait_value, value, col = type)) +
  geom_point(alpha = .2) +
  theme_bw() +
  facet_wrap(~ trait, scales = "free_x") +
  scale_y_log10() +
  geom_smooth(method = "lm", se = FALSE) +
  ggpubr::stat_cor(aes(label = ..rr.label..)) +
  theme(legend.position = "bottom") +
  xlab("Trait value") + ylab("Mean abundance ["~ha^{-1}~"]") +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))

Abund-trait

Functional

Simulated functional trait distribution in Paracou and Tapajos matched the ones observed with quantiles correlations above 0.9 for all traits and sites, at th exception of leaf area in Paracou. However due to a poor species resolution in Tapajos we used mean genus values, which may result in an overestimation of assessment in Tapajos. Leaf area differences in Paracou come to the lack of very high leaf area species in the simulations compared to the Paracou plots, however the resulting correlation coefficient might have differed if we would have used log-transformed leaf area). Diving into the observed data, the high leaf area values are due to pioneer species including Cecropia obtusa and sciadophylla, Sterculia speciosa and Perebea guianensis. The three genus are present in the simulations inputs, but seems to lack in the final forest due to either a lack of forest gaps or a lack of survival.

Code
troll_traits <- list(
    read_tsv("outputs/Paracou_species_troll.tsv"),
    read_tsv("outputs/Tapajos_species_troll.tsv")
) %>% bind_rows() %>% 
  unique() %>% 
  rename_all(~ gsub("s_", "", .)) %>% 
  select(-seedmass, -regionalfreq) %>% 
  rename(species = name) %>% 
  mutate(species = gsub("_", " ", species)) %>% 
  gather(trait, trait_value, -species)
troll_traitdist <- read_tsv("outputs/evaluation_forest2.tsv") %>% 
  filter(dbh >= 10) %>% 
  mutate(plot = repetition) %>% 
  select(site, plot, species) %>% 
  left_join(troll_traits)
all_traits_q <- bind_rows(
  mutate(troll_traitdist, type = "sim"),
  mutate(read_tsv("outputs/functional_composition.tsv"), type = "obs") %>% 
    mutate(trait = ifelse(trait == "leafaera", "leafarea", trait))
) %>% 
  group_by(site, trait, plot, type) %>% 
  summarise(quantile = list(quantile(trait_value, seq(0, 1, length.out = 100)))) %>% 
  unnest_longer(quantile) %>% 
  group_by(site, trait, type, quantile_id)%>% 
  summarise(l = quantile(quantile, 0.025, na.rm = TRUE), 
            m = quantile(quantile, 0.5, na.rm = TRUE), 
            h = quantile(quantile, 0.975, na.rm = TRUE)) %>% 
  gather(quantile, value, -site, -trait, -type, -quantile_id) %>% 
  mutate(quantile = paste0(quantile, "_", type)) %>% 
  ungroup() %>% 
  select(-type) %>% 
  pivot_wider(names_from = quantile, values_from = value)
Code
read_tsv("outputs/functional_composition.tsv") %>% 
   mutate(trait = ifelse(trait == "leafaera", "leafarea", trait)) %>% 
  filter(site == "Paracou") %>% 
  ggplot(aes(trait_value, group = plot)) +
  geom_density(aes(col = "Observed")) +
  geom_density(aes(col = "Simulated"),
               data = filter(troll_traitdist, site == "Paracou")) +
  facet_wrap(~ trait, scales = "free") +
  theme_bw() +
  ggtitle("Paracou", "546,498 inds represented on 577,761 (94.59%)") +
  theme(axis.title = element_blank(), legend.position = "bottom") +
  scale_color_discrete("") +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))

Evaluation of the functional composition estimated by TROLL at the Paracou site expressed in terms of density distribution per trait and site. The analyses have been done at the species level in Paracou.
Code
read_tsv("outputs/functional_composition.tsv") %>% 
   mutate(trait = ifelse(trait == "leafaera", "leafarea", trait)) %>% 
  filter(site == "Tapajos") %>% 
  ggplot(aes(trait_value, group = plot)) +
  geom_density(aes(col = "Observed")) +
  geom_density(aes(col = "Simulated"),
               data = filter(troll_traitdist, site == "Tapajos")) +
  facet_wrap(~ trait, scales = "free") +
  theme_bw() +
  ggtitle("Tapajos Genus level", "19,188 inds represented on 19,499 (98.65%)") +
  theme(axis.title = element_blank(), legend.position = "bottom") +
  scale_color_discrete("") +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))

Evaluation of the functional composition estimated by TROLL at the Tapajos site expressed in terms of density distribution per trait and site. The analyses have been done at the genus level in Tapajos.
Code
g <- read_tsv("outputs/functional_composition.tsv") %>%
 mutate(trait = ifelse(trait == "leafaera", "leafarea", trait)) %>% 
  mutate(trait_long = recode(trait, "ah" = "a[h]", "dbhmax" = "dbh[thres]", "hmax" = "h[lim]",
                             "leafarea" = "LA", "Nmass" = "N[mass]", "Pmass" = "P[mass]",
                             "tlp" = "pi[tlp]", "wsg" = "WSG")) %>% 
  filter(!(trait %in% c("ah", "hmax"))) %>% 
  mutate(varlong = paste0('"', site, ':"~', trait_long)) %>% 
  ggplot(aes(trait_value, group = plot)) +
  geom_density(aes(col = "Observed")) +
  geom_density(aes(col = "Simulated"), 
               data = troll_traitdist %>% 
                   mutate(trait_long = recode(trait, "ah" = "a[h]", "dbhmax" = "dbh[thres]",
                                              "hmax" = "h[lim]",
                             "leafarea" = "LA", "Nmass" = "N[mass]", "Pmass" = "P[mass]",
                             "tlp" = "pi[tlp]", "wsg" = "WSG")) %>% 
                 filter(!(trait %in% c("ah", "hmax"))) %>% 
  mutate(varlong = paste0('"', site, ':"~', trait_long))) +
  facet_wrap(~ varlong, scales = "free", labeller = label_parsed, nrow = 2) +
  theme_bw() +
  theme(axis.title = element_blank(), legend.position = "bottom",
        axis.ticks.y = element_blank(), axis.text.y = element_blank(),
        panel.spacing = unit(0,'lines'), axis.text.x = element_text(angle = 90)) +
  scale_color_discrete("") +
  scale_color_manual("", values = as.vector(cols[c("obs", "sim")]), 
                     labels = c("Field", "TROLL"))
ggsave("figures/f05.png", g, dpi = 300, width = 8, height = 5, bg = "white")
g

Evaluation of the functional composition estimated by TROLL at the Paracou and Tapajos sites expressed in terms of density distribution per trait and site. The analyses have been done at the species level in Paracou and the genus level in Tapajos.
Code
all_traits_q %>% 
  rename(simulated = m_sim, observed = m_obs) %>% 
  select(site, trait, quantile_id, observed, simulated) %>% 
  nest_by(site, trait) %>% 
  mutate(R2 = summary(lm(observed ~ 0 + simulated, data = data))$r.squared,
         CC = cor(data$simulated, data$observed),
         RMSEP = sqrt(mean((data$simulated-data$observed)^2)),
         RRMSEP = sqrt(mean((data$simulated-data$observed)^2))/mean(data$observed),
         SD =sd(data$simulated-data$observed),
         RSD =sd(data$simulated-data$observed)/mean(data$observed)) %>% 
  select(-data) %>% 
  knitr::kable(caption = "Evaluation of functional trait distribution for 100 quantiles at Paracou and Tapajos.")
Evaluation of functional trait distribution for 100 quantiles at Paracou and Tapajos.
site trait R2 CC RMSEP RRMSEP SD RSD
Paracou LMA 0.9896053 0.9201970 20.6118778 0.2050327 10.3013630 0.1024708
Paracou Nmass 0.9969127 0.9838207 0.0012712 0.0578075 0.0012768 0.0580632
Paracou Pmass 0.9618742 0.9302059 0.0001623 0.2200538 0.0001504 0.2039853
Paracou ah 0.9989955 0.9818008 0.0098758 0.0334267 0.0096867 0.0327864
Paracou dbhmax 0.9928933 0.9834815 0.0511103 0.1017157 0.0473316 0.0941956
Paracou hmax 0.9991629 0.9703460 1.4301038 0.0293869 1.3969925 0.0287065
Paracou leafarea 0.7546753 0.7600747 82.9686622 0.9118195 75.8408574 0.8334854
Paracou tlp 0.9991258 0.9830354 0.1178088 -0.0638092 0.0651376 -0.0352807
Paracou wsg 0.9992249 0.9882680 0.0329489 0.0532088 0.0180513 0.0291508
Tapajos LMA 0.9875950 0.9603714 16.6987934 0.1851223 7.1638947 0.0794187
Tapajos Nmass 0.9949649 0.9831347 0.0032181 0.1478773 0.0022077 0.1014460
Tapajos Pmass 0.9840617 0.9777216 0.0002339 0.2997895 0.0001740 0.2229925
Tapajos ah 0.9888110 0.9212514 0.0311372 0.1061306 0.0310128 0.1057066
Tapajos dbhmax 0.9883715 0.9857450 0.0606284 0.1239182 0.0572870 0.1170887
Tapajos hmax 0.9950017 0.9140041 3.4151245 0.0718007 3.3006111 0.0693932
Tapajos leafarea 0.8806318 0.8726881 47.4153861 0.5669423 40.7312316 0.4870204
Tapajos tlp 0.9950154 0.9541119 0.1536362 -0.0799200 0.1462942 -0.0761008
Tapajos wsg 0.9959010 0.9677242 0.0418398 0.0658250 0.0420499 0.0661555