The purpose of this vignette is to use {nicheROVER} and {nichetools} to extract and then visualize estimates of isotopic niche size and similarities using three isotopes (i.e., \(\delta\)13C, \(\delta\)15N, and \(\delta\)34S) for multiple freshwater fish using {ggplot2}.
This vignette can be used for additional purposes including estimating niche size and similarities among different groups of aquatic and/or terrestrial species. Furthermore, niche size and similarities for different behaviours exhibited within a population can be made using behavioural data generated from acoustic telemetry (e.g., differences in habitat occupancy).
First we will load the necessary packages to preform the analysis and visualization. We will use {nicheROVER} and {nichetools} to preform the analysis. We will use {dplyr}, {tidyr}, and {purrr} to manipulate data and iterate processes. Lastly, we will use {ggplot2}, {ggtext}, and {patchwork} to plot, add labels, and arrange plots.
I will add that many of the {dplyr} and
{tidyr} functions and processes can be replaced using {data.table}
which is great when working with large data sets.
{
library(dplyr)
library(ggplot2)
library(ggtext)
library(ggh4x)
library(nicheROVER)
library(nichetools)
library(patchwork)
library(purrr)
library(stringr)
library(tidyr)
}
#> Warning: package 'patchwork' was built under R version 4.4.1For the purpose of the vignette we will be using the
fish data frame that is available within
{nicheROVER}.
We will first use the function janitor::clean_names() to
clean up column names. For your purposes you will need to replace fish
with your data frame either by loading a csv, rds, or qs, with your
data. You can do this multiple ways, I prefer using
readr::read_csv() but base R’s read.csv()
works perfectly fine.
If there are any isotopic values that did not run and are
NA, they will need to be removed because functions in
{nicheROVER} will not accommodate values of
NA.
### Estimate posterior distribution with Normal-Inverse-Wishart (NIW) priors.
We will take 1,000 posterior samples for each group. You can change this but suggest nothing less than 1,000.
We will then split the data frame into a list with each species as a
data frame object within the list, We will then iterate over the list,
using map() from {purrr}, to estimate posterior
distribution using Normal-Inverse-Wishart (NIW) priors.
You will notice that \(\delta\)34S has been selected
for this analysis which differs compared to the other vignette focused
on using {nichetools} with {nicheROVER} for
two isotopes.
We will use extract_mu()to extract posteriors for \(\mu\) estimates. The default output of
extract_mu() is long format which works for plotting with
{ggplot2} and other functions in {nichetools}. If we want wide format we
can specify the argument format with "wide",
however, it is unlikely you will need this data in wide format.
The default output will be lacking some info for plotting. We will need to add in a column that is the element abbreviation and neutron number to be used in axis labeling.
We will use extract_sigma() to extract posterior
estimates for \(\Sigma\). The default
output of extract_sigma() is wide format which doesn’t work
for plotting with {ggplot2} but does work other functions in
{nichetools}. If we want long for plotting we can specify the argument
format with "long".
Remember to change the argument isotope_n from
2 to 3, considering we are working with three
isotopes in this vignette.
For plotting we will need the extracted \(\Sigma\) values to be in long format. We also need to remove \(\Sigma\) values for when the both isotope columns are the same isotope.
For most plotting within this vignette, I will split()
the data frame by isotope, creating a list that I will then use
imap() to iterate over the list to create plots. We will
use geom_density() to represent densities for both \(\mu\) and \(\Sigma\). Plot objects will then be stored
in a list.
First we will plot \(\mu\) for each isotope. We will use {patchwork} to configure plots for multi-panel figures. This package is phenomenal and uses math operators to configure and manipulate the plots to create multi-panel figures.
For labeling we are also going to use element_markdown()
from {ggtext} to work with
the labels that are needed to correctly display the isotopic signature.
If you are working other data please replace.
posterior_plots <- df_mu %>%
split(.$isotope) %>%
imap(
~ ggplot(data = ., aes(x = mu_est)) +
geom_density(aes(fill = sample_name), alpha = 0.5) +
scale_fill_viridis_d(begin = 0.25, end = 0.75,
option = "D", name = "Species") +
theme_bw() +
theme(panel.grid = element_blank(),
axis.title.x = element_markdown(),
axis.title.y = element_markdown(),
legend.position = "none",
legend.background = element_blank()
) +
labs(
x = paste("\u00b5<sub>\U03B4</sub>", "<sub><sup>",
unique(.$neutron), "</sup></sub>",
"<sub>",unique(.$element), "</sub>", sep = ""),
y = paste0("p(\u00b5 <sub>\U03B4</sub>","<sub><sup>",
unique(.$neutron), "</sub></sup>",
"<sub>",unique(.$element),"</sub>",
" | X)"), sep = "")
)
posterior_plots$d15n +
theme(legend.position = c(0.18, 0.82)) +
posterior_plots$d13c +
posterior_plots$d34sFor labeling purposes we need to add columns that are the element
abbreviation and neutron number. I do this by using
case_when() which are vectorized if else statements.
df_sigma_cn <- df_sigma_cn %>%
mutate(
element_id = case_when(
id == "d15n" ~ "N",
id == "d13c" ~ "C",
id == "d34s" ~ "S",
),
neutron_id = case_when(
id == "d15n" ~ 15,
id == "d13c" ~ 13,
id == "d34s" ~ 34,
),
element_iso = case_when(
isotope == "d15n" ~ "N",
isotope == "d13c" ~ "C",
isotope == "d34s" ~ "S",
),
neutron_iso = case_when(
isotope == "d15n" ~ 15,
isotope == "d13c" ~ 13,
isotope == "d34s" ~ 34,
)
)Next we will plot the posteriors for \(\Sigma\).
sigma_plots <- df_sigma_cn %>%
group_split(id, isotope) %>%
imap(
~ ggplot(data = ., aes(x = post_sample)) +
geom_density(aes(fill = sample_name), alpha = 0.5) +
scale_fill_viridis_d(begin = 0.25, end = 0.75,
option = "D", name = "Species") +
theme_bw() +
theme(panel.grid = element_blank(),
axis.title.x = element_markdown(),
axis.title.y = element_markdown(),
legend.position = "none"
) +
labs(
x = paste("\U03A3","<sub>\U03B4</sub>",
"<sub><sup>", unique(.$neutron_id), "</sub></sup>",
"<sub>",unique(.$element_id),"</sub>"," ",
"<sub>\U03B4</sub>",
"<sub><sup>", unique(.$neutron_iso), "</sub></sup>",
"<sub>",unique(.$element_iso),"</sub>", sep = ""),
y = paste("p(", "\U03A3","<sub>\U03B4</sub>",
"<sub><sup>", unique(.$neutron_id), "</sub></sup>",
"<sub>",unique(.$element_id),"</sub>"," ",
"<sub>\U03B4</sub>",
"<sub><sup>", unique(.$neutron_iso), "</sub></sup>",
"<sub>",unique(.$element_iso),"</sub>", " | X)", sep = ""),
)
)
sigma_plots[[1]] +
theme(legend.position = c(0.1, 0.82)) +
sigma_plots[[2]] +
sigma_plots[[4]]We then will use niche_ellipse() to easily extract
ellipse for each \(\Sigma\) estimate
(i.e., 1000). The function will tell you how long it took to process as
with large sets of isotope data it is nice to know the time it takes for
the function to work. The function also has the argument
random which by default is set to TRUE. This
argument randomly subsamples and returns 10 ellipse estimates out of the
total number of samples taken in this case it is 1,000. The
set_seed argument allows you to change the
set.seed value by giving a numerical value to make the
results of the function reproducible, default is a random value. I
highly suggest using set_seed otherwise it will subsample different
values, there is no default value because CRAN will not allow for a
default value. You can change the number of subsamples but 10 seems
pretty standard. If you’d like all 1,000 ellipses you can set
random to FALSE.
Remember to change the argument isotope_n from
2 to 3, considering we are working with three
isotopes in this vignette.
ellipse_df <- niche_ellipse(dat_mu = df_mu,
dat_sigma = df_sigma,
isotope_n = 3,
set_seed = 4)
#> → Total time processing was 0.08 secsellipse_df <- ellipse_df %>%
mutate(
element_id = case_when(
iso_a == "d15n" ~ "N",
iso_a == "d13c" ~ "C",
iso_a == "d34s" ~ "S",
),
neutron_id = case_when(
iso_a == "d15n" ~ 15,
iso_a == "d13c" ~ 13,
iso_a == "d34s" ~ 34,
),
element_iso = case_when(
iso_b == "d15n" ~ "N",
iso_b == "d13c" ~ "C",
iso_b == "d34s" ~ "S",
),
neutron_iso = case_when(
iso_b == "d15n" ~ 15,
iso_b == "d13c" ~ 13,
iso_b == "d34s" ~ 34,
)
)We will first plot the ellipse for each sample_name
ellipse_plots <-
ellipse_df %>%
split(.$iso_combos) %>%
map(~ .x %>%
ggplot() +
geom_polygon(data = .x,
mapping = aes(x = x, y = y,
group = interaction(sample_number, sample_name),
color = sample_name),
fill = NA,
linewidth = 0.5) +
scale_colour_viridis_d(begin = 0.25, end = 0.75,
option = "D", name = "species",
) +
theme_bw(base_size = 10) +
theme(axis.text = element_text(colour = "black"),
axis.title.x = element_markdown(),
axis.title.y = element_markdown(),
panel.grid = element_blank(),
legend.position = "none",
legend.title = element_text(hjust = 0.5),
legend.background = element_blank()
) +
labs(
x = paste("\U03B4","<sup>", unique(.$neutron_id), "</sup>",
unique(.$element_id), sep = ""),
y = paste("\U03B4","<sup>", unique(.$neutron_iso), "</sup>",
unique(.$element_iso), sep = ""),
)
)
ellipse_plots
#> $`d13c - d15n`#>
#> $`d13c - d34s`
#>
#> $`d15n - d34s`
We need to turn
df into long format to iterate over using
imap() to easily create density plots. You will notice that
I again use case_when() to make columns of element
abbreviations and neutron numbers that will be used in plot
labeling.
iso_long <- df %>%
pivot_longer(cols = -species,
names_to = "isotope",
values_to = "value") %>%
mutate(
element = case_when(
isotope == "d15n" ~ "N",
isotope == "d13c" ~ "C",
isotope == "d34s" ~ "S",
),
neutron = case_when(
isotope == "d15n" ~ 15,
isotope == "d13c" ~ 13,
isotope == "d34s" ~ 34,
)
)We will then make density plots for each isotope using
geom_density()
iso_density <- iso_long %>%
group_split(isotope) %>%
imap(
~ ggplot(data = .) +
geom_density(aes(x = value,
fill = species),
alpha = 0.35,
linewidth = 0.8) +
scale_fill_viridis_d(begin = 0.25, end = 0.75,
option = "D", name = "Species") +
theme_bw(base_size = 10) +
theme(axis.text = element_text(colour = "black"),
panel.grid = element_blank(),
legend.position = c(0.15, 0.55),
legend.background = element_blank(),
axis.title.x = element_markdown(family = "sans")) +
labs(x = paste("\U03B4",
"<sup>", unique(.$neutron), "</sup>",unique(.$element),
sep = ""),
y = "Density")
)
d13c_density <- iso_density[[1]] +
scale_x_continuous(breaks = rev(seq(-20, -34, -2)),
limits = rev(c(-20, -34)))
d15n_density <- iso_density[[2]] +
scale_x_continuous(breaks = seq(5, 15, 2.5),
limits = c(5, 15)) +
theme(
legend.position = "none"
)
d34s_density <- iso_density[[3]] +
theme(
legend.position = "none"
)# split_iso <- create_isotope_pairs(isotope_n = 3)
# iso_split <- split_iso |>
# purrr::map(~ iso_long |>
# mutate(
# isotope = factor(isotope, level = c("d34s",
# "d13c",
# "d15n"))
# ) |>
# arrange(isotope) |>
# filter(isotope %in% .x) |>
# pivot_wider(id_cols = species,
# names_from = "isotope",
# values_from = c("value", "element", "neutron")) |>
# unnest()
# )Lastly we will use geom_point() to make isotopic
biplot.
# iso_biplot <- iso_split %>%
# purrr::map(~ ggplot(data = .x, aes(x = .x[[2]], y = .x[[3]])) +
# geom_point(aes(fill = .x$species), size = 3,
# shape = 21, colour = "black",
# stroke = 0.8, alpha = 0.70) +
# scale_fill_viridis_d(begin = 0.25, end = 0.75,
# option = "D", name = "species") +
# theme_bw(base_size = 10) +
# theme(axis.text = element_text(colour = "black"),
# axis.title.x = element_markdown(),
# axis.title.y = element_markdown(),
# panel.grid = element_blank(),
# legend.position = "none",
# legend.title = element_text(hjust = 0.5),
# legend.background = element_blank()
# ) +
# labs(
# x = paste("\U03B4","<sup>", unique(.x[[6]]), "</sup>",
# unique(.x[[4]]), sep = ""),
# y = paste("\U03B4","<sup>", unique(.x[[7]]), "</sup>",
# unique(.x[[5]]), sep = ""),
# )
# )We can also use the function plot_annotation() to add
lettering to the figure that can be used in the figure description. To
maneuver where plot_annotation() places the lettering, we
need to add plot.tag.position = c(x, y) to the
theme() call in every plot.
We will use the overlap() function from {nicheROVER})
to estimate the percentage of similarity among species. We will set
overlap to assess based on 95% similarities.
We then are going transform this output to a data frame using
extract_overlap() plotting so we can assess overall
similarities among species.
We then are going to take our newly made data frame and extract out the median percentage of similarities and the 2.5% and 97.5% quantiles.
over_sum <- over_stat_df %>%
group_by(sample_name_a, sample_name_b) %>%
summarise(
median_niche_overlap = round(median(niche_overlap_perc), digits = 2),
qual_2.5 = round(quantile(niche_overlap_perc,
probs = 0.025, na.rm = TRUE), digits = 2),
qual_97.5 = round(quantile(niche_overlap_perc,
probs = 0.975, na.rm = TRUE), digits = 2)
) %>%
ungroup() %>%
pivot_longer(cols = -c(sample_name_a, sample_name_b, median_niche_overlap),
names_to = "percentage",
values_to = "niche_overlap_qual") %>%
mutate(
percentage = as.numeric(str_remove(percentage, "qual_"))
) We are now going to use ggplot(),
geom_violin(), and stat_summary() to represent
the posterior distributions and the median of the posterior
distributions. Representations of posterior distributions should either
use the mode or median, not the mean. See the following publication as a
guide to representation of posterior distributions in vizialations and
in text. You can extract credible intervals or the equal-tailed
intervals using {bayestestR}.
ggplot(data = over_stat_df, aes(x = sample_name_a,
y = niche_overlap_perc,
fill = sample_name_b)) +
geom_violin() +
stat_summary(fun.y = median, geom = "point",
size = 3,
position = position_dodge(width = 0.9)) +
geom_vline(xintercept = seq(1.5, 3.5, 1),
linetype = 2) +
scale_fill_viridis_d(begin = 0.25, end = 0.75,
option = "D", name = "Species",
alpha = 0.35) +
theme_bw() +
theme(
panel.grid = element_blank(),
axis.text = element_text(colour = "black"),
legend.background = element_blank(),
strip.background = element_blank()
) +
labs(x = paste("Overlap Probability (%)", "\u2013",
"Niche Region Size: 95%"),
y = "p(Percent Overlap | X)")We are now going to estimate the overall size of the niche for each
posterior sample by using the function extract_niche_size()
which is a wrapper around niche.size() and some data
manipulation functions.
We will now use geom_violin(),
geom_point(), and geom_errorbar() to plot the
distribution for niche size for each species.
ggplot(data = niche_size,
aes(x = sample_name, y = niche_size)) +
geom_violin(
width = 0.2) +
stat_summary(fun.y = median, geom = "point",
size = 3,
position = position_dodge(width = 0.9)) +
theme_bw(base_size = 15) +
theme(panel.grid = element_blank(),
axis.text = element_text(colour = "black")) +
labs(x = "Species",
y = "Niche Size") Now that we have our niche sizes and similarities determined we can make inferences about the species, trophic similarities, and the ecosystem.