Code
# Access Biological Data with {gmRi}
# nmfs_bio <- gmRi::gmri_survdat_prep(survdat_source = "bio")
withr::with_dir(rprojroot::find_root('_targets.R'),
tar_load(survdat_biological))Change in Size-at-age Relationships for Groundfish Species
This report digs into changes in size with age using biological data from the NMFS bottom trawl survey. This dataset is a subset of the groundfish trawl survey’s total catch.
The subset is saved while at sea to be brought back for additional workout for things like aging or histology. Because these fishes are worked up at an individual level they have additional data than is otherwise available from the general survey catch data.
Because this is a subset of the full catch data, this information is better at size-at-age analysis and not suited for size-frequency information.
The biological data has its own target, but is loaded from box using the {gmRi} package just like the catch data, pulling the data off our shared cloud storage.
# Access Biological Data with {gmRi}
# nmfs_bio <- gmRi::gmri_survdat_prep(survdat_source = "bio")
withr::with_dir(rprojroot::find_root('_targets.R'),
tar_load(survdat_biological))Year Groupings
For some of the analyses below the data are grouped into 5-year increments to ensure that there is enough data at-age that might be sparse for any single given year. The function to apply these labels is below:
# 1. Make 5 year increments
bio_5yr_prep <- function(survdat_biological){
# Drop NA age data, restrict to summer and fall
nmfs_bio <- survdat_biological %>%
filter(is.na(age) == FALSE,
season %in% c("Summer", "Fall"))
#Identify the breaks
yr5_breaks <- seq(1970, 2020, by = 5)
yr5_ends <- seq(1974, 2024, by = 5)
# Build labels
yr5_labels <- str_c(yr5_breaks, "-", yr5_ends)
yr5_labels <- yr5_labels[1:(length(yr5_labels)-1)]
#Add 5-year groups, decade labels
nmfs_bio <- nmfs_bio %>%
mutate(
yearclass = est_year - (age-1),
five_yr_group = cut(
est_year,
breaks = yr5_breaks,
labels = yr5_labels,
include.lowest = TRUE,
ordered_result = TRUE),
decade = floor_decade(est_year)) %>%
as.data.frame()
}
# run cleanup
nmfs_bio <- bio_5yr_prep(survdat_biological = survdat_biological)Following this step the biological data is now ready for looking at individual years, cohorts, or five-year groups.
Species Selection
In the original proposal it was stated that 17 species would be used, species that are regularly measured and aged. Ordering all species by the number aged we can pull out the following top species:
#### Data Prep ####
# Rank species by how many measurements there are
species_abunds <- nmfs_bio %>%
count(comname) %>%
arrange(desc(n)) # ordered by number measured
# Reorder alphabetically
vonbert_species <- sort(species_abunds$comname[1:17])These will be the species that we assess length-at-age characteristics for:
acadian redfish, american plaice, atlantic cod, atlantic herring, black sea bass, butterfish, goosefish, haddock, pollock, red hake, scup, silver hake, summer flounder, white hake, winter flounder, witch flounder and yellowtail flounder
Survey Seasons
We care about data from the two main survey seasons, so we limit the data to just those seasons before splitting it out into lists for each species. The lists contain all data for a given species, broken into groups at a set interval of time. In this case it is five year increments.
# Name the list so it carries through
names(vonbert_species) <- vonbert_species
# Does not work for because of bad start points or no age data:
# skates, dogfish, fourspot flounder, goosefish
##### Pull species data into the list, and make group id for later
species_data <- map(vonbert_species, function(species_name){
# Drop NA ages, set increment labels
nmfs_bio %>%
filter(comname == species_name) %>%
mutate(group_id = str_c(est_year, comname, sep = "-"))
})
# Set Aside the von bert species
all_vonbert <- bind_rows(species_data, .id = "common name") Growth Increment Calculation
One aspect of growth beyond size-at-age is the annual growth between years of a specific age, or the growth increment.
# Age-Size Increments Differences
vb_incs <- all_vonbert %>%
group_by(`common name`, est_year, age) %>%
summarise(avg_len = mean(length_cm, na.rm = T),
avg_wt = mean(indwt, na.rm = T),
.groups = "drop") %>%
arrange(`common name`,age) %>%
group_split(`common name`, est_year) %>%
map_dfr(function(x){
x %>%
mutate(
lead_age = lead(age, n = 1),
diff_name = str_c(age, " - ", lead_age),
len_age_p1 = lead(avg_len, n = 1),
wt_age_p1 = lead(avg_wt, n = 1),
len_inc = len_age_p1 - avg_len,
wt_inc = wt_age_p1 - avg_wt) %>%
filter((lead_age - age == 1))
})Age Data Availability
The availability of data for fitting growth curves varies by species & sex for the different seasons and years of the survey. Some species are much longer lived and have a larger age range that may be difficult to accurately fit without good representation in the data.
# Make summary table
all_vonbert %>%
group_by(comname, decade) %>%
summarise(
number_aged = n(),
min_age = min(age),
max_age = max(age),
age_range = max_age - min_age,
num_per_age = number_aged / age_range,
.groups = "drop") %>%
mutate(across(where(is.numeric), round, 1),
age_range = NULL) %>%
setNames(
c("Common Name", "Decade", "# Aged",
"Age Min", "Age Max", "Avg. Fish / Age")) %>%
DT::datatable()Before fitting anything, let’s first look at how the size at age and growth increments look visually. To avoid plot overload the focus will be on 5 main species that have a high ecological or commercial significance.
# Species we want panels for:
key_species <- sort(c(
"atlantic cod",
"atlantic herring",
"american plaice",
"haddock",
"acadian redfish",
"silver hake",
"winter flounder"
))Just looking at the raw data available, it becomes obvious that there are differences in the number of fish at various ages through time.
With cases of large cohorts passing through, or older individuals being caught less frequently. Particularly large cohorts of haddock are a good example.
# Plotting age distribution as ggridges
plot_age_ridgeplot <- function(species){
species_data[[species]] %>%
filter(age <= 15) %>%
mutate(yr = factor(est_year),
yr = fct_rev(yr)) %>%
ggplot(aes(x = age, y = yr)) +
geom_density_ridges(fill = gmri_cols("gmri blue"), color = "white") +
facet_wrap(~comname) +
scale_x_continuous(limits = c(0,NA), expand = expansion(add = c(0, 0))) +
labs(x = "Age", y = "Year") +
guides(size = guide_legend(title.position = "top", title.hjust = 0.5))
}
# # Process Each species
# ridge_plots <- map(vonbert_species, plot_age_ridgeplot)# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = plot_age_ridgeplot)






Another feature that the age data gives us, is the ability to track cohorts (fish born on the same year) through time and how size at age varies across these cohorts.
plot_length_bubbles <- function(spec_choice){
species_data[[spec_choice]] %>%
group_by(comname, yearclass, age) %>%
summarise(avg_len = mean(length_cm, na.rm = T),
.groups = "drop") %>%
ggplot(aes(x = yearclass,
y = age,
fill = avg_len)) +
geom_point(shape = 22, color = "white", size = 2.5) +
scale_fill_gmri(discrete = F) +
facet_wrap(~comname) +
theme(legend.position = "bottom") +
labs(x = "Cohort", y = "Age", fill = "Average Length (cm)") +
guides(fill = guide_colorbar(title.position = "top", title.hjust = 0.5))
}# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = plot_length_bubbles)






plot_wt_bubbles <- function(spec_choice){
species_data[[spec_choice]] %>%
group_by(comname, yearclass, age) %>%
summarise(avg_wt = mean(indwt, na.rm = T),
.groups = "drop") %>%
ggplot(aes(x = yearclass, y = age, fill = avg_wt)) +
geom_point(shape = 22, color = "white", size = 3) +
scale_fill_gmri(discrete = F) +
facet_wrap(~comname) +
scale_x_continuous(limits = c(1989, 2020)) +
theme(legend.position = "bottom") +
labs(x = "Cohort", y = "Age", fill = "Average Weight (kg)") +
guides(fill = guide_colorbar(title.position = "top", title.hjust = 0.5))
}# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = plot_wt_bubbles)






From that same raw data, we have measurements of individuals sampled and the age that they were later given by otolith readers. Without needing to fit growth curves we can look at how these sizes change over the years:
# Plotting a species changing weight at age:
saa_boxplot <- function(comname_select, age_cutoff = 8){
# Pull species
this_species <- all_vonbert %>%
filter(comname == comname_select) %>%
mutate(decade = str_c(decade, "'s")) %>%
filter(est_year >= 1990,
age <= age_cutoff) %>%
mutate(age = fct_drop(factor(age)))
# Length plot for decades as box-whisker
len_plot <- this_species %>%
drop_na(length_cm) %>%
ggplot() +
geom_boxplot(aes(age, length_cm, fill = decade), position = "dodge",
width = .5, alpha = 0.8,
outlier.colour = NA) +
scale_fill_gmri() +
theme(legend.position = "none") +
labs(x = "Age", y = "Length (cm)",
fill = "Decadal Distribution",
title = str_to_title(comname_select),
subtitle = "Decadal Change to Size at Age")
# weight plot for decades as box-whisker
wt_plot <- this_species %>%
drop_na(indwt) %>%
ggplot() +
geom_boxplot(aes(age, indwt, fill = decade), position = "dodge",
width = .5, alpha = 0.8,
outlier.colour = NA) +
scale_fill_gmri() +
theme(legend.position = "bottom") +
labs(x = "Age", y = "Weight (kg)",
fill = "Decadal Distribution")
# Return the two plots
return(len_plot / wt_plot)
}
# Test it:
# plot_panelset(spec = "atlantic cod", plot_fun = saa_boxplot)# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = saa_boxplot)






Declines in size at age can be indicative of a number of stressors including fishing pressures, foraging difficulty, and physiological stressors like temperature increases.
# Plot individual Sizes and the trends with them:
plot_raw_saa <- function(
spec_choice,
max_age = 4,
gam_knots = 15,
var_choice = "length",
facet_prefix = "Age: ",
facet_suffix = ""){
# Code to switch variables and their labels:
var_sym <- switch(var_choice,
"length" = sym("length_cm"),
"weight" = sym("indwt"))
var_label <- switch(var_choice,
"length" = "Length (cm)",
"weight" = "Weight (kg)")
# Add pre/suffix to facet labels
appender <- function(string, prefix = facet_prefix, suffix = facet_suffix) {
paste0(prefix, string)
}
# Plot them
all_vonbert %>%
filter(`common name` == spec_choice,
age <= max_age) %>%
drop_na({{var_sym}}) %>%
ggplot(aes(est_year, y = {{var_sym}} )) +
geom_point(alpha = 0.3) +
geom_smooth(method = "gam",
formula = y ~ s(x, bs = "cs", k = gam_knots),
size = 1,
color = gmri_cols("orange"),
se = FALSE) +
facet_wrap(~age, ncol = 1, scales = "free",
labeller = as_labeller(appender)) +
labs(
x = "Year",
y = var_label,
title = str_to_title(spec_choice),
subtitle = str_c("Changes in ", var_choice, " at age."))
}# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = plot_raw_saa, var_choice = "length")






Growth increments relate changes in growth to specific points in a species life. Decline in growth increments for young fish can relate to things like changes prey availability. Declines in growth at later ages can also sometimes indicate early/late sexual maturation.
# Put together a timeseries of:
# how big were the age 0's,
# how big were the age 1's
# Plotting Age Increments
plot_age_increments <- function(
spec_choice,
var_choice = "length",
gam_knots = 15,
facet_prefix = "Age Increment: ",
facet_suffix = ""){
# Code to switch variables and their labels:
var_sym <- switch(var_choice,
"length" = sym("len_inc"),
"weight" = sym("wt_inc"))
var_label <- switch(var_choice,
"length" = "Annual Growth (cm)",
"weight" = "Annual Growth (kg)")
# Add pre/suffix to facet labels
appender <- function(string, prefix = facet_prefix, suffix = facet_suffix) {
paste0(prefix, string)}
# Make the plot
vb_incs %>%
filter(`common name` == spec_choice,
lead_age <= 8) %>%
drop_na({{var_sym}}) %>%
ggplot(aes(est_year, {{var_sym}})) +
# geom_col(fill = gmri_cols("gmri blue")) +
geom_point(color = gmri_cols("gmri blue")) +
geom_line(color = gmri_cols("gmri blue"), group = 1, alpha = 0.3, size = 0.8) +
geom_smooth(method = "gam",
formula = y ~ s(x, bs = "cs", k = gam_knots),
se = FALSE,
size = 1,
color = gmri_cols("orange")) +
facet_wrap(~diff_name, ncol = 1, scales = "free",
labeller = as_labeller(appender)) +
theme(legend.position = "bottom") +
labs(x = "",
y = var_label,
color = "Age Increment",
title = str_to_title(spec_choice),
subtitle = str_c("Changes in annual ", var_choice, " growth increments."))
}# Loop through the key species:
walk(key_species, plot_panelset, plot_fun = plot_age_increments, var_choice = "weight")






Growth in fishes is commonly modeled using the “Von-Bertalanffy” growth function to capture how a fishes size (length or weight) changes with age. To give a coarse glimpse into how this relationship may have changed over time we can look at how the parameters have changed over consecutive 5-year periods.
The Von-Bertalanffy parameterization used is the “Typical” von Bertalanffy parameterization, also known as the Beverton-Holt parameterization, which is implemented using the {FSA} package.
We can check that this is the case by displaying the equations:
# "typical" von bert
typical <- growthFunShow(param = "Typical", "vonBertalanffy")
typical_plot <- ggplot() +
annotate(x = 1, y = 1, label = typical, geom = "text", size = 6) +
labs(title = '"Typical" Parameterization') +
theme_void()
bholt <- growthFunShow(param = "BevertonHolt", "vonBertalanffy")
bholt_plot <- ggplot() +
annotate(x = 1, y = 1, label = bholt, geom = "text", size = 6) +
labs(title = '"Beverton-Holt" Parameterization') +
theme_void()
typical_plot | bholt_plot
To start things off we first set the von Bert parameterization to solve using nonlinear least squares: nls()
( vb <- vbFuns(param="Typical") )function (t, Linf, K = NULL, t0 = NULL)
{
if (length(Linf) == 3) {
K <- Linf[[2]]
t0 <- Linf[[3]]
Linf <- Linf[[1]]
}
Linf * (1 - exp(-K * (t - t0)))
}
<bytecode: 0x7f7ca1e0e9b8>
<environment: 0x7f7ca19d3b18>
Then once we have set that as the function we want to use, we can write a wrapper around it that will take input length and age data, some reasonable parameter starting points, and some minimum number of observations that we set to make sure groups with very few aged fish don’t break the workflow.
# Function to Pull Vonbert Coefficients
vonbert_coef <- function(length_age_dat, start_points, min_obs = 20){
# Von Bert Fitting Using: {FSA}
# Don't run on fewer than a minimum number of observations
num_aged <- nrow(length_age_dat)
if(num_aged < min_obs){
na_df <- data.frame("Linf" = NA,
"K" = NA,
"t0" = NA,
"n_aged" = num_aged,
"len_age_1" = NA,
"len_age_2" = NA,
"len_age_3" = NA)
return(na_df)
}
# Use nls to estimate VBGF parameters using starting points
vbert_fit <- nls(length_cm ~ vb(age, Linf, K, t0),
data = length_age_dat,
start = start_points)
# Access parameters with coef()
# Estimate sizes at age
vbert_coef <- as.data.frame(t(coef(vbert_fit))) %>%
mutate(n_aged = num_aged,
len_age_1 = Linf * (1 - exp(-1 * K * (1 - t0))),
len_age_2 = Linf * (1 - exp(-1 * K * (2 - t0))),
len_age_3 = Linf * (1 - exp(-1 * K * (3 - t0))),
len_age_4 = Linf * (1 - exp(-1 * K * (4 - t0))),
len_age_5 = Linf * (1 - exp(-1 * K * (5 - t0))),
len_age_6 = Linf * (1 - exp(-1 * K * (6 - t0))),
len_age_7 = Linf * (1 - exp(-1 * K * (7 - t0))),
len_age_8 = Linf * (1 - exp(-1 * K * (8 - t0)))
)
return(vbert_coef)
}This next function is a little redundant, but makes it easier to use purrr::map and repeat the estimations for each species:
# Function for running that for a single Species
species_vonbert <- function(comname, split_col, min_obs){
# Get starting points from all data
test_starts <- vbStarts(length_cm ~ age, data = species_data[[comname]])
# Get coefficients for groups
species_data[[comname]] %>%
split(.[split_col]) %>%
map_dfr(vonbert_coef, test_starts, min_obs = min_obs, .id = split_col) %>%
mutate(comname = comname)
}Once everything is prepped we can then fun the von Bert parameter estimation for all the species we are interested using the desired groups.
# Running everything?
#species_coef <- vonbert_species[c(1:17)] %>%
species_coef <- key_species %>%
map_dfr(.f = species_vonbert,
split_col = "five_yr_group",
min_obs = 30)Since the table of coefficients is somewhat simple here is a function to select a species, the x_column to use (corresponding to the time grouping), and the name of the coefficient to plot.
# pick species and coefficients
plot_vonbert_coef <- function(comnames, x_col, coef_name){
# Set columns as symbols
x_col_sym <- sym(x_col)
coef_sym <- sym(coef_name)
# Set the coefficient for labelling the plot
coef_label <- switch (
coef_name,
Linf = "Asymptotic Max Length (cm)",
K = "Body Growth Coefficient (K)",
t0 = "Hypothetical Age of Zero Size (t0)",
n_aged = "Number Aged",
len_age_1 = "Age 1 length (cm)",
len_age_2 = "Age 2 length (cm)",
len_age_3 = "Age 3 length (cm)",
len_age_4 = "Age 4 length (cm)",
len_age_5 = "Age 5 length (cm)",
len_age_6 = "Age 6 length (cm)",
len_age_7 = "Age 7 length (cm)",
len_age_8 = "Age 8 length (cm)"
)
species_coef %>%
filter(comname %in% comnames) %>%
ggplot(aes(y = {{coef_sym}}, x = {{x_col_sym}})) +
geom_line(color = gmri_cols("gmri blue"), group = 1, linetype = 1) +
geom_point(color = gmri_cols("gmri blue"),size = 0.8) +
scale_x_discrete(guide = guide_axis(n.dodge = 2)) +
facet_wrap(~comname, ncol = 1, scales = "free") +
labs(x = "Five-Year Period", y = coef_label)
}Using that function we can now take a look at the coefficients for any given species/coefficient combination.
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "Linf")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "K")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "t0")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "n_aged")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_1")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_2")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_3")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_4")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_5")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_6")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_7")
plot_vonbert_coef(comnames = c("atlantic cod", "atlantic herring", "haddock", "winter flounder", "silver hake"),
x_col = "five_yr_group",
coef_name = "len_age_8")
# Make the plot for each species
vb_fits_plot = function(spec_name){
# The raw data:
t_dat <- species_data[[spec_name]]
# The group coefficients, augmented with x values across a range
age_max <- max(t_dat$age)
x_range <- seq(from = 0, to = age_max, by = .1)
t_coef <- species_coef %>% filter(comname == spec_name) %>%
right_join(
y = data.frame(
five_yr_group = rep(unique(t_dat$five_yr_group), length(x_range)),
age = rep(x_range, length(unique(t_dat$five_yr_group)))),
by = "five_yr_group"
) %>%
mutate(
# Calculate the length at age
pred_length = Linf * (1 - exp(-1 * K * (age - t0)))
)
# Make plot?
ggplot(t_dat, aes(x = age, y = length_cm)) +
geom_point(alpha = 0.25) +
#facet_wrap(~five_yr_group, ncol = 1) +
geom_line(data = t_coef,
aes(x = age, y = pred_length, group = five_yr_group),
color = "orange",
size = 1) +
scale_x_continuous(breaks = seq(0, age_max, by = 2),
limits = c(0, age_max),
expand = expansion(mult = c(0, .15))) +
geom_dl(data = t_coef,
aes(x = age, y = pred_length, label = five_yr_group),
method= list("last.qp", cex = 0.65, hjust = -.15),
size = 12) +
labs(x = "Age",
y = "Length (cm)",
title = spec_name,
subtitle = "Progression of Length at Age Characteristics")
}# Loop through the key species:
walk(key_species,
plot_panelset,
plot_fun = vb_fits_plot)






For each species, I’m curious if Size At Age, Growth Increment, K, and Linf for pre-2010 differs from post-2010, and whether heatwave years (2012, 2016, 2018, 2020, 2021–did I miss any?) differ from non-heatwave years. Could you set up some anovas to make these comparisons for each species (and where appropriate, species by age)?
Things that have yearly values:
* Size at age
* Growth increments
# Growth increments
regime_dat <- vb_incs %>% mutate(regime = ifelse(est_year < 2010, "Normal Regime", "Warm Regime"))
# Actual sizes
regime_dat <- bind_rows(species_data) %>% mutate(regime = ifelse(est_year < 2010, "Normal Regime", "Warm Regime"))
spec <- "haddock"
regime_dat %>%
filter(
comname == spec,
#`common name` == spec,
age <= 5) %>%
ggplot(aes(x = indwt, group = regime, fill = regime)) +
geom_density(alpha = 0.5) +
facet_wrap(~age, ncol = 1, scales = "free") +
scale_fill_gmri() +
theme(legend.position = "bottom") +
labs(x = "Individual Weight (kg)",
y = "Count",
title = spec)
Things that were derived in 5-year groups:
* K
* t0
* L infinity
A work by Adam A. Kemberling
Akemberling@gmri.org