NEFSC Trawl Bodymass Distribution Allocations

Supplemental materials for size spectrum manuscript.

Author

Affiliation

Gulf of Maine Research Institute

Published

November 28, 2022

Patterns in Biomass Allocation

The following code digs into where the biomass is allocated among the 62 species sampled by the NE Groundfish Survey. Stratified abundances are used. Individual lengths and bodymasses are used to isolate what proportion of the overall abundances and biomasses reside.

Data is prepared and updated using {targets} to ensure a consistent data state and a reproducible workflow.

Target Data

Data for the report comes directly from the {targets} workflow found in _targets.R. For this markdown I’m loading in the results from the size spectrum slope analysis and the data that went into it so I can dig into any odd patterns I see. There is also regional SST based on the same trawl regions.

Exploration of Abundance/Size Information

The biological data was filtered prior to doing the size spectrum analysis so that any individuals smaller than 1g were removed. Then to dig into where the biomass andd abundances were distributed I’ve made some size bins to help tease out what the community looks like.

Group Summary Functions

The following function is used to process the same numbers for each combination of factor groups without me rewriting the function over and over again.

Same idea here, but for plotting them based on their groupings

All Data

These figures are helpful for just displaying how the overall size distributions have changed with abundance, biomass, or species type over the years:

Size Characteristics
Functional Groups

Regional Differences

These first plots get at how the different regions compare to one another. Are we seeing striking patterns across them all simultaneously, or are there localized patterns that only occur in some of them.

Length Bins
Weight Bins

Functional Group Differences

Thought it may be helpful to have a table here to check which species are currently assigned to which group.

Fishery Status Diffferences

Thought it may be helpful to have a table here to check which species are currently assigned to which group.

Community Size Spectra Results

Community size spectrum slopes were estimated using 2 methods for comparison. An individual size distribution approach ISD developed by A. Edwards and using a more traditional method of binning biomass information into bins before fitting a slope to those bins.

The first method uses the {sizeSpectra} package which were shown to be the most accurate when using simulated data compared to any of the binning methods.

The second method uses binned abundances, with bodymass bins of width 0.5 on a log10 scale, so 10^0 - 10^0.5 etc. These bins were then normalized by dividing the abundances by the bin width to account for the increasing bin width.

Starting data used for both methods was the same. A minimum individual biomass of 1 gram was used to avoid issue with sampling biases for smaller individuals. Area-stratified total abundances (and their corresponding length-based biomasses) were used to preserve the importance of the sampling design.

Pull Results from Both Methods

Biomass Data to Match Size Spectrum Analysis

Region & Season
Region & Functional Group

The first lens to look at is the seasonal variation across all the different areas. This was the typical grouping that we have been focusing on.1

The second lens that I feel is important is the functional groups. This is the not the typical grouping that we have been focusing on, but clears the air on where the biomass increase is coming from.

Edwards Methodology

The Edwards methodology differs from the other methods for estimating size spectra by avoiding the subjective decisions around how to bin data prior to fitting the log-linear regression.

Instead, Edwards uses the individual size distributions (how many individuals of any given length/weight). Abundances are totaled into discrete size bins based on expected biomass at length and length + 1, and individuals that fall within those bins are totaled to get abundances across a continuous distribution of individual bodymass.

The next difference is that the individual body size data is presumed to follow a bounded power-law distribution, with a minimum and maximum body size. Using the individual size data, maximum likelihood estimation is used to solve for the parameter (b) that minimizes the negative log-likelihood.

Year and Region Only
Year, Region, and Season

Odd Years

In the previous figure, for Georges Bank there are a number of years when b is close to -1.2 that do not follow the rest of the trend. These occur on 1974 and 1980. This section exists to see why that is:

Individual Size Distribution
Binned Biomass

The individual size distribution plots look like a lot but are not too complicated with a little explanation.

The blue bars are the width of what a given species biomass can be based on. What the species is, and its L-W relationship. Since fishes are measured to the nearest cm the left bound is the biomass for that length, and the right bound is how much mass it could also have before jumping into the next 1cm increment.

The height of the blue bars indicate how many individuals fall within that size range. Eventually building up the range of possible sizes caught, and how many there are/were.

The red line is a bounded power law fit to those blue lines. The power law relationship between abundance and size is the foundation for all the size spectra methods, in this case individual sizes are used rather than the more typical larger bins.

Data Prep

Plotting Function

Individual Size Distributions

Like any of these methods sometimes the assumed relationship does not perfectly fit the data. In the case of these two years there appears to be in the 10g-1kg range.

If I plot the same data using the binning approach you still see that the bins that cover 10^1 through 10^3 grams are higher than the fit line (analogous to red line for ISD). But, because the size distribution is no longer continuous, you can potentially run into issues of bins not accurately representing the data that goes into them.

Log10 Binning Results

For each of the slope/intercepts derived using a linear model and binned data I also pulled the adjusted r-square to get a sense of whether or not certain groups had poor fits that should be investigated.

Sea Surface Temperature

Long-Term Patterns

External Driver Timelines

Several external forces have been suggested as both drivers and/or contributors to changes in the composition of the Northeast Shelf fish community.

Among them are: * Changes in temperature regimes
* Historical fishing removal
* Climate Oscillations * Trophic Cascades

Each of these have different time horizons and periodicities associated with them. This timeline aims to place them on a common timeline to see the agreement or alignment between competing factors.

Historical periods related to fishing gear changes and the decline of the groundfish fishery can be found here:A brief History of the Groundfishing Industry of New England

`<!DOCTYPE html>

A work by Adam A. Kemberling

Akemberling@gmri.org

`{=html}

--- title: "NEFSC Trawl Bodymass Distribution Allocations" author: name: "Adam A. Kemberling" title: "Quantitative Research Associate" url: https://github.com/adamkemberling affiliation: Gulf of Maine Research Institute description: | Supplemental materials for size spectrum manuscript. date: "`r Sys.Date()`" format: html: self-contained: true code-fold: true code-tools: true df-print: kable toc: true toc-depth: 3 editor: source execute: echo: false warning: false message: false fig.align: "center" comment: "" --- ```{r setup} #### Packages #### library(targets) library(here) library(sf) library(gmRi) library(patchwork) library(gt) library(knitr) library(tidyverse) library(ggstream) # Support functions source(here("R/support/sizeSpectra_support.R")) # Set theme theme_set(theme_bw() + theme(legend.position = "bottom", strip.text = element_text(color = "white", face = "bold"), strip.background = element_rect( color = "white", fill = "#36454F", size = 0.75, linetype="solid"), legend.background = element_rect(fill = "transparent", color = "black"))) ``` # Patterns in Biomass Allocation The following code digs into where the biomass is allocated among the 62 species sampled by the NE Groundfish Survey. Stratified abundances are used. Individual lengths and bodymasses are used to isolate what proportion of the overall abundances and biomasses reside. Data is prepared and updated using {targets} to ensure a consistent data state and a reproducible workflow. **Target Data** Data for the report comes directly from the {targets} workflow found in `_targets.R`. For this markdown I'm loading in the results from the size spectrum slope analysis and the data that went into it so I can dig into any odd patterns I see. There is also regional SST based on the same trawl regions. ```{r} #### Data #### #### Size Spectrum Targets # SS and manual bins together withr::with_dir(rprojroot::find_root('_targets.R'), tar_load(size_spectrum_indices)) # Format Columns size_spectrum_indices <- size_spectrum_indices %>% mutate(season = fct_rev(season), survey_area = factor(survey_area, levels = c("GoM", "GB", "SNE", "MAB")), yr = as.numeric(as.character(Year))) #### OISST Data # OISST for all the regions withr::with_dir(rprojroot::find_root('_targets.R'), tar_load(regional_oisst)) #### Size Data Targets # 1. Biological data used as input withr::with_dir(rprojroot::find_root('_targets.R'), tar_load(catch_1g_labelled)) catch_size_bins <- catch_1g_labelled # # quick little format # catch_1g <- catch_1g %>% # mutate(fishery = case_when( # fishery == "com" ~ "Commercially Targeted", # fishery == "nc" ~ "Not Commercially Targeted", # TRUE ~ "Not Labelled")) ``` # Exploration of Abundance/Size Information The biological data was filtered prior to doing the size spectrum analysis so that any individuals smaller than 1g were removed. Then to dig into where the biomass andd abundances were distributed I've made some size bins to help tease out what the community looks like. ```{r} # # Cut up discrete length and weight bins # catch_size_bins <- catch_1g %>% # mutate( # length_bin = case_when( # length_cm <= 5 ~ "0 - 5cm", # length_cm <= 10 ~ "5 - 10cm", # length_cm <= 25 ~ "10 - 25cm", # length_cm <= 50 ~ "25 - 50cm", # length_cm <= 75 ~ "50 - 75cm", # length_cm <= 100 ~ "75 - 100cm", # length_cm >= 100 ~ ">= 100cm"), # length_bin = factor(length_bin, levels = c( # "0 - 5cm", "5 - 10cm", "10 - 25cm", # "25 - 50cm", "50 - 75cm", "75 - 100cm", ">= 100cm"))) # # # # Weight bins # catch_size_bins <- catch_size_bins %>% # mutate( # weight_bin = case_when( # ind_weight_kg <= 0.001 ~ "0 - 1g", # ind_weight_kg <= 0.005 ~ "1 - 5g", # ind_weight_kg <= 0.010 ~ "5 - 10g", # ind_weight_kg <= 0.050 ~ "10 - 50g", # ind_weight_kg <= 0.100 ~ "50 - 100g", # ind_weight_kg <= 0.500 ~ "100 - 500g", # ind_weight_kg <= 1.000 ~ ".5 - 1kg", # ind_weight_kg <= 5.000 ~ "1 - 2kg", # ind_weight_kg <= 5.000 ~ "2 - 5kg", # ind_weight_kg <= 10.00 ~ "5 - 10kg", # ind_weight_kg >= 10.00 ~ ">= 10kg" ), # weight_bin = factor(weight_bin, levels = c( # "0 - 1g", "1 - 5g", "5 - 10g", "10 - 50g", # "50 - 100g", "100 - 500g", ".5 - 1kg", # "1 - 2kg", "2 - 5kg", "5 - 10kg", ">= 10kg"))) # # # # Rename the functional groups # catch_size_bins <- catch_size_bins %>% # mutate( # spec_class = case_when( # spec_class == "gf" ~ "Groundfish", # spec_class == "pel" ~ "Pelagic", # spec_class == "dem" ~ "Demersal", # spec_class == "el" ~ "Elasmobranch", # spec_class == "<NA>" ~ "NA")) # # # Make regions go N->S # catch_size_bins <- catch_size_bins %>% # mutate(survey_area = factor(survey_area, levels = c("GoM", "GB", "SNE", "MAB")), # season = factor(season, levels = c("Spring", "Fall"))) ``` **Group Summary Functions** The following function is used to process the same numbers for each combination of factor groups without me rewriting the function over and over again. ```{r} # Function to process summaries for various factor combinations get_group_summaries <- function(...){ # Do some grouping to get totals group_totals <- catch_size_bins %>% group_by(...) %>% summarise(total_survey_catch = sum(numlen, na.rm = T), total_lw_bio = sum(sum_weight_kg, na.rm = T), total_strat_abund = sum(strat_total_abund_s, na.rm = T), total_strat_lw_bio = sum(strat_total_lwbio_s, na.rm = T), .groups = "drop") # length bins group_lengths <- catch_size_bins %>% group_by(..., length_bin) %>% summarise(lenbin_survey_catch = sum(numlen), lenbin_lw_bio = sum(sum_weight_kg), lenbin_strat_abund = sum(strat_total_abund_s), lenbin_strat_lw_bio = sum(strat_total_lwbio_s), .groups = "drop") %>% left_join(group_totals) %>% mutate( perc_total_catch = (lenbin_survey_catch - total_survey_catch) * 100, perc_lw_bio = (lenbin_lw_bio - total_lw_bio) * 100, perc_strat_abund = (lenbin_strat_abund - total_strat_abund) * 100, perc_strat_lw_bio = (lenbin_strat_lw_bio - total_strat_lw_bio) * 100) # weight bins group_weights <- catch_size_bins %>% group_by(..., weight_bin) %>% summarise(wtbin_survey_catch = sum(numlen), wtbin_lw_bio = sum(sum_weight_kg), wtbin_strat_abund = sum(strat_total_abund_s), wtbin_strat_lw_bio = sum(strat_total_lwbio_s), .groups = "drop") %>% left_join(group_totals) %>% mutate( perc_total_catch = (wtbin_survey_catch / total_survey_catch) * 100, perc_lw_bio = (wtbin_lw_bio / total_lw_bio) * 100, perc_strat_abund = (wtbin_strat_abund / total_strat_abund) * 100, perc_strat_lw_bio = (wtbin_strat_lw_bio / total_strat_lw_bio) * 100) return(list("length_bins" = drop_na(group_lengths), "weight_bins" = drop_na(group_weights))) } # Process each group regions <- get_group_summaries(Year, survey_area) seasons <- get_group_summaries(Year, season) fgroups <- get_group_summaries(Year, hare_group) fishery <- get_group_summaries(Year, fishery) ``` **Same idea here, but for plotting them based on their groupings** ```{r} # 1. Abundance by length abundance_by_length <- function(length_summary, facet_var){ ggplot(length_summary, aes(Year, lenbin_strat_abund, fill = length_bin)) + geom_col(position = "stack", color = "gray80", size = 0.1) + facet_wrap(as.formula(paste("~", facet_var)), scales = "free", ncol = 1) + scale_fill_gmri() + scale_y_continuous(labels = comma_format()) + labs(y = "Stratified Total Abundance", title = "Abundance Allocation by Length", x = "", fill = "Individual Length (cm)") } # test # abundance_by_length(regions$length_bins, "survey_area") # 2. Biomass by length biomass_by_length <- function(length_summary, facet_var){ ggplot(length_summary, aes(Year, lenbin_strat_lw_bio, fill = length_bin)) + geom_col(position = "stack", color = "gray80", size = 0.1) + facet_wrap(as.formula(paste("~", facet_var)), scales = "free", ncol = 1) + scale_fill_gmri() + scale_y_continuous(labels = comma_format()) + labs(y = "Stratified Total Biomass (kg)", title = "Biomass Allocation by Length", x = "", fill = "Individual Length (cm)") } # 3. Abundance by bodymass abundance_by_bodymass <- function(bodymass_summary, facet_var){ ggplot(bodymass_summary, aes(Year, wtbin_strat_abund, fill = weight_bin)) + geom_col(position = "stack", color = "gray80", size = 0.1) + facet_wrap(as.formula(paste("~", facet_var)), scales = "free", ncol = 1) + scale_fill_gmri() + scale_y_continuous(labels = comma_format()) + labs(y = "Stratified Total Abundance", title = "Abundance Allocation by Bodymass", x = "", fill = "Individual Weight (kg)") } # 4. Biomass across bodymass biomass_by_bodymass <- function(bodymass_summary, facet_var){ ggplot(bodymass_summary, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) + geom_col(position = "stack", color = "gray80", size = 0.1) + facet_wrap(as.formula(paste("~", facet_var)), scales = "free", ncol = 1) + scale_fill_gmri() + scale_y_continuous(labels = comma_format()) + labs(y = "Stratified Total Biomass (kg)", title = "Biomass Allocation by Bodymass", x = "", fill = "Individual Weight (kg)") } ``` ## All Data These figures are helpful for just displaying how the overall size distributions have changed with abundance, biomass, or species type over the years: ::: panel-tabset ### Size Characteristics ```{r} # make a column that contains all data catch_size_bins <- catch_size_bins %>% mutate(group_col = "All Data") # run summary all_data_summs <- get_group_summaries(Year, group_col) ``` ```{r all-data-stream-plots} #| fig.height: 8 #| fig.width: 8 # Stream Plots len_abund_p <- ggplot(all_data_summs$length_bins, aes(Year, lenbin_strat_abund, fill = length_bin)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Abundance", title = "Length Distributions") + theme(axis.text.y = element_blank(), legend.position = "none") # Length Bins len_bio_p <- ggplot(all_data_summs$length_bins, aes(Year, lenbin_strat_lw_bio, fill = length_bin)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Biomass", fill = "Length Bin") + theme(axis.text.y = element_blank()) + guides(fill = guide_legend(title.position = "top", title.hjust = 0.5)) # Weight Bins wt_abund_p <- ggplot(all_data_summs$weight_bins, aes(Year, wtbin_strat_abund, fill = weight_bin)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Abundance", title = "Bodymass Distributions") + theme(axis.text.y = element_blank(), legend.position = "none") wt_bio_p <- ggplot(all_data_summs$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Biomass", fill = "Weight Bin") + theme(axis.text.y = element_blank()) + guides(fill = guide_legend(title.position = "top", title.hjust = 0.5)) # Plot it (len_abund_p / len_bio_p) | (wt_abund_p / wt_bio_p) ``` ### Functional Groups ```{r fgroup-stream-plots} #| fig.height: 6 #| fig.width: 8 all_fgroups <- catch_size_bins %>% drop_na(hare_group) %>% group_by(group_col, Year, hare_group) %>% summarise(total_survey_catch = sum(numlen, na.rm = T), total_lw_bio = sum(sum_weight_kg, na.rm = T), total_strat_abund = sum(strat_total_abund_s, na.rm = T), total_strat_lw_bio = sum(strat_total_lwbio_s, na.rm = T), .groups = "drop") # Abundance f_abund <- ggplot(all_fgroups, aes(Year, total_strat_abund, fill = hare_group)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Abundance", fill = "Functional Group", title = "Relative Abundances") + theme(axis.text.y = element_blank()) + guides(fill = guide_legend(title.position = "top", title.hjust = 0.5)) # Biomass f_biom <- ggplot(all_fgroups, aes(Year, total_strat_lw_bio, fill = hare_group)) + geom_stream(type = "ridge", color = "gray80", size = 0.1) + scale_fill_gmri() + facet_wrap(~group_col) + scale_y_continuous(expand = c(0,0)) + labs(x = "", y = "Stratified Total Biomass", fill = "Functional Group", title = "Relative Community Biomass") + theme(axis.text.y = element_blank()) + guides(fill = guide_legend(title.position = "top", title.hjust = 0.5)) (f_abund | f_biom) + plot_layout(guides = "collect") ``` ::: ## Regional Differences These first plots get at how the different regions compare to one another. Are we seeing striking patterns across them all simultaneously, or are there localized patterns that only occur in some of them. ```{r} # Do some grouping regions <- get_group_summaries(Year, survey_area) ``` ::: panel-tabset ### Length Bins ```{r} # Get fraction for different length/weight classes # length bins regional_lengths <- regions$length_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 # Abundance by length abundance_by_length(regional_lengths, "survey_area") ``` ```{r} #| fig.height: 8 #| fig.width: 8 # biomass biomass_by_length(regional_lengths, "survey_area") ``` ### Weight Bins ```{r} # weight bins regional_weights <- regions$weight_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_bodymass(regional_weights, "survey_area") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_bodymass(regional_weights, "survey_area") ``` ::: ## Seasonal Differences ```{r} # Do some grouping seasons <- get_group_summaries(Year, season) ``` ::: panel-tabset ### Length Bins ```{r} # Get fraction for different length/weight classes # length bins seasonal_lengths <- seasons$length_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_length(seasonal_lengths, "season") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_length(seasonal_lengths, "season") ``` ### Weight Bins ```{r} # weight bins seasonal_weights <- seasons$weight_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_bodymass(seasonal_weights, "season") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_bodymass(seasonal_weights, "season") ``` ::: ## Functional Group Differences ```{r} # Do some grouping fgroups <- get_group_summaries(Year, hare_group) ``` ::: panel-tabset ### Length Bins ```{r} # Get fraction for different length/weight classes # length bins fgroup_lengths <- fgroups$length_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_length(fgroup_lengths, "hare_group") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_length(fgroup_lengths, "hare_group") ``` ### Weight Bins ```{r} # weight bins fgroup_weights <- fgroups$weight_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_bodymass(fgroup_weights, "hare_group") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_bodymass(fgroup_weights, "hare_group") ``` ### Functional Group Key Thought it may be helpful to have a table here to check which species are currently assigned to which group. ```{r} # Display table of whichever species fall into the categories catch_size_bins %>% distinct(comname, scientific_name, hare_group) %>% rename(`Common Name` = comname, `Scientific Name` = scientific_name, `Functional Group` = hare_group) %>% arrange(`Functional Group`, `Common Name`) %>% DT::datatable(filter = "top", rownames = FALSE) ``` ::: ## Fishery Status Diffferences ```{r} # Do some grouping fishery <- get_group_summaries(Year, fishery) ``` ::: panel-tabset ### Length Bins ```{r} # Get fraction for different length/weight classes # length bins fish_lengths <- fishery$length_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_length(fish_lengths, "fishery") ``` ```{r} #| fig.height: 8 #| fig.width: 8 # # Biomass from Survey using L-W biomass_by_length(fish_lengths, "fishery") ``` ### Weight Bins ```{r} # weight bins fish_weights <- fishery$weight_bins ``` ```{r} #| fig.height: 8 #| fig.width: 8 abundance_by_bodymass(fish_weights, "fishery") ``` ```{r} #| fig.height: 8 #| fig.width: 8 biomass_by_bodymass(fish_weights, "fishery") ``` ### Fishery Group Key Thought it may be helpful to have a table here to check which species are currently assigned to which group. ```{r} # Display table of whichever specis fall into the categories catch_size_bins %>% distinct(comname, scientific_name, fishery) %>% rename(`Common Name` = comname, `Scientific Name` = scientific_name, `Fishery Status` = fishery) %>% arrange(`Fishery Status`, `Common Name`) %>% DT::datatable(filter = "top", rownames = FALSE) ``` ::: --- # Community Size Spectra Results Community size spectrum slopes were estimated using 2 methods for comparison. An individual size distribution approach ISD developed by A. Edwards and using a more traditional method of binning biomass information into bins before fitting a slope to those bins. The first method uses the {sizeSpectra} package which were shown to be the most accurate when using simulated data compared to any of the binning methods. The second method uses binned abundances, with bodymass bins of width 0.5 on a log10 scale, so 10^0 - 10^0.5 etc. These bins were then normalized by dividing the abundances by the bin width to account for the increasing bin width. Starting data used for both methods was the same. A minimum individual biomass of 1 gram was used to avoid issue with sampling biases for smaller individuals. Area-stratified total abundances (and their corresponding length-based biomasses) were used to preserve the importance of the sampling design. **Pull Results from Both Methods** ```{r} # Pull the group ID for the slopes grouped on year, season, and region warmem_group_slopes <- size_spectrum_indices %>% filter(`group ID`== "single years * season * region") # Or just regions and years year_region_slopes <- size_spectrum_indices %>% filter(`group ID` == "single years * region") ``` ## Biomass Data to Match Size Spectrum Analysis ::: panel-tabset ### Region & Season The first lens to look at is the seasonal variation across all the different areas. This was the typical grouping that we have been focusing on.1 ```{r} # Do some grouping warmem_totals <- get_group_summaries(Year, survey_area, season) # Stream Plot Version ggplot(warmem_totals$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) + geom_stream(type = "ridge") + facet_grid(survey_area~season) + scale_y_continuous(labels = comma_format()) + scale_fill_gmri() + labs(y = "Stratified Total Biomass (kg)", title = "Biomass Allocation by Weight", x = "", fill = "Individual Weight (kg)") # # Bar graphs # ggplot(warmem_totals$weight_bins, # aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) + # geom_col(position = "stack", color = "gray80", size = 0.1) + # facet_grid(survey_area~season, scales = "free") + # scale_fill_gmri() + # scale_y_continuous(labels = comma_format()) + # labs(y = "Stratified Total Biomass (kg)", # title = "Biomass Allocation by Weight", # x = "", # fill = "Individual Weight (kg)") ``` ### Region & Functional Group The second lens that I feel is important is the functional groups. This is the not the typical grouping that we have been focusing on, but clears the air on where the biomass increase is coming from. ```{r} # Do some grouping fgroup_area <- get_group_summaries(Year, survey_area, hare_group) ggplot(fgroup_area$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) + geom_col(position = "stack", color = "gray80", size = 0.1) + facet_grid(survey_area~hare_group, scales = "free") + scale_fill_gmri() + scale_y_continuous(labels = comma_format()) + labs(y = "Stratified Total Biomass (kg)", title = "Biomass Allocation by Weight", x = "", fill = "Individual Weight (kg)") ``` ::: ## Edwards Methodology The Edwards methodology differs from the other methods for estimating size spectra by avoiding the subjective decisions around how to bin data prior to fitting the log-linear regression. Instead, Edwards uses the individual size distributions (how many individuals of any given length/weight). Abundances are totaled into discrete size bins based on expected biomass at length and length + 1, and individuals that fall within those bins are totaled to get abundances across a continuous distribution of individual bodymass. The next difference is that the individual body size data is presumed to follow a bounded power-law distribution, with a minimum and maximum body size. Using the individual size data, maximum likelihood estimation is used to solve for the parameter (b) that minimizes the negative log-likelihood. ::: panel-tabset ### Year and Region Only ```{r} # Just years and regions (ss_patterns <- year_region_slopes %>% ggplot(aes(yr, b, color = survey_area)) + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + geom_smooth(method = "lm", formula = y ~ x, alpha = 0.2) + facet_wrap(~survey_area, ncol = 1) + scale_color_gmri() + labs(x = "", y = "Size Spectrum Slope (b)", color = "", title = "Results from Area-Stratified Abundance - Edwards Method")) ``` ### Year, Region, and Season ```{r} # Plot the sizeSpectra slopes - year*region*season (ss_patterns <- warmem_group_slopes %>% ggplot(aes(yr, b, color = survey_area)) + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + geom_smooth(method = "lm", formula = y ~ x, alpha = 0.2) + facet_grid(survey_area~season) + scale_color_gmri() + labs(x = "", y = "Size Spectrum Slope (b)", color = "", title = "Results from Area-Stratified Abundance - Edwards Method") ) ``` ::: ## Odd Years In the previous figure, for Georges Bank there are a number of years when b is close to -1.2 that do not follow the rest of the trend. These occur on 1974 and 1980. This section exists to see why that is: ```{r} # Find the years odd_years <- warmem_group_slopes %>% filter(survey_area == "GB", b < -1.2) %>% split(.$Year) # Find the data odd_year_dat <- catch_size_bins %>% filter(survey_area == "GB", Year %in% c(1974, 1980), season == "Fall") %>% split(.$Year) ``` ::: panel-tabset ### Individual Size Distribution The individual size distribution plots look like a lot but are not too complicated with a little explanation. The blue bars are the width of what a given species biomass can be based on. What the species is, and its L-W relationship. Since fishes are measured to the nearest cm the left bound is the biomass for that length, and the right bound is how much mass it could also have before jumping into the next 1cm increment. The height of the blue bars indicate how many individuals fall within that size range. Eventually building up the range of possible sizes caught, and how many there are/were. The red line is a bounded power law fit to those blue lines. The power law relationship between abundance and size is the foundation for all the size spectra methods, in this case individual sizes are used rather than the more typical larger bins. **Data Prep** ```{r} # ISD plot preparation # counts how many of each size there are with overlapping sizes counting # towards the discrete size bins when possible isd_prepped <- map(odd_year_dat, ~ isd_plot_prep(as.data.frame(.x), stratified_abundance = T, min_weight_g = 1)) ``` **Plotting Function** ```{r} # Function for ISD plots # should work on whatever the grouping variable is as long as the lists match #' @title Plot Individual Size Distribution Curves #' #' @param isd_data_prepped Data from isd_plot_prep #' @param mle_results Corresponding group results from group_mle_calc #' @param abundance_used Label to add for context of what abundance source was used #' @param plot_rects Flag to plot bodymass rectangles #' @param show_pl_fit Flag to include powerlaw fit #' @param xlim_global global xlimits so that other plots can be held side-by-side #' @param group_name name of the group for the data to use as label #' #' @return #' @export #' #' @examples ggplot_isd <- function(isd_data_prepped, mle_results, abundance_vals = "observed", plot_rects = TRUE, show_pl_fit = TRUE, xlim_global = NULL, group_name = NULL){ # Columns and labels abundance_lab <- c("observed" = "Survey Abundance", "stratified" = "Stratified Abundance") abundance_label <- abundance_lab[[abundance_vals]] # Power law parameters and summary details for the group of data: b.MLE <- mle_results$b total_abundance <- mle_results$n b.confMin <- mle_results$confMin b.confMax <- mle_results$confMax # Create range of x values from the group to get power law predictions # PLB = bounded power-law # min and max weights for predictions xmin <- mle_results$xmin xmax <- mle_results$xmax # Create x values (individual bodymass) to predict across # break up the Xlim into pieces between min and max x.PLB <- seq(from = xmin, to = xmax, length.out = 2000) # get the length of that vector x.PLB.length <- length(x.PLB) # remove last entry, add an entry .9999 of the way there, and cap it with the last entry wtf x.PLB <- c(x.PLB[-x.PLB.length], 0.99999 * x.PLB[x.PLB.length], x.PLB[x.PLB.length]) # Y values for plot limits/bounds/predictions from bounded power law pdf y.PLB <- (1 - sizeSpectra::pPLB(x = x.PLB, b = b.MLE, xmin = min(x.PLB), xmax = max(x.PLB))) * total_abundance y.PLB.confMin <- (1 - sizeSpectra::pPLB(x = x.PLB, b = b.confMin, xmin = min(x.PLB), xmax = max(x.PLB))) * total_abundance y.PLB.confMax <- (1 - sizeSpectra::pPLB(x = x.PLB, b = b.confMax, xmin = min(x.PLB), xmax = max(x.PLB))) * total_abundance # Put it in a df to make it easier PLB_df <- data.frame( x.PLB = x.PLB, y.PLB = y.PLB, confMin = y.PLB.confMin, confMax = y.PLB.confMax) # Coefficient Labels group_name <- str_replace_all(group_name, "_", " ") the_slope <- as.character(round(mle_results$b, 3)) #### First Plot p1 <- ggplot() # Toggle Rectangles if(plot_rects == TRUE) { p1 <- p1 + geom_rect(data = isd_data_prepped, aes(xmin = wmin_g, xmax = wmax_g, ymin = lowCount, ymax = highCount), color = "gray70", fill = "transparent")} # Add segments for bin widths p1 <- p1 + geom_segment(data = isd_data_prepped, aes(x = wmin_g, xend = wmax_g, y = countGTEwmin, yend = countGTEwmin), color = "blue") # Set limits to global limits if desired if(is.null(xlim_global) == FALSE){ p1 <- p1 + scale_x_log10(limits = xlim_global, labels = trans_format("log10", math_format(10^.x))) } else{ p1 <- p1 + scale_x_log10(labels = trans_format("log10", math_format(10^.x))) } # Toggle to turn on the fit lines or not if(show_pl_fit == TRUE) { p1 <- p1 + geom_line(data = PLB_df, aes(x.PLB, y.PLB), color = "darkred") + geom_line(data = PLB_df, aes(x.PLB, confMin), color = "darkred", linetype = 2) + geom_line(data = PLB_df, aes(x.PLB, confMax), color = "darkred", linetype = 2) } # Finish off labels p1 <- p1 + labs(x = "Individual Bodymass (g)", y = "Fishes with BodyMass \u2265 x", title = group_name, tag = str_c("b = ", the_slope), caption = str_c(abundance_label, " used for size spectrum slope estimation.")) + theme(plot.tag.position = "topright") #### Second Plot, log transformed y axis # Start plot p2 <- p1 # Change the y-axis, log transformation p2 <- p2 + scale_y_log10(labels = trans_format("log10", math_format(10^.x))) + labs(x = "Individual Bodymass (g)", y = "Fishes with BodyMass \u2265 x", title = group_name, tag = str_c("b = ", the_slope), caption = str_c(abundance_label, " used for size spectrum slope estimation.")) + theme(plot.tag.position = "topright") # Build the stacked plot p3 <- p1 / p2 # Put figures in list plot_list <- list(obs_y = p1, log10_y = p2, stacked = p3) return(plot_list) } ``` **Individual Size Distributions** Like any of these methods sometimes the assumed relationship does not perfectly fit the data. In the case of these two years there appears to be in the 10g-1kg range. ```{r} isd_1974 <- ggplot_isd(isd_data_prepped = isd_prepped$`1974`, mle_results = odd_years$`1974`, abundance_vals = "stratified", plot_rects = FALSE, show_pl_fit = TRUE, group_name = "1974, Fall, GB")$obs_y # isd_1980 <- ggplot_isd(isd_data_prepped = isd_prepped$`1980`, # mle_results = odd_years$`1974`, # abundance_vals = "stratified", # plot_rects = FALSE, # show_pl_fit = TRUE, # group_name = "1980, Fall, GB")$obs_y isd_1974 #/ isd_1980 ``` ### Binned Biomass If I plot the same data using the binning approach you still see that the bins that cover 10^1 through 10^3 grams are higher than the fit line (analogous to red line for ISD). But, because the size distribution is no longer continuous, you can potentially run into issues of bins not accurately representing the data that goes into them. ```{r} # Log10 binning, 0.5 increments on log10 scale # assign bins #l10_assigned <- map(odd_year_dat, ~ assign_log10_bins(.x)) l10_assigned <- odd_year_dat # plot bin structure for survey abundance and stratified abundance l10_1974 <- plot_log10_ss(l10_assigned = l10_assigned$`1974`) #l10_1980 <- plot_log10_ss(l10_assigned = l10_assigned$`1980`) # Grab just the stratified abundance one l10_1974[[2]] + labs(title = "1974, Fall, GB") #l10_1980[[2]] + labs(title = "1980, Fall, GB") ``` ::: ## Log10 Binning Results ::: panel-tabset ### Slope ```{r} # plot trends of log10 slopes (l10_patterns <- warmem_group_slopes %>% ggplot(aes(yr, l10_slope_strat, color = survey_area)) + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + # geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) + facet_grid(survey_area~season) + scale_color_gmri() + labs(x = "", y = "Size Spectrum Slope (b)", color = "", title = "Results from Area-Stratified Abundance - log10 bins")) ``` ### Intercept ```{r} # plot trends of log10 slopes (int_patterns <- warmem_group_slopes %>% ggplot(aes(yr, l10_int_strat, color = survey_area)) + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + # geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) + facet_grid(survey_area~season) + scale_color_gmri() + labs(x = "", y = "Size Spectrum Intercept", color = "", title = "Results from Area-Stratified Abundance - log10 bins") ) ``` ### Fit Statistics (adjusted r-squared) For each of the slope/intercepts derived using a linear model and binned data I also pulled the adjusted r-square to get a sense of whether or not certain groups had poor fits that should be investigated. ```{r} # Reshaping # Goal: Row for Years, columns for each different group l10_fit_dat <- warmem_group_slopes %>% select(Year, season, survey_area, l10_slope_strat, l10_int_strat, l10_rsqr_strat) %>% mutate(yr = as.numeric(Year)) l10_fit_dat %>% ggplot(aes(yr, l10_rsqr_strat, color = survey_area)) + geom_rect(xmin = -Inf, xmax = Inf, ymin = 0, ymax = 0.2, fill = "gray80", color = "transparent") + geom_hline(yintercept = 0.2, linetype = 2, size = 0.5, color = "red") + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + # geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) + facet_grid(survey_area~season) + scale_color_gmri() + scale_y_continuous(limits = c(0,1), expand = c(0,0)) + labs(x = "", y = "Linear Model R-Square", color = "", title = "Results from Area-Stratified Abundance - log10 bins") ``` ::: ## Sea Surface Temperature ::: panel-tabset ### Long-Term Patterns ```{r} # plot trends of log10 slopes (sst_patterns <- regional_oisst %>% ggplot(aes(yr, sst_anom, color = survey_area)) + geom_line(aes(group = survey_area), linetype = 3) + geom_point(alpha = 0.6) + geom_smooth(method = "lm", formula = y ~ x, alpha = 0.2) + facet_grid(survey_area~.) + scale_color_gmri() + labs(x = "", y = "Mean Temperature Anomaly", color = "", title = "") ) ``` ::: # External Driver Timelines Several external forces have been suggested as both drivers and/or contributors to changes in the composition of the Northeast Shelf fish community. Among them are: * Changes in temperature regimes * Historical fishing removal * Climate Oscillations * Trophic Cascades Each of these have different time horizons and periodicities associated with them. This timeline aims to place them on a common timeline to see the agreement or alignment between competing factors. Historical periods related to fishing gear changes and the decline of the groundfish fishery can be found here:[A brief History of the Groundfishing Industry of New England](https://www.fisheries.noaa.gov/new-england-mid-atlantic/commercial-fishing/brief-history-groundfishing-industry-new-england#period-2.-cod-to-haddock-(1920-1930)) ```{r, fig.height=4} library(vistime) # Fishing History fishing_history <- data.frame( event = c("Introduction\nof Steam\nTrawlers", "Market\nExpansion", "Landings\nDecline", "Foreign\nFishing\nPressure", "2nd\nIndustrial\nRevolution", "Too Many\nFisherman,\nToo Few\nFish"), start = c("1906-01-01", "1920-01-01", "1930-01-01", "1960-01-01", "1977-01-01", "1985-01-01"), end = c("1919-12-31", "1929-12-31", "1959-12-31", "1976-12-31", "1984-12-31", "1994-12-31"), group = c("Fishing History") ) # SST History sst_history <- data.frame( event = c("Familiar Climate", "New Normal"), start = c("1982-01-01", "2009-01-01"), end = c("2008-12-31", "2020-12-31"), group = c("Sea Surface Temperature") ) # Survey History gf_history <- data.frame( event = c("Last Large\nGroundfish Caught", "Low Abundance", "Many\nSmall Fishes"), start = c("1971-01-01", "1990-01-01", "2010-01-01"), end = c("1989-12-31", "2008-12-31", "2018-12-31"), group = c("Survey Patterns") ) # Full Timeline timeline_data <- bind_rows(list(fishing_history, sst_history, gf_history)) %>% mutate(group = factor(group, levels = c("Fishing History", "Survey Patterns", "Sea Surface Temperature"))) # Plot the timeseries gg_vistime(timeline_data, optimize_y = T) # hc_vistime(timeline_data) ``` `r insert_gmri_footer()`