The following code digs into where the biomass is allocated among the 62 species sampled by the NE Groundfish Survey. Stratified abundances are used. Individual lengths and bodymasses are used to isolate what proportion of the overall abundances and biomasses reside.
Data is prepared and updated using {targets} to ensure a consistent data state and a reproducible workflow.
Target Data
Data for the report comes directly from the {targets} workflow found in _targets.R. For this markdown I’m loading in the results from the size spectrum slope analysis and the data that went into it so I can dig into any odd patterns I see. There is also regional SST based on the same trawl regions.
Exploration of Abundance/Size Information
The biological data was filtered prior to doing the size spectrum analysis so that any individuals smaller than 1g were removed. Then to dig into where the biomass andd abundances were distributed I’ve made some size bins to help tease out what the community looks like.
Group Summary Functions
The following function is used to process the same numbers for each combination of factor groups without me rewriting the function over and over again.
Same idea here, but for plotting them based on their groupings
All Data
These figures are helpful for just displaying how the overall size distributions have changed with abundance, biomass, or species type over the years:
These first plots get at how the different regions compare to one another. Are we seeing striking patterns across them all simultaneously, or are there localized patterns that only occur in some of them.
Thought it may be helpful to have a table here to check which species are currently assigned to which group.
Community Size Spectra Results
Community size spectrum slopes were estimated using 2 methods for comparison. An individual size distribution approach ISD developed by A. Edwards and using a more traditional method of binning biomass information into bins before fitting a slope to those bins.
The first method uses the {sizeSpectra} package which were shown to be the most accurate when using simulated data compared to any of the binning methods.
The second method uses binned abundances, with bodymass bins of width 0.5 on a log10 scale, so 10^0 - 10^0.5 etc. These bins were then normalized by dividing the abundances by the bin width to account for the increasing bin width.
Starting data used for both methods was the same. A minimum individual biomass of 1 gram was used to avoid issue with sampling biases for smaller individuals. Area-stratified total abundances (and their corresponding length-based biomasses) were used to preserve the importance of the sampling design.
The first lens to look at is the seasonal variation across all the different areas. This was the typical grouping that we have been focusing on.1
The second lens that I feel is important is the functional groups. This is the not the typical grouping that we have been focusing on, but clears the air on where the biomass increase is coming from.
Edwards Methodology
The Edwards methodology differs from the other methods for estimating size spectra by avoiding the subjective decisions around how to bin data prior to fitting the log-linear regression.
Instead, Edwards uses the individual size distributions (how many individuals of any given length/weight). Abundances are totaled into discrete size bins based on expected biomass at length and length + 1, and individuals that fall within those bins are totaled to get abundances across a continuous distribution of individual bodymass.
The next difference is that the individual body size data is presumed to follow a bounded power-law distribution, with a minimum and maximum body size. Using the individual size data, maximum likelihood estimation is used to solve for the parameter (b) that minimizes the negative log-likelihood.
In the previous figure, for Georges Bank there are a number of years when b is close to -1.2 that do not follow the rest of the trend. These occur on 1974 and 1980. This section exists to see why that is:
The individual size distribution plots look like a lot but are not too complicated with a little explanation.
The blue bars are the width of what a given species biomass can be based on. What the species is, and its L-W relationship. Since fishes are measured to the nearest cm the left bound is the biomass for that length, and the right bound is how much mass it could also have before jumping into the next 1cm increment.
The height of the blue bars indicate how many individuals fall within that size range. Eventually building up the range of possible sizes caught, and how many there are/were.
The red line is a bounded power law fit to those blue lines. The power law relationship between abundance and size is the foundation for all the size spectra methods, in this case individual sizes are used rather than the more typical larger bins.
Data Prep
Plotting Function
Individual Size Distributions
Like any of these methods sometimes the assumed relationship does not perfectly fit the data. In the case of these two years there appears to be in the 10g-1kg range.
If I plot the same data using the binning approach you still see that the bins that cover 10^1 through 10^3 grams are higher than the fit line (analogous to red line for ISD). But, because the size distribution is no longer continuous, you can potentially run into issues of bins not accurately representing the data that goes into them.
For each of the slope/intercepts derived using a linear model and binned data I also pulled the adjusted r-square to get a sense of whether or not certain groups had poor fits that should be investigated.
Several external forces have been suggested as both drivers and/or contributors to changes in the composition of the Northeast Shelf fish community.
Among them are: * Changes in temperature regimes
* Historical fishing removal
* Climate Oscillations * Trophic Cascades
Each of these have different time horizons and periodicities associated with them. This timeline aims to place them on a common timeline to see the agreement or alignment between competing factors.
---title: "NEFSC Trawl Bodymass Distribution Allocations"author: name: "Adam A. Kemberling" title: "Quantitative Research Associate" url: https://github.com/adamkemberling affiliation: Gulf of Maine Research Institutedescription: | Supplemental materials for size spectrum manuscript.date: "`r Sys.Date()`"format: html: self-contained: true code-fold: true code-tools: true df-print: kable toc: true toc-depth: 3editor: sourceexecute: echo: false warning: false message: false fig.align: "center" comment: ""---```{r setup}#### Packages ####library(targets)library(here)library(sf)library(gmRi)library(patchwork)library(gt)library(knitr)library(tidyverse)library(ggstream)# Support functionssource(here("R/support/sizeSpectra_support.R"))# Set theme theme_set(theme_bw() +theme(legend.position ="bottom",strip.text =element_text(color ="white", face ="bold"),strip.background =element_rect(color ="white", fill ="#36454F", size =0.75, linetype="solid"), legend.background =element_rect(fill ="transparent", color ="black")))```# Patterns in Biomass AllocationThe following code digs into where the biomass is allocated among the 62 species sampled by the NE Groundfish Survey. Stratified abundances are used. Individual lengths and bodymasses are used to isolate what proportion of the overall abundances and biomasses reside.Data is prepared and updated using {targets} to ensure a consistent data state and a reproducible workflow.**Target Data**Data for the report comes directly from the {targets} workflow found in `_targets.R`. For this markdown I'm loading in the results from the size spectrum slope analysis and the data that went into it so I can dig into any odd patterns I see. There is also regional SST based on the same trawl regions.```{r}#### Data ######## Size Spectrum Targets # SS and manual bins togetherwithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(size_spectrum_indices)) # Format Columnssize_spectrum_indices <- size_spectrum_indices %>%mutate(season =fct_rev(season),survey_area =factor(survey_area, levels =c("GoM", "GB", "SNE", "MAB")),yr =as.numeric(as.character(Year)))#### OISST Data# OISST for all the regionswithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(regional_oisst)) #### Size Data Targets# 1. Biological data used as inputwithr::with_dir(rprojroot::find_root('_targets.R'), tar_load(catch_1g_labelled)) catch_size_bins <- catch_1g_labelled# # quick little format# catch_1g <- catch_1g %>% # mutate(fishery = case_when(# fishery == "com" ~ "Commercially Targeted",# fishery == "nc" ~ "Not Commercially Targeted",# TRUE ~ "Not Labelled"))```# Exploration of Abundance/Size InformationThe biological data was filtered prior to doing the size spectrum analysis so that any individuals smaller than 1g were removed. Then to dig into where the biomass andd abundances were distributed I've made some size bins to help tease out what the community looks like.```{r}# # Cut up discrete length and weight bins# catch_size_bins <- catch_1g %>% # mutate(# length_bin = case_when(# length_cm <= 5 ~ "0 - 5cm",# length_cm <= 10 ~ "5 - 10cm",# length_cm <= 25 ~ "10 - 25cm",# length_cm <= 50 ~ "25 - 50cm",# length_cm <= 75 ~ "50 - 75cm",# length_cm <= 100 ~ "75 - 100cm",# length_cm >= 100 ~ ">= 100cm"),# length_bin = factor(length_bin, levels = c(# "0 - 5cm", "5 - 10cm", "10 - 25cm",# "25 - 50cm", "50 - 75cm", "75 - 100cm", ">= 100cm")))# # # # Weight bins# catch_size_bins <- catch_size_bins %>% # mutate(# weight_bin = case_when(# ind_weight_kg <= 0.001 ~ "0 - 1g",# ind_weight_kg <= 0.005 ~ "1 - 5g",# ind_weight_kg <= 0.010 ~ "5 - 10g",# ind_weight_kg <= 0.050 ~ "10 - 50g",# ind_weight_kg <= 0.100 ~ "50 - 100g",# ind_weight_kg <= 0.500 ~ "100 - 500g",# ind_weight_kg <= 1.000 ~ ".5 - 1kg",# ind_weight_kg <= 5.000 ~ "1 - 2kg",# ind_weight_kg <= 5.000 ~ "2 - 5kg",# ind_weight_kg <= 10.00 ~ "5 - 10kg",# ind_weight_kg >= 10.00 ~ ">= 10kg" ),# weight_bin = factor(weight_bin, levels = c(# "0 - 1g", "1 - 5g", "5 - 10g", "10 - 50g",# "50 - 100g", "100 - 500g", ".5 - 1kg",# "1 - 2kg", "2 - 5kg", "5 - 10kg", ">= 10kg")))# # # # Rename the functional groups# catch_size_bins <- catch_size_bins %>% # mutate(# spec_class = case_when(# spec_class == "gf" ~ "Groundfish",# spec_class == "pel" ~ "Pelagic",# spec_class == "dem" ~ "Demersal",# spec_class == "el" ~ "Elasmobranch",# spec_class == "<NA>" ~ "NA"))# # # Make regions go N->S# catch_size_bins <- catch_size_bins %>% # mutate(survey_area = factor(survey_area, levels = c("GoM", "GB", "SNE", "MAB")),# season = factor(season, levels = c("Spring", "Fall")))```**Group Summary Functions**The following function is used to process the same numbers for each combination of factor groups without me rewriting the function over and over again.```{r}# Function to process summaries for various factor combinationsget_group_summaries <-function(...){# Do some grouping to get totals group_totals <- catch_size_bins %>%group_by(...) %>%summarise(total_survey_catch =sum(numlen, na.rm = T),total_lw_bio =sum(sum_weight_kg, na.rm = T),total_strat_abund =sum(strat_total_abund_s, na.rm = T),total_strat_lw_bio =sum(strat_total_lwbio_s, na.rm = T), .groups ="drop") # length bins group_lengths <- catch_size_bins %>%group_by(..., length_bin) %>%summarise(lenbin_survey_catch =sum(numlen),lenbin_lw_bio =sum(sum_weight_kg),lenbin_strat_abund =sum(strat_total_abund_s),lenbin_strat_lw_bio =sum(strat_total_lwbio_s), .groups ="drop") %>%left_join(group_totals) %>%mutate(perc_total_catch = (lenbin_survey_catch - total_survey_catch) *100,perc_lw_bio = (lenbin_lw_bio - total_lw_bio) *100,perc_strat_abund = (lenbin_strat_abund - total_strat_abund) *100,perc_strat_lw_bio = (lenbin_strat_lw_bio - total_strat_lw_bio) *100)# weight bins group_weights <- catch_size_bins %>%group_by(..., weight_bin) %>%summarise(wtbin_survey_catch =sum(numlen),wtbin_lw_bio =sum(sum_weight_kg),wtbin_strat_abund =sum(strat_total_abund_s),wtbin_strat_lw_bio =sum(strat_total_lwbio_s),.groups ="drop") %>%left_join(group_totals) %>%mutate(perc_total_catch = (wtbin_survey_catch / total_survey_catch) *100,perc_lw_bio = (wtbin_lw_bio / total_lw_bio) *100,perc_strat_abund = (wtbin_strat_abund / total_strat_abund) *100,perc_strat_lw_bio = (wtbin_strat_lw_bio / total_strat_lw_bio) *100)return(list("length_bins"=drop_na(group_lengths),"weight_bins"=drop_na(group_weights)))}# Process each groupregions <-get_group_summaries(Year, survey_area)seasons <-get_group_summaries(Year, season)fgroups <-get_group_summaries(Year, hare_group)fishery <-get_group_summaries(Year, fishery)```**Same idea here, but for plotting them based on their groupings**```{r}# 1. Abundance by lengthabundance_by_length <-function(length_summary, facet_var){ggplot(length_summary,aes(Year, lenbin_strat_abund, fill = length_bin)) +geom_col(position ="stack", color ="gray80", size =0.1) +facet_wrap(as.formula(paste("~", facet_var)), scales ="free", ncol =1) +scale_fill_gmri() +scale_y_continuous(labels =comma_format()) +labs(y ="Stratified Total Abundance",title ="Abundance Allocation by Length",x ="",fill ="Individual Length (cm)")}# test# abundance_by_length(regions$length_bins, "survey_area")# 2. Biomass by lengthbiomass_by_length <-function(length_summary, facet_var){ggplot(length_summary,aes(Year, lenbin_strat_lw_bio, fill = length_bin)) +geom_col(position ="stack", color ="gray80", size =0.1) +facet_wrap(as.formula(paste("~", facet_var)), scales ="free", ncol =1) +scale_fill_gmri() +scale_y_continuous(labels =comma_format()) +labs(y ="Stratified Total Biomass (kg)",title ="Biomass Allocation by Length",x ="",fill ="Individual Length (cm)")}# 3. Abundance by bodymassabundance_by_bodymass <-function(bodymass_summary, facet_var){ggplot(bodymass_summary,aes(Year, wtbin_strat_abund, fill = weight_bin)) +geom_col(position ="stack", color ="gray80", size =0.1) +facet_wrap(as.formula(paste("~", facet_var)), scales ="free", ncol =1) +scale_fill_gmri() +scale_y_continuous(labels =comma_format()) +labs(y ="Stratified Total Abundance",title ="Abundance Allocation by Bodymass",x ="",fill ="Individual Weight (kg)")}# 4. Biomass across bodymassbiomass_by_bodymass <-function(bodymass_summary, facet_var){ggplot(bodymass_summary,aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) +geom_col(position ="stack", color ="gray80", size =0.1) +facet_wrap(as.formula(paste("~", facet_var)), scales ="free", ncol =1) +scale_fill_gmri() +scale_y_continuous(labels =comma_format()) +labs(y ="Stratified Total Biomass (kg)",title ="Biomass Allocation by Bodymass",x ="",fill ="Individual Weight (kg)")}```## All Data These figures are helpful for just displaying how the overall size distributions have changed with abundance, biomass, or species type over the years:::: panel-tabset### Size Characteristics```{r}# make a column that contains all datacatch_size_bins <- catch_size_bins %>%mutate(group_col ="All Data") # run summaryall_data_summs <-get_group_summaries(Year, group_col)``````{r all-data-stream-plots}#| fig.height: 8#| fig.width: 8# Stream Plotslen_abund_p <-ggplot(all_data_summs$length_bins, aes(Year, lenbin_strat_abund, fill = length_bin)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Abundance", title ="Length Distributions") +theme(axis.text.y =element_blank(),legend.position ="none")# Length Binslen_bio_p <-ggplot(all_data_summs$length_bins, aes(Year, lenbin_strat_lw_bio, fill = length_bin)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Biomass", fill ="Length Bin") +theme(axis.text.y =element_blank()) +guides(fill =guide_legend(title.position ="top", title.hjust =0.5))# Weight Binswt_abund_p <-ggplot(all_data_summs$weight_bins, aes(Year, wtbin_strat_abund, fill = weight_bin)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Abundance", title ="Bodymass Distributions") +theme(axis.text.y =element_blank(),legend.position ="none")wt_bio_p <-ggplot(all_data_summs$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Biomass", fill ="Weight Bin") +theme(axis.text.y =element_blank()) +guides(fill =guide_legend(title.position ="top", title.hjust =0.5))# Plot it(len_abund_p / len_bio_p) | (wt_abund_p / wt_bio_p)```### Functional Groups```{r fgroup-stream-plots}#| fig.height: 6#| fig.width: 8all_fgroups <- catch_size_bins %>%drop_na(hare_group) %>%group_by(group_col, Year, hare_group) %>%summarise(total_survey_catch =sum(numlen, na.rm = T),total_lw_bio =sum(sum_weight_kg, na.rm = T),total_strat_abund =sum(strat_total_abund_s, na.rm = T),total_strat_lw_bio =sum(strat_total_lwbio_s, na.rm = T), .groups ="drop") # Abundancef_abund <-ggplot(all_fgroups, aes(Year, total_strat_abund, fill = hare_group)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Abundance", fill ="Functional Group", title ="Relative Abundances") +theme(axis.text.y =element_blank()) +guides(fill =guide_legend(title.position ="top", title.hjust =0.5))# Biomassf_biom <-ggplot(all_fgroups, aes(Year, total_strat_lw_bio, fill = hare_group)) +geom_stream(type ="ridge", color ="gray80", size =0.1) +scale_fill_gmri() +facet_wrap(~group_col) +scale_y_continuous(expand =c(0,0)) +labs(x ="", y ="Stratified Total Biomass", fill ="Functional Group", title ="Relative Community Biomass") +theme(axis.text.y =element_blank()) +guides(fill =guide_legend(title.position ="top", title.hjust =0.5))(f_abund | f_biom) +plot_layout(guides ="collect")```:::## Regional DifferencesThese first plots get at how the different regions compare to one another. Are we seeing striking patterns across them all simultaneously, or are there localized patterns that only occur in some of them.```{r}# Do some groupingregions <-get_group_summaries(Year, survey_area)```::: panel-tabset### Length Bins```{r}# Get fraction for different length/weight classes# length binsregional_lengths <- regions$length_bins``````{r}#| fig.height: 8#| fig.width: 8# Abundance by lengthabundance_by_length(regional_lengths, "survey_area")``````{r}#| fig.height: 8#| fig.width: 8# biomassbiomass_by_length(regional_lengths, "survey_area")```### Weight Bins```{r}# weight binsregional_weights <- regions$weight_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_bodymass(regional_weights, "survey_area")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_bodymass(regional_weights, "survey_area")```:::## Seasonal Differences```{r}# Do some groupingseasons <-get_group_summaries(Year, season)```::: panel-tabset### Length Bins```{r}# Get fraction for different length/weight classes# length binsseasonal_lengths <- seasons$length_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_length(seasonal_lengths, "season")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_length(seasonal_lengths, "season")```### Weight Bins```{r}# weight binsseasonal_weights <- seasons$weight_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_bodymass(seasonal_weights, "season")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_bodymass(seasonal_weights, "season")```:::## Functional Group Differences```{r}# Do some groupingfgroups <-get_group_summaries(Year, hare_group)```::: panel-tabset### Length Bins```{r}# Get fraction for different length/weight classes# length binsfgroup_lengths <- fgroups$length_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_length(fgroup_lengths, "hare_group")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_length(fgroup_lengths, "hare_group")```### Weight Bins```{r}# weight binsfgroup_weights <- fgroups$weight_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_bodymass(fgroup_weights, "hare_group")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_bodymass(fgroup_weights, "hare_group")```### Functional Group KeyThought it may be helpful to have a table here to check which species are currently assigned to which group.```{r}# Display table of whichever species fall into the categoriescatch_size_bins %>%distinct(comname, scientific_name, hare_group) %>%rename(`Common Name`= comname, `Scientific Name`= scientific_name,`Functional Group`= hare_group) %>%arrange(`Functional Group`, `Common Name`) %>% DT::datatable(filter ="top", rownames =FALSE)```:::## Fishery Status Diffferences```{r}# Do some groupingfishery <-get_group_summaries(Year, fishery) ```::: panel-tabset### Length Bins```{r}# Get fraction for different length/weight classes# length binsfish_lengths <- fishery$length_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_length(fish_lengths, "fishery")``````{r}#| fig.height: 8#| fig.width: 8# # Biomass from Survey using L-Wbiomass_by_length(fish_lengths, "fishery")```### Weight Bins```{r}# weight binsfish_weights <- fishery$weight_bins``````{r}#| fig.height: 8#| fig.width: 8abundance_by_bodymass(fish_weights, "fishery")``````{r}#| fig.height: 8#| fig.width: 8biomass_by_bodymass(fish_weights, "fishery")```### Fishery Group KeyThought it may be helpful to have a table here to check which species are currently assigned to which group.```{r}# Display table of whichever specis fall into the categoriescatch_size_bins %>%distinct(comname, scientific_name, fishery) %>%rename(`Common Name`= comname, `Scientific Name`= scientific_name,`Fishery Status`= fishery) %>%arrange(`Fishery Status`, `Common Name`) %>% DT::datatable(filter ="top", rownames =FALSE)```:::---# Community Size Spectra Results Community size spectrum slopes were estimated using 2 methods for comparison. An individual size distribution approach ISD developed by A. Edwards and using a more traditional method of binning biomass information into bins before fitting a slope to those bins.The first method uses the {sizeSpectra} package which were shown to be the most accurate when using simulated data compared to any of the binning methods. The second method uses binned abundances, with bodymass bins of width 0.5 on a log10 scale, so 10^0 - 10^0.5 etc. These bins were then normalized by dividing the abundances by the bin width to account for the increasing bin width.Starting data used for both methods was the same. A minimum individual biomass of 1 gram was used to avoid issue with sampling biases for smaller individuals. Area-stratified total abundances (and their corresponding length-based biomasses) were used to preserve the importance of the sampling design. **Pull Results from Both Methods**```{r}# Pull the group ID for the slopes grouped on year, season, and regionwarmem_group_slopes <- size_spectrum_indices %>%filter(`group ID`=="single years * season * region")# Or just regions and yearsyear_region_slopes <- size_spectrum_indices %>%filter(`group ID`=="single years * region")```## Biomass Data to Match Size Spectrum Analysis::: panel-tabset### Region & SeasonThe first lens to look at is the seasonal variation across all the different areas. This was the typical grouping that we have been focusing on.1```{r}# Do some groupingwarmem_totals <-get_group_summaries(Year, survey_area, season)# Stream Plot Versionggplot(warmem_totals$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) +geom_stream(type ="ridge") +facet_grid(survey_area~season) +scale_y_continuous(labels =comma_format()) +scale_fill_gmri() +labs(y ="Stratified Total Biomass (kg)", title ="Biomass Allocation by Weight",x ="", fill ="Individual Weight (kg)")# # Bar graphs# ggplot(warmem_totals$weight_bins, # aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) +# geom_col(position = "stack", color = "gray80", size = 0.1) +# facet_grid(survey_area~season, scales = "free") +# scale_fill_gmri() +# scale_y_continuous(labels = comma_format()) +# labs(y = "Stratified Total Biomass (kg)", # title = "Biomass Allocation by Weight",# x = "", # fill = "Individual Weight (kg)")```### Region & Functional GroupThe second lens that I feel is important is the functional groups. This is the not the typical grouping that we have been focusing on, but clears the air on where the biomass increase is coming from.```{r}# Do some groupingfgroup_area <-get_group_summaries(Year, survey_area, hare_group)ggplot(fgroup_area$weight_bins, aes(Year, wtbin_strat_lw_bio, fill = weight_bin)) +geom_col(position ="stack", color ="gray80", size =0.1) +facet_grid(survey_area~hare_group, scales ="free") +scale_fill_gmri() +scale_y_continuous(labels =comma_format()) +labs(y ="Stratified Total Biomass (kg)", title ="Biomass Allocation by Weight",x ="", fill ="Individual Weight (kg)")```:::## Edwards MethodologyThe Edwards methodology differs from the other methods for estimating size spectra by avoiding the subjective decisions around how to bin data prior to fitting the log-linear regression.Instead, Edwards uses the individual size distributions (how many individuals of any given length/weight). Abundances are totaled into discrete size bins based on expected biomass at length and length + 1, and individuals that fall within those bins are totaled to get abundances across a continuous distribution of individual bodymass. The next difference is that the individual body size data is presumed to follow a bounded power-law distribution, with a minimum and maximum body size. Using the individual size data, maximum likelihood estimation is used to solve for the parameter (b) that minimizes the negative log-likelihood.::: panel-tabset### Year and Region Only```{r}# Just years and regions(ss_patterns <- year_region_slopes %>%ggplot(aes(yr, b, color = survey_area)) +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +geom_smooth(method ="lm", formula = y ~ x, alpha =0.2) +facet_wrap(~survey_area, ncol =1) +scale_color_gmri() +labs(x ="", y ="Size Spectrum Slope (b)", color ="",title ="Results from Area-Stratified Abundance - Edwards Method"))```### Year, Region, and Season ```{r}# Plot the sizeSpectra slopes - year*region*season(ss_patterns <- warmem_group_slopes %>%ggplot(aes(yr, b, color = survey_area)) +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +geom_smooth(method ="lm", formula = y ~ x, alpha =0.2) +facet_grid(survey_area~season) +scale_color_gmri() +labs(x ="", y ="Size Spectrum Slope (b)", color ="",title ="Results from Area-Stratified Abundance - Edwards Method") )```:::## Odd Years In the previous figure, for Georges Bank there are a number of years when b is close to -1.2 that do not follow the rest of the trend. These occur on 1974 and 1980. This section exists to see why that is:```{r}# Find the yearsodd_years <- warmem_group_slopes %>%filter(survey_area =="GB", b <-1.2) %>%split(.$Year)# Find the dataodd_year_dat <- catch_size_bins %>%filter(survey_area =="GB", Year %in%c(1974, 1980), season =="Fall") %>%split(.$Year)```::: panel-tabset### Individual Size DistributionThe individual size distribution plots look like a lot but are not too complicated with a little explanation.The blue bars are the width of what a given species biomass can be based on. What the species is, and its L-W relationship. Since fishes are measured to the nearest cm the left bound is the biomass for that length, and the right bound is how much mass it could also have before jumping into the next 1cm increment.The height of the blue bars indicate how many individuals fall within that size range. Eventually building up the range of possible sizes caught, and how many there are/were.The red line is a bounded power law fit to those blue lines. The power law relationship between abundance and size is the foundation for all the size spectra methods, in this case individual sizes are used rather than the more typical larger bins.**Data Prep**```{r}# ISD plot preparation# counts how many of each size there are with overlapping sizes counting# towards the discrete size bins when possibleisd_prepped <-map(odd_year_dat, ~isd_plot_prep(as.data.frame(.x),stratified_abundance = T,min_weight_g =1))```**Plotting Function**```{r}# Function for ISD plots # should work on whatever the grouping variable is as long as the lists match#' @title Plot Individual Size Distribution Curves#'#' @param isd_data_prepped Data from isd_plot_prep#' @param mle_results Corresponding group results from group_mle_calc#' @param abundance_used Label to add for context of what abundance source was used#' @param plot_rects Flag to plot bodymass rectangles#' @param show_pl_fit Flag to include powerlaw fit#' @param xlim_global global xlimits so that other plots can be held side-by-side#' @param group_name name of the group for the data to use as label#'#' @return#' @export#'#' @examplesggplot_isd <-function(isd_data_prepped, mle_results, abundance_vals ="observed",plot_rects =TRUE,show_pl_fit =TRUE,xlim_global =NULL,group_name =NULL){# Columns and labels abundance_lab <-c("observed"="Survey Abundance", "stratified"="Stratified Abundance") abundance_label <- abundance_lab[[abundance_vals]]# Power law parameters and summary details for the group of data: b.MLE <- mle_results$b total_abundance <- mle_results$n b.confMin <- mle_results$confMin b.confMax <- mle_results$confMax# Create range of x values from the group to get power law predictions# PLB = bounded power-law# min and max weights for predictions xmin <- mle_results$xmin xmax <- mle_results$xmax# Create x values (individual bodymass) to predict across# break up the Xlim into pieces between min and max x.PLB <-seq(from = xmin, to = xmax, length.out =2000) # get the length of that vector x.PLB.length <-length(x.PLB) # remove last entry, add an entry .9999 of the way there, and cap it with the last entry wtf x.PLB <-c(x.PLB[-x.PLB.length], 0.99999* x.PLB[x.PLB.length], x.PLB[x.PLB.length])# Y values for plot limits/bounds/predictions from bounded power law pdf y.PLB <- (1- sizeSpectra::pPLB(x = x.PLB, b = b.MLE, xmin =min(x.PLB), xmax =max(x.PLB))) * total_abundance y.PLB.confMin <- (1- sizeSpectra::pPLB(x = x.PLB, b = b.confMin, xmin =min(x.PLB), xmax =max(x.PLB))) * total_abundance y.PLB.confMax <- (1- sizeSpectra::pPLB(x = x.PLB, b = b.confMax, xmin =min(x.PLB), xmax =max(x.PLB))) * total_abundance# Put it in a df to make it easier PLB_df <-data.frame(x.PLB = x.PLB,y.PLB = y.PLB,confMin = y.PLB.confMin,confMax = y.PLB.confMax)# Coefficient Labels group_name <-str_replace_all(group_name, "_", " ") the_slope <-as.character(round(mle_results$b, 3))#### First Plot p1 <-ggplot()# Toggle Rectanglesif(plot_rects ==TRUE) { p1 <- p1 +geom_rect(data = isd_data_prepped, aes(xmin = wmin_g, xmax = wmax_g, ymin = lowCount, ymax = highCount),color ="gray70",fill ="transparent")}# Add segments for bin widths p1 <- p1 +geom_segment(data = isd_data_prepped,aes(x = wmin_g, xend = wmax_g, y = countGTEwmin, yend = countGTEwmin),color ="blue")# Set limits to global limits if desiredif(is.null(xlim_global) ==FALSE){ p1 <- p1 +scale_x_log10(limits = xlim_global, labels =trans_format("log10", math_format(10^.x))) } else{ p1 <- p1 +scale_x_log10(labels =trans_format("log10", math_format(10^.x))) }# Toggle to turn on the fit lines or notif(show_pl_fit ==TRUE) { p1 <- p1 +geom_line(data = PLB_df, aes(x.PLB, y.PLB), color ="darkred") +geom_line(data = PLB_df, aes(x.PLB, confMin), color ="darkred", linetype =2) +geom_line(data = PLB_df, aes(x.PLB, confMax), color ="darkred", linetype =2) }# Finish off labels p1 <- p1 +labs(x ="Individual Bodymass (g)",y ="Fishes with BodyMass \u2265 x",title = group_name, tag =str_c("b = ", the_slope),caption =str_c(abundance_label, " used for size spectrum slope estimation.")) +theme(plot.tag.position ="topright")#### Second Plot, log transformed y axis# Start plot p2 <- p1# Change the y-axis, log transformation p2 <- p2 +scale_y_log10(labels =trans_format("log10", math_format(10^.x))) +labs(x ="Individual Bodymass (g)",y ="Fishes with BodyMass \u2265 x",title = group_name, tag =str_c("b = ", the_slope),caption =str_c(abundance_label, " used for size spectrum slope estimation.")) +theme(plot.tag.position ="topright")# Build the stacked plot p3 <- p1 / p2# Put figures in list plot_list <-list(obs_y = p1,log10_y = p2,stacked = p3)return(plot_list)}```**Individual Size Distributions**Like any of these methods sometimes the assumed relationship does not perfectly fit the data. In the case of these two years there appears to be in the 10g-1kg range.```{r}isd_1974 <-ggplot_isd(isd_data_prepped = isd_prepped$`1974`, mle_results = odd_years$`1974`, abundance_vals ="stratified", plot_rects =FALSE, show_pl_fit =TRUE, group_name ="1974, Fall, GB")$obs_y# isd_1980 <- ggplot_isd(isd_data_prepped = isd_prepped$`1980`, # mle_results = odd_years$`1974`, # abundance_vals = "stratified", # plot_rects = FALSE, # show_pl_fit = TRUE, # group_name = "1980, Fall, GB")$obs_yisd_1974 #/ isd_1980```### Binned BiomassIf I plot the same data using the binning approach you still see that the bins that cover 10^1 through 10^3 grams are higher than the fit line (analogous to red line for ISD). But, because the size distribution is no longer continuous, you can potentially run into issues of bins not accurately representing the data that goes into them.```{r}# Log10 binning, 0.5 increments on log10 scale# assign bins#l10_assigned <- map(odd_year_dat, ~ assign_log10_bins(.x))l10_assigned <- odd_year_dat# plot bin structure for survey abundance and stratified abundancel10_1974 <-plot_log10_ss(l10_assigned = l10_assigned$`1974`) #l10_1980 <- plot_log10_ss(l10_assigned = l10_assigned$`1980`) # Grab just the stratified abundance onel10_1974[[2]] +labs(title ="1974, Fall, GB") #l10_1980[[2]] + labs(title = "1980, Fall, GB")```:::## Log10 Binning Results::: panel-tabset### Slope ```{r}# plot trends of log10 slopes(l10_patterns <- warmem_group_slopes %>%ggplot(aes(yr, l10_slope_strat, color = survey_area)) +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +# geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) +facet_grid(survey_area~season) +scale_color_gmri() +labs(x ="", y ="Size Spectrum Slope (b)", color ="",title ="Results from Area-Stratified Abundance - log10 bins"))```### Intercept ```{r}# plot trends of log10 slopes(int_patterns <- warmem_group_slopes %>%ggplot(aes(yr, l10_int_strat, color = survey_area)) +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +# geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) +facet_grid(survey_area~season) +scale_color_gmri() +labs(x ="", y ="Size Spectrum Intercept", color ="",title ="Results from Area-Stratified Abundance - log10 bins"))```### Fit Statistics (adjusted r-squared) For each of the slope/intercepts derived using a linear model and binned data I also pulled the adjusted r-square to get a sense of whether or not certain groups had poor fits that should be investigated.```{r}# Reshaping# Goal: Row for Years, columns for each different groupl10_fit_dat <- warmem_group_slopes %>%select(Year, season, survey_area, l10_slope_strat, l10_int_strat, l10_rsqr_strat) %>%mutate(yr =as.numeric(Year))l10_fit_dat %>%ggplot(aes(yr, l10_rsqr_strat, color = survey_area)) +geom_rect(xmin =-Inf, xmax =Inf, ymin =0, ymax =0.2, fill ="gray80", color ="transparent") +geom_hline(yintercept =0.2, linetype =2, size =0.5, color ="red") +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +# geom_smooth(method = "lm", formula = y ~ x, # alpha = 0.2) +facet_grid(survey_area~season) +scale_color_gmri() +scale_y_continuous(limits =c(0,1), expand =c(0,0)) +labs(x ="", y ="Linear Model R-Square", color ="",title ="Results from Area-Stratified Abundance - log10 bins")```:::## Sea Surface Temperature::: panel-tabset### Long-Term Patterns```{r}# plot trends of log10 slopes(sst_patterns <- regional_oisst %>%ggplot(aes(yr, sst_anom, color = survey_area)) +geom_line(aes(group = survey_area), linetype =3) +geom_point(alpha =0.6) +geom_smooth(method ="lm", formula = y ~ x, alpha =0.2) +facet_grid(survey_area~.) +scale_color_gmri() +labs(x ="", y ="Mean Temperature Anomaly", color ="",title =""))```:::# External Driver TimelinesSeveral external forces have been suggested as both drivers and/or contributors to changes in the composition of the Northeast Shelf fish community.Among them are: * Changes in temperature regimes * Historical fishing removal * Climate Oscillations * Trophic CascadesEach of these have different time horizons and periodicities associated with them. This timeline aims to place them on a common timeline to see the agreement or alignment between competing factors.Historical periods related to fishing gear changes and the decline of the groundfish fishery can be found here:[A brief History of the Groundfishing Industry of New England](https://www.fisheries.noaa.gov/new-england-mid-atlantic/commercial-fishing/brief-history-groundfishing-industry-new-england#period-2.-cod-to-haddock-(1920-1930))```{r, fig.height=4}library(vistime)# Fishing Historyfishing_history <-data.frame(event =c("Introduction\nof Steam\nTrawlers", "Market\nExpansion", "Landings\nDecline", "Foreign\nFishing\nPressure", "2nd\nIndustrial\nRevolution","Too Many\nFisherman,\nToo Few\nFish"),start =c("1906-01-01", "1920-01-01", "1930-01-01", "1960-01-01", "1977-01-01", "1985-01-01"),end =c("1919-12-31", "1929-12-31", "1959-12-31", "1976-12-31", "1984-12-31", "1994-12-31"),group =c("Fishing History"))# SST Historysst_history <-data.frame(event =c("Familiar Climate", "New Normal"),start =c("1982-01-01", "2009-01-01"),end =c("2008-12-31", "2020-12-31"),group =c("Sea Surface Temperature"))# Survey Historygf_history <-data.frame(event =c("Last Large\nGroundfish Caught", "Low Abundance", "Many\nSmall Fishes"),start =c("1971-01-01", "1990-01-01", "2010-01-01"),end =c("1989-12-31", "2008-12-31", "2018-12-31"),group =c("Survey Patterns"))# Full Timelinetimeline_data <-bind_rows(list(fishing_history, sst_history, gf_history)) %>%mutate(group =factor(group, levels =c("Fishing History", "Survey Patterns", "Sea Surface Temperature")))# Plot the timeseriesgg_vistime(timeline_data, optimize_y = T)# hc_vistime(timeline_data)````r insert_gmri_footer()`