Detailing the Maine Coastal Current Information with FVCOM
Published
March 4, 2025
Code
{library(raster)library(sf) library(fvcom) library(ncdf4) library(tidyverse)library(gmRi)library(patchwork)library(rnaturalearth)library(showtext)# Cyclic color palettes in scico# From: https://www.fabiocrameri.ch/colourmaps/library(scico)library(legendry)}# namespace conflictsconflicted::conflict_prefer("select", "dplyr")conflicted::conflict_prefer("filter", "dplyr")# Set the themetheme_set(theme_bw() +map_theme())# Project pathslob_ecol_path <-cs_path("mills", "Projects/Lobster ECOL")fvcom_path <-cs_path("res", "FVCOM/Lobster-ECOL")poly_paths <-cs_path("mills", "Projects/Lobster ECOL/Spatial_Defs")# Shapefilesnew_england <-ne_states("united states of america", returnclass ="sf") %>%filter(postal %in%c("VT", "ME", "RI", "MA", "CT", "NH", "NY", "MD", "VA", "NJ", "DE", "NC", "PA", "WV"))canada <-ne_states("canada", returnclass ="sf")# # # Support functions for FVCOMsource(here::here("R/FVCOM_Support.R"))
Code
# Use GMRI styleuse_gmri_style_rmd()
Daily Maine Coastal Current Information from FVCOM
The Gulf of Maine Coastal Current (GMCC) can be divided into to principal branches (Eastern Maine Coastal Current EMCC & Western Maine Coastal Current WMCC), separated by a junction near Penobscot bay.
This junction is important as it is a site where a variable portion of current flow shifts from continuing Southwest, and veers offshore.
This shift in current flow has important implications for nutrient and plankton distribution, and larval transport. This is why this section of the current flow is used as an indicator of general circulation patterns within the Gulf of Maine.
Code
# Loading FVCOM# Get some file namesfvcom_surfbot_files <-setNames(list.files(fvcom_path, full.names = T, pattern =".nc"),str_remove(list.files(fvcom_path, full.names = F, pattern =".nc"), ".nc"))# Open onegom3_early <-nc_open(fvcom_surfbot_files[[1]])# So the data is daily...# ncvar_get(gom3_early, varid = "surface_v", start = c(1,1), count = c(1,-1))# Get the mesh itself as a simple feature collectiongom3_mesh <-get_mesh_geometry(gom3_early, what ='lonlat')
Maine Coastal Current Region
The primary area of concern rgarding capturing the continuous flow of the Maine Coastal Current is the area off the coast of penobscot bay.
The following polygon was created as a boundary for subsetting the relevant data for calculating MCC indices.
We may want to extend our area east (Li et al 2022): > The EMCC generally bifurcates southeast of Mount Desert Island (D. Brooks & Townsend, 1989; D. Li et al., 2021; Luerssen et al., 2005; Pettigrew et al., 1998, 2005).
Code
# Make a polygon for the area of interest:#Original Areamcc_turnoff_coords <-tribble(~"lon", ~"lat",-69.34, 43.21, # Bottom Left-69.78, 43.62, # Off Popham-67.45, 44.34, # Off Jonesport-67.4, 43.8, # Bottom right-69.34, 43.21, # Bottom left again)# # Expanded Area?# mcc_turnoff_coords <- tribble(# ~"lon", ~"lat",# -69.75, 43, # Bottom Left# -70, 43.5, # Top Left# -67.25, 44.45, # Off Jonesport# -66.9, 43.95, # Bottom right# -69.75, 43, # Bottom left again# )# Make it a polygonmcc_turnoff_poly <-st_polygon(list(cbind(mcc_turnoff_coords$lon, mcc_turnoff_coords$lat))) %>%st_sfc(crs =4326) %>%st_as_sf() %>%mutate(area ="Maine Coastal Current Region") st_geometry(mcc_turnoff_poly) <-"geometry"
Using that boundary we can clip the FVCOM mesh and obtain the indices to use for pulling data from the many netcdf files.
Code
# Project polygonpoly_projected <-st_transform(mcc_turnoff_poly, st_crs(gom3_mesh)) # Turn off s2sf::sf_use_s2(FALSE)# Trim itmcc_studyarea_mesh <-mesh_trim(mesh = gom3_mesh, domain =st_as_sf(poly_projected) )# Map thatggplot() +geom_sf(data = new_england) +geom_sf(data = canada) +geom_sf(data = mcc_studyarea_mesh, color ="black", alpha =0.6) +geom_sf(data = mcc_turnoff_poly, fill ="orange", alpha =0.4) +theme_classic() +map_theme() +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +labs(title ="MCC Turnoff Study Area - FVCOM Mesh Intersection")
Grab the Surface Current Information
The clipped mesh contains node and element index numbers. We will want to use these to pull variables for each of these locations. These will then be used for principal component analyses whos goal is to decompose the correlation structures between all these locations.
Pull Daily Surface Current Values
The data obtained from the Chen lab directly contains surface current vector information, and does not contain siglev coordinates.
The code to pull these variables from these files is slightly different for this reason (mainly just one less dimension to provide “start” and “count” arguments for within ncdf4::ncvar_get())
Daily current information gives us the option of estimating a more refined/targeted seasonal index and would give us higher resolution throughout the year.
Literature suggests connectivity is highest in the winter (concurrent with offshore EMCC veering) and again in spring/summer (time of strongest EMCC flow).
During late-fall and early winter the inshore flow reverses*
Both the circulation and particle tracking models suggested that the connectivity generally peaks twice annually, highest in winter and then secondarily in late spring or early summer. The former is concurrent with the most southwest offshore veering of the EMCC, while the latter is concurrent with the strongest EMCC. Moreover, the counter-WMCC can reduce the connectivity and result in year-to-year variations. Li et al. 2022
Code
#' @title Extract Netcdf Data with Index List#' #' @description Subsets data from the results of ncvar_get(). Accepts single or#' multiple indexes for one or more dimensions using a list. List order should#' match the relevant dimensions for the variable.#' #' If no indexing is supplied for a relevant dimension, all elements are returned.#'#' @param nc_array #' @param index_list #'#' @returns#' @export#'#' @examplessubset_nc_var <-function(nc_array, index_list) {# Get the number of dimensions dims <-length(dim(nc_array))# Create a list of indices, defaulting to ":" (keeping all elements) full_index_list <-rep(list(quote(expr = )), dims) # Update with user-specified indicesfor (dim_idx inseq_along(index_list)) {if (!is.null(index_list[[dim_idx]])) { full_index_list[[dim_idx]] <- index_list[[dim_idx]] } }# Use do.call to apply indexing dynamically extracted_data <-do.call(`[`, c(list(nc_array), full_index_list, list(drop =TRUE)))return(extracted_data)}
Code
# Function to grab each of them for a netcdf connection:# Loop through daily surfbot files and pull values for the proper nodesget_elem_timeseries <-function(fpath, elem_list){# Open (lazy-load) the netcdf connection fvcom_x <-nc_open(fpath)# Time dimension info time_dim <-ncvar_get(fvcom_x, "Times")# Get U u_array <-ncvar_get(nc = fvcom_x, varid ="surface_u", start =c(1, 1),count =c(-1, -1))# Get V v_array <-ncvar_get(nc = fvcom_x, varid ="surface_v", start =c(1, 1),count =c(-1, -1))# Get Lon/Lat lon_array <-ncvar_get(fvcom_x, varid ="lonc", start =c(1), count =c(-1)) lat_array <-ncvar_get(fvcom_x, varid ="latc", start =c(1), count =c(-1))# Close connection to the netcdf:nc_close(fvcom_x)# Start index (just grab all the way through) daily_df <-map_dfr(elem_list, function(elem_x){ elem_lon <- lon_array[[elem_x]] elem_lat <- lat_array[[elem_x]]data.frame("time"= time_dim,"lonc"= elem_lon,"latc"= elem_lat,"u"=subset_nc_var(u_array, list(elem_x)) ,"v"=subset_nc_var(v_array, list(elem_x)) ) }, .id ="elem")# Return the tablemessage(str_c("Completed: ", fpath))return(daily_df)}
Code
# Get the elem numbers we care about for ncvar_get()mcc_elems <-unique(mcc_studyarea_mesh$elem) %>%setNames(unique(mcc_studyarea_mesh$elem))# # Test one#t1 <- get_elem_timeseries(fpath = fvcom_surfbot_files[[1]], elem_list = mcc_elems[1])# Run them alldaily_mcc_surface_currents <-map_dfr(.x = fvcom_surfbot_files,.f =~get_elem_timeseries(fpath = .x, elem_list = mcc_elems))# Save themwrite_csv( daily_mcc_surface_currents, here::here("data/daily_mcc_surface_currents_vectors.csv"))
Visual Inspection of Current Vectors
If we load in the table(s) of the element values processed above, we can start to look what the current vectors look like on a map.
Current vectors can be visualized easily using ggplot with metR::geom_vector().
Confirm Daily Records
The following figure shows the daily Eastward and Northward velocities at centroid #48733.
Code
# Load daily:daily_mcc_sc <-read_csv(here::here("data/daily_mcc_surface_currents_vectors.csv")) %>%mutate(time =as.Date(time),label = time,year =str_sub(label, 1,4),month =as.numeric(str_sub(label, 6, 7)))# Plot the velocities for an element as a checkdaily_mcc_sc %>%filter(elem ==48733) %>%filter(between(time, as.Date("1978-01-01"), as.Date("2019-12-31"))) %>%ggplot() +geom_line(aes(time, u, color ="Eastward Velocity (u)"), alpha =0.3) +geom_line(aes(time, v, color ="Northward Velovity (v)"), alpha =0.3) +labs(title ="Confirming Daily Records",subtitle ="centroid index: 48733" )
Mapping Long-term Averages
Before proceeding too far, here is what the average flow direction is across the domain of interest to us:
latc/lonc coordinate pairs for the centers of each triangle provide a unique ID for each element.
When transposed we can create a matrix where each triangular element value has one row per month (or daily, whatever timestep), and one column for each variable (eastward velocity, northward etc.). From this matrix we can perform a Principal Component Analysis (PCA) returning 2 or more principal components that explain a share of the variance in the matrix.
Question: What variables to include/exclude for PCA
We can perform the PCA using only one or more of the current variables, essentially just the water flow characteristics. Or we could include temperature and salinity and focus more on the water mass characteristics.
The primary variables of interest for this step are:
u The Daily Eastward Velocity in meters s-1
v The Daily Northward Water Velocity in meters s-1
temperature and salinity may also be used to determine/delineate the different water masses.
Method/Approach Questions:
Take a step back, understand the flow before trying to relate the PCA back.
Our aim is generate a timeseries that is either a direct measure or a good proxy for MCC flow continuity. We care about this because flow characteristics determine whether lobster larva are transported along the coast or advected offshore.
In chatting with Damien, there may not be 1 best way to get the meaningful information out of the different FVCOM variables available.
It may make sense to tailor the data we use and the index we generate to lobster question:
This could mean limitieng the time span, to include the time of year when lobster larva are dispersing.
Simple Test: PCA Approach
The following approach uses Northward (v) and Eastward(u) water velocity of the surface layer for the principal component analysis.
Code
# For Area weighted PCA, get mesh element areas# Get the surface area of all the elementselem_areas_df <-map_dfr(split(mcc_studyarea_mesh, mcc_studyarea_mesh$elem),function(x){data.frame("elem"= x$elem, "area"=st_area(x))})# A. Surface current variables (u,v)pca_mat <- daily_mcc_sc %>%# Join surface areas and weight u + v# left_join(elem_areas_df) %>% # mutate(across(c(u,v), ~.x*sqrt(area))) %>% # Weight the current values by the areasselect(time, elem, u, v) %>%pivot_wider(names_from = elem, values_from =c(u, v)) %>%column_to_rownames("time")# # Inspect# pca_mat[1:6,1:6]# Do PCA# If all columns have comparable variances and you want to retain absolute differences in current strength, center but do not scale.mcc_pca <-prcomp(pca_mat, scale. =TRUE, center =TRUE)
PC Loadings
The following maps show how the two principal component rotations/loadings (the same if matrix is scaled) relate to areas along the study area.
Code
# Summary - Proportion of variancemcc_pca_summ <-summary(mcc_pca)mcc_pca_summ$importance[1:2, 1:2]
PC1 PC2
Standard deviation 21.45602 18.97389
Proportion of Variance 0.39482 0.30876
# Pull Principal Components overall valuesmcc_pcomponents <-data.frame("date"=rownames(pca_mat),"PC1"= mcc_pca$x[,"PC1"],"PC2"= mcc_pca$x[,"PC2"]) %>%mutate(date =as.Date(date), mon = lubridate::month(date), yr = lubridate::year(date))# Make a monthly summary, with a 12-month rolling averagemcc_pc_monthly <- mcc_pcomponents %>%pivot_longer(starts_with("PC"), names_to ="Principal Component", values_to ="vals") %>%group_by(yr, mon, `Principal Component`) %>%summarise(vals =mean(vals),.groups ="drop") %>%mutate(date =as.Date(str_c( yr,str_pad(mon, side ="left", width =2, pad ="0"),"15",sep ="-"))) %>%group_by(`Principal Component`) %>%arrange(date) %>%mutate(vals_rollmean = zoo::rollmean(vals, k =12, fill =NA, align ="center"))# Plot daily with the monthly smoothmcc_pc_monthly %>%ggplot() +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Surface Current PCA Timeseries",y ="Principal Component Value",x ="Date",color ="12-month average")
Validation Exercise: Pettigrew 2005 Years
I need to figure out how to relate the principal components to the open/closed gate language of pettigrew 2005.
This study used 1998-2001 as the study years, and characterized strong offshore veering in 1998 and strong continuous flow in 2000.
These years may be useful for characterizing the relationship between the current principal components. Here is what the current directions were like during those years:
Choose the 12-month periods where principal components are at their peaks. Make maps of what the current directions over that same period of time.
Code
# Get the dates when the 12 month average are highestmcc_peaks <- mcc_pc_monthly %>%select(-vals) %>%filter(date >as.Date("1990-01-01")) %>%pivot_wider(names_from =`Principal Component`, values_from = vals_rollmean) %>%mutate(PC_state =case_when( PC1 ==max(PC1, na.rm = T) ~"PC1 High", PC2 ==max(PC2, na.rm = T) ~"PC2 High", PC1 ==min(PC1, na.rm = T) ~"PC1 low", PC2 ==min(PC2, na.rm = T) ~"PC2 low",TRUE~NA)) %>%filter(is.na(PC_state) ==FALSE)# Plot the timeseries with those breaksmcc_pc_monthly %>%ggplot(aes(date, vals, color =`Principal Component`)) +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) +geom_vline(data = mcc_peaks,aes(xintercept = date), color ="gray30", linetype =2) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Surface Current PCA Timeseries",subtitle ="Peak States in 12-month average (dashed line)",y ="Principal Component Value",x ="Date")
Code
# Get the average currents for the 12 months around those breakspeak_directions <- mcc_peaks %>%split(.$date) %>%map_dfr(function(peak_date){# Get the start-end date brackets for the year start_month <- peak_date$date - lubridate::dmonths(6) end_month <- peak_date$date + lubridate::dmonths(6) yr <- peak_date$yr mon <-str_pad(peak_date$mon, width =2, side ="left", pad ="0") state <- peak_date$PC_state# Filter to that period, get average conditions daily_mcc_sc %>%filter(between(time, start_month, end_month)) %>%group_by(lonc, latc) %>%summarise( across(.cols =c(u,v), .fns =~mean(.x, na.rm = T)),.groups ="drop") %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, angle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), dx = u / speed *0.05, dy = v / speed *0.05) %>%mutate(start_month = start_month, end_month = end_month, PC_state = state,ymon =str_c(yr, mon, sep ="-"),.before ="lonc")# End the function })# Map the current directions at those timespeak_direction_maps <-ggplot(peak_directions) +geom_sf(data = new_england) +geom_segment(aes( lonc, latc, xend = lonc + dx, yend = latc + dy,color = angle),arrow =arrow(length =unit(0.05, "cm")),linewidth =0.5) +theme_minimal() +theme(legend.position ="right") +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~PC_state*ymon, ncol =2) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +labs(title ="Peak/Trough Periods in PC Timeseries",subtitle ="Average Surface Current Direction", x ="Longitude", y ="Latitude",color ="Current Direction")# Stack thempeak_direction_maps
PCA on Angular Direction and Speed Directly?
We could try coding the PCA to use angle and speed directly rather than the Northward and Eastward velocity components.
The cosine and sine components of the current’s angular direction preserve the cyclical nature of the currents.
Code
# Get the Cyclic X and Y components using trigangular_pca_df <- daily_mcc_sc %>%mutate(angle_rad =atan2(v, u), # Compute angle in radianscos_theta =cos(angle_rad), # Cyclic X componentsin_theta =sin(angle_rad), # Cyclic Y componentspeed =sqrt(u^2+ v^2) # Optional: current speed )# A. Pivot wider to get those three things by the mesh elementangular_pca_mat <- angular_pca_df %>%select(time, elem, cos_theta, sin_theta, speed) %>%pivot_wider(names_from = elem, values_from =c(cos_theta, sin_theta, speed), names_sep ="X") %>%column_to_rownames("time")# Now run PCA using cos_theta and sin_theta instead of u, vangular_pca_result <-prcomp(angular_pca_mat, center =TRUE, scale. =TRUE)
Angular PC Loadings
The following maps show how the two principal component rotations/loadings (the same if matrix is scaled) relate to areas along the study area.
Code
# Summary - Proportion of varianceangular_pca_summ <-summary(angular_pca_result)angular_pca_summ$importance[1:2, 1:2]
PC1 PC2
Standard deviation 21.23591 19.03156
Proportion of Variance 0.25784 0.20709
# Pull Principal Components overall valuesangular_pcomponents <-data.frame("date"=rownames(angular_pca_mat),"PC1"= angular_pca_result$x[,"PC1"],"PC2"= angular_pca_result$x[,"PC2"]) %>%mutate(date =as.Date(date), mon = lubridate::month(date), yr = lubridate::year(date))# Make a monthly summary:angular_pc_monthly <- angular_pcomponents %>%pivot_longer(starts_with("PC"), names_to ="Principal Component", values_to ="vals") %>%group_by(yr, mon, `Principal Component`) %>%summarise(vals =mean(vals),.groups ="drop") %>%mutate(date =as.Date(str_c( yr,str_pad(mon, side ="left", width =2, pad ="0"),"15",sep ="-"))) %>%group_by(`Principal Component`) %>%arrange(date) %>%mutate(vals_rollmean = zoo::rollmean(vals, k =12, fill =NA, align ="center"))
Validation Exercise: Check PC Timeseries Peaks
Code
# Get the dates when the 12 month average are highestangular_peaks <- angular_pc_monthly %>%select(-vals) %>%filter(date >as.Date("1990-01-01")) %>%pivot_wider(names_from =`Principal Component`, values_from = vals_rollmean) %>%mutate(PC_state =case_when( PC1 ==max(PC1, na.rm = T) ~"PC1 High", PC2 ==max(PC2, na.rm = T) ~"PC2 High", PC1 ==min(PC1, na.rm = T) ~"PC1 low", PC2 ==min(PC2, na.rm = T) ~"PC2 low",TRUE~NA)) %>%filter(is.na(PC_state) ==FALSE)# Timeseriesangular_monthly_timeseries <- angular_pc_monthly %>%ggplot(aes(date, vals, color =`Principal Component`)) +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) +geom_vline(data =data.frame(angular_peaks),aes(xintercept = date), color ="gray30", linetype =2) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Angular PCA Timeseries",subtitle ="Peak States in 12-month average (dashed line)",y ="Principal Component Value",x ="Date")angular_monthly_timeseries
Code
# 12-month average of lead-in# Get the average currents for the 12 months around those breaksangular_peak_directions <- angular_peaks %>%split(.$date) %>%map_dfr(function(peak_date){# Get the start-end date brackets for the year start_month <- peak_date$date - lubridate::dmonths(6) end_month <- peak_date$date + lubridate::dmonths(6) yr <- peak_date$yr state <- peak_date$PC_state mon <-str_pad(peak_date$mon, width =2, side ="left", pad ="0")# Filter to that period, get average conditions daily_mcc_sc %>%filter(between(time, start_month, end_month)) %>%group_by(lonc, latc) %>%summarise( across(.cols =c(u,v), .fns =~mean(.x, na.rm = T)),.groups ="drop") %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, angle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), dx = u / speed *0.05, dy = v / speed *0.05) %>%mutate(start_month = start_month, end_month = end_month, PC_state = state,ymon =str_c(yr, mon, sep ="-"),.before ="lonc")# End the function })# Map thempeak_direction_maps <-ggplot(angular_peak_directions) +geom_sf(data = new_england) +geom_segment(aes( lonc, latc, xend = lonc + dx, yend = latc + dy,color = angle),arrow =arrow(length =unit(0.05, "cm")),linewidth =0.5) +theme_minimal() +theme(legend.position ="right") +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~PC_state*ymon, ncol =2) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +labs(title ="Peak/Trough Periods in AngularPC Timeseries",subtitle ="Average Surface Current Direction", x ="Longitude", y ="Latitude",color ="Current Direction")# Stack thempeak_direction_maps
Alternative Approaches: Area-Flux
An alternative way to potentially get at MCC flow continuity is by measuring water volume flux directly for some transect along this area, and potentially use that instead of the PCA approach.
We could frame it as either the total southwestern flux, or if its clear/consistent where positionally the current veers south, we could measure southward flux in that area.
It would be super interesting to correlate the surface volume flux with the river output from the penobscot.
Source Code
---title: "Daily MCC Metrics"description: | Detailing the Maine Coastal Current Information with FVCOMdate: "Updated on: `r Sys.Date()`"format: html: code-fold: true code-tools: true df-print: kable self-contained: trueexecute: echo: true warning: false message: false fig.align: "center" comment: ""---```{r}{library(raster)library(sf) library(fvcom) library(ncdf4) library(tidyverse)library(gmRi)library(patchwork)library(rnaturalearth)library(showtext)# Cyclic color palettes in scico# From: https://www.fabiocrameri.ch/colourmaps/library(scico)library(legendry)}# namespace conflictsconflicted::conflict_prefer("select", "dplyr")conflicted::conflict_prefer("filter", "dplyr")# Set the themetheme_set(theme_bw() +map_theme())# Project pathslob_ecol_path <-cs_path("mills", "Projects/Lobster ECOL")fvcom_path <-cs_path("res", "FVCOM/Lobster-ECOL")poly_paths <-cs_path("mills", "Projects/Lobster ECOL/Spatial_Defs")# Shapefilesnew_england <-ne_states("united states of america", returnclass ="sf") %>%filter(postal %in%c("VT", "ME", "RI", "MA", "CT", "NH", "NY", "MD", "VA", "NJ", "DE", "NC", "PA", "WV"))canada <-ne_states("canada", returnclass ="sf")# # # Support functions for FVCOMsource(here::here("R/FVCOM_Support.R"))``````{r}#| label: style-sheet#| results: asis# Use GMRI styleuse_gmri_style_rmd()``````{r}#| label: fonts-config#| echo: false# Path to the directory containing the font file (replace with your actual path)font_dir <-paste0(system.file("stylesheets", package ="gmRi"), "/GMRI_fonts/Avenir/")# Register the fontfont_add(family ="Avenir",file.path(font_dir, "LTe50342.ttf"),bold =file.path(font_dir, "LTe50340.ttf"),italic =file.path(font_dir, "LTe50343.ttf"),bolditalic =file.path(font_dir, "LTe50347.ttf"))# Load the fontshowtext::showtext_auto()```# Daily Maine Coastal Current Information from FVCOMThe Gulf of Maine Coastal Current (GMCC) can be divided into to principal branches (Eastern Maine Coastal Current EMCC & Western Maine Coastal Current WMCC), separated by a junction near Penobscot bay. This junction is important as it is a site where a variable portion of current flow shifts from continuing Southwest, and veers offshore. This shift in current flow has important implications for nutrient and plankton distribution, and larval transport. This is why this section of the current flow is used as an indicator of general circulation patterns within the Gulf of Maine.```{r}#| label: load-daily-fvcom# Loading FVCOM# Get some file namesfvcom_surfbot_files <-setNames(list.files(fvcom_path, full.names = T, pattern =".nc"),str_remove(list.files(fvcom_path, full.names = F, pattern =".nc"), ".nc"))# Open onegom3_early <-nc_open(fvcom_surfbot_files[[1]])# So the data is daily...# ncvar_get(gom3_early, varid = "surface_v", start = c(1,1), count = c(1,-1))# Get the mesh itself as a simple feature collectiongom3_mesh <-get_mesh_geometry(gom3_early, what ='lonlat') ```### Maine Coastal Current RegionThe primary area of concern rgarding capturing the continuous flow of the Maine Coastal Current is the area off the coast of penobscot bay.The following polygon was created as a boundary for subsetting the relevant data for calculating MCC indices.We may want to extend our area east (Li et al 2022): > The EMCC generally bifurcates southeast of Mount Desert Island (D. Brooks & Townsend, 1989; D. Li et al., 2021; Luerssen et al., 2005; Pettigrew et al., 1998, 2005).```{r}# Make a polygon for the area of interest:#Original Areamcc_turnoff_coords <-tribble(~"lon", ~"lat",-69.34, 43.21, # Bottom Left-69.78, 43.62, # Off Popham-67.45, 44.34, # Off Jonesport-67.4, 43.8, # Bottom right-69.34, 43.21, # Bottom left again)# # Expanded Area?# mcc_turnoff_coords <- tribble(# ~"lon", ~"lat",# -69.75, 43, # Bottom Left# -70, 43.5, # Top Left# -67.25, 44.45, # Off Jonesport# -66.9, 43.95, # Bottom right# -69.75, 43, # Bottom left again# )# Make it a polygonmcc_turnoff_poly <-st_polygon(list(cbind(mcc_turnoff_coords$lon, mcc_turnoff_coords$lat))) %>%st_sfc(crs =4326) %>%st_as_sf() %>%mutate(area ="Maine Coastal Current Region") st_geometry(mcc_turnoff_poly) <-"geometry"```Using that boundary we can clip the FVCOM mesh and obtain the indices to use for pulling data from the many netcdf files.```{r}#| label: Trim Mesh to MCC area# Project polygonpoly_projected <-st_transform(mcc_turnoff_poly, st_crs(gom3_mesh)) # Turn off s2sf::sf_use_s2(FALSE)# Trim itmcc_studyarea_mesh <-mesh_trim(mesh = gom3_mesh, domain =st_as_sf(poly_projected) )# Map thatggplot() +geom_sf(data = new_england) +geom_sf(data = canada) +geom_sf(data = mcc_studyarea_mesh, color ="black", alpha =0.6) +geom_sf(data = mcc_turnoff_poly, fill ="orange", alpha =0.4) +theme_classic() +map_theme() +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +labs(title ="MCC Turnoff Study Area - FVCOM Mesh Intersection")```## Grab the Surface Current InformationThe clipped mesh contains node and element index numbers. We will want to use these to pull variables for each of these locations. These will then be used for principal component analyses whos goal is to decompose the correlation structures between all these locations.### Pull Daily Surface Current ValuesThe data obtained from the Chen lab directly contains surface current vector information, and does not contain `siglev` coordinates. The code to pull these variables from these files is slightly different for this reason (mainly just one less dimension to provide "start" and "count" arguments for within `ncdf4::ncvar_get()`)Daily current information gives us the option of estimating a more refined/targeted seasonal index and would give us higher resolution throughout the year.Literature suggests connectivity is highest in the winter (concurrent with offshore EMCC veering) and again in spring/summer (time of strongest EMCC flow).During late-fall and early winter the inshore flow reverses*> Both the circulation and particle tracking models suggested that the connectivity generally peaks twice annually, highest in winter and then secondarily in late spring or early summer. The former is concurrent with the most southwest offshore veering of the EMCC, while the latter is concurrent with the strongest EMCC. Moreover, the counter-WMCC can reduce the connectivity and result in year-to-year variations. Li et al. 2022```{r}#' @title Extract Netcdf Data with Index List#' #' @description Subsets data from the results of ncvar_get(). Accepts single or#' multiple indexes for one or more dimensions using a list. List order should#' match the relevant dimensions for the variable.#' #' If no indexing is supplied for a relevant dimension, all elements are returned.#'#' @param nc_array #' @param index_list #'#' @returns#' @export#'#' @examplessubset_nc_var <-function(nc_array, index_list) {# Get the number of dimensions dims <-length(dim(nc_array))# Create a list of indices, defaulting to ":" (keeping all elements) full_index_list <-rep(list(quote(expr = )), dims) # Update with user-specified indicesfor (dim_idx inseq_along(index_list)) {if (!is.null(index_list[[dim_idx]])) { full_index_list[[dim_idx]] <- index_list[[dim_idx]] } }# Use do.call to apply indexing dynamically extracted_data <-do.call(`[`, c(list(nc_array), full_index_list, list(drop =TRUE)))return(extracted_data)}``````{r}#| label: subset daily currents within MCC area# Function to grab each of them for a netcdf connection:# Loop through daily surfbot files and pull values for the proper nodesget_elem_timeseries <-function(fpath, elem_list){# Open (lazy-load) the netcdf connection fvcom_x <-nc_open(fpath)# Time dimension info time_dim <-ncvar_get(fvcom_x, "Times")# Get U u_array <-ncvar_get(nc = fvcom_x, varid ="surface_u", start =c(1, 1),count =c(-1, -1))# Get V v_array <-ncvar_get(nc = fvcom_x, varid ="surface_v", start =c(1, 1),count =c(-1, -1))# Get Lon/Lat lon_array <-ncvar_get(fvcom_x, varid ="lonc", start =c(1), count =c(-1)) lat_array <-ncvar_get(fvcom_x, varid ="latc", start =c(1), count =c(-1))# Close connection to the netcdf:nc_close(fvcom_x)# Start index (just grab all the way through) daily_df <-map_dfr(elem_list, function(elem_x){ elem_lon <- lon_array[[elem_x]] elem_lat <- lat_array[[elem_x]]data.frame("time"= time_dim,"lonc"= elem_lon,"latc"= elem_lat,"u"=subset_nc_var(u_array, list(elem_x)) ,"v"=subset_nc_var(v_array, list(elem_x)) ) }, .id ="elem")# Return the tablemessage(str_c("Completed: ", fpath))return(daily_df)}``````{r}#| label: extract daily currents for elements within MCC area#| eval: false# Get the elem numbers we care about for ncvar_get()mcc_elems <-unique(mcc_studyarea_mesh$elem) %>%setNames(unique(mcc_studyarea_mesh$elem))# # Test one#t1 <- get_elem_timeseries(fpath = fvcom_surfbot_files[[1]], elem_list = mcc_elems[1])# Run them alldaily_mcc_surface_currents <-map_dfr(.x = fvcom_surfbot_files,.f =~get_elem_timeseries(fpath = .x, elem_list = mcc_elems))# Save themwrite_csv( daily_mcc_surface_currents, here::here("data/daily_mcc_surface_currents_vectors.csv"))```## Visual Inspection of Current VectorsIf we load in the table(s) of the element values processed above, we can start to look what the current vectors look like on a map.Current vectors can be visualized easily using ggplot with `metR::geom_vector()`.#### Confirm Daily RecordsThe following figure shows the daily Eastward and Northward velocities at centroid #48733.```{r}#| label: load daily current vector files#| eval: true# Load daily:daily_mcc_sc <-read_csv(here::here("data/daily_mcc_surface_currents_vectors.csv")) %>%mutate(time =as.Date(time),label = time,year =str_sub(label, 1,4),month =as.numeric(str_sub(label, 6, 7)))# Plot the velocities for an element as a checkdaily_mcc_sc %>%filter(elem ==48733) %>%filter(between(time, as.Date("1978-01-01"), as.Date("2019-12-31"))) %>%ggplot() +geom_line(aes(time, u, color ="Eastward Velocity (u)"), alpha =0.3) +geom_line(aes(time, v, color ="Northward Velovity (v)"), alpha =0.3) +labs(title ="Confirming Daily Records",subtitle ="centroid index: 48733" )```#### Mapping Long-term AveragesBefore proceeding too far, here is what the average flow direction is across the domain of interest to us:```{r}#| label: plot current direction#| fig-height: 10# Quick verification that directions make senseaverage_directions <- daily_mcc_sc %>%mutate(period = lubridate::month(time, label =TRUE)) %>%bind_rows(mutate(daily_mcc_sc, period ="Year Round")) %>%mutate(period =factor(period, levels =c(month.abb, "Year Round"))) %>%group_by(period, lonc, latc) %>%summarise(across(.cols =c(u,v), .fns =~mean(.x, na.rm = T))) %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, # Convert from radians to degreesangle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), # Compute speed (magnitude)dx = u / speed *0.05, # Scale x-component for visualizationdy = v / speed *0.05# Scale y-component for visualization )# Map themggplot(average_directions) +geom_sf(data = new_england) +geom_segment(aes(lonc, latc, xend = lonc + dx, yend = latc + dy, color = angle),arrow =arrow(length =unit(0.03, "cm")),linewidth =0.25) +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~period, ncol =3) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +theme(legend.position ="right") +labs(title ="Average Surface Current Direction", x ="Longitude", y ="Latitude")```## MCC Index from Principal Component Analysislatc/lonc coordinate pairs for the centers of each triangle provide a unique ID for each element. When transposed we can create a matrix where each triangular element value has one row per month (or daily, whatever timestep), and one column for each variable (eastward velocity, northward etc.). From this matrix we can perform a Principal Component Analysis (PCA) returning 2 or more principal components that explain a share of the variance in the matrix.**Question: What variables to include/exclude for PCA**We can perform the PCA using only one or more of the current variables, essentially just the water flow characteristics. **Or** we could include temperature and salinity and focus more on the water mass characteristics.[Code used for past indices looked at the vertically averaged Eastward current vector in isolation.](https://github.com/dzaugis/Ecosystem_Indicators/blob/6d21e553614cb06eb7ea02e4546535cf038d7678/Code/MCC_index_report.Rmd#L69-L76)**The primary variables of interest for this step are:** - `u` The Daily Eastward Velocity in meters s-1 - `v` The Daily Northward Water Velocity in meters s-1temperature and salinity may also be used to determine/delineate the different water masses.### Method/Approach Questions:Take a step back, understand the flow before trying to relate the PCA back.Our aim is generate a timeseries that is either a direct measure or a good proxy for MCC flow continuity. We care about this because flow characteristics determine whether lobster larva are transported along the coast or advected offshore.In chatting with Damien, there may not be 1 best way to get the meaningful information out of the different FVCOM variables available.It may make sense to tailor the data we use and the index we generate to lobster question: - This could mean limitieng the time span, to include the time of year when lobster larva are dispersing.# Simple Test: PCA ApproachThe following approach uses Northward (v) and Eastward(u) water velocity of the surface layer for the principal component analysis. ```{r}#| label: surface-currents-pca# For Area weighted PCA, get mesh element areas# Get the surface area of all the elementselem_areas_df <-map_dfr(split(mcc_studyarea_mesh, mcc_studyarea_mesh$elem),function(x){data.frame("elem"= x$elem, "area"=st_area(x))})# A. Surface current variables (u,v)pca_mat <- daily_mcc_sc %>%# Join surface areas and weight u + v# left_join(elem_areas_df) %>% # mutate(across(c(u,v), ~.x*sqrt(area))) %>% # Weight the current values by the areasselect(time, elem, u, v) %>%pivot_wider(names_from = elem, values_from =c(u, v)) %>%column_to_rownames("time")# # Inspect# pca_mat[1:6,1:6]# Do PCA# If all columns have comparable variances and you want to retain absolute differences in current strength, center but do not scale.mcc_pca <-prcomp(pca_mat, scale. =TRUE, center =TRUE)```### PC LoadingsThe following maps show how the two principal component rotations/loadings (the same if matrix is scaled) relate to areas along the study area.```{r}# Summary - Proportion of variancemcc_pca_summ <-summary(mcc_pca)mcc_pca_summ$importance[1:2, 1:2]# Pull + Reshape the Rotations / Loadingsmcc_loadings <-data.frame("PC1"= mcc_pca$rotation[,"PC1"],"PC2"= mcc_pca$rotation[,"PC2"]) %>%rownames_to_column("loc") %>%separate(col ="loc", into =c("var", "elem"), sep ="_") %>%pivot_longer(cols =starts_with("PC"), names_to ="PC", values_to ="PC_rotation") %>%mutate(longname =if_else(var =="u", "Eastward Velocity (u)", "Northward Velocity (v)"))# Map The Loadingsmcc_studyarea_mesh %>%mutate(elem =as.character(elem)) %>%left_join(mcc_loadings) %>%st_as_sf() %>%ggplot() +geom_sf(data = new_england) +geom_sf(aes(fill = PC_rotation)) +facet_grid(PC~longname) +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +scale_fill_distiller(palette ="RdBu", limits =c(-0.06, 0.06)) +map_theme(legend.title.position ="top",legend.title =element_text(hjust =0.5),legend.position ="bottom")```### PC Timeseries```{r}# Pull Principal Components overall valuesmcc_pcomponents <-data.frame("date"=rownames(pca_mat),"PC1"= mcc_pca$x[,"PC1"],"PC2"= mcc_pca$x[,"PC2"]) %>%mutate(date =as.Date(date), mon = lubridate::month(date), yr = lubridate::year(date))# Make a monthly summary, with a 12-month rolling averagemcc_pc_monthly <- mcc_pcomponents %>%pivot_longer(starts_with("PC"), names_to ="Principal Component", values_to ="vals") %>%group_by(yr, mon, `Principal Component`) %>%summarise(vals =mean(vals),.groups ="drop") %>%mutate(date =as.Date(str_c( yr,str_pad(mon, side ="left", width =2, pad ="0"),"15",sep ="-"))) %>%group_by(`Principal Component`) %>%arrange(date) %>%mutate(vals_rollmean = zoo::rollmean(vals, k =12, fill =NA, align ="center"))# Plot daily with the monthly smoothmcc_pc_monthly %>%ggplot() +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Surface Current PCA Timeseries",y ="Principal Component Value",x ="Date",color ="12-month average")```### Validation Exercise: Pettigrew 2005 YearsI need to figure out how to relate the principal components to the open/closed gate language of pettigrew 2005.This study used 1998-2001 as the study years, and characterized strong offshore veering in 1998 and strong continuous flow in 2000.These years may be useful for characterizing the relationship between the current principal components. Here is what the current directions were like during those years:```{r}# Quick verification that directions make sensepettigrew_directions <- daily_mcc_sc %>%filter(between(time, as.Date("1998-01-01"), as.Date("2001-12-31"))) %>%mutate(year = lubridate::year(time)) %>%filter(year !=1999) %>%mutate(gate_position =case_match( year, 1998~"Gate Closed", 2000~"Gate Open", 2001~"Mixed")) %>%group_by(gate_position, year, lonc, latc) %>%summarise(across(.cols =c(u,v), .fns =~mean(.x, na.rm = T)),.groups ="drop") %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, # Convert from radians to degreesangle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), # Compute speed (magnitude)dx = u / speed *0.05, # Scale x-component for visualizationdy = v / speed *0.05) # Scale y-component for visualization# Map themggplot(pettigrew_directions) +geom_sf(data = new_england) +geom_segment(aes( lonc, latc, xend = lonc + dx, yend = latc + dy,color = angle),arrow =arrow(length =unit(0.05, "cm")),linewidth =0.5) +theme_minimal() +theme(legend.position ="right") +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~gate_position*year, ncol =2) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +labs(title ="MCC Pettigrew 2025: Gate Open/Closed/Mixed Example Years",subtitle ="Average Surface Current Direction", x ="Longitude", y ="Latitude",color ="Current Direction")```### Validation Exercise: PC Timeseries PeaksChoose the 12-month periods where principal components are at their peaks. Make maps of what the current directions over that same period of time. ```{r}# Get the dates when the 12 month average are highestmcc_peaks <- mcc_pc_monthly %>%select(-vals) %>%filter(date >as.Date("1990-01-01")) %>%pivot_wider(names_from =`Principal Component`, values_from = vals_rollmean) %>%mutate(PC_state =case_when( PC1 ==max(PC1, na.rm = T) ~"PC1 High", PC2 ==max(PC2, na.rm = T) ~"PC2 High", PC1 ==min(PC1, na.rm = T) ~"PC1 low", PC2 ==min(PC2, na.rm = T) ~"PC2 low",TRUE~NA)) %>%filter(is.na(PC_state) ==FALSE)# Plot the timeseries with those breaksmcc_pc_monthly %>%ggplot(aes(date, vals, color =`Principal Component`)) +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) +geom_vline(data = mcc_peaks,aes(xintercept = date), color ="gray30", linetype =2) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Surface Current PCA Timeseries",subtitle ="Peak States in 12-month average (dashed line)",y ="Principal Component Value",x ="Date")``````{r}# Get the average currents for the 12 months around those breakspeak_directions <- mcc_peaks %>%split(.$date) %>%map_dfr(function(peak_date){# Get the start-end date brackets for the year start_month <- peak_date$date - lubridate::dmonths(6) end_month <- peak_date$date + lubridate::dmonths(6) yr <- peak_date$yr mon <-str_pad(peak_date$mon, width =2, side ="left", pad ="0") state <- peak_date$PC_state# Filter to that period, get average conditions daily_mcc_sc %>%filter(between(time, start_month, end_month)) %>%group_by(lonc, latc) %>%summarise( across(.cols =c(u,v), .fns =~mean(.x, na.rm = T)),.groups ="drop") %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, angle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), dx = u / speed *0.05, dy = v / speed *0.05) %>%mutate(start_month = start_month, end_month = end_month, PC_state = state,ymon =str_c(yr, mon, sep ="-"),.before ="lonc")# End the function })# Map the current directions at those timespeak_direction_maps <-ggplot(peak_directions) +geom_sf(data = new_england) +geom_segment(aes( lonc, latc, xend = lonc + dx, yend = latc + dy,color = angle),arrow =arrow(length =unit(0.05, "cm")),linewidth =0.5) +theme_minimal() +theme(legend.position ="right") +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~PC_state*ymon, ncol =2) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +labs(title ="Peak/Trough Periods in PC Timeseries",subtitle ="Average Surface Current Direction", x ="Longitude", y ="Latitude",color ="Current Direction")# Stack thempeak_direction_maps```# PCA on Angular Direction and Speed Directly?We could try coding the PCA to use angle and speed directly rather than the Northward and Eastward velocity components.The cosine and sine components of the current's angular direction preserve the cyclical nature of the currents. ```{r}# Get the Cyclic X and Y components using trigangular_pca_df <- daily_mcc_sc %>%mutate(angle_rad =atan2(v, u), # Compute angle in radianscos_theta =cos(angle_rad), # Cyclic X componentsin_theta =sin(angle_rad), # Cyclic Y componentspeed =sqrt(u^2+ v^2) # Optional: current speed )# A. Pivot wider to get those three things by the mesh elementangular_pca_mat <- angular_pca_df %>%select(time, elem, cos_theta, sin_theta, speed) %>%pivot_wider(names_from = elem, values_from =c(cos_theta, sin_theta, speed), names_sep ="X") %>%column_to_rownames("time")# Now run PCA using cos_theta and sin_theta instead of u, vangular_pca_result <-prcomp(angular_pca_mat, center =TRUE, scale. =TRUE)```### Angular PC LoadingsThe following maps show how the two principal component rotations/loadings (the same if matrix is scaled) relate to areas along the study area.```{r}# Summary - Proportion of varianceangular_pca_summ <-summary(angular_pca_result)angular_pca_summ$importance[1:2, 1:2]# Pull + Reshape the Rotations / Loadingsangular_loadings <-data.frame("PC1"= angular_pca_result$rotation[,"PC1"],"PC2"= angular_pca_result$rotation[,"PC2"]) %>%rownames_to_column("loc") %>%separate(col ="loc", into =c("var", "elem"), sep ="X") %>%pivot_longer(cols =starts_with("PC"), names_to ="PC", values_to ="PC_rotation") %>%mutate(longname =case_match( var,"cos_theta"~"Cyclic X component","sin_theta"~"Cyclic Y component","speed"~"Current Speed"))# Map The Loadingsmcc_studyarea_mesh %>%mutate(elem =as.character(elem)) %>%left_join(angular_loadings) %>%st_as_sf() %>%ggplot() +geom_sf(data = new_england) +geom_sf(aes(fill = PC_rotation)) +facet_grid(PC~longname) +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +scale_fill_distiller(palette ="RdBu", limits =c(-0.06, 0.06)) +map_theme(legend.title.position ="top",legend.title =element_text(hjust =0.5),legend.position ="bottom")```### Angular PC Timeseries```{r}# Pull Principal Components overall valuesangular_pcomponents <-data.frame("date"=rownames(angular_pca_mat),"PC1"= angular_pca_result$x[,"PC1"],"PC2"= angular_pca_result$x[,"PC2"]) %>%mutate(date =as.Date(date), mon = lubridate::month(date), yr = lubridate::year(date))# Make a monthly summary:angular_pc_monthly <- angular_pcomponents %>%pivot_longer(starts_with("PC"), names_to ="Principal Component", values_to ="vals") %>%group_by(yr, mon, `Principal Component`) %>%summarise(vals =mean(vals),.groups ="drop") %>%mutate(date =as.Date(str_c( yr,str_pad(mon, side ="left", width =2, pad ="0"),"15",sep ="-"))) %>%group_by(`Principal Component`) %>%arrange(date) %>%mutate(vals_rollmean = zoo::rollmean(vals, k =12, fill =NA, align ="center"))```### Validation Exercise: Check PC Timeseries Peaks```{r}# Get the dates when the 12 month average are highestangular_peaks <- angular_pc_monthly %>%select(-vals) %>%filter(date >as.Date("1990-01-01")) %>%pivot_wider(names_from =`Principal Component`, values_from = vals_rollmean) %>%mutate(PC_state =case_when( PC1 ==max(PC1, na.rm = T) ~"PC1 High", PC2 ==max(PC2, na.rm = T) ~"PC2 High", PC1 ==min(PC1, na.rm = T) ~"PC1 low", PC2 ==min(PC2, na.rm = T) ~"PC2 low",TRUE~NA)) %>%filter(is.na(PC_state) ==FALSE)# Timeseriesangular_monthly_timeseries <- angular_pc_monthly %>%ggplot(aes(date, vals, color =`Principal Component`)) +geom_line(aes(date, vals, color =`Principal Component`),linewidth =0.4, alpha =0.4) +geom_line(aes(date, vals_rollmean, color =`Principal Component`),linetype =1) +geom_vline(data =data.frame(angular_peaks),aes(xintercept = date), color ="gray30", linetype =2) + rcartocolor::scale_color_carto_d() +labs(title ="MCC Angular PCA Timeseries",subtitle ="Peak States in 12-month average (dashed line)",y ="Principal Component Value",x ="Date")angular_monthly_timeseries``````{r}# 12-month average of lead-in# Get the average currents for the 12 months around those breaksangular_peak_directions <- angular_peaks %>%split(.$date) %>%map_dfr(function(peak_date){# Get the start-end date brackets for the year start_month <- peak_date$date - lubridate::dmonths(6) end_month <- peak_date$date + lubridate::dmonths(6) yr <- peak_date$yr state <- peak_date$PC_state mon <-str_pad(peak_date$mon, width =2, side ="left", pad ="0")# Filter to that period, get average conditions daily_mcc_sc %>%filter(between(time, start_month, end_month)) %>%group_by(lonc, latc) %>%summarise( across(.cols =c(u,v), .fns =~mean(.x, na.rm = T)),.groups ="drop") %>%mutate(# Radian to degree conversionangle =atan2(v, u) *180/ pi, angle =if_else(angle<0, angle+360, angle),speed =sqrt(u^2+ v^2), dx = u / speed *0.05, dy = v / speed *0.05) %>%mutate(start_month = start_month, end_month = end_month, PC_state = state,ymon =str_c(yr, mon, sep ="-"),.before ="lonc")# End the function })# Map thempeak_direction_maps <-ggplot(angular_peak_directions) +geom_sf(data = new_england) +geom_segment(aes( lonc, latc, xend = lonc + dx, yend = latc + dy,color = angle),arrow =arrow(length =unit(0.05, "cm")),linewidth =0.5) +theme_minimal() +theme(legend.position ="right") +coord_sf(xlim =c(-70, -67.3), ylim =c(43.2, 44.7), expand = F) +facet_wrap(~PC_state*ymon, ncol =2) +scale_color_scico(palette ='romaO',breaks =c(45, 90, 135, 180, 225, 270, 315, 360),labels =c("NE", "N", "NW", "W", "SW", "S", "SE", "E"),limits =c(0,360)) +guides(color =guide_colring(reverse = T, start = pi/2)) +labs(title ="Peak/Trough Periods in AngularPC Timeseries",subtitle ="Average Surface Current Direction", x ="Longitude", y ="Latitude",color ="Current Direction")# Stack thempeak_direction_maps```---# Alternative Approaches: Area-FluxAn alternative way to potentially get at MCC flow continuity is by measuring water volume flux directly for some transect along this area, and potentially use that instead of the PCA approach.We could frame it as either the total southwestern flux, or if its clear/consistent where positionally the current veers south, we could measure southward flux in that area.It would be super interesting to correlate the surface volume flux with the river output from the penobscot.