We recently developed a fork for FRAM that allows batch running of
FRAM runs (https://github.com/FRAMverse/FRAM_automation). This can
facilitate sensitivity analyses, where we explore the consequences to
fishery mortalities or ERs of changes in some key input or inputs of
interest. Here we provide a tutorial for recently added functionality to
framrsquared
that can be used to generate runs for
sensitivity analyses which can then be batch-run in this new FRAM
fork.
Overview
I begin by describing the fundamental framework underlying the sensitivity anlaysis functions, then walk through those functions themselves, and end with an example of using these functions to run a sensitivity analysis on Elwha for our new OOB work.
Philosophy and functions
Match/Replace dataframe approach
Fundamentally, any sort of changes to the FRAM database – including for a sensitivity analyses – need to identify
- What should be changed, and
- What new values should be used
- requires both identify the table (e.g. maybe the StockRecruit table), the columns to change (e.g., maybe RecruitScaleFactor), and rows to change (maybe all rows with StockID 17 or 18, or all rows with Age 2, or some mixing and maxing of conditions).
The philosophy I took in writing the modify_table()
and
sensitivity_*()
functions was to identify both (1) and (2)
in a systematic, flexible way. All sensitivity analyses functions rely
on modify_table()
under the hood; even though it is
unlikely that we will use modify_table()
directly, it is
important to understand the structure of its input.
modify_table()
answers both (1) and (2) above using a
“match/replace” dataframe. Each column name must start with a prefix of
“match_” or “replace_”, and the rest of the column name should be the
name of a column in the FRAM database column (with exact match of
capitalization). Any column label that starts with “match_” will be used
to identify rows to change, and any column label that starts with
“replace_” provides values that should be used. This is somewhat
analogous to a join()
framework, in that we have “match by”
columns (in a join, we would use these for the by =
argument), and other columns that are not used for match. The primary
difference from a join framework is that rather than adding new columns
to a dataframe, we are replacing values in place.
As an example, if we were asked to look at the consequences of changing the size limits for fishery
For example, if we wanted to change the size limits for Chinook in Area 7 sport (Fishery ID 36). Currently those are 520 for each time step, but let’s imagine the request was to see what happens if we increased the size limit to 600 in timestep 1, left the size limit at 520 in timestep 2, and decreased the size limit to 450 in timestep 3. Timestep 4 should match timestep 1, so should also be 600. Let’s say we’re supposed to test the effects on the last three years’ final preseason runs, which have run ids 28, 29, and 30. For reference, the size limits are specified in the MinimumSize column of the SizeLimits table.
We could specify those changes with the following match/replace dataframe
## make the match/replace dataframe for a single run
mr_df = data.frame(match_FisheryID = c(36, 36, 36),
match_TimeStep = c(1,3,4),
replace_MinimumSize = c(600, 450, 600)
)
Before worrying about the run ids, our match/replace dataframe looks like this:
mr_df
## match_FisheryID match_TimeStep replace_MinimumSize
## 1 36 1 600
## 2 36 3 450
## 3 36 4 600
If we used this as it is, as in
modify_table(fram_db, "SizeLimits", mr_df)
,
modify_table()
would find all entries with fishery 36
timestep 1 or 4 and replace the MinimumSize with 600. Similarly, i
twould find all entries with fishery 36 timestep 3 and replace the
MinimumSize with 450. We can constrain these changes to run ids 28
through 30 by adding a match_RunID
column
mr_df = tidyr::expand_grid(match_RunID = 28:30, mr_df)
mr_df
## # A tibble: 9 × 4
## match_RunID match_FisheryID match_TimeStep replace_MinimumSize
## <int> <dbl> <dbl> <dbl>
## 1 28 36 1 600
## 2 28 36 3 450
## 3 28 36 4 600
## 4 29 36 1 600
## 5 29 36 3 450
## 6 29 36 4 600
## 7 30 36 1 600
## 8 30 36 3 450
## 9 30 36 4 600
This framework allows us to specify any number of changes (to a single table), with any kind of simple matching criterion. Further, it’s easy to review the changes specified, and the changes can easily be saved to a csv or excel file for later review.
Scaling
We may not always want to specify our new values in absolute terms,
especially if we’re looking at the joint effects of changing several
values. The calc_from_scaling()
allows the specification of
changes relative to current values, and returns the
corresponding match/replace table. For example, this allows us to
implement scenarios like “What if catch in these ten fisheries were 50%
of what we thought it was? What if it was 150% of what we thought it
was?. calc_from_scaling()
takes the scenario using modified
version of the match/replace dataframe (with prefix”scale_” instead of
“replace_”), and runs the calculations to find what the actual
replacement values should be, then returns a match/replace dataframe
that can be fed into modify_table()
. This is the framework
behind sensitivity_scaled()
.
sensitivity_*() functions
As of writing, there are three functions designed to generate sensitivity analyses
-
sensitivity_exact()
takes a sequence of exact values to be used in the sensitivity analyses. Useful for cases when we, for example, want to look at what happens if a recruit scaler varies between 0.1 and 10. -
sensitivity_scaled()
takes a sequence of scaling values. Useful if we want to look at concurrent changes for subsets of stocks or fisheries in a way that is consistent with an existing run; e.g. “what would have happened last year if all coastal stocks were 10%, 20%, …, 200% of the values used in the post-season run? -
sensitivity_custom()
takes a list of match/replace dataframes. This is harder to use, but supports more complex changes like those needed to explore how changes in age composition affect ERs. I would recommend writing additional functions to programmatically generate those lists.
Each of these functions starts with a template run in the FRAM database, makes as many copies as necessary of that template run, makes changes to each run so that they represent the spectrum of values needed for the sensitivity analysis, and optionally copies a TAMM run into a directory with appropriate naming so that the folder can be used to load the runs and tamms into the FRAM multirun fork. By default these functions also save the changes made and associated run ids into a log file in the same directory as the FRAM database.
Common sensitivity functions
sensitivity_exact()
and
sensitivity_scaled()
have similar arguments, most of them
exactly the same. For simplicity, I will not describe them all here;
instead, see their respective help pages. Each of these only allows
specification a single new value (either an exact value or a scaling
factor) for each sensitivity run, although it can be applied to multiple
columns and rows of a table. For example, it is possible to use
sensitivity_exact()
to have the recruit scalers of a dozen
stocks be 0.1 in the first run, 0.2 in the second run, 0.3 in the third
run, etc. For something in which multiple values need to be changed but
not in concert, use sensitivity_custom()
. (For complex
scenarios in which multiple values should be changing differently from
one another but proportionally to some starting ratio, it may be
possible to do so by first modifying the values of the template run to
the desired ratio and then using sensitivity_scaled()
).
When designing a sensitivity analysis, we need to first decide: -
what is our starting run? This determines template_run
. -
What do we want to be changing across these runs? This determines
table_name
and match_df
. - What sequence of
values do we want to look at? This determines scale_values
or exact_values
. Several notes here. First, because we
provide a vector of values, we can choose to a linear scale (e.g.,
seq(0.1, 2, by = 0.1)
), OR we can choose a more complex
scale like a log scale(e.g.,
exp(seq(log(10), log(100000), length = 50)))
. Second, the
length of the sequence of values we provide determines how many runs are
generated. In general it looks like we need to keep the number of runs
in FRAM access databases to no more than 500, so bear that in mind when
making sensitivity analyses. If there is a need for many more runs in a
single database, consider updating the FRAM fork to fold in the SQLite
support Ty wrote, and then modify these functions to work with SQLite as
well. - Do we want to use TAMMs with our sensitivity runs? If so, what
TAMM file do we want to be our template? Where do we want the
sensitivity run TAMMs to be saved? Using TAMM runs will slow the actual
sensitivity analyses run in FRAM, but are necessary to do things like
account for coastal iterations (Coho) or do a bunch of important things
(Chinook).
Handling complicated things
sensitivity_custom()
allows the specification of
arbitrary scenarios for each sensitivity run; the only constraint is
that (currently) only one table can be changed for each sensitivity run.
If, in the future, we end up with frequent changes that must be
simultaneously specified across multiple tables, it may be useful to
change sensitivity_custom
to optionally take a more complex
structure. Effectively we would want a list of lists for
scenario_list
, but it would be safest if the “inner” list
(multiple match/replace tables to be applied to a single run) were
written as some kind of custom class / object-type so that there is less
space for mistakes.
We may sometimes want our sensitivity analysis to look at the effects
of changing the TAMM rather than values directly in FRAM. While there is
no direct method to do so, we can use sensitivity_scaled()
as a quick way to create a series of identical runs and associated TAMMs
by providing scale_values = rep(1, nrun)
where
nrun
is the number of desired runs. We can then write code
to programmatically change the values in each TAMM. If this is a
frequent occurrence, we might write functions here or in {TAMMsupport}
to handle the TAMM modifications.
Example
As part of the Elwha Out of Base (OOB) work in 2025, we want to look at how changes to the Elwha starting abundances affect ERs across the board. I started by making a copy of the Elwa OOB “valid run” database to ensure I had no way of accidentally screwing up the reference database.
Remainder of this section to be completed after merging