Sensitivities, adjoints and optimization
Jutul.jl is build from the ground up to support gradient-based optimization problems. This includes both data assimilation/parameter calibration and control problems.
An example application from JutulDarcy.jl
demonstrates many of these functions: A fully differentiable geothermal doublet: History matching and control optimization
Objective functions
There are two main types of objective functions supported in Jutul: Those that evaluated globally over all time-steps, and those who are evaluated locally (typically as a sum over all time-steps).
Jutul.AbstractJutulObjective
— TypeAbstract type for Jutul objectives.
For either type of objective, you must implement the correct interface. This can either be done by passing a function directly, with Jutul guessing from the number of arguments what kind of objective it is, or by explicitly making a subtype that is a Julia callable struct.
These functions take in the step_info
Dict
, which is worth having a look at if you plan on writing objective functions:
Jutul.optimization_step_info
— Functionoptimization_step_info(step::Int, time::Real, dt; kwarg...)
Optimization step information is normally not set up manually, but the output from this function will be passed to objective functions as step_info
. This function is a Dict
with the following fields:
Fields
:time
- The time at the start of the step. To get time at the end of the step, usestep_info[:time] + step_info[:dt]
:dt
- The time step size for this step.:step
- The step number, starting at 1. Not that this is the report step, and multiplestep_info
entries could have the same step if substeps are used.:Nstep
- The total number of steps in the simulation.:substep
- The substep number, starting at 1. This is used to indicate that multiple steps are taken within a single step.:substep_global
- The global substep number, starting at 1 and will not reset between global steps.:Nsubstep_global
- The total number of substeps in the simulation.:total_time
- The total time of the simulation (i.e. dt + time at the final step and substep)
Global objectives
These are objectives that are defined over the entire simulation time. This means that you define the objective function as a single function that takes in the solution for all time-steps together with forces, time-step information, initial state and input data used to set up the model (if any).
Jutul.AbstractGlobalObjective
— TypeAbstract type for objective as a global objective function on the form:
F(model, state0, states, step_infos, forces, input_data)
Note that if substeps are enabled by setting output_substates=true
in the simulator setup, the length of forces and states will be dynamic and of equal length to step_infos
. If you want to recover the state at a specific step, you can reason about :step
field of the corresponding entry in the step_infos
array.
Jutul.WrappedGlobalObjective
— TypeWrappedGlobalObjective(objective)
A global (in time) objective that wraps a function/callable. This type automatically wraps functions/callable structs that are passed to the optimization interface if they have the sum signature (five arguments).
Local/sum objectives
Jutul.AbstractSumObjective
— TypeAbstract type for objective as a sum of function values on the form:
F(model, state, dt, step_info, forces)
evaluated for each step. This means that the objective is a sum of all of these values. If you want to only depend on a single step, you can look up
Jutul.WrappedSumObjective
— TypeWrappedSumObjective(objective)
An objective that is a sum of function values, evaluated for each step, defined by a function. This type automatically wraps functions/callable structs that are passed to the optimization interface if they have the sum signature (five arguments).
Generic optimization interface
The generic optimization interface is very general, handling gradients with respect to any parameter used in a function that sets up a complete simulation case from a AbstractDict
. This makes use of AbstractDifferentiation.jl and the default configuration assumes that your setup function can be differentiated with the ForwardDiff
backend. In practice, this means that you must take care when initializing arrays and other types so that they can fit the AD type (e.g. avoid use of zeros
without a type). In addition, there may be a large number of calls to the setup function, which can sometimes be slow. Alternatively, the numerical parameter optimization interface can be used, which only differentiates with respect to numerical parameters inside the model.
Jutul.DictOptimization.DictParameters
— TypeDictParameters(parameters)
DictParameters(parameters::AbstractDict, setup_function = missing;
strict = true,
verbose = true,
active_type = Float64
)
Set up a DictParameters
object for optimization. Optionally, the setup function that takes an instance with the same keys as parameters
together with a step_info
dictionary can be provided. The setup function should return a JutulCase
set up from the parameters in the Dict.
Optional keyword arguments:
strict
: If true, the optimization will throw an error if any of the parameters are not set with at least one of the upper or lower bounds.verbose
: If true, the optimization will print information about the optimization process.active_type
: The type of the parameters that are considered active in the optimization. Defaults toFloat64
. This is used to determine which parameters are active and should be optimized. This means that all entries (and entries in nested dictionaries) of theparameters
dictionary must be of this type or an array with this type as element type.
Jutul.DictOptimization.free_optimization_parameter!
— Functionfree_optimization_parameter!(dopt, "parameter_name", rel_min = 0.01, rel_max = 100.0)
free_optimization_parameter!(dopt, ["dict_name", "parameter_name"], abs_min = -8.0, abs_max = 7.0)
Free an existing parameter for optimization in the DictParameters
object. This will allow the parameter to be optimized through a call to optimize
.
Nesting structures
If your DictParameters
has a nesting structure, you can use a vector of strings or symbols to specify the parameter name, e.g. ["dict_name", "parameter_name"]
to access the parameter located at ["dict_name"]["parameter_name"]
.
Setting limits
The limits can be set using the following keyword arguments:
abs_min
: Absolute minimum value for the parameter. If not set, no absolute minimum will be applied.abs_max
: Absolute maximum value for the parameter. If not set, no absolute maximum will be applied.rel_min
: Relative minimum value for the parameter. If not set, no relative minimum will be applied.rel_max
: Relative maximum value for the parameter. If not set, no relative maximum will be applied.
For either of these entries it is possible to pass either a scalar, or an array. If an array is passed, it must have the same size as the parameter being set.
Note that if dopt.strict
is set to true
, at least one of the upper or lower bounds must be set for free parameters. If dopt.strict
is set to false
, the bounds are optional and the DictParameters
object can be used to compute sensitivities, but the built-in optimization routine assumes that finite limits are set for all parameters.
Other keyword arguments
initial
: Initial value for the parameter. If not set, the current value indopt.parameters
will be used.scaler=missing
: Optional scaler for the parameter. If not set, no scaling will be applied. Available scalers are:log
,:exp
. The scaler will be appliedlumping=missing
: Optional lumping array for the parameter. If not set, no lumping will be applied. The lumping array should have the same size as the parameter and contain positive integers. The lumping array defines groups of indices that should be lumped together, i.e. the same value will be used for all indices in the same group. The lumping array should contain all integers from 1 to the maximum value in the array, and all indices in the same group should have the same value in the initial parameter, otherwise an error will be thrown.
Jutul.DictOptimization.freeze_optimization_parameter!
— Functionfreeze_optimization_parameter!(dopt, "parameter_name")
freeze_optimization_parameter!(dopt, ["dict_name", "parameter_name"])
freeze_optimization_parameter!(dopt::DictParameters, parameter_name, val = missing)
Freeze an optimization parameter in the DictParameters
object. This will remove the parameter from the optimization targets and set its value to val
if provided. Any limits/lumping/scaling settings for this parameter will be removed.
Jutul.DictOptimization.set_optimization_parameter!
— Functionset_optimization_parameter!(dopt::DictParameters, parameter_name, value)
Set a specific optimization parameter in the DictParameters
object. This function will update the value of the parameter in the dopt.parameters
dictionary.
Jutul.DictOptimization.optimize
— Functionoptimized_dict = optimize(dopt, objective)
optimize(dopt::DictParameters, objective, setup_fn = dopt.setup_function;
grad_tol = 1e-6,
obj_change_tol = 1e-6,
max_it = 25,
opt_fun = missing,
maximize = false,
simulator = missing,
config = missing,
solution_history = false,
backend_arg = (
use_sparsity = false,
di_sparse = true,
single_step_sparsity = false,
do_prep = true,
),
kwarg...
)
Optimize parameters defined in a DictParameters
object using the provided objective function. At least one variable has to be declared to be free using free_optimization_parameter!
prior to calling the optimizer.
Arguments
dopt::DictParameters
: Container with parameters to optimizeobjective
: The objective function to minimize (or maximize)setup_fn
: Function to set up the optimization problem. Defaults todopt.setup_function
Keyword Arguments
grad_tol
: Gradient tolerance for stopping criterionobj_change_tol
: Objective function change tolerance for stopping criterionmax_it
: Maximum number of iterationsopt_fun
: Optional custom optimization function. If missing, L-BFGS will be usedmaximize
: Set totrue
to maximize the objective instead of minimizingsimulator
: Optional simulator object used in forward simulationsconfig
: Optional configuration for the setupsolution_history
: Iftrue
, stores all intermediate solutionsbackend_arg
: Options for the autodiff backend:use_sparsity
: Enable sparsity detection for the objective functiondi_sparse
: Use sparse differentiationsingle_step_sparsity
: Enable single step sparsity detection (if sparsity does not change during timesteps)do_prep
: Perform preparation step
Returns
The optimized parameters as a dictionary.
Notes
- The function stores the optimization history and optimized parameters in the input
dopt
object. - If
solution_history
istrue
, intermediate solutions are stored indopt.history.solutions
. - The default optimization algorithm is L-BFGS with box constraints.
Jutul.DictOptimization.parameters_gradient
— Functionparameters_gradient(dopt::DictParameters, objective, setup_fn = dopt.setup_function)
Compute the gradient of the objective function with respect to the parameters defined in the DictParameters
object. This function will return the gradient as a dictionary with the same structure as the input parameters, where each entry is a vector of gradients for each parameter. Only gradients with respect to free parameters will be computed.
Numerical parameter optimization interface
Jutul.solve_adjoint_sensitivities
— Functionsolve_adjoint_sensitivities(model, states, reports_or_timesteps, G; extra_timing = false, state0 = setup_state(model), forces = setup_forces(model), raw_output = false, kwarg...)
Compute sensitivities of model
parameter with name target
for objective function G
.
The objective function is at the moment assumed to be a sum over all states on the form: obj = Σₙ G(model, state, dt_n, n, forces_for_step_n)
Solves the adjoint equations: For model equations F the gradient with respect to parameters p is ∇ₚG = Σₙ (∂Fₙ / ∂p)ᵀ λₙ where n ∈ [1, N]. Given Lagrange multipliers λₙ from the adjoint equations (∂Fₙ / ∂xₙ)ᵀ λₙ = - (∂J / ∂xₙ)ᵀ - (∂Fₙ₊₁ / ∂xₙ)ᵀ λₙ₊₁ where the last term is omitted for step n = N and G is the objective function.
Jutul.solve_adjoint_sensitivities!
— Functionsolve_adjoint_sensitivities!(∇G, storage, states, state0, timesteps, G; forces)
Non-allocating version of solve_adjoint_sensitivities
.
Jutul.solve_numerical_sensitivities
— Functionsolve_numerical_sensitivities(model, states, reports, G, target;
forces = setup_forces(model),
state0 = setup_state(model),
parameters = setup_parameters(model),
epsilon = 1e-8)
Compute sensitivities of model
parameter with name target
for objective function G
.
This method uses numerical perturbation and is primarily intended for testing of solve_adjoint_sensitivities
.
Jutul.setup_parameter_optimization
— Functionsetup_parameter_optimization(model, state0, param, dt, forces, G, opt_cfg = optimization_config(model, param);
grad_type = :adjoint,
config = nothing,
print = 1,
copy_case = true,
param_obj = false,
kwarg...
)
Set up function handles for optimizing the case defined by the inputs to simulate
together with a per-timestep objective function G
. The objective function should fit the format described in AbstractGlobalObjective or AbstractSumObjective.
Generally calling either of the functions will mutate the data Dict. The options are:
F_o(x)
: evaluate objectivedF_o(dFdx, x)
: evaluate gradient of objective, mutatingdFdx
(may trigger evaluation ofF_o
)F_and_dF(F, dFdx, x)
: evaluateF
and/ordF
. Ifnothing
is passed for an entry, the corresponding entry is skipped.