Sensitivities, adjoints and optimization

Jutul.jl is build from the ground up to support gradient-based optimization problems. This includes both data assimilation/parameter calibration and control problems.

An example application from JutulDarcy.jl demonstrates many of these functions: A fully differentiable geothermal doublet: History matching and control optimization

Objective functions

There are two main types of objective functions supported in Jutul: Those that evaluated globally over all time-steps, and those who are evaluated locally (typically as a sum over all time-steps).

For either type of objective, you must implement the correct interface. This can either be done by passing a function directly, with Jutul guessing from the number of arguments what kind of objective it is, or by explicitly making a subtype that is a Julia callable struct.

These functions take in the step_info Dict, which is worth having a look at if you plan on writing objective functions:

Jutul.optimization_step_infoFunction
optimization_step_info(step::Int, time::Real, dt; kwarg...)

Optimization step information is normally not set up manually, but the output from this function will be passed to objective functions as step_info. This function is a Dict with the following fields:

Fields

  • :time - The time at the start of the step. To get time at the end of the step, use step_info[:time] + step_info[:dt]
  • :dt - The time step size for this step.
  • :step - The step number, starting at 1. Not that this is the report step, and multiple step_info entries could have the same step if substeps are used.
  • :Nstep - The total number of steps in the simulation.
  • :substep - The substep number, starting at 1. This is used to indicate that multiple steps are taken within a single step.
  • :substep_global - The global substep number, starting at 1 and will not reset between global steps.
  • :Nsubstep_global - The total number of substeps in the simulation.
  • :total_time - The total time of the simulation (i.e. dt + time at the final step and substep)
source

Global objectives

These are objectives that are defined over the entire simulation time. This means that you define the objective function as a single function that takes in the solution for all time-steps together with forces, time-step information, initial state and input data used to set up the model (if any).

Jutul.AbstractGlobalObjectiveType

Abstract type for objective as a global objective function on the form:

F(model, state0, states, step_infos, forces, input_data)

Note that if substeps are enabled by setting output_substates=true in the simulator setup, the length of forces and states will be dynamic and of equal length to step_infos. If you want to recover the state at a specific step, you can reason about :step field of the corresponding entry in the step_infos array.

source
Jutul.WrappedGlobalObjectiveType
WrappedGlobalObjective(objective)

A global (in time) objective that wraps a function/callable. This type automatically wraps functions/callable structs that are passed to the optimization interface if they have the sum signature (five arguments).

source

Local/sum objectives

Jutul.AbstractSumObjectiveType

Abstract type for objective as a sum of function values on the form:

F(model, state, dt, step_info, forces)

evaluated for each step. This means that the objective is a sum of all of these values. If you want to only depend on a single step, you can look up

source
Jutul.WrappedSumObjectiveType
WrappedSumObjective(objective)

An objective that is a sum of function values, evaluated for each step, defined by a function. This type automatically wraps functions/callable structs that are passed to the optimization interface if they have the sum signature (five arguments).

source

Generic optimization interface

The generic optimization interface is very general, handling gradients with respect to any parameter used in a function that sets up a complete simulation case from a AbstractDict. This makes use of AbstractDifferentiation.jl and the default configuration assumes that your setup function can be differentiated with the ForwardDiff backend. In practice, this means that you must take care when initializing arrays and other types so that they can fit the AD type (e.g. avoid use of zeros without a type). In addition, there may be a large number of calls to the setup function, which can sometimes be slow. Alternatively, the numerical parameter optimization interface can be used, which only differentiates with respect to numerical parameters inside the model.

Jutul.DictOptimization.DictParametersType
DictParameters(parameters)
DictParameters(parameters::AbstractDict, setup_function = missing;
        strict = true,
        verbose = true,
        active_type = Float64
    )

Set up a DictParameters object for optimization. Optionally, the setup function that takes an instance with the same keys as parameters together with a step_info dictionary can be provided. The setup function should return a JutulCase set up from the parameters in the Dict.

Optional keyword arguments:

  • strict: If true, the optimization will throw an error if any of the parameters are not set with at least one of the upper or lower bounds.
  • verbose: If true, the optimization will print information about the optimization process.
  • active_type: The type of the parameters that are considered active in the optimization. Defaults to Float64. This is used to determine which parameters are active and should be optimized. This means that all entries (and entries in nested dictionaries) of the parameters dictionary must be of this type or an array with this type as element type.
source
Jutul.DictOptimization.free_optimization_parameter!Function
free_optimization_parameter!(dopt, "parameter_name", rel_min = 0.01, rel_max = 100.0)
free_optimization_parameter!(dopt, ["dict_name", "parameter_name"], abs_min = -8.0, abs_max = 7.0)

Free an existing parameter for optimization in the DictParameters object. This will allow the parameter to be optimized through a call to optimize.

Nesting structures

If your DictParameters has a nesting structure, you can use a vector of strings or symbols to specify the parameter name, e.g. ["dict_name", "parameter_name"] to access the parameter located at ["dict_name"]["parameter_name"].

Setting limits

The limits can be set using the following keyword arguments:

  • abs_min: Absolute minimum value for the parameter. If not set, no absolute minimum will be applied.
  • abs_max: Absolute maximum value for the parameter. If not set, no absolute maximum will be applied.
  • rel_min: Relative minimum value for the parameter. If not set, no relative minimum will be applied.
  • rel_max: Relative maximum value for the parameter. If not set, no relative maximum will be applied.

For either of these entries it is possible to pass either a scalar, or an array. If an array is passed, it must have the same size as the parameter being set.

Note that if dopt.strict is set to true, at least one of the upper or lower bounds must be set for free parameters. If dopt.strict is set to false, the bounds are optional and the DictParameters object can be used to compute sensitivities, but the built-in optimization routine assumes that finite limits are set for all parameters.

Other keyword arguments

  • initial: Initial value for the parameter. If not set, the current value in dopt.parameters will be used.
  • scaler=missing: Optional scaler for the parameter. If not set, no scaling will be applied. Available scalers are :log, :exp. The scaler will be applied
  • lumping=missing: Optional lumping array for the parameter. If not set, no lumping will be applied. The lumping array should have the same size as the parameter and contain positive integers. The lumping array defines groups of indices that should be lumped together, i.e. the same value will be used for all indices in the same group. The lumping array should contain all integers from 1 to the maximum value in the array, and all indices in the same group should have the same value in the initial parameter, otherwise an error will be thrown.
source
Jutul.DictOptimization.freeze_optimization_parameter!Function
freeze_optimization_parameter!(dopt, "parameter_name")
freeze_optimization_parameter!(dopt, ["dict_name", "parameter_name"])
freeze_optimization_parameter!(dopt::DictParameters, parameter_name, val = missing)

Freeze an optimization parameter in the DictParameters object. This will remove the parameter from the optimization targets and set its value to val if provided. Any limits/lumping/scaling settings for this parameter will be removed.

source
Jutul.DictOptimization.set_optimization_parameter!Function
set_optimization_parameter!(dopt::DictParameters, parameter_name, value)

Set a specific optimization parameter in the DictParameters object. This function will update the value of the parameter in the dopt.parameters dictionary.

source
Jutul.DictOptimization.optimizeFunction
optimized_dict = optimize(dopt, objective)
optimize(dopt::DictParameters, objective, setup_fn = dopt.setup_function;
    grad_tol = 1e-6,
    obj_change_tol = 1e-6,
    max_it = 25,
    opt_fun = missing,
    maximize = false,
    simulator = missing,
    config = missing,
    solution_history = false,
    backend_arg = (
        use_sparsity = false,
        di_sparse = true,
        single_step_sparsity = false,
        do_prep = true,
    ),
    kwarg...
)

Optimize parameters defined in a DictParameters object using the provided objective function. At least one variable has to be declared to be free using free_optimization_parameter! prior to calling the optimizer.

Arguments

  • dopt::DictParameters: Container with parameters to optimize
  • objective: The objective function to minimize (or maximize)
  • setup_fn: Function to set up the optimization problem. Defaults to dopt.setup_function

Keyword Arguments

  • grad_tol: Gradient tolerance for stopping criterion
  • obj_change_tol: Objective function change tolerance for stopping criterion
  • max_it: Maximum number of iterations
  • opt_fun: Optional custom optimization function. If missing, L-BFGS will be used
  • maximize: Set to true to maximize the objective instead of minimizing
  • simulator: Optional simulator object used in forward simulations
  • config: Optional configuration for the setup
  • solution_history: If true, stores all intermediate solutions
  • backend_arg: Options for the autodiff backend:
    • use_sparsity: Enable sparsity detection for the objective function
    • di_sparse: Use sparse differentiation
    • single_step_sparsity: Enable single step sparsity detection (if sparsity does not change during timesteps)
    • do_prep: Perform preparation step

Returns

The optimized parameters as a dictionary.

Notes

  • The function stores the optimization history and optimized parameters in the input dopt object.
  • If solution_history is true, intermediate solutions are stored in dopt.history.solutions.
  • The default optimization algorithm is L-BFGS with box constraints.
source
Jutul.DictOptimization.parameters_gradientFunction
parameters_gradient(dopt::DictParameters, objective, setup_fn = dopt.setup_function)

Compute the gradient of the objective function with respect to the parameters defined in the DictParameters object. This function will return the gradient as a dictionary with the same structure as the input parameters, where each entry is a vector of gradients for each parameter. Only gradients with respect to free parameters will be computed.

source

Numerical parameter optimization interface

Jutul.solve_adjoint_sensitivitiesFunction
solve_adjoint_sensitivities(model, states, reports_or_timesteps, G; extra_timing = false, state0 = setup_state(model), forces = setup_forces(model), raw_output = false, kwarg...)

Compute sensitivities of model parameter with name target for objective function G.

The objective function is at the moment assumed to be a sum over all states on the form: obj = Σₙ G(model, state, dt_n, n, forces_for_step_n)

Solves the adjoint equations: For model equations F the gradient with respect to parameters p is ∇ₚG = Σₙ (∂Fₙ / ∂p)ᵀ λₙ where n ∈ [1, N]. Given Lagrange multipliers λₙ from the adjoint equations (∂Fₙ / ∂xₙ)ᵀ λₙ = - (∂J / ∂xₙ)ᵀ - (∂Fₙ₊₁ / ∂xₙ)ᵀ λₙ₊₁ where the last term is omitted for step n = N and G is the objective function.

source
Jutul.solve_numerical_sensitivitiesFunction
solve_numerical_sensitivities(model, states, reports, G, target;
                                            forces = setup_forces(model),
                                            state0 = setup_state(model),
                                            parameters = setup_parameters(model),
                                            epsilon = 1e-8)

Compute sensitivities of model parameter with name target for objective function G.

This method uses numerical perturbation and is primarily intended for testing of solve_adjoint_sensitivities.

source
Jutul.setup_parameter_optimizationFunction
setup_parameter_optimization(model, state0, param, dt, forces, G, opt_cfg = optimization_config(model, param);
    grad_type = :adjoint,
    config = nothing,
    print = 1,
    copy_case = true,
    param_obj = false,
    kwarg...
)

Set up function handles for optimizing the case defined by the inputs to simulate together with a per-timestep objective function G. The objective function should fit the format described in AbstractGlobalObjective or AbstractSumObjective.

Generally calling either of the functions will mutate the data Dict. The options are:

  • F_o(x): evaluate objective
  • dF_o(dFdx, x): evaluate gradient of objective, mutating dFdx (may trigger evaluation of F_o)
  • F_and_dF(F, dFdx, x): evaluate F and/or dF. If nothing is passed for an entry, the corresponding entry is skipped.
source