Adding new objective functions

Attention

This page is mainly intended to explain some of the logic inside of objective functions. For simple objectives like shown in this page, it is recommended to use the GenericObjective (for objectives that simply just use values computable already in the data index, see List of Variables) or ObjectiveFromUser (for quantities which are derived from things computable from the data index) classes. The benefit of making a full objective class like shown in this page is mainly when dealing with multiple objects at once (see e.g PlasmaVesselDistance), or more complicated objectives (like EffectiveRipple) or having better control over default values and overriding some methods such as print_value. Most objectives can trivially be made with GenericObjective and ObjectiveFromUser:

from desc.objectives import ObjectiveFromUser, GenericObjective
from desc.equilibrium import Equilibrium
from desc.integrals import surface_min, surface_max

eq = Equilibrium()

# QS triple product, already exists as "f_T" in the data index
obj_QS_triple = GenericObjective(f="f_T", thing=eq)

# Mirror ratio, manually computed from "|B|"
def fun_mirror_ratio(grid, data):
    max_tz_B = surface_max(grid=grid, x=data["|B|"], surface_label="rho")
    min_tz_B = surface_min(grid=grid, x=data["|B|"], surface_label="rho")

    max_tz_B = grid.compress(max_tz_B, surface_label="rho")
    min_tz_B = grid.compress(min_tz_B, surface_label="rho")

    mirror_ratio = (max_tz_B - min_tz_B) / (min_tz_B + max_tz_B)

    return mirror_ratio
    # alternatively, "mirror ratio" is something that can be computed in the data index
    # directly (see List of Variables docs), so can replace entirety of the above function code with this return statement
    # return grid.compress(data["mirror ratio"])
    # or can just use GenericObjective(f="mirror ratio", thing=eq) like given above
obj_mirror_ratio = ObjectiveFromUser(fun=fun_mirror_ratio, thing=eq)

This guide walks through creating a new objective to optimize using Quasi-symmetry and mirror ratio as an example. The primary methods needed for a new objective are __init__, build, and compute. The base class _Objective provides a number of other methods that generally do not need to be re-implemented for subclasses such as derivative calculation, deviation from the target value, scaled cost and many common attributes.

__init__ should generally just assign attributes and store inputs. The object(s) that will be optimized by the objective is (are) also defined in the __init__ method. This (these) can be any object that is inherited from Optimizable super-class, i.e. Equilibrium, Coil, Surface. __init__ method should not do any expensive calculations, these should be in build or compute. The main arguments are summarized in the example below.

build is called before optimization either by user or automatically. It is used to precompute things that will be constant during the optimization like transform matrices that convert spectral coefficients to real space values, the values of the fixed plasma profiles, names of the physics quantities required that are registered as compute functions, shape of the objective value etc. The quantities that will be used by compute are then packaged into constants. build method also performs any necessary checks on the inputs before starting the optimization.

compute is where the actual calculation of the objective takes place. Objectives generally return a vector of residuals that are minimized in a least squares sense, though the exact method will depend on the optimization algorithm. The main thing here is calling compute_fun to get physics quantities, and then performing any post-processing we want such as averaging, combining, etc. compute must always return a 1-D array.

Now let’s look at QuasisymmetryTripleProduct as an example. This objective takes in an Equilibrium object and computes the quasi-symmetry triple product.

First, we need some common imports that almost all objectives in DESC need. One thing to keep in mind here is that desc.compute.utils offers 2 functions namely compute and _compute. The latter is the JIT compatible version of the former, and is used in the compute method of the objective. The former has additional checks that make it incompatible to use in JIT-compiled functions. We import _compute with an alias compute_fun to avoid confusion with the compute method of the objective.

from desc.objectives.objective_funs import _Objective
from desc.objectives.normalization import compute_scaling_factors
from desc.compute import get_profiles, get_transforms
from desc.compute.utils import _compute as compute_fun
from desc.grid import LinearGrid

_Objective parent class provides a lot of functionality for objectives, such as compute_scaled for scaling the result of compute method, compute_scaled_error for computing the error from the target, jac_scaled_error for computing the Jacobian of the scaled error, and many other methods. The docstring of the objective should explain special inputs. The docstrings for common inputs (i.e. target, bounds, weight, jac_chunk_size etc) can be inherited from the base class and can be adjusted as shown below.

class QuasisymmetryTripleProduct(_Objective):  # need to subclass from ``desc.objectives._Objective``
    """Give a description of what it is and what it's useful for.

    Parameters
    ----------
    eq : Equilibrium
        Equilibrium that will be optimized to satisfy the Objective.
    grid : Grid, optional
        Collocation grid containing the nodes to evaluate at.

    """
    # Most of the documentation is shared among all objectives, so we just inherit
    # the docstring from the base class and add a few details specific to this objective.
    # See the documentation of `collect_docs` for more details.
    __doc__ = __doc__.rstrip() + collect_docs(
        target_default="``target=0``.", bounds_default="``target=0``."
    )

    _coordinates = "rtz"    # What coordinates is this objective a function of, with r=rho, t=theta, z=zeta?
                            # i.e. if only a profile, it is "r" , while if all 3 coordinates it is "rtz"
    _units = "(T^4/m^2)"    # units of the output
    _print_value_fmt = "Quasi-symmetry error: "    # string with the name of the printed value, used when showing results of an optimization
    _static_attrs = _Objective._static_attrs + [] # list of strings of attribute names that should be considered static by jax, eg strings and booleans or anything used for control flow.

__init__ method should assign the optimizable thing(s) to the things attribute, which is a list of objects that will be optimized. For this example, we will optimize an Equilibrium object, so we assign it to the things as a list. As explained before, the __init__ method should not do any expensive calculations, so we just assign the attributes and call the parent class’s __init__ method which will handle common inputs and finalize the initialization.

def __init__(
    self,
    eq,
    target=None,
    bounds=None,
    weight=1,
    normalize=True,
    normalize_target=True,
    grid=None,
    name="QS triple product",
    jac_chunk_size=None,
):
    # we don't have to do much here, mostly just call ``super().__init__()``
    if target is None and bounds is None:
        target = 0 # default target value
    self._grid = grid
    super().__init__(
        things=[eq], # things is a list of things that will be optimized, in this case just the equilibrium
        target=target,
        bounds=bounds,
        weight=weight,
        normalize=normalize,
        normalize_target=normalize_target,
        name=name,
        jac_chunk_size=jac_chunk_size
    )

build method can be thought as a pre-computation step that prepares the objective for optimization by storing the constants needed for compute method to prevent extra computations. This method is not JIT-compiled, so it can perform any Python code.

grid is a Grid object that contains the nodes where the objective will be evaluated. If it is not provided, a default grid is created based on the grid requirements for the objective. For example, if the objective needs to compute a volumetric quantity, a grid that covers the entire plasma volume needs to be chosen as default, or if there is an integral quantity a grid with proper quadrature points needs to be chosen. Sometimes 2 grids are needed, for example coil objectives, one for the evaluation points on plasma surface and one for the coil segments for Biot-Savart integration.

Probably the most important part of the build method is to call get_profiles and get_transforms functions from desc.compute.utils. These functions return the profiles and transforms needed to compute the physics quantities from the equilibrium object. Both functions return dictionaries. Since these require information on the computation grid, one needs to call them after assigning the grid to the objective.

_data_keys is a list of strings that specifies which physics quantities are needed to be computed, for this example, from the equilibrium object. If there are multiple things in self.things, one can create separate lists for each thing. One can use a different name instead of _data_keys, but it is a convention in most DESC objectives. _dim_f is the size of the output vector returned by compute method. This quantity is used in ObjectiveFunction class to conduct concatenation or splitting and the name _dim_f has to be kept to prevent errors. One should also define the proper normalization factor for the objective, if needed. The units of the normalization factor should be such that the objective value is unitless.

We put all the constants into a dictionary called self._constants. This dictionary will be passed to the compute method as the constants argument, so it can access the transforms and profiles needed to compute the objective. Alternatively, one can also store the constants as attributes of the objective, for instance self._transforms and self._profiles. Finally, we call the parent class’s build method for common parts of building the objective.

def build(self, use_jit=True, verbose=1):
    """Build constant arrays.

    Parameters
    ----------
    use_jit : bool, optional
        Whether to just-in-time compile the objective and derivatives.
    verbose : int, optional
        Level of output.

    """
    # things is the list of things that will be optimized,
    # we assigned things to be just eq in the init, so we know that the
    # first (and only) element of things is the equilibrium
    eq = self.things[0]
    # need some sensible default grid
    if self._grid is None:
        grid = LinearGrid(M=eq.M_grid, N=eq.N_grid, NFP=eq.NFP, sym=eq.sym)
    else:
        grid = self._grid
    # dim_f = size of the output vector returned by self.compute
    # usually the same as self.grid.num_nodes, unless you're doing some down-sampling
    # or averaging etc.
    self._dim_f = self.grid.num_nodes
    # What data from desc.compute is needed? Here we want the QS triple product.
    self._data_keys = ["f_T"]

    # some helper code for profiling and logging
    timer = Timer()
    if verbose > 0:
        print("Precomputing transforms")
    timer.start("Precomputing transforms")

    # helper functions for building transforms etc to compute given
    # quantities. Alternatively, these can be created manually based on the
    # equilibrium, though in most cases that isn't necessary.
    profiles = get_profiles(self._data_keys, obj=eq, grid=grid)
    transforms = get_transforms(self._data_keys, obj=eq, grid=grid)
    self._constants = {
        "transforms": transforms,
        "profiles": profiles,
    }

    timer.stop("Precomputing transforms")
    if verbose > 1:
        timer.disp("Precomputing transforms")


    # We try to normalize things to order(1) by dividing things by some
    # characteristic scale for a given quantity.
    # See ``desc.objectives.compute_scaling_factors`` for examples.
    if self._normalize:
        scales = compute_scaling_factors(eq)
        # since the objective has units of T^4/m^2, the normalization here is
        # based on a characteristic field strength and minor radius.
        self._normalization = (
            scales["B"] ** 4 / scales["a"] ** 2
        )

    # finally, call ``super.build()``
    super().build(use_jit=use_jit, verbose=verbose)

The actual computation of the objective happens in compute method. This method is JIT-compiled (unless use_jit=False is passed to build method), so it should only contain JIT-compatible code. This method takes in the parameters of the thing(s) to be optimized, which is the dictionary form of the state vector such as R_lmn, Z_lmn, etc. for the Equilibrium object. Objectives with multiple things can have multiple parameters, one for each thing in self.things, in this case, the function signature would be compute(self, params_1, params_2, params3, ..., constants=None), see the PlasmaVesselDistance objective for an example of this. The constants argument is a dictionary of any other constant and usually set to None so that the self.constants are used.

def compute(self, params, constants=None):
    """Signature should take params (or possibly multiple params, one for each thing in self.things),
       which is the params_dict of the expected thing(s) to be optimized.
       It also takes in constants, which is a dictionary of any other constant data needed to compute
       the objective, and is usually none by default so the self.constants are used.

    Parameters
    ----------
    params : dict
        Dictionary of equilibrium degrees of freedom, eg Equilibrium.params_dict
    constants : dict
        Dictionary of constant data, eg transforms, profiles etc. Defaults to
        self.constants

    Returns
    -------
    f : ndarray
        Quasi-symmetry flux function error at each node (T^4/m^2).

    """
    if constants is None:
        constants = self.constants

    # here we get the physics quantities from ``desc.compute.utils._compute``
    data = compute_fun(
        "desc.equilibrium.equilibrium.Equilibrium",
        self._data_keys,                 # quantities we want
        params=params,                   # params from input containing the equilibrium R_lmn, Z_lmn, etc
        transforms=self._transforms,     # transforms and profiles from self.build
        profiles=self._profiles,
    )
    # next we do any additional processing, such as combining things,
    # averaging, etc. Here we just return the QS triple product f_T evaluated at each
    # node in the grid.
    f = data["f_T"]
    # this is all we need to do here. Applying objective weights/targets/bounds
    # is handled by the base _Objective class, as well as the normalizations to be unitless
    # and to make the objective value independent of grid resolution.
    return f

An example that is slightly more complex is shown below for computing the mirror ratio on each flux surface in the passed-in grid for an Equilibrium. (Some of the redundant comments from above are not repeated here)

from desc.objectives.objective_funs import _Objective
from desc.compute import get_profiles, get_transforms
from desc.compute.utils import _compute as compute_fun
from desc.grid import LinearGrid
from desc.integrals.surface_integral import surface_max, surface_min

class MirrorRatio(_Objective):
    """Target a particular value mirror ratio.

    The mirror ratio is defined as:

    (Bₘₐₓ - Bₘᵢₙ) / (Bₘₐₓ + Bₘᵢₙ)

    Where Bₘₐₓ and Bₘᵢₙ are the maximum and minimum values of ||B|| on a given surface.
    Returns one value for each surface in ``grid``.

    Parameters
    ----------
    eq : Equilibrium or OmnigenousField
        Equilibrium or OmnigenousField that will be optimized to satisfy the Objective.
    grid : Grid, optional
        Collocation grid containing the nodes to evaluate at. Defaults to
        ``LinearGrid(M=eq.M_grid, N=eq.N_grid)`` for ``Equilibrium``
        or ``LinearGrid(theta=2*eq.M_B, N=2*eq.N_x)`` for ``OmnigenousField``.

    """

    __doc__ = __doc__.rstrip() + collect_docs(
        target_default="``target=0.2``.",
        bounds_default="``target=0.2``.",
    )

    _coordinates = "r"  # Because the mirror ratio is a function of flux surface (rho) alone, we set
                        # _coordinates="r"
    _units = "(dimensionless)"
    _print_value_fmt = "Mirror ratio: "
    _static_attrs = _Objective._static_attrs + [] # list of strings of attribute names that should be considered static by jax, eg strings and booleans or anything used for control flow.

    def __init__(
        self,
        eq,
        *, # this just means all kwargs after this must be passed as kwargs, not as positional arguments
        grid=None,
        target=None,
        bounds=None,
        weight=1,
        normalize=True,
        normalize_target=True,
        loss_function=None,
        deriv_mode="auto",
        name="mirror ratio",
        jac_chunk_size=None,
    ):
        if target is None and bounds is None:
            target = 0.2 # default target value
        self._grid = grid
        super().__init__(
            things=eq,
            target=target,
            bounds=bounds,
            weight=weight,
            normalize=normalize,
            normalize_target=normalize_target,
            loss_function=loss_function,
            deriv_mode=deriv_mode,
            name=name,
            jac_chunk_size=jac_chunk_size,
        )

    def build(self, use_jit=True, verbose=1):
        """Build constant arrays.

        Parameters
        ----------
        use_jit : bool, optional
            Whether to just-in-time compile the objective and derivatives.
        verbose : int, optional
            Level of output.

        """
        eq = self.things[0]
        from desc.equilibrium import Equilibrium
        from desc.magnetic_fields import OmnigenousField

        # set defaults if grid is not passed in
        # Note that the grid has resolution in all three coordinates because
        # in order to compute mirror ratio (a flux surface quantity), we require
        # computation of |B| across the entire volume (i.e. poloidally and toroidally
        # on each flux surface) so that we can take the necessary min/maxes.
        if self._grid is None and isinstance(eq, Equilibrium):
            # default grid here only has rho=1.0 so a single flux surface, but the objective
            # is written generally for arbitrary number of surfaces
            grid = LinearGrid(
                M=eq.M_grid,
                N=eq.N_grid,
                NFP=eq.NFP,
                sym=eq.sym,
            )
        elif self._grid is None and isinstance(eq, OmnigenousField):
            # we have a different default grid when an OmnigenousField is the
            # object being optimized
            grid = LinearGrid(
                theta=2 * eq.M_B,
                N=2 * eq.N_x,
                NFP=eq.NFP,
            )
        else:
            grid = self._grid

        # because the mirror ratio is a flux-surface quantity, we will
        # in the end only be returning an array of size grid.num_rho, which is
        # the number of flux surfaces in our grid (i.e. the number of unique rho
        # values in the grid)
        self._dim_f = grid.num_rho

        # we will only need "|B|" to compute this quantity. For this example objective
        # we compute mirror ratio manually in the objective compute method, but one
        # may look at the List of Variables docs and see that you could also compute
        # "mirror ratio" as a key directly, but for the sake of demonstrating
        # functionality we compute it manually here
        self._data_keys = ["|B|"]

        timer = Timer()
        if verbose > 0:
            print("Precomputing transforms")
        timer.start("Precomputing transforms")

        profiles = get_profiles(self._data_keys, obj=eq, grid=grid)
        transforms = get_transforms(self._data_keys, obj=eq, grid=grid)
        self._constants = {
            "transforms": transforms,
            "profiles": profiles,
        }

        timer.stop("Precomputing transforms")
        if verbose > 1:
            timer.disp("Precomputing transforms")

        super().build(use_jit=use_jit, verbose=verbose)

    def compute(self, params, constants=None):
        """Compute mirror ratio.

        Parameters
        ----------
        params : dict
            Dictionary of equilibrium or field degrees of freedom,
            eg Equilibrium.params_dict
        constants : dict
            Dictionary of constant data, eg transforms, profiles etc. Defaults to
            self.constants

        Returns
        -------
        M : ndarray
            Mirror ratio on each surface.

        """
        if constants is None:
            constants = self.constants

        # we use this compute_fun to compute quantities inside of our
        # objective functions, as opposed to using `eq.compute`, because
        # we jit our objective compute functions and some logic inside of
        # `eq.compute` does not work under jit.
        data = compute_fun(
            # parameterization is the object (or type of object) that we are computing quantities with
            # i.e. ``desc.equilibrium.equilibrium.Equilibrium`` or just the Equilibrium object directly
            parameterization=self.things[0],
            # names is the names of the data (see List of Variables doc page) we want to compute
            names=self._data_keys,
            # params is the dict of the DOFs of the parameterization, and are used to compute these
            # quantities, i.e. for Equilibrium this contains R_lmn, Z_lmn, etc.
            params=params,
            # transforms and profiles pre-computed from the build method
            transforms=constants["transforms"],
            profiles=constants["profiles"],
        )
        # now data is a dictionary containing the key "|B|", the magnetic field evaluated at our grid
        # which is accessible through constants["transforms"]["grid"]


        # compute max and min of |B| on each flux surface in the grid
        ## we have utility functions which, given a quantity computed on a grid,
        ## return the max or min of that quantity on each coordinate surface
        max_tz_B = surface_max(
            grid=constants["transforms"]["grid"], x=data["|B|"], surface_label="rho"
        )  # also can be computed directly as "max_tz |B|" in our list of variables
        min_tz_B = surface_min(
            grid=constants["transforms"]["grid"], x=data["|B|"], surface_label="rho"
        )  # also can be computed directly as "min_tz |B|" in our list of variables

        # to avoid issues with array shapes, these two arrays (max_tz_B and min_tz_B)
        # are still the same shape as data["|B|"] i.e. still are 1-D arrays of length grid.num_nodes,
        # (i.e. there are data corresponding to nodes (rho, theta, zeta) = (1.0, 0, 0) and (1.0, pi, 0),
        # which have the same max_tz_B). This is useful if we, for instance, wanted to  multiply
        # a flux-surface quantity like max_tz_B with a non-flux-surface quantity like sqrt(g)
        # without needing to worry about shape mismatches.

        # However, since we only need these quantities on each flux surface from
        # here on out, we can use the grid.compress function to reduce these quantities
        # down to just the values at each unique rho surface
        max_tz_B = constants["transforms"]["grid"].compress(
            max_tz_B, surface_label="rho"
        )
        min_tz_B = constants["transforms"]["grid"].compress(
            min_tz_B, surface_label="rho"
        )
        # now max_tz_B and min_tz_B are just 1-D arrays of size grid.num_rho

        # Finally, compute the mirror ratio using the above max/min on each flux surface
        mirror_ratio = (max_tz_B - min_tz_B) / (min_tz_B + max_tz_B)

        return mirror_ratio # return the value of the objective

Converting to Cartesian coordinates

The above examples of quasi-symmetry and mirror ratio are quantities that are independent of the coordinate system. desc.compute.utils._compute always returns all vector quantities in toroidal coordinates \((R,\phi,Z)\). If you would prefer to work in Cartesian coordinates \((X,Y,Z)\) for any intermediate computations within your new objective, you will have to manually convert these vectors using the geometry utility functions rpz2xyz and/or rpz2xyz_vec. See the PlasmaVesselDistance objective for an example of this.

Adapting Existing Objectives with Different Loss Functions

If your desired objective is already implemented in DESC, but not in the correct form, a few different loss functions are available through the loss_function kwarg when instantiating an Objective, to modify the objective cost in order to adapt the objective to your desired purpose. For example, the DESC RotationalTransform objective with target=iota_target by default forms the residual by taking the target and subtracting it from the profile at the points in the grid, resulting in a residual of the form \(\iota_{err} = \sum_{i} (\iota_i - iota_{target})^2\), i.e. the residual is the sum of squared pointwise error between the current rotational transform profile and the target passed into the objective. If the desired objective instead is to optimize to target an average rotational transform of iota_target, we can adapt the RotationalTransform object by passing in loss_function="mean". The options available for the loss_function kwarg are [None,"mean","min","max","sum"], with None meaning using the usual default objective cost, while "mean" takes the average of the raw objective values (before subtracting the target/bounds or normalization), "min" takes the minimum, "max" takes the maximum, and "sum" takes the sum.