{ "cells": [ { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "# Perturbation Theory" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "In DESC, we employ perturbation techniques to find neighboring solutions. The basic ideas and outline of the relevant equations are given below. For full details and examples see our [paper](https://arxiv.org/abs/2203.15927)." ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Suppose we have a vector valued function $f(x,c)$, where we can interpret $f$ as a measure of equilibrium, $x$ as the optimization vector and $c$ as some additional parameter.\n", "\n", "Further suppose that we have $f(x,c) = 0$, so that we are at an equilibrium. We can then ask \"if we change the parameter $c$, how does $x$ need to change to maintain equilibrium?\"\n", "\n", "We can expand $f$ in a Taylor series:\n", "\n", "$$\n", "\\begin{align*}\n", "f(x+\\Delta x, c+\\Delta c) &= f(x,c) + \\frac{df}{dx}\\Delta x + \\frac{df}{dc}\\Delta c \\\\\n", "&+ \\frac{1}{2}\\frac{d^2f}{dx^2}\\Delta x \\Delta x + \\frac{1}{2}\\frac{d^2f}{dc^2}\\Delta c \\Delta c + \\frac{d^2f}{dxdc} \\Delta x \\Delta c \\\\\n", "&+ \\frac{1}{6}\\frac{d^3f}{dx^3}\\Delta x \\Delta x \\Delta x + \\frac{1}{6}\\frac{d^3f}{dc^3}\\Delta c \\Delta c \\Delta c + \\frac{1}{2}\\frac{d^3f}{dx^2 dc}\\Delta x \\Delta x \\Delta c + \\frac{1}{2} \\frac{d^3f}{dxdc^2}\\Delta x \\Delta c \\Delta c + ...\n", "\\end{align*}\n", "$$\n" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Where we assume that we know $\\Delta c$, the change in the parameters, and we want to find $\\Delta x$\n", "\n", "In general, this is a tensor polynomial equation for $\\Delta x$ which is difficult/impossible to solve (ie, just storing $\\frac{d^2f}{dx^2}$ and $\\frac{d^3f}{dx^3}$ in memory could take hundreds or thousands of GB). To get around this, we can further expand $\\Delta x$ and $\\Delta c$ in a perturbation series in powers of $\\epsilon$:\n", "\n", "$$\\Delta x = \\epsilon x_1 + \\epsilon^2 x_2 + \\epsilon^3 x_3$$\n", "\n", "$$\\Delta c = \\epsilon c_1$$" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Plugging this expansion in we get:\n", "\n", "$$\n", "\\begin{align*}\n", "f(x+\\Delta x, c+\\Delta c) &= f(x,c) + \\frac{df}{dx}\\epsilon x_1 + \\frac{df}{dx}\\epsilon^2 x_2 + \\frac{df}{dx}\\epsilon^3 x_3 + \\frac{df}{dc}\\epsilon c_1 \\\\\n", "&+ \\frac{1}{2}\\frac{d^2f}{dx^2}(\\epsilon^2 x_1^2 + 2 \\epsilon^3 x_1 x_2 + 2 \\epsilon^4 x_1 x_3) + \\frac{1}{2}\\frac{d^2f}{dc^2} (\\epsilon^2 c_1^2) + \\frac{d^2f}{dxdc} (\\epsilon^2 x_1 c_1 + \\epsilon^3 x_2 c_1 + \\epsilon^4 x_3 c_1) \\\\\n", "&+ \\frac{1}{6}\\frac{d^3f}{dx^3}(\\epsilon^3 x_1^3 + 3 \\epsilon^4 x_1^2 x_2 + 3 \\epsilon^5 x_1^2 x_3) + \\frac{1}{6}\\frac{d^3f}{dc^3}\\epsilon^3 c_1^3 \\\\\n", "&+ \\frac{1}{2}\\frac{d^3f}{dx^2 dc}(\\epsilon^3 x_1^2 c_1 + 2 \\epsilon^4 x_1 x_2 c_1 + 2 \\epsilon^5 x_1 x_3 c_1) \\\\\n", "&+ \\frac{1}{2}\\frac{d^3f}{dxdc^2}(\\epsilon^3 x_1 c_1^2 + \\epsilon^4 x_2 c_1^2 + \\epsilon^5 x_3 c_1^2)\n", "\\end{align*}\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Collecting terms of order $\\epsilon$ and setting equal to zero gives the first order term $x_1$:\n", "\n", "$$0 = \\frac{df}{dx}\\epsilon x_1 + \\frac{df}{dc}\\epsilon c_1$$\n", "\n", "$$x_1 = - \\left(\\frac{df}{dx}\\right)^{-1} \\left(\\frac{df}{dc}c_1 \\right)$$\n", "\n", "Where the inverse is meant in the least squares sense." ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Similary for 2nd order, $\\epsilon^2$:\n", "\n", "$$0 = \\frac{df}{dx}\\epsilon^2 x_2 + \\frac{1}{2}\\frac{d^2f}{dx^2}\\epsilon^2 x_1^2 + \\frac{1}{2}\\frac{d^2f}{dc^2}\\epsilon^2 c_1^2 + \\frac{d^2f}{dxdc}\\epsilon^2 x_1 c_1$$\n", "\n", "$$x_2 = -\\left( \\frac{df}{dx} \\right)^{-1} \\left(\\frac{1}{2}\\frac{d^2f}{dx^2} x_1^2 + \\frac{1}{2}\\frac{d^2f}{dc^2} c_1^2 + \\frac{d^2f}{dxdc} x_1 c_1 \\right)$$" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "Order $\\epsilon^3$:\n", "\n", "$$\n", "0 = \\frac{df}{dx}\\epsilon^3 x_3 + \\frac{d^2f}{dx^2} \\epsilon^3 x_1 x_2 + \\frac{d^2f}{dxdc}\\epsilon^3 x_2 c_1 + \\frac{1}{6}\\frac{d^3f}{dx^3}\\epsilon^3 x_1^3 + \\frac{1}{6}\\frac{d^3f}{dc^3}\\epsilon^3 c_1^3 + \\frac{1}{2}\\frac{d^3f}{dx^2 dc}\\epsilon^3 x_1^2 c_1 + \\frac{1}{2}\\frac{d^2f}{dx dc^2}\\epsilon^3 x_1 c_1^2\n", "$$\n", "\n", "$$\n", "x_3 = - \\left( \\frac{df}{dx} \\right) ^{-1} \\left( \\frac{d^2f}{dx^2} x_1 x_2 + \\frac{d^2f}{dxdc}x_2 c_1 + \\frac{1}{6}\\frac{d^3f}{dx^3}x_1^3 + \\frac{1}{6}\\frac{d^3f}{dc^3} c_1^3 + \\frac{1}{2}\\frac{d^3f}{dx^2 dc} x_1^2 c_1 + \\frac{1}{2}\\frac{d^2f}{dx dc^2} x_1 c_1^2 \\right)\n", "$$" ] }, { "cell_type": "markdown", "metadata": { "pycharm": { "name": "#%% md\n" } }, "source": [ "The advantage of this approach is that the higher order derivatives only ever appear multiplied by known quantities, so those terms can be efficiently computed as jacobian vector products (effectively a directional derivative) without needing to calculate the full derivative tensors. Additionally, the only operator that needs to be inverted is $\\frac{df}{dx}$, which is the regular jacobian of the objective and is already computed and inverted during optimization." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 4 }