Transforms

DESC is a pseudo-spectral code, where the dependent variables \(R\), \(Z\), \(\lambda\), as well as parameters such as the plasma boundary and profiles are represented by spectral basis functions. These parameters are interpolated to a grid of collocation nodes in real space. See the section on basis functions for more information.

Representing the parameters as a sum of spectral basis functions simplifies finding solutions to the relevant physics equations. This is similar to how the Fourier transform reduces a complicated operation like differentiation in real space to multiplication by frequency in the frequency space. A more relevant example is solving a partial differential equation (PDE) by expressing it as a linear combination of spectral basis functions, which transforms the PDE into a system of nonlinear algebraic equations. These equations are typically easier to solve numerically than the original PDE.

Once it is known which combination of basis functions in the spectral space compose the relevant parameters, such as the plasma boundary etc., these functions in the spectral space need to be transformed back to real space to better understand their behavior in real space.

The Transform class provides methods to transform between spectral and real space. Each Transform object contains a spectral basis and a grid.

`build()` and `transform(c)`

The build() method builds the matrices for a particular grid which define the transformation from spectral to real space. This is done by evaluating the basis at each point of the grid. Generic examples of this type of transformation are the inverse Fourier transform and a change of basis matrix for finite dimensional vector spaces.

The transform(c) method applies the resulting matrix to the given vector, \(\mathbf{c}\), which specify the coefficients of the basis associated with this Transform object. This transforms the given vector of spectral coefficients to real space values.

The matrices are computed for each derivative order specified when the Transform object was constructed. The highest derivative order at which to compute the transforms is specified by an array of three integers (one for each coordinate in \(\rho, \theta, \zeta\)) given as the derivs argument.

Define the transform matrix as \(A_{(d\rho,d\theta,d\zeta)}\) for the derivative of order \({(d\rho,d\theta,d\zeta)}\) (where each are integers). This matrix transforms a spectral basis evaluated on a certain grid with a given set of coefficients \(\mathbf{c}\) to real space values \(x\).

\[A\mathbf{c} = \mathbf{x}\]

\(\mathbf{c}\) is a vector of length Transform.basis.num_modes (the number of modes in the basis)
\(\mathbf{x}\) is a vector of length Transform.grid.num_nodes (the number of nodes in the grid)
\(A\) is a matrix of shape (num_nodes,num_modes).

As a simple example, if the basis is a Fourier series given by \(f(\zeta) = 2 + 4*cos(\zeta)\), and the grid is \(\mathbf{\zeta} =\begin{bmatrix}0\\ \pi\end{bmatrix}\), then

\[\begin{split}\mathbf{c}=\begin{bmatrix} 2\\ 4 \end{bmatrix}\end{split}\]

\[\begin{split}A_{(0, 0, 0)} = \begin{bmatrix} 1 & cos(0)\\ 1& cos(\pi) \end{bmatrix} = \begin{bmatrix} 1& 1\\ 1& -1 \end{bmatrix}\end{split}\]

\[\begin{split}A_{(0, 0, 0)}\mathbf{c} = \begin{bmatrix} 1& 1\\ 1& -1 \end{bmatrix} \begin{bmatrix} 2\\ 4 \end{bmatrix} = \begin{bmatrix} 6 \\ -2 \end{bmatrix}\end{split}\]

For FourierZernikeBasis, the construction of the matrix \(A\) is a bit more involved but the idea is the same. We write the summation in

\[R(\rho,\theta,\zeta) = \sum_{m=-M,n=-N,l=0}^{M,N,L} R_{lmn} \mathcal{Z}_l^m (\rho,\theta) \mathcal{F}^n(\zeta)\]

as a matrix vector multiply where each row of \(A\) is formed by evaluating \(\mathcal{Z}_l^m (\rho,\theta) \mathcal{F}^n(\zeta)\) for each mode \(lmn\) at a given collocation point with \((\rho, \theta, \zeta)\).

Improvement ideas: As of January 22 2025, this implementation is not the most optimum. We can use more efficient ways of computing transform from spectral to real space values. Some examples of these are

using FFT in both toroidal and poloidal direction.
using partial summation

For more details on these, check the issues on Github labeled as transforms.

`build_pinv()` and `fit(x)`

The build_pinv method builds the matrix which defines the pseudo-inverse (Moore–Penrose inverse) transformation shown by \(A^{\dagger}\) where \(A^{\dagger}A = \mathbb{I}\) (note that \(AA^{\dagger}\neq\mathbb{I}\), since in the code, we calculate \(A^{\dagger}\) as the left-inverse of \(A\)).

In particular, this is a transformation from real space values to coefficients of a spectral basis. Generic examples of this type of transformation are the Fourier transform and a change of basis matrix for finite dimensional vector spaces. As a continuation of the above transform, what we try to achieve is,

\[A \mathbf{c} = \mathbf{x}\]

\[A^{\dagger}A \mathbf{c} = A^{\dagger}\mathbf{x}\]

\[\mathbf{c} = A^{\dagger}\mathbf{x}\]

Any vector of values in real space can be represented as coefficients to some linear combination of a basis in spectral space. However, the basis of a particular Transform may not be able to exactly represent a given vector of real space values. In that case, the system \(A \mathbf{c} = \mathbf{x}\) would be inconsistent.

The fit(x) method applies \(A^{\dagger}\) to the vector \(\mathbf{x}\) of real space values. This yields the coefficients that best allow the basis of a Transform object to approximate \(\mathbf{x}\) in spectral space. The pseudo-inverse transform, \(A^{\dagger}\), applied to \(\mathbf{x}\) represents the least-squares solution for the unknown given by \(\mathbf{c}\) to the system \(A \mathbf{c} = \mathbf{x}\).

It is required from the least-squares solution, \(A^{\dagger} \mathbf{x}\), that

\[A^{\dagger} \mathbf{x} = \min_{∀ \mathbf{c}} \lvert A \mathbf{c} - \mathbf{x} \rvert \; \text{so that} \; \lvert A A^{\dagger} \mathbf{x} - \mathbf{x}\rvert \; \text{is minimized}\]

For this to be true, \(A A^{\dagger}\) must be the orthogonal projection onto the image of the transformation \(A\). It follows that

\[A A^{\dagger} \mathbf{x} - \mathbf{x} ∈ (\text{image}(A))^{\perp} = \text{kernel}(A^T)\]

\[\begin{split}\begin{align*} A^T (A A^{\dagger} \mathbf{x} - \mathbf{x}) &= 0 \\ A^T A A^{\dagger} \mathbf{x} &= A^T \mathbf{x} \\ A^{\dagger} &= (A^T A)^{-1} A^{T} \quad \text{if} \; A^TA \; \text{is invertible} \end{align*}\end{split}\]

Equivalently, if \(A = U S V^{T}\) is the singular value decomposition of the transform matrix \(A\), then

\[A^{\dagger} = V S^{+} U^{T}\]

where the diagonal of \(S^{+}\) has entries which are the reciprocals of the entries on the diagonal of \(S\), except that any entries in the diagonal with \(0\) for the singular value are kept as \(0\). (If there are no singular values corresponding to \(0\) and \(S\) is a square matrix, then \(S^{+}=S^{-1} \implies A^{\dagger}=A^{-1}\), and hence \(A^{-1}\) exists because there are no eigenvectors with eigenvalue \(0^{2}\)).

Transform build options

There are three different options from which the user can choose to build the transform matrix and its pseudoinverse.

Option 1: `direct1`

With this option, the transformation matrix is computed by directly evaluating the basis functions on the given grid. The computation of the pseudo-inverse matrix as discussed above is outsourced to JAX (implementation is very similar to our own function desc.utils.svd_inv_null). This option can handle arbitrary grids and uses the full matrices for the transforms (i.e. you can still specify to throw out the less significant singular values in the singular value decomposition). This makes direct1 robust. However, no simplifying assumptions are made, so it is likely to be the slowest.

The relevant code for this option builds the matrices exactly as discussed above.

To build the transform matrix for every combination of derivatives up to the given order:

for d in self.derivatives:
    self._matrices["direct1"][d[0]][d[1]][d[2]] = self.basis.evaluate(
        self.grid.nodes, d
    )

The transform(c) method for a specified derivative combination:

A = self.matrices["direct1"][dr][dt][dz]
return jnp.matmul(A, c)

To build the pseudo-inverse:

self._matrices["pinv"] = (
    jnp.linalg.pinv(A, rcond=rcond) if A.size else np.zeros_like(A.T)
)

The fit(x) method:

Ainv = self.matrices["pinv"]
c = jnp.matmul(Ainv, x)

Option 2: `direct2` and option 3: `fft`

Functions of the toroidal coordinate \(\zeta\) use Fourier series for their basis.

So, a Discrete Fourier Transform (DFT) can be used to transform real space values to spectral space for the pseudo-inverse matrix. Both direct2 and fft methods use this property. The difference is that direct2 creates a matrix \(A\) to compute DFT by direct evaluation. On the other hand, fft method use a Fast Fourier Transform (FFT) which is an efficient way to calculate Fourier Transform that reduces number of operations from \(\mathcal{O}(n^2)\) to \(\mathcal{O}(nlog(n))\) where n is the data size.

The way we implement it is, first divide the summation such that Zernike part and Fourier part is separated,

\[R(\rho,\theta,\zeta) = \sum_{l=0, m=-M}^{L, M} \mathcal{Z}_l^m (\rho,\theta) \sum_{n=-N}^{N} R_{lmn} \mathcal{F}^n(\zeta)\]

The inner summation can be computed using discrete Fourier transform (either FFT or direct evaluation). For the summation over lm, we again construct a matrix \(A\) by evaluating the Zernike polynomials at all collocation points but this time only for the unique lm modes (whereas previously we were “inefficiently” calculating same lm for multiple values of n). To take the discrete Fourier transform, we make some re-ordering and re-shaping. Once we get the transform, the summation becomes,

\[R(\rho,\theta,\zeta) = \sum_{l=0, m=-M}^{L, M} \mathcal{Z}_l^m (\rho,\theta) c_{lmn}\]

\[R(\rho,\theta,\zeta) = A_{Zernike} \mathbf{c}_{fft}\]

Future work

We can use FFT in poloidal direction, too. This might require some tweaking of the grid points, especially for the stellarator symmetric grids.

There are many Github issues on transforms and this notebook will be updated regularly.

Transforms

build() and transform(c)

build_pinv() and fit(x)