What Is the Complex Step Approximation?

In many situations we need to evaluate the derivative of a function but we do not have an explicit formula for the derivative. The complex step approximation approximates the derivative (and the function value itself) from a single function evaluation. The catch is that it involves complex arithmetic.

For an analytic function $f$ we have the Taylor expansion

$\notag \qquad\qquad\qquad\qquad f(x + \mathrm{i}h) = f(x) + \mathrm{i}h f'(x) - h^2\displaystyle\frac{f''(x)}{2} + O(h^3), \qquad\qquad\qquad\qquad(*)$

where $\mathrm{i} = \sqrt{-1}$ is the imaginary unit. Assume that $f$ maps the real line to the real line and that $x$ and $h$ are real. Then equating real and imaginary parts in $(*)$ gives $\mathrm{Re} f(x+\mathrm{i}h) = f(x) + O(h^2)$ and $\mathrm{Im} f(x+\mathrm{i}h) = hf'(x) + O(h^3)$ . This means that for small $h$ , the approximations

$\notag f(x) \approx \mathrm{Re} f(x+\mathrm{i}h), \quad f'(x) \approx \mathrm{Im} \displaystyle\frac{f(x+\mathrm{i}h)}{h}$

both have error $O(h^2)$ . So a single evaluation of $f$ at a complex argument gives, for small $h$ , a good approximation to $f'(x)$ , as well as a good approximation to $f(x)$ if we need it.

The usual way to approximate derivatives is with finite differences, for example by the forward difference approximation

$\notag f'(x) \approx \displaystyle\frac{f(x+h) - f(x)}{h}.$

This approximation has error $O(h)$ so it is less accurate than the complex step approximation for a given $h$ , but more importantly it is prone to numerical cancellation. For small $h$ , $f(x+h)$ and $f(x)$ agree to many significant digits and so in floating-point arithmetic the difference approximation suffers a loss of significant digits. Consequently, as $h$ decreases the error in the computed approximation eventually starts to increase. As numerical analysis textbooks explain, the optimal choice of $h$ that balances truncation error and rounding errors is approximately

$\notag h_{\mathrm{opt}} = 2\Bigl|\displaystyle\frac{u f(x)}{f''(x))} \Bigr|^{1/2},$

where $u$ is the unit roundoff. The optimal error is therefore of order $u^{1/2}$ .

A simple example illustrate these ideas. For the function $f(x) = \mathrm{e}^x$ with $x = 1$ , we plot in the figure below the relative error for the finite difference, in blue, and the relative error for the complex step approximation, in orange, for $h$ ranging from about $10^{-5}$ to $10^{-11}$ . The dotted lines show $u$ and $u^{1/2}$ . The computations are in double precision ( $u \approx 1.1\times 10^{-16}$ ). The finite difference error decreases with $h$ until it reaches about $h_{\mathrm{opt}} = 2.1\times 10^{-8}$ ; thereafter the error grows, giving the characteristic V-shaped error curve. The complex step error decreases steadily until it is of order $u$ for $h \approx u^{1/2}$ , and for each $h$ it is about the square of the finite difference error, as expected from the theory.

Remarkably, one can take $h$ extremely small in the complex step approximation (e.g., $h = 10^{-100}$ ) without any ill effects from roundoff.

The complex step approximation carries out a form of approximate automatic differentiation, with the variable $h$ functioning like a symbolic variable that propagates through the computations in the imaginary parts.

The complex step approximation applies to gradient vectors and it can be extended to matrix functions. If $f$ is analytic and maps real $n\times n$ matrices to real $n\times n$ matrices and $A$ and $E$ are real then (Al-Mohy and Higham, 2010)

$\notag L_f(A,E) \approx \mathrm{Im} \displaystyle\frac{f(A+\mathrm{i}hE)}{h},$

where $L_f(A,E)$ is the Fréchet derivative of $f$ at $A$ in the direction $E$ . It is important to note that the method used to evaluate $f$ must not itself use complex arithmetic (as methods based on the Schur decomposition do); if it does, then the interaction of those complex terms with the much smaller $\mathrm{i}hE$ term can lead to damaging subtractive cancellation.

The complex step approximation has also been extended to higher derivatives by using “different imaginary units” in different components (Lantoine et al., 2012).

Here are some applications where the complex step approximation has been used.

Sensitivity analysis in engineering applications (Giles et al., 2003).
Approximating gradients in deep learning (Goodfellow et al., 2016).
Approximating the exponential of an operator in option pricing (Ackerer and Filipović, 2019).

Software has been developed for automatically carrying out the complex step method—for example, by Shampine (2007).

The complex step approximation has been rediscovered many times. The earliest published appearance that we are aware of is in a paper by Squire and Trapp (1998), who acknowledge earlier work of Lyness and Moler on the use of complex variables to approximate derivatives.

References

This is a minimal set of references, which contain further useful references within.

Awad H. Al-Mohy and Nicholas J. Higham, The Complex Step Approximation to the Fréchet Derivative of a Matrix Function, Numer. Algorithms 53, 133–148, 2010.
Damien Ackerer and Damir Filipović, Option Pricing with Orthogonal Polynomial Expansions, Mathematical Finance 30, 47–84, 2019.
Michael B. Giles, Mihai C. Duta, Jens-Dominik Möuller, and Niles A. Pierce, Algorithm Developments for Discrete Adjoint Methods, AIAA Journal 4(2), 198–205, 2003.
Ian Goodfellow, Yoshua Bengio, and Aaron Courville, Deep Learning, MIT Press, 2016. Page 434.
Gregory Lantoine, Ryan P. Russell P., and Thierry Dargent, Using Multicomplex Variables for Automatic Computation of High-Order Derivatives, ACM Trans. Math. Software 38, 16:1–16:21, 2012.
L. F. Shampine, Accurate Numerical Derivatives in MATLAB, ACM Trans. Math. Software 33,26:1–26:17, 2007.
W. Squire and G. E. Trapp (1998), Using Complex Variables to Estimate Derivatives of Real Functions, SIAM Rev., 40(1), 110–112.

What Is the Complex Step Approximation? | Nick Higham

What Is the Complex Step Approximation?

References

Recommend

Michael J. D. Powell (1936–2015)

What Is a Modified Cholesky Factorization? | Nick Higham

Demystifying Dispatchers in Akka

Introducing Transparent Traits in Scala 3

How to Persist and Sharing Data in Docker

Loading JSON data into Snowflake

Data-Driven Approach to Your Cloud Migration Journey

A Quick Demo: Kafka to Flink to Cassandra

Exploring Eval Monad in Cats

Two-Way SMS Using Amazon Pinpoint

About Joyk