Jun 23, 2014

Zeros of the Laguerre Polynomial

The zeros of the Laguerre polynomials $L_n(x)$ are useful in the Gauss-Laguerre quadrature: \[
\int_0^\infty \exp(-x) \; f(x) = \sum_{k=1}^n w_k f(x_k) + \frac{n!^2}{(2n)!} f^{(2n)}(\xi)
\] where $x_k$ is the $k$-th root of the Laguerre polynomial, $0 < x_1 < x_2 < \cdots < x_n$ and \[
w_k = \frac{x_k}{(n+1)^2 \left[ L_{n+1}(x) \right]^2}.
\] The figure below is a illustration.

Fig. 1, A illustration of the Gauss-Laguerre quadrature. The widths of the boxes equal the quadrature weights. The area of the boxes approximates the area of the function.

The general strategy to compute the zeros of a function numerically is to get a decent analytical estimation and then improve it with Newton iteration or similar methods.

The Laguerre polynomial $L_n(x)$ has $n$ real non-degenerate zeros. These zeros lies within $\frac{1.20241}{2n+1} < x_1 < \cdots < x_n < 2n-2+\sqrt{1+4(n-1)^2 \cos^2\frac{\pi}{n+1}}$ according to Ismail and Li (1992) [1]. Individual zero can also be bounded $ \frac{j^2_{0,k}}{4n+2} < x_k < \frac{2k+1}{2n+1}(2k+1+\sqrt{(2k+1)^2+\frac{1}{4}})$, where $j_{n,m}$ is the $m$-th zero of the Bessel function $J_n(x)$ Abramowitz and Stegun 22.16 [2]. \[
j_{n,m} = \theta_{n,m} - \frac{4n^2-1}{8\theta_{n,m}} - \frac{4(4n^2-1)(28n^2-31)}{3(8\theta_{n,m})^3} - \frac{32(4n^2-1)(1328n^4-3928n^2+3779)}{15(8\theta_{n,m})^5} - \mathcal O(\theta_{n,m}^7)
\] where $\theta_{n,m} = (m+\frac{n}{2}-\frac{1}{4})\pi$. Finally, a fairly decent estimation [3], \[
x_k = \frac{j_{0,k}^2}{4n+2}\left( 1 + \frac{j^2_{0,k}-2}{12(2n+1)^2} \right) + \mathcal O(1/n^5).
\] Numerical recipe provides another (poor but somewhat useful) estimation [4] \[
x_1 = \frac{3}{1+2.4 n}, x_2 = x_1 + \frac{15}{1+0.9 n}, x_k = x_{k-1} + \frac{1+2.55(k-2)}{1.9 (k-2)}.
Fig. 2, Estimating the zeros of $L_{16}(x)$ using various results.

Once we obtain the initial estimations, we can use adaptive method to update them. There are several methods available in the market [5]. Newton-Raphson iteration is the most popular one. Secant method, Halley's method and Schroeder's method are its close cousins. Given the lower and upper bounds, the binary search (or bisection method) is also useful. Brent's method is a combination of the secant method and binary search. Laguerre's method is also a valuable method designed particularly for polynomials. Jenkins-Traub algorithm is a more sophisticated method for polynomials.

Table: root-finding algorithms used by some popular softwares. (Taken from Mekwi 2001 [5])

In this particular problem, we want to obtain the zeros of the Laguerre polynomial. The Newton procedure is \[
x \to x - \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x))}.
\] Laguerre polynomial oscillates very rapidly at large $n$ and $x$.  This causes a severe overshoot problem in the iteration process at large $k$ where the estimation formula works poorly. The problem can be partly cured by including a damping function $\rho(x) = \exp{-x/2}$. Then the Newton iteration becomes \[
x \to x - \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x)) - \frac{1}{2}x L_n(x)}.
\] In general, one can tune the iteration procedure with relaxation parameters: \[
x \to x - \omega \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x)) - \lambda x L_n(x)}.
\] However, the problem persists at large $n$. This problem is not limited to the Newton method. The fundamental issue is that the estimation formula becomes less accurate at large $k$. It is more likely to start from the wrong "valley" or hit small $L'_n(x)$. Both cases result in converging to the wrong zero.

Fig. 3a, Newton iteration.
Fig. 3b,  Newton iteration improved by relaxation parameters.

One simple solution of this problem is to lower the order of the polynomial as more and more zeros are found. Suppose we have already got the first $m, (m<n)$ zeros. Let $L_n(x) = (x-x_1)\cdots(x-x_m)p_m(x)$. Then the rest of the zeros of $L_n(x)$ contains in $p_m(x)$. $p_m$ is a lower order polynomial and behaves much smoother than $L_n$. We shall use $p_m$ to do the Newton iteration instead of $L_n$. The iteration procedure becomes \[
 x \to x - \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x)) -  L_n(x)\sum_{i=1}^{m}\frac{x}{x-x_i} }
\] Or even, \[
 x \to x - \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x)) -  L_n(x)\sum_{i=1,i\ne m}^{n}\frac{x}{x-x_i} }
\] This method can also be tuned with relaxation parameters: \[
 x \to x - \frac{x L_n(x)}{n(L_n(x) - L_{n-1}(x)) -  L_n(x)\left(\lambda x + \mu \sum_{i=1,i\ne m}^{n}\frac{x}{x-x_i}\right) }

This method is in fact share the same spirit (and of course weakness) with the Bairstow's method, which works with generic polynomial thus divides the main polynomial by $x^2 + ax + b$. This method is essentially the Aberth-Ehrlich's method [6]. (Of course it has been known before!)

Note that we always express everything (derivatives) in terms of the Laguerre polynomials, which are evaluated from the recurrence relations: \[
L_0(x) = 1, \; L_1(x) = 1-x, \quad L_{n+1}(x) = \frac{2n+1-x}{n+1}L_n(x) - \frac{n}{n+1} L_{n-1}(x)
\] To incorporate the damping factor $\exp(-x/2)$, we can add it to $L_0 \to \exp(-x/2)$ and $L_1 \to (1-x) \exp(-x/2)$. Alternatively, one can compute the ratio $L_n(x)/L'_n(x) = \frac{x}{n(1-r_n)}$, $r_n \equiv \frac{L_{n-1}(x)}{L_n(x)} $ directly: \[
r_1 = \frac{1}{1-x}, \; r_{n}^{-1}(x) = \frac{2n-1-x}{n} - \frac{n-1}{n} r_{n-1}
\] which can be written as a (finite) continued fraction: \[
n r_n(x) = \frac{n^2}{2n-1-x-}\frac{(n-1)^2}{2n-3-x-}\cdots \frac{1}{1-x}, \; \frac{L_n(x)}{L'_n(x)} =\frac{x}{n-n r_n(x)}

Below are some applications of this method comparing with the zeros obtained from Mathematica's NSolve function. This method shows more stability at large $n$ and $k$.


[1] M.E.H. Ismail, X. Li, Bounds on the extreme zeros of orthogonal polynomials, Proc. Amer. Math. Soc., 115 (1992), pp. 131–140.

[2]  M. Abramowitz and I. Stegun, eds. (1972), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover Publications, 22.16.

[3] M. Abramowitz and I. Stegun, eds. (1972), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover Publications, 9.5.12.

[4] W. H. Press, S. A. Teukolsky, W. T. Wetterling and B. P. Flannery, Numerical Recipes: The Art of Scientific Computing (2007), 3rd Edition, Cambridge University Press.

[5] Wankere R. Mekwi, Iterative Methods for Roots of Polynomials, MSc thesis (2001), University of Oxford, Trinity.

[6] O. Aberth,  Iteration methods for finding all zeros of a polynomial simultaneously, Math. Comp. 27, pp. 339-344 (1973); L. W. Ehrlich, A modified Newton method for polynomials, Comm. ACM 10, pp. 107-108, (1967).

Jun 20, 2014

Numerical Differentiation

Let $f(x)$ be a smooth function. We want to take its derivative at some point $x$, numerically. In this post, we assume the function is sampled on an equal length mesh. $h$ is the step, $0 \le h \ll 1$. We consider the leading order and higher order methods. Here by order we mean the order of residual error, $\mathcal O(h^n)$.

Leading Order Finite Differences

There are three standard way: the forward difference, the backward difference and the central (or symmetric) difference. They are defined as following: \[
f'(x) \simeq \frac{f(x+h)-f(x)}{h} + \mathcal O (h), \\
f'(x) \simeq \frac{f(x)-f(x-h)}{h} + \mathcal O (h), \\
f'(x) \simeq \frac{f(x+h)-f(x-h)}{2h} + \mathcal O (h^2)
\] The numerical accuracy can be seen from the Taylor expansion: \[
f(x+h) = f(x) + h f'(x) + \frac{h^2}{2}f''(x) + \frac{h^3}{3!}f'''(x) + ... \\
f(x-h) = f(x) - h f'(x) + \frac{h^2}{2}f''(x) - \frac{h^3}{3!}f'''(x) + ....
\] This method, used recursively, also produces higher order numerical differentiation. Central difference for example: \[
f^{(n)}(x) \simeq \frac{f^{(n-1)}(x+h) - f^{(n-1)}(x-h)}{2h}
\] Therefore [theorem 1], \[
f^{(n)}(x) = \frac{1}{h^n}\sum_{k = 0}^n {n \choose k} (-1)^k f(x+(n/2-k)h) + \frac{n h^2}{4!}f^{(n+2)}(\xi);
\] The accuracy can be proven by substituting in the Taylor expansion. Just note (Stirling number cf. The Stirling number S(n,k) is the number of ways to partition a set of n objects into k groups.) \[
S(m,n) \equiv \frac{1}{n!}\sum_{k=0}^n {n \choose k} (-1)^{n-k} k^m = \left\{ \begin{array}{ll} 0, & m<n \\ 1, &m=n \\  n(n+1)/2, &m=n+1\\ S(n-1,m-1) + m S(n-1,m), & \\ \end{array}\right. \] When $n = 1 \bmod 2$, the meshes are on half integers. One can use $h \to 2h$ to recover the integral grid. The series sometimes is defined as an operator known as the central difference operator (Abramowitz and Stegun, $\S$25.1, "Differences"): \[
\delta_n^{2k} f = \sum_{j=0}^{2k} (-1)^j { 2k \choose j} f_{n+k-j}, \\
\delta_{n+\frac{1}{2}}^{2k+1} f = \sum_{j=0}^{2k+1} (-1)^j { 2k+1 \choose j} f_{n+k+1-j}.
\] Then our theorem 1 can be written as: $ \frac{d^n}{d x^n} f(x) = \frac{\delta_0^{n}}{\Delta x^n}  f(x) + \mathcal O(\Delta x^2)$.

High Order Finite Differences

In the previous section, we have discussed approximate the differentiation numerically up to leading order $\mathcal O(h)$. The higher order numerical differentiations are rarely used but nevertheless helpful to know. Inspired by the central difference series, in general, we can approximate the $n$-th order derivative by series \[
f^{(n)}(x) = \sum_k c_k^{(n,r)} f(x + k h) + \mathcal O(h^r)
\] Obviously, the larger $n$ and/or $r$ is, the more terms we need. We can use the Lagrange polynomial to approximate the function $f(x)$ up to $f^{(n+r)}(\xi)$ and then take the $n$-th order derivative of the Lagrange polynomial. Then \[
f^{(n)} (x) = \sum_k l_k^{(n)}(x) f(x+kh),
\] where $l_i(x) = \prod_{j\ne i} \frac{x-x_j}{x_i-x_j}$.

example: the finite-difference approximation of the first order derivative ($f_n = f(x+nh)/h$): \[
f'(x) \simeq  \sum_{k=1}^{n} \frac{(-1)^{k-1} (n)_k}{(n+1)^{(k)}}\frac{f(x+kh) - f(x-kh)}{kh},
\] where $(x)_k = x(x-1)\cdots(x-k+1)$ is the falling factorial and $x^{(k)} = x(x+1)\cdots(x+k-1)$ is the rising factorial.
second order derivatives ($f_n = f(x+nh)/h^2$): \[
f''(x) \simeq 2\sum_{k=1}^n \frac{(-1)^{k-1} (n)_k}{(n+1)^{(k)}} \frac{ f(x+kh)+f(x-kh) }{k^2h^2} -  \frac{2H_{n,2}}{ h^2} f(x),
\] where $H_{n,2} = \sum_{k=1}^n \frac{1}{k^2}$ is called harmonic number.
third order derivatives ($f_n = f(x+nh)/h^3$):

In Mathematica, function InterpolationPolynomial[] provides the Lagrange polynomial. The coefficients can be computed as following:
L[K_] := InterpolatingPolynomial[
  Table[{k, Subscript[f, k]}, {k, -K, K}], x]

fd[K_, n_] := Collect[#, Variables[#]] & @(D[L[K], {x, n}] /. x -> 0)