Fractals/Mathematics/Newton method

applications of Newton's method in case of complex quadratic polynomialsEdit

Introduction[1][2]

recurrence relationsEdit

Define the iterated function f^n by:

\begin{align}
f^0 (z, c) &= z \\
f^1 (z, c) &= z^2 + c \\
f^{m+1} (z, c) &= f (f^m (z, c), c)
\end{align}

Pick arbitrary names for the iterated function and derivatives:

\begin{align}
A_m &= f^m \\
B_m &= \dfrac{\partial}{\partial z} f^m \\
C_m &= \dfrac{\partial}{\partial z} \dfrac{\partial}{\partial z} f^m \\
D_m &= \dfrac{\partial}{\partial c} f^m \\
E_m &= \dfrac{\partial}{\partial c} \dfrac{\partial}{\partial z} f^m
\end{align}

These derivatives can be computed by recurrence relations.

Initial conditions:

\begin{align}
A_0 &= z \\
B_0 &= 1 \\
C_0 &= 0 \\
D_0 &= 0 \\
E_0 &= 0
\end{align}

Recurrence steps:

\begin{align}
A_{m+1} &= A_m ^ 2 + c \\
B_{m+1} &= 2 A_m B_m \\
C_{m+1} &= 2 (B_m ^ 2 + A_m C_m) \\
D_{m+1} &= 2 A_m D_m + 1 \\
E_{m+1} &= 2 (A_m E_m + B_m D_m)
\end{align}

derivation of recurrence relationsEdit

Derivation of A_0:

\begin{align}
  & A_0 & \\
= & & \text{ ( definition of }A\text{ ) } \\
  & f^0 (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & z &
\end{align}

Derivation of B_0:

\begin{align}
  & B_0 & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & \dfrac{\partial}{\partial z} f^0 (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & \dfrac{\partial}{\partial z} z & \\
= & & \text{ ( derivative ) } \\
  & 1 &
\end{align}

Derivation of C_0:

\begin{align}
  & C_0 & \\
= & & \text{ ( definition of }C\text{ ) } \\
  & \dfrac{\partial}{\partial z} \dfrac{\partial}{\partial z} f^0 (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & \dfrac{\partial}{\partial z} \dfrac{\partial}{\partial z} z & \\
= & & \text{ ( derivative ) } \\
  & \dfrac{\partial}{\partial z} 1 & \\
= & & \text{ ( derivative ) } \\
  & 0 &
\end{align}

Derivation of D_0:

\begin{align}
  & D_0 & \\
= & & \text{ ( definition of }D\text{ ) } \\
  & \dfrac{\partial}{\partial c} f^0 (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & \dfrac{\partial}{\partial c} z & \\
= & & \text{ ( derivative ) } \\
  & 0 &
\end{align}

Derivation of E_0:

\begin{align}
  & E_0 & \\
= & & \text{ ( definition of }E\text{ ) } \\
  & \dfrac{\partial}{\partial c} \dfrac{\partial}{\partial z} f^0 (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & \dfrac{\partial}{\partial c} \dfrac{\partial}{\partial z} z & \\
= & & \text{ ( derivative ) } \\
  & \dfrac{\partial}{\partial c} 1 & \\
= & & \text{ ( derivative ) } \\
  & 0 &
\end{align}

Derivation of A_{m+1}:

\begin{align}
  & A_{m+1} & \\
= & & \text{ ( definition of }A\text{ ) } \\
  & f^{m+1} (z, c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & f (f^m (z, c), c) & \\
= & & \text{ ( definition of }A\text{ ) } \\
  & f (A_m , c) & \\
= & & \text{ ( definition of }f\text{ ) } \\
  & A_m ^ 2 + c &
\end{align}

Derivation of B_{m+1}:

\begin{align}
  & B_{m+1} & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & \dfrac{\partial}{\partial z} A_{m+1} & \\
= & & \text{ ( definition of }A\text{ ) } \\
  & \dfrac{\partial}{\partial z} (A_m ^ 2 + c) & \\
= & & \text{ ( distributivity ) } \\
  & \dfrac{\partial}{\partial z} A_m ^ 2 + \dfrac{\partial}{\partial z} c & \\
= & & \text{ ( constant derivative ) } \\
  & \dfrac{\partial}{\partial z} A_m ^ 2 + 0 & \\
= & & \text{ ( zero ) } \\
  & \dfrac{\partial}{\partial z} A_m ^ 2 & \\
= & & \text{ ( chain rule ) } \\
  & 2 A_m (\dfrac{\partial}{\partial z} A_m) & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & 2 A_m B_m &
\end{align}

Derivation of C_{m+1}:

\begin{align}
  & C_{m+1} & \\
= & & \text{ ( definition of }C\text{ ) } \\
  & \dfrac{\partial}{\partial z} B_{m+1} & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & \dfrac{\partial}{\partial z} (2 A_m B_m) & \\
= & & \text{ ( linearity ) } \\
  & 2 \dfrac{\partial}{\partial z} (A_m B_m) & \\
= & & \text{ ( product rule ) } \\
  & 2 ( (\dfrac{\partial}{\partial z} A_m) B_m + A_m (\dfrac{\partial}{\partial z} B_m) ) & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & 2 ( B_m B_m + A_m (\dfrac{\partial}{\partial z} B_m) ) & \\
= & & \text{ ( algebra ) } \\
  & 2 ( B_m ^ 2 + A_m (\dfrac{\partial}{\partial z} B_m) ) & \\
= & & \text{ ( definition of }C\text{ ) } \\
  & 2 ( B_m ^ 2 + A_m C_m ) &
\end{align}

Derivation of D_{m+1}:

\begin{align}
  & D_{m+1} & \\
= & & \text{ ( definition of }D\text{ ) } \\
  & \dfrac{\partial}{\partial c} A_{m+1} & \\
= & & \text{ ( definition of }A\text{ ) } \\
  & \dfrac{\partial}{\partial c} (A_m ^ 2 + c) & \\
= & & \text{ ( distributivity ) } \\
  & \dfrac{\partial}{\partial c} A_m ^ 2 + \dfrac{\partial}{\partial c} c & \\
= & & \text{ ( derivative ) } \\
  & \dfrac{\partial}{\partial c} A_m ^ 2 + 1 & \\
= & & \text{ ( chain rule ) } \\
  & 2 A_m (\dfrac{\partial}{\partial c} A_m) + 1 & \\
= & & \text{ ( definition of }D\text{ ) } \\
  & 2 A_m D_m + 1 &
\end{align}

Derivation of E_{m+1}:

\begin{align}
  & E_{m+1} & \\
= & & \text{ ( definition of }E\text{ ) } \\
  & \dfrac{\partial}{\partial c} B_{m+1} & \\
= & & \text{ ( definition of }B\text{ ) } \\
  & \dfrac{\partial}{\partial c} (2 A_m B_m) & \\
= & & \text{ ( linearity ) } \\
  & 2 \dfrac{\partial}{\partial c} (A_m B_m) & \\
= & & \text{ ( product rule ) } \\
  & 2 ( A_m (\dfrac{\partial}{\partial c} B_m) + (\dfrac{\partial}{\partial c} A_m) B_m ) & \\
= & & \text{ ( definition of }E\text{ ) } \\
  & 2 ( A_m E_m + (\dfrac{\partial}{\partial c} A_m) B_m ) & \\
= & & \text{ ( definition of }D\text{ ) } \\
  & 2 ( A_m E_m + D_m B_m ) &
\end{align}

centerEdit

A center n of period p satisfies f^p (0, n) = 0. Applying Newton's method in one variable:

\begin{align}
n_0 &= \text{ initial estimate } \\
n_{m+1} &= n_m - \frac{f^p(0, n_m)}{\dfrac{\partial }{\partial c}f^p(0, n_m)}
\end{align}

or equivalently:

\begin{align}
n_0 &= \text{ initial estimate } \\
n_{m+1} &= n_m - \frac{A_p (0, n_m)}{D_p (0, n_m)}
\end{align}

boundaryEdit

The boundary of the component with center n of period p can be parameterized by internal angles.


The boundary point c = b at angle t (measured in turns) satisfies system of 2 equations :


\begin{cases}
f^p(w, b) - w &= 0 \\
\dfrac{\partial }{\partial z}f^p(w, b) - e^{2 \pi i t} &= 0
\end{cases}

Defining a function of two complex variables:

\begin{align}
g \begin{pmatrix} z \\ c \end{pmatrix} &= \begin{pmatrix} f^p(z, c) - z \\ \dfrac{\partial }{\partial z}f^p(z, c) - e^{2 \pi i t} \end{pmatrix}
\end{align}

and applying Newton's method in two variables:

\begin{align}
\boldsymbol{v}_0 = \begin{pmatrix} n \\ n \end{pmatrix} \\
J_g(\boldsymbol{v}_m) (\boldsymbol{v}_{m+1} - \boldsymbol{v}_m) = -g(\boldsymbol{v}_m)
\end{align}

where

\begin{align}
J_g = \begin{pmatrix}
\dfrac{\partial}{\partial z} (f^p(z, c) - z) & 
\dfrac{\partial}{\partial c} (f^p(z, c) - z) \\
\dfrac{\partial}{\partial z} (\dfrac{\partial }{\partial z}f^p(z, c) - e^{2 \pi i t}) &
\dfrac{\partial}{\partial c} (\dfrac{\partial }{\partial z}f^p(z, c) - e^{2 \pi i t})
\end{pmatrix}
\end{align}

This can be expressed using the recurrence relations as:

\begin{align}
\begin{pmatrix} w_0 \\ b_0 \end{pmatrix} &= \begin{pmatrix} n \\ n \end{pmatrix} \\
\begin{pmatrix} B_p - 1 & D_p \\ C_p & E_p \end{pmatrix} \begin{pmatrix} w_{m+1} - w_m \\ b_{m+1} - b_m \end{pmatrix} &= - \begin{pmatrix} A_p - w_m \\ B_p - e^{2 \pi i t} \end{pmatrix}
\end{align}

where \{A,B,C,D,E\}_p are evaluated at (w_m, b_m).

ExampleEdit

"The process is for numerically finding one boundary point on one component at a time - you can't solve for the whole boundary of all the components at once. So pick and fix one particular nucleus n and one particular internal angle t " (Claude Heiland-Allen [3])

Lets try find point of boundary ( bond point )

c = b

of hyperbolic component of Mandelbrot set for :

  • period p=3
  • angle t=0
  • center ( nucleus ) c = n = -0.12256116687665+0.74486176661974i

There is only one such point.

Initial estimates (first guess) will be center ( nucleus ) c = n of this components. This center ( complex number) will be saved in a vector of complex numbers containing two copies of that nucleus:

\begin{align}
\boldsymbol{v}_0 = 
\begin{pmatrix} 
n\\
n
 \end{pmatrix}
=
\begin{pmatrix} 
-0.12256116687665+0.74486176661974i\\
-0.12256116687665+0.74486176661974i 
 \end{pmatrix}

\end{align}



The boundary point at angle t (measured in turns) satisfies system of 2 equations :


\begin{cases}
f^3(z, c) - z &= 0 \\
\dfrac{\partial }{\partial z}f^3(z, c) - e^{2 \pi i 0} &= 0
\end{cases}

so :


\begin{cases}
z^8 +4cz^6 +6c^2z^4+2cz^4 +4c^3z^2 +4c^2z^2 -z+c^4 +2c^3 +c^2+c = 0 \\
8z^7 +24cz^5 +24c^2z^3 +8cz^3 +8c^3z +8c^2z-1 = 0
\end{cases}

Then function g

\begin{align}
g(\boldsymbol{v}) = \begin{cases}
z^8 +4cz^6 +6c^2z^4+2cz^4 +4c^3z^2 +4c^2z^2 -z+c^4 +2c^3 +c^2+c  \\
8z^7 +24cz^5 +24c^2z^3 +8cz^3 +8c^3z +8c^2z-1 
\end{cases}
\end{align}


Jacobian matrix J is


J_g(\boldsymbol{v}) = 
\begin{pmatrix}
8z^7+24cz^5+24c^2z^3+8cz^3+8c^3z+8c^2z-1 & 4z^6+12cz^4+2z^4+12c^2z^2+8cz^2+4c^3+6c^2+2c+1\\
56z^6+120cz^4+72c^2z^2+24cz^2+8c^3+8c^2 & 24z^5+48cz^3+8z^3+24c^2z+16cz
\end{pmatrix}


J^{-1}_g(\boldsymbol{v}) = 
\begin{pmatrix}
\frac{24z^5+48cz^3+8z^3+24c^2z+16cz}{d}&
\frac{-4*z^6-12*c*z^4-2*z^4-12*c^2*z^2-8*c*z^2-4*c^3-6*c^2-2*c-1}{d}\\
\frac{-56*z^6-120*c*z^4-72*c^2*z^2-24*c*z^2-8*c^3-8*c^2}{d}&
\frac{8*z^7+24*c*z^5+24*c^2*z^3+8*c*z^3+8*c^3*z+8*c^2*z-1}{d}
\end{pmatrix}


where denominator d is :

d = (24z^5+48cz^3+8z^3+24c^2z+16cz)(8z^7+24cz^5+24c^2z^3+8cz^3+8c^3z+8c^2z-1)+(-56z^6-120cz^4-72c^2z^2-24cz^2-8c^3-8c^2)(4z^6+12cz^4+2z^4+12c^2z^2+8cz^2+4c^3+6c^2+2c+1)


Then find better aproximation of point c=b using iteration :


 \boldsymbol{v}_{m+1} = \frac{ \boldsymbol{v}_m -g(\boldsymbol{v}_m)}{J_g(\boldsymbol{v}_m)} = (\boldsymbol{v}_m -g(\boldsymbol{v}_m) ) J^{-1}_g(\boldsymbol{v}_m)

using  \boldsymbol{v}_0 as an starting point :

 \boldsymbol{v}_0

 \boldsymbol{v}_1 = (\boldsymbol{v}_0 -g(\boldsymbol{v}_0) ) J^{-1}_g(\boldsymbol{v}_0)

 \boldsymbol{v}_2 = (\boldsymbol{v}_1 -g(\boldsymbol{v}_1) ) J^{-1}_g(\boldsymbol{v}_1)

 ...

 \boldsymbol{v}_{m+1} = (\boldsymbol{v}_m -g(\boldsymbol{v}_m) ) J^{-1}_g(\boldsymbol{v}_m)

Maxima CAS codeEdit

Boundaries of Components of Mandelbrot set by Newton method. Image and Maxima CAS code

In Maxima CAS one can use Newton method for solving systems of multiple nonlinear functions. It is implemented in mnewton function. See directory :

/Maxima..../share/contrib/mnewton.mac

which uses linsolve_by_lu defined in :

 /share/linearalgebra/lu.lisp

See this image for more code.

Cpp codeEdit

/*
cpp code by Claude Heiland-Allen 
http://mathr.co.uk/blog/
licensed GPLv3+
 
 
g++ -std=c++98 -Wall -pedantic -Wextra -O3 -o bonds bonds.cc
 
./bonds period nucleus_re nucleus_im internal_angle
./bonds 1 0 0 0
./bonds 1 0 0 0.5
./bonds 1 0 0 0.333333333333333
./bonds 2 -1 0 0
./bonds 2 -1 0 0.5
./bonds 2 -1 0 0.333333333333333
./bonds 3 -0.12256116687665 0.74486176661974 0
./bonds 3 -0.12256116687665 0.74486176661974 0.5
./bonds 3 -0.12256116687665 0.74486176661974 0.333333333333333
 
for b in $(seq -w 0 999)
do
  ./bonds 1 0 0 0.$b 2>/dev/null
done > period-1.txt
gnuplot
plot "./period-1.txt" with lines
 
 
-------------
for b in $(seq -w 0 999)
do
  ./bonds 3 -0.12256116687665 0.74486176661974 0.$b 2>/dev/null
done > period-3a.txt
 
gnuplot
plot "./period-3a.txt" with lines
*/
#include <complex>
#include <stdio.h>
#include <stdlib.h>
 
typedef std::complex<double> complex;
 
const double pi = 3.141592653589793;
const double eps = 1.0e-12;
 
struct param {
  int period;
  complex nucleus, bond;
};
 
struct recurrence {
  complex A, B, C, D, E;
};
 
void recurrence_init(recurrence &r, const complex &z) {
  r.A = z;
  r.B = 1;
  r.C = 0;
  r.D = 0;
  r.E = 0;
}
 
/*
void recurrence_step(recurrence &rr, const recurrence &r, const complex &c) {
  rr.A = r.A * r.A + c;
  rr.B = 2.0 * r.A * r.B;
  rr.C = 2.0 * (r.B * r.B + r.A * r.C);
  rr.D = 2.0 * r.A * r.D + 1.0;
  rr.E = 2.0 * (r.A * r.E + r.B * r.D);
}
*/
 
void recurrence_step_inplace(recurrence &r, const complex &c) {
  // this works because no component of r is read after it is written
  r.E = 2.0 * (r.A * r.E + r.B * r.D);
  r.D = 2.0 * r.A * r.D + 1.0;
  r.C = 2.0 * (r.B * r.B + r.A * r.C);
  r.B = 2.0 * r.A * r.B;
  r.A = r.A * r.A + c;
}
 
struct matrix {
  complex a, b, c, d;
};
 
struct vector {
  complex u, v;
};
 
vector operator+(const vector &u, const vector &v) {
  vector vv = { u.u + v.u, u.v + v.v };
  return vv;
}
 
vector operator*(const matrix &m, const vector &v) {
  vector vv = { m.a * v.u + m.b * v.v, m.c * v.u + m.d * v.v };
  return vv;
}
 
matrix operator*(const complex &s, const matrix &m) {
  matrix mm = { s * m.a, s * m.b, s * m.c, s * m.d };
  return mm;
}
 
complex det(const matrix &m) {
  return m.a * m.d - m.b * m.c;
}
 
matrix inv(const matrix &m) {
  matrix mm = { m.d, -m.b, -m.c, m.a };
  return (1.0/det(m)) * mm;
}
 
// newton stores <z, c> into vector as <u, v>
 
vector newton_init(const param &h) {
  vector vv = { h.nucleus, h.nucleus };
  return vv;
}
 
void print_equation(const matrix &m, const vector &v, const vector &k) {
  fprintf(stderr, "/  % 20.16f%+20.16fi  % 20.16f%+20.16fi  \\  /  z  -  % 20.16f%+20.16fi  \\  =  /  % 20.16f%+20.16fi  \\\n", real(m.a), imag(m.a), real(m.b), imag(m.b), real(v.u), imag(v.u), real(k.u), imag(k.u));
  fprintf(stderr, "\\  % 20.16f%+20.16fi  % 20.16f%+20.16fi  /  \\  c  -  % 20.16f%+20.16fi  /     \\  % 20.16f%+20.16fi  /\n", real(m.c), imag(m.c), real(m.d), imag(m.d), real(v.v), imag(v.v), real(k.v), imag(k.v));
}
 
vector newton_step(const vector &v, const param &h) {
  recurrence r;
  recurrence_init(r, v.u);
  for (int i = 0; i < h.period; ++i) {
    recurrence_step_inplace(r, v.v);
  }
  // matrix equation: J * (vv - v) = -g
  // solved by: vv = inv(J) * -g + v
  // note: the C++ variable g has the value of semantic variable -g
  matrix J = { r.B - 1.0, r.D, r.C, r.E };
  vector g = { v.u - r.A, h.bond - r.B };
  print_equation(J, v, g);
  return inv(J) * g + v;
}
 
int main(int argc, char **argv) {
  if (argc < 5) {
    fprintf(stderr, "usage: %s period nucleus_re nucleus_im internal_angle\n", argv[0]);
    return 1;
  }
  param h;
  h.period  = atoi(argv[1]);
  h.nucleus = complex(atof(argv[2]), atof(argv[3]));
  double angle = atof(argv[4]);
  if (angle == 0 && h.period != 1) {
    fprintf(stderr, "warning: possibly wrong results due to convergence into parent!\n");
  }
  h.bond = std::polar(1.0, 2.0 * pi * angle);
  fprintf(stderr, "initial parameters:\n");
  fprintf(stderr, "period = %d\n", h.period);
  fprintf(stderr, "nucleus = % 20.16f%+20.16fi\n", real(h.nucleus), imag(h.nucleus));
  fprintf(stderr, "bond = % 20.16f%+20.16fi\n", real(h.bond), imag(h.bond));
  vector v = newton_init(h);
  fprintf(stderr, "z =% 20.16f%+20.16fi\n", real(v.u), imag(v.u));
  fprintf(stderr, "c =% 20.16f%+20.16fi\n", real(v.v), imag(v.v));
  fprintf(stderr, "\n");
  fprintf(stderr, "newton iteration:\n");
  vector vv;
  do {
    vv = v;
    v = newton_step(vv, h);
    fprintf(stderr, "z =% 20.16f%+20.16fi\n", real(v.u), imag(v.u));
    fprintf(stderr, "c =% 20.16f%+20.16fi\n", real(v.v), imag(v.v));
    fprintf(stderr, "\n");
  } while (abs(v.u - vv.u) + abs(v.v - vv.v) > eps);
  fprintf(stderr, "bond location:\n");
  fprintf(stderr, "c =% 20.16f%+20.16fi\n", real(v.v), imag(v.v));
  // suitable for gnuplot
  printf("%.16f %.16f\n", real(v.v), imag(v.v));
  return 0;
}

sizeEdit

Newton's method (providing the initial estimate is sufficiently good and the function is well-behaved) provides successively more accurate approximations. How to tell when the approximation is good enough? One way might be to compare the center with points on its boundary, and continue to increase precision until this distance is accurate to enough bits.

Algorithm:

  1. given a center location estimate n_m accurate to P bits of precision
  2. compute a boundary location estimate b_m accurate to the same precision, using center n_m as starting value
  3. compute |n_m - b_m| to find an estimate of the size of the component
    1. if it isn't zero, and if it is accurate enough (to a few 10s of bits, perhaps) then finish
    2. otherwise increase precision, refine center estimate to new precision, try again from the top

Measuring effective accuracy of the distance between two very close points might be done by comparing floating point exponents with floating point mantissa precision.

external raysEdit

External rays

See alsoEdit

ReferencesEdit

  1. NEWTON’S METHOD AND FRACTALS by AARON BURTON
  2. Newton, Chebyshev, and Halley Basins of Attraction; A Complete Geometric Approach by Bart D. Stewart
  3. Home page of Claude Heiland-Allen
Last modified on 19 June 2013, at 20:50