Nonlinear Equations in One
Variable
This chapter is concerned with finding solutions to the scalar, nonlinear equation
f (x) = 0,
where the variable x runs in an interval [a,b]. The topic provides us with an opportunity to discuss
various issues and concepts that arise in more general circumstances.
There are many canned routines that do the job of finding a solution to a nonlinear scalar
equation; the one in MATLAB is called fzero; type help fzero to see how this function is
used and what input parameters it requires.
Following an extended introduction to our topic in Section 3.1 are three sections, 3.2–3.4, each
devoted to a different method or a class of methods for solving our nonlinear equation. A closely
related problem is that of minimizing a function in one variable, and this is discussed in Section 3.5.
3.1 Solving nonlinear equations
Referring to our prototype problem introduce above, f (x)=0, let us further assume that the function
f is continuous on the interval, denoted f ∈ C[a,b]. Throughout our discussion we denote a solution
of the equation (called root,orzero)byx
Note: Why introduce nonlinear equations before introducing their easier comrades, the
linear ones?! Because one linear equation in one unknown is just too easy and not particu-
larly illuminating, and systems of equations bring a wave of complications with them. We
have a keen interest in solving scalar nonlinear equations not only because such problems
arise frequently in many applications, but also because it is an opportunity to discuss var-
ious issues and concepts that arise in more general circumstances and highlight ideas that
are extensible well beyond just one equation in one unknown.
Example 3.1. Here are a few simple functions and their roots.
1. f (x) = x −1, on the interval [a,b] = [0,2].
Obviously there is one root for this linear equation: x
is there, even though the fixed point iteration with the
“natural g” does not converge to it. Not everything natural is good for you! There are other choices
of g for the purpose of fixed point iteration that perform. better here.
Note: A discussion of rates of convergence similar to that appearing here is also given in
Section 7.3. There it is more crucial, because here we will soon see methods that converge
faster than a rate of convergence can suitably quantify.
Rate of convergence
Suppose now that at a particular root of a given fixed point iteration, ρ =|g
prime
(x
∗
)| satisfies 0 1, indicating no convergence of the fixed point iteration, and that it becomes infinite
for ρ = 0, indicating that the error reduction per iteration is faster than by any constant factor. We
encounter such methods next.
Specific exercises for this section: Exercises 3–4.
3.4 Newton’s method and variants
The bisection method requires the function f to be merely continuous (which is good) and makes
no further use of further information on f such as availability of its derivatives (which causes it
to be painfully slow at times). At the other end of the scale is Newton’s method, which requires
more knowledge and smoothness of the function f but which converges much faster in appropriate
circumstances.
Newton’s method
Newton’s method is the most basic fast method for root finding. The principle we use below to
derive it can be directly extended to more general problems.
Assume that the first and second derivatives of f exist and are continuous: f ∈ C
3.5. Minimizing a function in one variable 55
Globalizing Newton’s method
Both Newton’s method and the secant method are guaranteed to converge under appropriate con-
ditions only locally: there is the assumption in the theorem statement that the initial iterate x
isalready “close enough” to an isolated root. This is in contrast to the situation for the bisection
method, which is guaranteed to converge provided only that a bracketing interval is found on which
f changes sign. The δ-neighborhood for which Newton’s method is guaranteed to converge may be
large for some applications, as is the case for Example 3.7, but it may be small for others, and it is
more involved to assess ahead of time which is the case for a given problem.
A natural idea, then, is to combine two methods into a hybrid one. We start with a rough plot
or probe of the given function f in order to bracket roots. Next, starting from a given bracket with
0forallx, we can have only one minimum point. (Think of how the curve
must look like if we had more than one minimum.) Applying any of the algorithms described in
Sections 3.2–3.4, we obtain the unique minimum point as the root of f (x), given by
ˆx ≈ 1.5601412791.
At this point, φ(ˆx) ≈ 9.21018833518615.
58 Chapter 3. Nonlinear Equations in One Variable
Local and global minimum
The procedures outlined above find local minima. The problem of finding a global minimizer, i.e.,
a point ˆx such that φ(ˆx) ≤ φ(x)forallx ∈ [a,b], is significantly more difficult in general. In the
worst situations, where not much is known about the objective function φ, a suitable algorithm may
involve a complete enumeration, i.e., finding all local minima and then comparing them. At the
other end of the scale, if the objective function is known to be convex, which means that
φ(θx +(1−θ)y)≤ θφ(x)+(1−θ)φ(y) ∀ 0 ≤ θ ≤ 1
for any two points x, y ∈ [a,b], then there is a unique minimum and the problems of finding local and
global minima coincide. Such is the instance considered in Example 3.10, but not that considered in
Example 3.11.
Specific exercises for this section: Exercises 21–23.
3.6 Exercises
0. Review questions
(a) What is a nonlinear equation?
(b) Is the bisection method (i) efficient? (ii) robust? Does it (iii) require a minimal amount
of additional knowledge? (iv) require f to satisfy only minimum smoothness properties?
(v) generalize easily to several functions in several variables?
(c) Answer similar questions for the Newton and secant methods.
(d) State at least one advantage and one disadvantage of the recursive implementation of the
bisection method over the iterative nonrecursive implementation.
(e) In what way is the fixed point iteration a family of methods, rather than just one method
like bisection or secant?
(f) What is the basic condition for convergence of the fixed point iteration, and how does
the speed of convergence relate to the derivative of the iteration function g?
(g) Suppose a given fixed point iteration does not converge: does this mean that there is
no root in the relevant interval? Answer a similar question for the Newton and secant
methods.
(h) State at least two advantages and two disadvantages of Newton’s method.
(i) What are order of convergence and rate of convergence, and how do they relate?
(j) State at least one advantage and one disadvantage of the secant method over Newton’s.
(k) In what situation does Newton’s method converge only linearly?
(l) Explain the role that roundoff errors play in the convergence of solvers for nonlinear
equations, and explain their relationship with convergence errors.
(m) State a similarity and a difference between the problem of minimizing a function φ(x)
and that of solving the nonlinear equation φ
prime
(x) = 0.
(n) State what a convex function is, and explain what happens if an objective function is
convex.
1. Apply the bisection routine bisect to find the root of the function
f (x) =
√
x −1.1
starting from the interval [0,2] (that is, a = 0andb = 2), with atol =1.e-8.
Downloaded 02/04/18 to 132.174.254.159. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
the cost of approximating f . Determine approximately for what values of α Newton’s method
is more efficient (in terms of number of function evaluations) than the secant method. You
may neglect the asymptotic error constants in your calculations. Assume that both methods
are starting with initial guesses of a similar quality.
9. This exercise essentially utilizes various forms of Taylor’s expansion and relies on expertise
in calculus.
(b) Give one potential advantage and two potential disadvantages for this iteration as com-
pared to Newton’s method for f (x).
(c) Try this method on the problem of Exercise 10. What are your observations?
12. Given a > 0, we wish to compute x = lna using addition, subtraction, multiplication, division,
and the exponential function, e
x
(a) Suggest an iterative formula based on Newton’s method, and write it in a way suitable
for numerical computation.
(b) Show that your formula converges quadratically.
(c) Write down an iterative formula based on the secant method.
(d) State which of the secant and Newton’s methods is expected to perform. better in this
case in terms of overall number of exponential function evaluations. Assume a fair
comparison, i.e., same floating point system, “same quality” initial guesses, and identical
convergence criterion.
13. Write a MATLAB program to find all the roots of a given, twice continuously differentiable,
function f ∈ C
(b) Plot a graph of the function on the interval [0.1,1].
(c) As you can see from the graph, the root is between 0.5 and 0.6. Write MATLAB
routines for finding the root, using the following:
i. The bisection method, with the initial interval [0.5,0.6]. Explain why this choice of
the initial interval is valid.
ii. A linearly convergent fixed point iteration, with x
0
= 0.5. Show that the conditions
of the Fixed Point Theorem (for the function g you have selected) are satisfied.
iii. Newton’s method, with x
0
= 0.5.
iv. The secant method, with x
0
= 0.5 and x
1
= 0.6.
For each of the methods:
• Use |x
k
−x
k−1
| < 10
−10
as a convergence criterion.
• Print out the iterates and show the progress in the number of correct decimal digits
throughout the iteration.
• Explain the convergence behavior. and how it matches theoretical expectations.
16. We saw in Example 3.5 that the function
f (x) = αcosh(x/4)−x
has two roots for α = 2 and none for α = 10. Is there an α for which there is precisely one
root? If yes, then find such an α and the corresponding root; if not, then justify.
17. The derivative of the sinc function is given by
f (x) =
x cos(x)−sin(x)
x
2
.
(a) Show that near x = 0, this function can be approximated by
f (x) ≈−x/3.
The error in this approximation gets smaller as x approaches 0.
(b) Find all the roots of f in the interval [−10,10] for tol= 10
−8
.
Downloaded 02/04/18 to 132.174.254.159. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
✐
✐
✐
✐
✐
✐
✐
✐
3.6. Exercises 63
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x
f
Figure 3.5. Graph of an anonymous function; see Exercise 18.
18. Consider finding the root of a given nonlinear function f (x), known to exist in a given interval
[a,b], using one of the following three methods: bisection, Newton, and secant. For each of
the following instances, one of these methods has a distinct advantage over the other two.
Match problems and methods and justify briefly.
(a) f (x) = x −1 on the interval [0,2.5].
(b) f (x) is given in Figure 3.5 on [0,4].
(c) f ∈ C
5
[0.1,0.2], the derivatives of f are all bounded in magnitude by 1, and f
prime
(x)is
hard to specify explicitly or evaluate.
19. You are required by a computer manufacturer to write a library function for a given floating
point system to find the cube root y
1/3
of any given positive number y.Anysuchrelevant
floating point number can be represented as y = a × 2
e
,wherea is a normalized fraction
(0.5 ≤ a < 1) and e is an integer exponent. This library function must be very efficient and
it should always work. For efficiency purposes it makes sense to store some useful constants
ahead of computation time, e.g., the constants 2
1/3
,
2
3
,anda/3, should these prove useful.
(a) Show how y
1/3
can be obtained, once a
1/3
has been calculated for the corresponding
fraction, in at most five additional flops.
(b) Derive the corresponding Newton iteration. What is the flop count per iteration?
(c) How would you choose an initial approximation? Roughly how many iterations are
needed? (The machine rounding unit is 2
−52
.)
[You might find this exercise challenging.]
Downloaded 02/04/18 to 132.174.254.159. Redistribution subject to SIAM license or copyright; see http://www.siam.org/journals/ojsa.php
✐
✐
✐
✐
✐
✐
✐
✐
64 Chapter 3. Nonlinear Equations in One Variable
20. Suppose that the division button of your calculator has stopped working, and you have ad-
dition, subtraction, and multiplication only. Given a real number b negationslash= 0, suggest a quadrati-
cally convergent iterative formula to compute
1
b
, correct to a user-specified tolerance. Write
a MATLAB routine that implements your algorithm, using |x
k
−x
k−1
| < 10
−10
as a conver-
gence criterion, and apply your algorithm to b =π (that is, we compute
1
π
), with two different
initial guesses: (a) x
0
= 1;and(b)x
0
= 0.1.
Explain your results.
21. Write a MATLAB program to minimize a smooth, scalar function in one variable φ(x)overa
given interval. The specifications are similar to those in Exercise 13. Your program should first
find and display all critical points, i.e., zeros of φ
prime
(x). Then it should determine which of these
correspond to a local minimum by checking the sign of φ
primeprime
, and display the local minimum
points. Finally, it should determine the global minimum point(s), value, and arguments by
minimizing over the local minimum values.
22. Apply your program from Exercise 21 to find the (unique) global minimum of the function
φ(x) = cos(x)+x/10 over the interval [−10,10]. What are ˆx and φ(ˆx)?
23. Use your program from Exercise 21 to minimize
(a) φ(x) = sin(x)and
(b) φ(x) =−sinc(x) =−
sin(x)
x
over the interval [−10,10] with tol= 10
−8
.
You might find that determining the global minimum of the second function is significantly
more challenging, even though the global minimum for the first one is not unique. Explain
the source of difficulties and how you have addressed them.
[Exercises 17 and 14 are relevant here.]
3.7 Additional notes
Just about any introductory text on numerical methods or numerical analysis covers the topics intro-
duced in this chapter, although the goals and the manner of these treatments may vary significantly.
For different flavors, see Heath [39], Cheney and Kincaid [12], and Burden and Faires [11], as well
as the classics Conte and de Boor [13] and Dahlquist and Björck [15].
For us this chapter is mainly an opportunity to introduce and demonstrate, within a simple
context, several basic notions and issues arising also in much more complex numerical computations.
We have thus avoided methods that do not seem to shed much light on other problems as well.
There is a (somewhat more complicated) analogue to the bisection method for function min-
imization. This classical algorithm is called golden section search and is described, together with
extensions, in Press et al. [59]; alternatively, see the crisper Heath [39].
Downloaded 02/04/18 to 132.174.254.159. Redistribution subject to SIAM license