










Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
The concept of extrapolation in numerical analysis, focusing on Richardson and Romberg methods. Extrapolation is a technique used to increase speed and accuracy in various numerical tasks by computing weighted averages of relatively poor estimates to eliminate error modes. the background theory, derivation, validity, and stability of these methods, as well as their applications in improving simple numerical differentiation and integration schemes. It also includes numerical results demonstrating the superiority of extrapolation over basic algorithms.
Typology: Study notes
1 / 18
This page cannot be seen from the preview
Don't miss anything!
Fundamental Methods of Extrapolation 1
Eric Hung-Lin Liu
Keywords: numerical analysis, extrapolation, richardson, romberg, numerical differentiation, numerical integration
Abstract Extrapolation is an incredibly powerful technique for increasing speed and accuracy in various numerical tasks in scientific computing. As we will see, extrapolation can transform even the most mundane of algorithms (such as the Trapezoid Rule) into an extremely fast and accurate algorithm, increasing the rate of convergence by more than one order of magnitude. Within this paper, we will first present some background theory to motivate the derivation of Richardson and Romberg Extrapolation and provide support for their validity and stability as numerical techniques. Then we will present examples from differentiation and integration to demonstrate the incredible power of extrapolation. We will also give some MATLAB code to show some simple implentation methods in addition to discussing error bounds and differentiating between ”‘true”’ and ”‘false”’ convergence.
Extrapolation is an extremely powerful tool available to numerical analysts for improving the performance of a wide variety of mathematical methods. Since many “interesting” problems cannot be solved analytically, we often have to use computers to derive approximate solutions. In order to make this process efficient, we have to think carefully about our implementations: this is where extrapolation comes in. We will examine some examples in greater detail later, but we will briefly survey them now. First, consider numeric differentiation: in theory, it is performed by tending the step-size to zero. The problem is that modern computers have limited precision; as we will see, decreasing the step-size past 10−^8 severely worsens the performance because of floating point round-off errors. The choice of step-size in integration is also important but for a different reason. Having a tiny step-size means more accurate results, but it also requires more function evaluations–this could be very expensive depending on the application at hand. The large numbers of floating point operations associated with a small step-size has the negative effect of magnifying the problems of limited precision (but it is not nearly as serious as with differentiation). Another example is the evaluation of (infinite) series. For slowly converging series, we will need to sum a very large number of terms to get an accurate result; the problem is that this could be very slow and again limited precision comes into play. Extrapolation attempts to rectify these problems by computing “weighted averages” of
2 Liu
relatively poor estimates to eliminate error modes–specifically, we will be examining various forms of Richardson Extrapolation. Richardson and Richardson-like schemes are but a subset of all extrapolation methods, but they are among the most common. They have numerous applications and are not terribly difficult to understand and implement. We will demonstrate that various numerical methods taught in introductory calculus classes are in fact insufficient, but they can all be greatly improved through proper application of extrapolation techniques.
We are very often able to associate some infinite sequence {An} with a known function A(h). For example, A(h) may represent a first divided differences scheme for differentiation, where h is the step-size. Note that h may be continuous or discrete. The sequence {An} is described by: An = A(hn), n ∈ N for a monotonically decreasing sequence {hn} that converges to 0 in the limit. Thus limh→0+ A(h) = A and similarly, limn→∞ An = A. Computing A(0) is exactly what we want to do–continuing the differentiation example, roughly speaking A(0) is the ”point” where the divided differences ”becomes” the derivative. Interestingly enough, although we require that A(y) have an asymptotic series expansion, we need not determine it uniquely. In fact we will see that knowing the pn is sufficient. Specifically, we will examine functions with expansions that take this form:
s A(h) = A + αkhpk^ + O(hps+1^ ) as h →0+ (1) k=
where pn 6 = 0 ∀n, p 1 < p 2 < ... < ps+1, and all the αk are independent of y. The pn can in fact be imaginary, in which case we compare only the real parts. Clearly, having p 1 > 0 guarantees the existence of limh→0+ A(h) = A. In the case where that limit does not exist, A is the antilimit of A(h), and then we know that at pi ≤ 0 for at least i = 1. However, as long as the sequence {pn} is monotonically increasing such that lim∑ k→∞ pk = +∞, (1) exists and A(h) has the expansion A(h) ∼ A + ∞ k=1 αkh pk (^). We do not need to declare any conditions on the
convergence of the infinite series. We do assume that the hk are known, but we do not need to know the αk, since as we will see, these constant factors are not significant. Now suppose that p 1 > 0 (hence it is true ∀p), then when h is sufficiently small, A(y) approximates A with error A(h) − A = O(hp^1 ). In most cases, p 1 is not particularly small (i.e. p 1 < 4), so it would appear that we will need to decrease h to get useful results. But as previously noted, this is almost always prohibitive in terms of round-off errors and/or function evaluations. We should not despair, because we are now standing on the doorstep of the fundamental idea behind Richardson Extrapolation. That idea is to use equation (1) to somehow eliminate the hp^1 error term, thus obtaining a new approximation A 1 (h) − A = O(hh^2 ). It is obvious that | A 1 (h) − A |<| A(h) − A | since p 2 > p 1. As we will see later, this is accomplished by taking a ”weighted average” of A(h) and A(h s ).
2 Simple Numerical Differentiation and Integration Schemes
Before continuing with our development of the theory of extrapolation, we will briefly review differentiation by first divided differences and quadrature using the (composite) trapezoid and midpoint rules. These are among the simplest algorithms; they are straightforward enough
log
(error) 10
4 Liu
From equation (4), it is evident that hoptimum ∼
ǫm where ǫm is machine precision, which is
roughly 10−^16 in the IEEE double precision standard. Thus our best error is 4(f ′′(x)f (x)ǫ)
1 (^2) ∼
10 −^8. This is rather dismal, and we certainly can do better. Naturally it is impossible to modify arithemtic error: it is a property of double precision representations. So we look to the truncation error: a prime suspect for extrapolation.
Failure of First Divided Differences
−16 −14 −12 −10 −8 −6 −4 −2 0
1
0
−
−
−
−
−
−
−
− log 10 (dx)
Figure 1: Approximation errors from de
x dx |x=
The extent of this problem can be seen in Fig. 1. It displays the numerical errors resulting from the first divided difference approximation of dex dx |x=1. The^ y-axis represents the logarithm of the error, while the x-axis shows the approximate step size. The log-log scale is convenient for visualizing rough order of magnitude approximations for numerical errors with the additional benefit of linearizing functions with polynomial error expressions. This method is commonplace in numerical analysis. We should not be applying first divided differences to estimate derivatives. The previous method is O(h). By using centered divided differences, we can obtain O(h^2 ) without any additional work. We initially avoided this because the example of numerical errors is most severe using first differences. The formula for centered differences approximation is:
δ 0 (h) =
f (x 0 + h) − f (x 0 − h) for 0 < h ≤ a (5) 2 h
The derivative approximation then has error:
δ 0 (h) − f ′^ (x 0 ) =
f ′′′(ξ(h)) h^2 = O(h^2 ) (6) 3!
Fundamental Methods of Extrapolation 5
Sidi[6] provides the full expansion, obtained with a Taylor series expansion around x 0 :
∑^ s f 2 k+1(x 0 ) δ 0 (h) − f ′^ (x 0 ) = (2k + 1)!
h^2 k^ + Rs(h) where k=1 (^) (7)
Rs(h) = h^2 s+
f (2s+3)(ξ(h)) = O(h^2 s+2) (2s + 3)!
First we should note that since the integration schemes presented are used over a range of inter polation points {xj }, the given formulas correspond to the Composite Trapezoid and Midpoint schemes, although we will drop the ”composite” label.
∫ (^) b n− 1 h Composite Trapezoid Rule: f (x) dx = f (a) + 2 f (xj ) + f (b) + O(h^2 ) (8) a (^2) j=
For a uniformly chosen set of points {xj } in the range [a, b] such that xj = a + jh and j = 1/n where n is the total number of interpolation points. Observe that all the Trapezoid Rule really does is fit a line to the function between each pair of points in the interval.
The Trapezoid Rule is a special case (n = 1) of the (n+1)-point closed Newton-Cotes formula. Also, Simpson’s Rule is another special case (n = 2) of this formula. The composite rules arise from iterating the Newton-Cotes expressions over a range of points. The following is adapted from Burden and Faires:
Theorem 1. Suppose that (^) i^ n =0 aif (xi) denotes the (n+1)-point closed Newton-Cotes formula with x 0 = a, xn = b, and h = b− n^ a^. Then there exists ξ ∈ (a, b) such that
b n^ hn+3f n+2(ξ) n n f (x)dx = aif (xi) + (n + 2)!
t^2 (t − j)dt a (^) i=0 0 j=
if n is even and f ∈ Cn+2[a, b], and
∫ (^) b (^) ∑n ∫ (^) n hn+2f n+1(ξ) n^ ∏ f (x)dx = aif (xi) + (n + 1)!
t (t − j)dt a (^) i=0 0 j=
if n is odd and f ∈ Cn+1[a, b].
Proof. As implied in the introduction, we are assuming the validity of the following expansion:
∫ (^) b n f (x)dx ≈ aif (xi) a (^) i=
where (^) ∫ ∫ (^) n xn xn (^) ∏ ai = Li(x)dx =
x − xj dx x 0 x (^0) j=0 xi^ −^ xj j 6 =i
Fundamental Methods of Extrapolation 7
3 The Richardson and Romberg Extrapolation Technique
The fundamental idea behind these extrapolation schemes is the usage of multiple, low-accuracy evaluations to eliminate error modes and create a highly accurate result. As mentioned earlier, we are assuming that the working functions have the following property: A(h) − A = O(hp^1 ), where p 1 is known or obtainable. To motivate our understanding of extrapolation, we will write some arbitrary quadrature rule I as a function of its associated step-size, h: I(h). For example, the Trapezoid Rule could be written as:
n− 1 h I(h) = f (a) + 2 f (xj ) + f (b) + O(h^2 ) (^2) j=
As previously mentioned, suppose the expansion I(h) = I 0 +hp^1 k 1 +hp 2 k 2 +... exists, where {pi} ⊂ R are the set of error powers and {ki} ⊂ R are some known or unknown constants. Now consider: h hp^1 k 1 hp^2 k 2 I( ) = I 0 + + +... s sp^1 sp^2
To eliminate the hp^1 error term (because it is the dominant error power), we will compute −I(h) + sp^1 I(hs^ ).
hp^2 k 2 I ′^ (h) = sp^1 I(
h ) − I(h) = sp^1 I 0 − I 0 + 0 ∗ O(hp^1 ) + − hp^2 k 2 +... (10) s sp^1 −p^2 s (^) s =
p (^1) I(h (^) ) − I(h) = I 0 +
sp^1 −p^2 − 1 hp^2 k 2 +... (11) sp^1 − 1 sp^1 − 1 = I 0 + hp^2 k 2 ′^ +... (12) h s
I(h s
p 1
(h s ) = I( (13)
And that is the basic idea underlying Romberg Extrapolation. Note that s is the weight by which we decrease the step-size in each new evaluation of I. s = 2 is a very common choice. From (14), we can continue iteratively (or recursively) to elimate O(hp^2 ). Using that result, we can eliminate O(hp^3 ), and continue on indefinitely, in theory. In reality, we are limited by floating point precision. The following table presents a graphical representation of the extrapolation process. Notice that the left-most column is the only column that involves actual function evaluations. The second column represents estimates of I with the O(hp^1 ) term eliminated using the algorithm given above. The third column elmiminates O(hp^2 ) and so forth. The actual approximations that should be used for error checking can be found by reading off the entries on the main diagonal of this matrix.
Note: more generally, we need not decrease h geometrically by factors of (^1) s^ , but this approach is often easiest to visualize, and it simplifies the algebra in most applications without damaging performance. Common choices for s are small integers (e.g. s = 2) because larger integers may cause the number of function evaluations to grow prohibitively quickly.
8 Liu
Table 1: Iteration of a Generic Extrapolation Method
I 0 (0)^ (h) I 0 (1)^ (hs^ ) I 1 (0)^ (h) I 0 (2)^ ( (^) sh 2 ) I 1 (1)(hs^ ) I 2 (0)^ (h) I 0 (3)^ (s^ h 3 ) I 1 (2)^ (s^ h 2 ) I 2 (1)^ (h s ) I 3 (0)^ (h) .. .. .. .. (^) ..
.....
It turns out that the Richardson Extrapolation process has several nice convergence properties, the proofs of which are presented in Sidi [6]. Due to their length, we will simply summarize the results here instead of going through the full exercise. The convergence results for the Richardson method fall into two areas: convergence of columns and convergence of diagonals. One result demonstrates that the approximation made by each column is at least as good as the column that immediately preceeds it and hence every previous column. The requirement is that the integer s in Equation 1 exist and is chosen to be as large as possible. By column, we are referring to the columns represented in Table 1; column 0 is the base sequence, column 1 is the first round of extrapolation, column 2 is the second round, and so forth. Specifically, if column n converges, then column n + 1 converges at least as quickly as column n. In the problems we will consider, column 0 converges because the baseline algorithms (e.g. Trapezoid Rule) are proven to work. Hence extrapolation will not theoretically make anything worse. Performing extrapolation with the wrong set of pn will cause the columns after the first to yield little to no improvement. If column n diverges, then in the worst case, column n + 1 diverges as quickly as column n; in the best case, column n + 1 will actually converge. Finally, looking back at Equation 1, if the αk = 0 for all k and limh→0+ I(h) = I, then each column converges more quickly than all the preceding columns: the best case being linear convergence. The convergence condition for diagonals is more restrictive, but if satisfied the diagonals enjoy much faster convergence than the columns. The condition imposed is that I(h) must have an asymptotic expansion of the form given in Equation 1 for all s. THus if I(h) has a valid expansion, then all diagonal sequences converge to I superlinearly. This means that for a fixed j, In (j) − I tends to 0 asymptotically like e−λn^ for every λ > 0. For an even smaller set of cases, the convergence of diagonals is even better than e−λn, but the examples we are considering are of the former type. Note that the convergence results of columns and diagonals may hold for other extrapolation schemes outside of the Richardson method; the conditions required are not unique to Richardson Extrapolation.
When implementing Richardson or Romberg extrapolation, it is most common to do it itera tively, unless storage is an extreme issue. Although the problem can be solved just as easily using recursion, iteration is generally preferred. In an iterative method, storage of the full table/matrix is not neccessary; it is only necessary to store the last two rows. However, one should keep at least a short history (1 or 2 entries) from the main diagonal for error compari sions. When the difference between successive estimates decreases to some given tolerance (the
10 Liu
differences. Even the single step showed great improvements over the non-extrapolated centered difference. But unfortunately, not even extrapolation can eliminate the problems of floating point errors. What it can do and has done here is eliminate truncation error due to ”poor” initial approximation methods.
Improvement of Divided Differences from Extrapolation
−16 −14 −12 −10 −8 −6 −4 −2 0 −
−
−
−
−
−
−
0
2
log
10 (error)
log 10 (h)
Figure 2: Extrapolated and Un-Extrapolated First and Centered Differences for de x dx |^ x= The blue line shows the performance of the first divided difference approximation and the red line is the centered difference. The upper black line is marks the extrapolated first divided difference while the lower black line is for the centered case.
Figure 3 displays the LHS, Trapezoid, Midpoint, and Simpson Rules in their standard forms (dotted lines). If it is not clear, LHS is the black line marked by asterisks, the mid point/trapezoid rules virtually lie on top of each other, and the pink like marked by Xs is Simpson’s Rule. Once extrapolated (solid lines), Simpson, Midpoint, and Trapezoid are nearly indistinguishable. (Why does this imply that we should never start with Simpson’s Rule when extrapolating?) Even the lowly Left Hand Sum is now performing at a decent level. But clearly, the Midpoint or Trapezoid Rule is the best choice here. The deciding factor between them is typically where we want the function evaluations to occur (i.e. Midpoint avoids the endpoints). In all cases, the difference between extrapolated and un-extrapolated results is stunning. As we said, extrapolation is the only way to overcome the error bound of 10−^8 and achieve high accuracy in differentiation. In integration, using a maximum step-size of only 641 , the Midpoint and Trapezoidal schemes reach a result that is accurate to machine-precision. In the un-extrapolated schemes, we can only achieve this level of accuracy using Simpson’s rule with h = 81961. The un-extrapolated versions of the other three schemes do not even come close to machine-precision; graphically, it seems that the Midpoint and Trapezoid Rules would need a step-size on the order of 10−^6 while we would have to decrease it even further to 10−^16 to acommodate LHS. Even Simpson’s Rule, the best of the un-extrapolated schemes, requires over an order of
Fundamental Methods of Extrapolation 11
magnitude more function evaluations than the extrapolated Midpoint and Trapezoid schemes. Since the integral we used for demonstration is “nice” and because modern computers are fast, the run-time difference is not significant. But conceivably, if f (x) took one minute to evaluate, then the run-time difference would be huge (dominated by calls to f (x)) since the subtraction and division operations of the extrapolative process are comparatively cheap.
Comparison of Extrapolated and Un−Extrapolated Integration Methods
−4 −3.5 −3 −2.5 −2 −1.5 −1 −0.
−
−
−
−
−
−
−
−
−
−
0
log
10
(error)
log 10 (h)
Figure 3: LHS, Trapezoid, Midpoint, and Simpson Extrapolated and Un-Extrapolated Black is LHS, blue is Midpoint, red is Trapezoid, and pink is Simpson. The dotted lines are the un-extrapolated results while the solid lines show the extrapolated performance of each integration scheme. See Section 5.1 for more details.
Oftentimes, we find ourselves attempting to extrapolate expressions where the error powers have not been analytically pre-derived for our convenience. We have already seen a few examples of this; other situations include most ”canned” methods for numerically solving ODEs. However not all cases are so nice. For example, it is easy to apply extrapolation to evaluate infinite series or sequences, but the error terms are usually not obvious in those cases. Differentiation and integration are special in that it is relatively easy to analytically show how to form the {pi}. In these situations, we are left with two options. We can either attempt to analytically derive the error terms or we can use graphical methods. This is another situation where using log-log plots is ideal. We begin with:
h I(h) − I( ) = (1 − s−p^1 )hp^1 k 1 + (1 − s−p^2 )hp^2 k 1 +... s = (1 − s−p^1 )hp^1 k 1 + O(hp^2 )
Fundamental Methods of Extrapolation 13
log
(diff(S)) 10
Using Slope to Find Error Exponents 0
− −
− −
− −
− −
− − log (h)
−4 −3.5 −3 −2.5 −2 −1.5 −1 −0. 10
Figure 4: Trapezoid Rule applied to
Table 2: Slope between step-sizes of the line for determining p 2 in Figure 4
14 Liu
Table 3: Extrapolation Results for W 2 ( n^ n)
2
n Number of Terms First Round Second Third Fourth 2 24 3.412384377 3.137572137 3.141356616 3. 4 29 3.274978257 3.140410496 3.141562164 3. 8 37 3.207694377 3.141274247 3.141588874 3. 16 47 3.174484312 3.141510218 3.141592187 3. 32 62 3.157997265 3.141571695 3. 64 82 3.14978448 3. 128 111 3.
Following our previous discussions, again we re-iterate that the most accurate result of each column occurs in its last row. The extrapolation process was performed with s = 1/2 and pi = −i. With series evaluations, we get more accurate results for a larger number of terms, which corresponds to a larger n as opposed to integration and differentiation where we decreased h. As a result, the pi are negative; the sign in the exponent “changes twice” to maintain the validity of the expansion that Richardson Extrapolation uses. In Table 3, we see that after four rounds of extrapolation on the original calculations for W (n), we have obtained π to 10 decimal places. The largest n used was n = 128; the sum in that case has 111 terms in the sum, as shown in the table. So in total, we evaulated expression for the summand 392 times. If we used just n = 2048, we would require slightly more than 400 terms in our sum, but we would only get 4 dgits of π. With n = 8192, we would get 5 digits of π but at a cost of 800 terms. Going to n = 32768 yields 7 digits, but it requires a massive 1600 terms. From this example, the power of extrapolation should be even more obvious. The expression W (n) converges extremely slowly: curve-fitting yields W (n) = π + 0.^5237657 n + 0.^0291701 n 2 , which should be clear given the fairly poor result for n as large as 32768. But with extrapolation methods, we arrived at the answer using an order of magnitude fewer summand terms with greater accuracy to boot.
6 Advanced Methods
Thus far we have concentrated on the Richardson Extrapolation process since it is the easiest to understand, not to mention that it has plenty of applications and is extremely useful. But there are many more schemes used in practice. So we will briefly discuss a small subset of those by attempting to generalize some of the notions presented already. Implicit in our work so far is the idea that the Richardson Extrapolation process used in differentiation and integration (among other things) operates on polynomial interpolants. Our method has been to begin with a standard scheme and perform extrapolation on it. These schemes are all based on polynomial interpolation–the Newton-Cotes formulas were even verified in this way. For example, the Trapezoid Rule performs a linear fit and Simpson’s Rule uses a quadratic interpolant. But there are other possibilities too–we have in no way limited ourselves
16 Liu
where ρi(h) = ζ(−s − i)log(h) − ζ′(−s − i). There are also analytic expressions for ai and bi, but since these correspond to the αk coefficients, their values are not of particular importance. From here we see that the φk(h) are known:
hs+i+1^ if i =
⌊ 2 k ⌋ ,k = 1, 2 , 4 , 5 , 7 , 8 ,... φk(h) = (^2) k^3 (22) h 3 if k = 3, 6 , 9 ,...
The solution of the system in the First Generalization is the An^ (j)^ which can be written as follows, using Cramer’s Rule:
A( n^ j)^ =
Γn^ ( (
j j
) )
(a) (23) Γn (1)
For convenience, we have defined a(m) = A(ym), gk(m) = φ(ym), and Γ(a)n^ (j)^ as the de terminant of a matrix of n + 1 rows and n + 1 columns where the ith row looks like [g 1 (j + i − 1) g 2 (j + i − 1) · · · gn(j + i − 1) a(j + i − 1)]. (j) Now define χ (j) = Γp (b) such that χ (j) = A (j) p (b)^ Γ and^ we^ will^ also p^ (j)(1)^ p^ (a) (^) p ,
need to let G( pj)^ be the determinant of a p by p matrix with ith row given by
[g 1 (j + i − 1) g 2 (j + i − 1) · · · gn(j + i − 1)] for p ≥ 1 and G( 0 j)^ = 1. Then we can use the Sylvester Determinant Identity to relate Γ and G:
Γ( pj )^ (b)G(pj−+1) 1 = Γ(pj−+1) 1 (b)Gp^ (j)^ − Γ p^ ( j −^ ) 1 (b)G(pj+1)
Γ(j)(b)G(j+1) Then beginning from χp^ (j)(b) = (^) Γ(pj^ ) p(j−+1)^1 and applying our relation of Γ and G, we arrive at p (1)Gp− 1 the final result of the E-Algorithm:
(j+1) (j) (j) (j+1) χ(j)(b) =
χp− 1 (b)χp− 1 (gp) − χp− 1 (b)χp− 1 (gp) p (^) (j) (j+1) ,^ (24) χp− 1 (gp) − χp− 1 (gp)
where b = a and b = gk for k = p + 1, p + 2,.. .. The intitial conditions are that χ 0 (j)(b) = b(j).
Note that we can save a considerable amount of computational work by setting w^ (j)^ χ(j+1)^ (gp) p =^ χp (^ − j)^1 : p− 1 (gp)
χ(j+1)^ w(j)χ(j) χ( pj)(b) = p− 1 (b)^ −^ p (j)
p− 1 (b)^ (25) 1 − wp
This forms the basis of the E Algorithm. Note that what we have presented here does not actually have much practical application. The E Algorithm solves systems of the form A(yl) =
An^ (j)^ +
∑n k=1 βkφk(yl). By itself it is not an extrapolative scheme; we have to first create some conditions to specify the gk(m). Picking the gk(m) in a useful manner is by no means an easy task and there is ongoing research in that area. But given the generality of the E Algorithm, many extrapolative schemes are based on it.
Fundamental Methods of Extrapolation 17
7 Conclusion
Extrapolation allows us to achieve greater precision with fewer function evaluations than un extrapolated methods. In some cases (as we saw with simple difference schemes), extrapolation is the only way to improve numeric behavior using the same algorithm (i.e. without switching to centered differences or something better). Using sets of poor estimates with small step- sizes, we can easily and quickly eliminate error modes, reaching results that would be otherwise unobtainable without significantly smaller step-sizes and function evaluations. Thus we turn relatively poor numeric schemes like the Trapezoid Rule into excellent performers. The material presented here is certainly not the limit of extrapolation methods as we attempted to show in the last section. Still as our examples demonstrate, the Richardson Extrapolation Process that we emphasized is more than sufficient for many applications including evaluating integrals, derivatives, infinite sums, infinite sequences, function approximations at a point (e.g. evaluating π or γ), and more.
According to [1], it has been proven (by Delahaye and Germain-Bonne) that there cannot exist a general extrapolative scheme that works for all sequences. The concept of a working extrapolation scheme can be re-stated in the following way: if we apply an extrapolation method to a sequence Sn (for example, Si = trap(h/si)), the resulting sequence En must converge faster than Sn, meaning that limn→∞ En−S Sn−S = 0. Thus the application of extrapolation for cases more difficult than the fundamental examples presented here quite difficult, because for any given situation, we must first determine which scheme is best-suited for the problem. The metric for “best-suited” is not necessarily obvious either: it could be ease of implementation, number of function evaluations required, numeric stability, or others. Delahaye’s negative result also means that each new extrapolative scheme is interesting in its own right, since it has the potential to solve a wider class of problems which no other method can attack.
On a closing note, we will now quickly outline how Romberg extrapolation can be used to easily derive Simpson’s rule from the Trapezoid rule. More generally, extrapolating the n = i Closed Newton-Cotes Formula results in the n = i + 1 version. By now, this should not be all that surprising, but without knowledge of extrapolation, this relation between the ith and i + 1th formulas is not obvious at all. We start with s = 2 and p = 2; the choice of p is based on the error term from the Closed Newton-Cotes Formula. Alternatively, it could have been determined graphically, as shown earlier.
4 trap(h^ ) − trap(h) = Simp( 2 h ) 22 − (^1) ( 2 ) ( ) h n−^1 h n−^1 = 3
f (a) + 2 f (xj ) + f (b) − 6
f (a) + 2 f (xj ) + f (b) j=1 j= (^) n − 1 n h 2 2 = f (a) + 2 f (xj ) + 4 f (xj ) + f (b) (^3) j=1 j=