Заголовок
Источник: http://picxxx.info
Ссылка на PDF: http://picxxx.info/pml.php?action=GETCONTENT&md5=e0437fc7f59ea4d85a5640b3f4e82629
Конец заголовка
Introduction to Real Analysis
Lee Larson
University of Louisville
August 4, 2017
About This Document
I often teach the MATH 501502: Introduction to Real Analysis course at
the University of Louisville. The course is intended for a mix of mostly senior
mathematics majors with a smattering of other students from mathematics,
physics and engineering. These are notes I’ve compiled over the years. They
cover the basic ideas of analysis on the real line.
Prerequisites are a good calculus course, including standard differentiation
and integration methods for real functions, and a course in which the students
must read and write proofs. Some familiarity with basic set theory and standard
proof methods such as induction and contradiction is needed. The most important thing is some mathematical sophistication beyond the basic algorithm and
computationbased courses.
Feel free to use these notes for any purpose, as long as you give me blame
or credit. In return, I only ask you to tell me about mistakes. Any suggestions
for improvements and additions are very much appreciated. I can be contacted
using the email address on the Web page referenced below.
The notes are updated and corrected quite often. The date of the version
you’re reading is at the bottomleft of most pages. The latest version is available
for download at the Web address math.louisville.edu/∼lee/ira.
There are many exercises at the ends of the chapters. There is no general
collection of solutions.
Some early versions of the notes leaked out onto the Internet and they are
being offered by a few of the usual download sites. The early versions were meant
for me and my classes only, and contain many typos and a few — gasp! — outright
mistakes. Please help me expunge these escapees from the Internet by pointing
those who offer the older files to the latest version.
August 4, 2017
i
Contents
About This Document
i
Chapter 1. Basic Ideas
1. Sets
2. Algebra of Sets
3. Indexed Sets
4. Functions and Relations
5. Cardinality
6. Exercises
11
11
12
14
15
111
113
Chapter 2. The Real Numbers
1. The Field Axioms
2. The Order Axiom
3. The Completeness Axiom
4. Comparisons of Q and R
5. Exercises
21
21
23
26
29
211
Chapter 3. Sequences
1. Basic Properties
2. Monotone Sequences
3. Subsequences and the BolzanoWeierstrass Theorem
4. Lower and Upper Limits of a Sequence
5. The Nested Interval Theorem
6. Cauchy Sequences
7. Exercises
31
31
35
37
38
310
311
314
Chapter 4. Series
1. What is a Series?
2. Positive Series
3. Absolute and Conditional Convergence
4. Rearrangements of Series
5. Exercises
41
41
44
411
414
417
Chapter 5. The Topology of R
1. Open and Closed Sets
2. Relative Topologies and Connectedness
3. Covering Properties and Compactness on R
4. More Small Sets
i
51
51
54
56
59
5. Exercises
514
Chapter 6. Limits of Functions
1. Basic Definitions
2. Unilateral Limits
3. Continuity
4. Unilateral Continuity
5. Continuous Functions
6. Uniform Continuity
7. Exercises
61
61
65
65
68
610
612
614
Chapter 7. Differentiation
1. The Derivative at a Point
2. Differentiation Rules
3. Derivatives and Extreme Points
4. Differentiable Functions
5. Applications of the Mean Value Theorem
6. Exercises
71
71
72
75
76
79
715
Chapter 8. Integration
1. Partitions
2. Riemann Sums
3. Darboux Integration
4. The Integral
5. The Cauchy Criterion
6. Properties of the Integral
7. The Fundamental Theorem of Calculus
8. Change of Variables
9. Integral Mean Value Theorems
10. Exercises
81
81
82
84
87
89
811
814
818
821
822
Chapter 9. Sequences of Functions
1. Pointwise Convergence
2. Uniform Convergence
3. Metric Properties of Uniform Convergence
4. Series of Functions
5. Continuity and Uniform Convergence
6. Integration and Uniform Convergence
7. Differentiation and Uniform Convergence
8. Power Series
9. Exercises
91
91
94
96
97
98
913
915
918
924
Chapter 10. Fourier Series
1. Trigonometric Polynomials
2. The Riemann Lebesgue Lemma
3. The Dirichlet Kernel
4. Dini’s Test for Pointwise Convergence
101
101
103
104
106
August 4, 2017
http://math.louisville.edu/∼lee/ira
5.
6.
7.
8.
Gibbs Phenomenon
A Divergent Fourier Series
The Fejér Kernel
Exercises
109
1011
1015
1019
Appendix. Bibliography
A1
Appendix. Index
A2
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 1
Basic Ideas
In the end, all mathematics can be boiled down to logic and set theory. Because of this, any careful presentation of fundamental mathematical ideas is
inevitably couched in the language of logic and sets. This chapter defines enough
of that language to allow the presentation of basic real analysis. Much of it will be
familiar to you, but look at it anyway to make sure you understand the notation.
1. Sets
Set theory is a large and complicated subject in its own right. There is no time
in this course to touch on any but the simplest parts of it. Instead, we’ll just look
at a few topics from what is often called “naive set theory,” many of which should
already be familiar to the reader.
We begin with a few definitions.
A set is a collection of objects called elements. Usually, sets are denoted by the
capital letters A, B, · · · , Z . A set can consist of any type and number of elements.
Even other sets can be elements of a set. The sets dealt with here usually have
real numbers as their elements.
If a is an element of the set A, we write a ∈ A. If a is not an element of the set
A, we write a ∉ A.
If all the elements of A are also elements of B , then A is a subset of B . In this
case, we write A ⊂ B or B ⊃ A. In particular, notice that whenever A is a set, then
A ⊂ A.
Two sets A and B are equal, if they have the same elements. In this case we
write A = B . It is easy to see that A = B iff A ⊂ B and B ⊂ A. Establishing that both
of these containments are true is the most common way to show that two sets
are equal.
If A ⊂ B and A = B , then A is a proper subset of B . In cases when this is
important, it is written A B instead of just A ⊂ B .
There are several ways to describe a set.
A set can be described in words such as “P is the set of all presidents of the
United States.” This is cumbersome for complicated sets.
All the elements of the set could be listed in curly braces as S = {2, 0, a}. If the
set has many elements, this is impractical, or impossible.
More common in mathematics is set builder notation. Some examples are
P = {p : p is a president of the United states}
= {Washington, Adams, Jefferson, · · · , Clinton, Bush, Obama, Trump}
11
12
CHAPTER 1. BASIC IDEAS
and
A = {n : n is a prime number} = {2, 3, 5, 7, 11, · · · }.
In general, the set builder notation defines a set in the form
{formula for a typical element : objects to plug into the formula}.
A more complicated example is the set of perfect squares:
S = {n 2 : n is an integer} = {0, 1, 4, 9, · · · }.
The existence of several sets will be assumed. The simplest of these is the
empty set, which is the set with no elements. It is denoted as . The natural
numbers is the set N = {1, 2, 3, · · · } consisting of the positive integers. The set
Z = {· · · , −2, −1, 0, 1, 2, · · · } is the set of all integers. ω = {n ∈ Z : n ≥ 0} = {0, 1, 2, · · · }
is the nonnegative integers. Clearly, ⊂ A, for any set A and
⊂ N ⊂ ω ⊂ Z.
D EFINITION 1.1. Given any set A, the power set of A, written P (A), is the set
consisting of all subsets of A; i.e.,
P (A) = {B : B ⊂ A}.
For example, P ({a, b}) = { , {a}, {b}, {a, b}}. Also, for any set A, it is always
true that ∈ P (A) and A ∈ P (A). If a ∈ A, it is rarely true that a ∈ P (A), but it is
always true that {a} ⊂ P (A). Make sure you understand why!
An amusing example is P ( ) = { }. (Don’t confuse with { }! The former is
empty and the latter has one element.) Now, consider
P( )={ }
P (P ( )) = { , { }}
P (P (P ( ))) = { , { }, {{ }}, { , { }}}
After continuing this n times, for some n ∈ N, the resulting set,
P (P (· · · P ( ) · · · )),
is very large. In fact, since a set with k elements has 2k elements in its power set,
22
there are 22 = 65, 536 elements after only five iterations of the example. Beyond
this, the numbers are too large to print. Number sequences such as this one are
sometimes called tetrations.
2. Algebra of Sets
Let A and B be sets. There are four common binary operations used on sets.1
1In the following, some logical notation is used. The symbol ∨ is the logical nonexclusive “or.”
The symbol ∧ is the logical “and.” Their truth tables are as follows:
∧
T
F
August 4, 2017
T
T
F
F
F
F
∨
T
F
T
T
T
F
T
F
http://math.louisville.edu/∼lee/ira
2. ALGEBRA OF SETS
13
A
A
B
A
B
B
A
A∆B
A\B
A
B
B
A
B
F IGURE 1.1. These are Venn diagrams showing the four standard binary operations on sets. In this figure, the set which results from the
operation is shaded.
The union of A and B is the set containing all the elements in either A or B :
A ∪ B = {x : x ∈ A ∨ x ∈ B }.
The intersection of A and B is the set containing the elements contained in
both A and B :
A ∩ B = {x : x ∈ A ∧ x ∈ B }.
The difference of A and B is the set of elements in A and not in B :
A \ B = {x : x ∈ A ∧ x ∉ B }.
The symmetric difference of A and B is the set of elements in one of the sets,
but not the other:
A∆B = (A ∪ B ) \ (A ∩ B ).
Another common set operation is complementation. The complement of a
set A is usually thought of as the set consisting of all elements which are not in A.
But, a little thinking will convice you this is not a meaningful definition because
the collection of elements not in A is not a precisely understood collection. To
make sense of the complement of a set, there must be a welldefined universal
set U which contains all the sets in question. Then the complement of a set A ⊂ U
is A c = U \ A. It is usually the case that the universal set U is evident from the
context in which it is used.
With these operations, an extensive algebra for the manipulation of sets can
be developed. It’s usually done hand in hand with formal logic because the two
subjects share much in common. These topics are studied as part of Boolean
algebra.2 Several examples of set algebra are given in the following theorem and
its corollary.
2George Boole (18151864)
August 4, 2017
http://math.louisville.edu/∼lee/ira
14
CHAPTER 1. BASIC IDEAS
T HEOREM 1.2. Let A, B and C be sets.
(a) A \ (B ∪C ) = (A \ B ) ∩ (A \ C )
(b) A \ (B ∩C ) = (A \ B ) ∪ (A \ C )
P ROOF. (a) This is proved as a sequence of equivalences.3
x ∈ A \ (B ∪C ) ⇐⇒ x ∈ A ∧ x ∉ (B ∪C )
⇐⇒ x ∈ A ∧ x ∉ B ∧ x ∉ C
⇐⇒ (x ∈ A ∧ x ∉ B ) ∧ (x ∈ A ∧ x ∉ C )
⇐⇒ x ∈ (A \ B ) ∩ (A \C )
(b) This is also proved as a sequence of equivalences.
x ∈ A \ (B ∩C ) ⇐⇒ x ∈ A ∧ x ∉ (B ∩C )
⇐⇒ x ∈ A ∧ (x ∉ B ∨ x ∉ C )
⇐⇒ (x ∈ A ∧ x ∉ B ) ∨ (x ∈ A ∧ x ∉ C )
⇐⇒ x ∈ (A \ B ) ∪ (A \C )
Theorem 1.2 is a version of a pair of set equations which are often called
De Morgan’s Laws.4 The more usual statement of De Morgan’s Laws is in Corollary 1.3, which is an obvious consequence of Theorem 1.2 when there is a universal set to make complementation welldefined.
C OROLLARY 1.3 (De Morgan’s Laws). Let A and B be sets.
(a) (A ∪ B )c = A c ∩ B c
(b) (A ∩ B )c = A c ∪ B c
3. Indexed Sets
We often have occasion to work with large collections of sets. For example,
we could have a sequence of sets A 1 , A 2 , A 3 , · · · , where there is a set A n associated
with each n ∈ N. In general, let Λ be a set and suppose for each λ ∈ Λ there is a
set A λ . The set {A λ : λ ∈ Λ} is called a collection of sets indexed by Λ. In this case,
Λ is called the indexing set for the collection.
E XAMPLE 1.1. For each n ∈ N, let A n = {k ∈ Z : k 2 ≤ n}. Then
A 1 = A 2 =A 3 = {−1, 0, 1}, A 4 = {−2, −1, 0, 1, 2}, · · · ,
A 61 = {−7, −6, −5, −4, −3, −2, −1, 0, 1, 2, 3, 4, 5, 6, 7}, · · ·
is a collection of sets indexed by N.
3The logical symbol ⇐⇒ is the same as “if, and only if.” If A and B are any two statements,
then A ⇐⇒ B is the same as saying A implies B and B implies A. It is also common to use iff in
this way.
4Augustus De Morgan (1806–1871)
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. FUNCTIONS AND RELATIONS
15
Two of the basic binary operations can be extended to work with indexed collections. In particular, using the indexed collection from the previous paragraph,
we define
A λ = {x : x ∈ A λ for some λ ∈ Λ}
λ∈Λ
and
λ∈Λ
A λ = {x : x ∈ A λ for all λ ∈ Λ}.
De Morgan’s Laws can be generalized to indexed collections.
T HEOREM 1.4. If {B λ : λ ∈ Λ} is an indexed collection of sets and A is a set, then
A\
λ∈Λ
Bλ =
(A \ B λ )
λ∈Λ
and
A\
λ∈Λ
Bλ =
(A \ B λ ).
λ∈Λ
P ROOF. The proof of this theorem is Exercise 1.4.
4. Functions and Relations
4.1. Tuples. When listing the elements of a set, the order in which they are
listed is unimportant; e.g., {e, l , v, i , s} = {l , i , v, e, s}. If the order in which n items
are listed is important, the list is called an ntuple. (Strictly speaking, an ntuple
is not a set.) We denote an ntuple by enclosing the ordered list in parentheses.
For example, if x 1 , x 2 , x 3 , x 4 are four items, the 4tuple (x 1 , x 2 , x 3 , x 4 ) is different
from the 4tuple (x 2 , x 1 , x 3 , x 4 ).
Because they are used so often, the cases when n = 2 and n = 3 have special
names: 2tuples are called ordered pairs and a 3tuple is called an ordered triple.
D EFINITION 1.5. Let A and B be sets. The set of all ordered pairs
A × B = {(a, b) : a ∈ A ∧ b ∈ B }
is called the Cartesian product of A and B .5
E XAMPLE 1.2. If A = {a, b, c} and B = {1, 2}, then
A × B = {(a, 1), (a, 2), (b, 1), (b, 2), (c, 1), (c, 2)}.
and
B × A = {(1, a), (1, b), (1, c), (2, a), (2, b), (2, c)}.
Notice that A × B = B × A because of the importance of order in the ordered pairs.
A useful way to visualize the Cartesian product of two sets is as a table. The
Cartesian product A × B from Example 1.2 is listed as the entries of the following
table.
×
1
2
a (a, 1) (a, 2)
b (b, 1) (b, 2)
c (c, 1) (c, 2)
5René Descartes, 1596–1650
August 4, 2017
http://math.louisville.edu/∼lee/ira
16
CHAPTER 1. BASIC IDEAS
Of course, the common Cartesian plane from your analytic geometry course
is nothing more than a generalization of this idea of listing the elements of a
Cartesian product as a table.
The definition of Cartesian product can be extended to the case of more than
two sets. If {A 1 , A 2 , · · · , A n } are sets, then
A 1 × A 2 × · · · × A n = {(a 1 , a 2 , · · · , a n ) : a k ∈ A k for 1 ≤ k ≤ n}
is a set of ntuples. This is often written as
n
Ak = A1 × A2 × · · · × An .
k=1
4.2. Relations.
D EFINITION 1.6. If A and B are sets, then any R ⊂ A × B is a relation from A
to B . If (a, b) ∈ R, we write aRb.
In this case,
dom (R) = {a : (a, b) ∈ R} ⊂ A
is the domain of R and
ran (R) = {b : (a, b) ∈ R} ⊂ B
is the range of R. It may happen that dom (R) and ran (R) are proper subsets of A
and B , respectively.
In the special case when R ⊂ A × A, for some set A, there is some additional
terminology.
R is symmetric, if aRb ⇐⇒ bRa.
R is reflexive, if aRa whenever a ∈ dom (A).
R is transitive, if aRb ∧ bRc =⇒ aRc.
R is an equivalence relation on A, if it is symmetric, reflexive and transitive.
E XAMPLE 1.3. Let R be the relation on Z × Z defined by aRb ⇐⇒ a ≤ b. Then
R is reflexive and transitive, but not symmetric.
E XAMPLE 1.4. Let R be the relation on Z × Z defined by aRb ⇐⇒ a < b. Then
R is transitive, but neither reflexive nor symmetric.
E XAMPLE 1.5. Let R be the relation on Z × Z defined by aRb ⇐⇒ a 2 = b 2 . In
this case, R is an equivalence relation. It is evident that aRb iff b = a or b = −a.
4.3. Functions.
D EFINITION 1.7. A relation R ⊂ A × B is a function if
aRb 1 ∧ aRb 2 =⇒ b 1 = b 2 .
If f ⊂ A × B is a function and dom f = A, then we usually write f : A → B
and use the usual notation f (a) = b instead of a f b.
If f : A → B is a function, the usual intuitive interpretation is to regard f
as a rule that associates each element of A with a unique element of B . It’s not
necessarily the case that each element of B is associated with something from A;
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. FUNCTIONS AND RELATIONS
17
i.e., B may not be ran f . It’s also common for more than one element of A to be
associated with the same element of B .
E XAMPLE 1.6. Define f : N → Z by f (n) = n 2 and g : Z → Z by g (n) = n 2 . In
this case ran f = {n 2 : n ∈ N} and ran g = ran f ∪ {0}. Notice that even though
f and g use the same formula, they are actually different functions.
D EFINITION 1.8. If f : A → B and g : B → C , then the composition of g with f
is the function g ◦ f : A → C defined by g ◦ f (a) = g ( f (a)).
In Example 1.6, g ◦ f (n) = g ( f (n)) = g (n 2 ) = (n 2 )2 = n 4 makes sense for all
n ∈ N, but f ◦ g is undefined at n = 0.
There are several important types of functions.
D EFINITION 1.9. A function f : A → B is a constant function, if ran f has a
single element; i.e., there is a b ∈ B such that f (a) = b for all a ∈ A. The function
f is surjective (or onto B ), if ran f = B .
In a sense, constant and surjective functions are the opposite extremes. A
constant function has the smallest possible range and a surjective function has
the largest possible range. Of course, a function f : A → B can be both constant
and surjective, if B has only one element.
D EFINITION 1.10. A function f : A → B is injective (or onetoone), if f (a) =
f (b) implies a = b.
The terminology “onetoone” is very descriptive because such a function
uniquely pairs up the elements of its domain and range. An illustration of this
definition is in Figure 1.2. In Example 1.6, f is injective while g is not.
D EFINITION 1.11. A function f : A → B is bijective, if it is both surjective and
injective.
A bijective function can be visualized as uniquely pairing up all the elements
of A and B . Some authors, favoring less pretentious language, use the more
descriptive terminology onetoone correspondence instead of bijection. This
pairing up of the elements from each set is like counting them and finding they
have the same number of elements. Given any two sets, no matter how many
elements they have, the intuitive idea is they have the same number of elements
if, and only if, there is a bijection between them.
The following theorem shows that this property of counting the number of
elements works in a familiar way. (Its proof is left as an easy exercise.)
T HEOREM 1.12. If f : A → B and g : B → C are bijections, then g ◦ f : A → C is
a bijection.
4.4. Inverse Functions.
D EFINITION 1.13. If f : A → B , C ⊂ A and D ⊂ B , then the image of C is the
set f (C ) = { f (a) : a ∈ C }. The inverse image of D is the set f −1 (D) = {a : f (a) ∈ D}.
August 4, 2017
http://math.louisville.edu/∼lee/ira
18
CHAPTER 1. BASIC IDEAS
f
b
a
c
f
B
A
f
g
b
a
c
g
B
A
g
F IGURE 1.2. These diagrams show two functions, f : A → B and g :
A → B . The function g is injective and f is not because f (a) = f (c).
Definitions 1.11 and 1.13 work together in the following way. Suppose f : A →
B is bijective and b ∈ B . The fact that f is surjective guarantees that f −1 ({b}) = .
Since f is injective, f −1 ({b}) contains only one element, say a, where f (a) = b. In
this way, it is seen that f −1 is a rule that assigns each element of B to exactly one
element of A; i.e., f −1 is a function with domain B and range A.
D EFINITION 1.14. If f : A → B is bijective, the inverse of f is the function
f −1 : B → A with the property that f −1 ◦ f (a) = a for all a ∈ A and f ◦ f −1 (b) = b
for all b ∈ B .6
There is some ambiguity in the meaning of f −1 between 1.13 and 1.14. The
former is an operation working with subsets of A and B ; the latter is a function
working with elements of A and B . It’s usually clear from the context which
meaning is being used.
E XAMPLE 1.7. Let A = N and B be the even natural numbers. If f : A → B is
f (n) = 2n and g : B → A is g (n) = n/2, it is clear f is bijective. Since f ◦ g (n) =
f (n/2) = 2n/2 = n and g ◦ f (n) = g (2n) = 2n/2 = n, we see g = f −1 . (Of course, it
is also true that f = g −1 .)
E XAMPLE 1.8. Let f : N → Z be defined by
f (n) =
(n − 1)/2, n odd,
−n/2,
n even
6The notation f −1 (x) for the inverse is unfortunate because it is so easily confused with the
−1
multiplicative inverse, f (x)
. For a discussion of this, see [8]. The context is usually enough to
avoid confusion.
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. FUNCTIONS AND RELATIONS
19
f
f
—1
f —1
f
A
B
F IGURE 1.3. This is one way to visualize a general invertible function.
First f does something to a and then f −1 undoes it.
It’s quite easy to see that f is bijective and
f −1 (n) =
2n + 1, n ≥ 0,
−2n,
n<0
Given any set A, it’s obvious there is a bijection f : A → A and, if g : A →
B is a bijection, then so is g −1 : B → A. Combining these observations with
Theorem 1.12, an easy theorem follows.
T HEOREM 1.15. Let S be a collection of sets. The relation on S defined by
A ∼ B ⇐⇒ there is a bijection f : A → B
is an equivalence relation.
4.5. SchröderBernstein Theorem. The following theorem is a powerful
tool in set theory, and shows that a seemingly intuitively obvious statement
is sometimes difficult to verify. It will be used in Section 5.
T HEOREM 1.16 (SchröderBernstein7). Let A and B be sets. If there are injective
functions f : A → B and g : B → A, then there is a bijective function h : A → B .
P ROOF. Let B 1 = B \ f (A). If B k ⊂ B is defined for some k ∈ N, let A k = g (B k )
and B k+1 = f (A k ). This inductively defines A k and B k for all k ∈ N. Use these sets
to define A˜ = k∈N A k and h : A → B as
h(x) =
g −1 (x), x ∈ A˜
f (x),
x ∈ A \ A˜
.
It must be shown that h is welldefined, injective and surjective.
˜ then it is clear h(x) = f (x) is
To show h is welldefined, let x ∈ A. If x ∈ A \ A,
˜
defined. On the other hand, if x ∈ A, then x ∈ A k for some k. Since x ∈ A k = g (B k ),
we see h(x) = g −1 (x) is defined. Therefore, h is welldefined.
To show h is injective, let x, y ∈ A with x = y.
˜ then the assumptions that g and f are injective,
If both x, y ∈ A˜ or x, y ∈ A \ A,
respectively, imply h(x) = h(y).
7Felix Bernstein (1878–1956), Ernst Schröder (1841–1902)
This is often called the CantorSchröderBernstein or CantorBernstein Theorem, despite the fact
that it was apparently first proved by Richard Dedekind (1831–1916).
August 4, 2017
http://math.louisville.edu/∼lee/ira
110
CHAPTER 1. BASIC IDEAS
A
A1
B1
A3
A2
B2
B3
A4
B5
B4
···
A5
f (A)
···
B
F IGURE 1.4. Here are the first few steps from the construction used in
the proof of Theorem 1.16.
˜ Suppose x ∈ A k and h(x) =
The remaining case is when x ∈ A˜ and y ∈ A \ A.
h(y). If k = 1, then h(x) = g −1 (x) ∈ B 1 and h(y) = f (y) ∈ f (A) = B \ B 1 . This is
clearly incompatible with the assumption that h(x) = h(y). Now, suppose k > 1.
Then there is an x 1 ∈ B 1 such that
x = g ◦ f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ).
k − 1 f ’s and k g ’s
This implies
h(x) = g −1 (x) = f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ) = f (y)
k − 1 f ’s and k − 1 g ’s
so that
˜
y = g ◦ f ◦ g ◦ f ◦ · · · ◦ f ◦ g (x 1 ) ∈ A k−1 ⊂ A.
k − 2 f ’s and k − 1 g ’s
This contradiction shows that h(x) = h(y). We conclude h is injective.
To show h is surjective, let y ∈ B . If y ∈ B k for some k, then h(A k ) = g −1 (A k ) =
B k shows y ∈ h(A). If y ∉ B k for any k, y ∈ f (A) because B 1 = B \ f (A), and
˜ so y = h(x) = f (x) for some x ∈ A. This shows h is surjective.
g (y) ∉ A,
The SchröderBernstein theorem has many consequences, some of which
are at first a bit unintuitive, such as the following theorem.
C OROLLARY 1.17. There is a bijective function h : N → N × N
P ROOF. If f : N → N × N is f (n) = (n, 1), then f is clearly injective. On the
other hand, suppose g : N × N → N is defined by g ((a, b)) = 2a 3b . The uniqueness
of prime factorizations guarantees g is injective. An application of Theorem 1.16
yields h.
To appreciate the power of the SchröderBernstein theorem, try to find an
explicit bijection h : N → N × N.
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. CARDINALITY
111
5. Cardinality
There is a way to use sets and functions to formalize and generalize how we
count. For example, suppose we want to count how many elements are in the set
{a, b, c}. The natural way to do this is to point at each element in succession and
say “one, two, three.” What we’re doing is defining a bijective function between
{a, b, c} and the set {1, 2, 3}. This idea can be generalized.
D EFINITION 1.18. Given n ∈ N, the set n = {1, 2, · · · , n} is called an initial
segment of N. The trivial initial segment is 0 = . A set S has cardinality n, if
there is a bijective function f : S → n. In this case, we write card (S) = n.
The cardinalities defined in Definition 1.18 are called the finite cardinal
numbers. They correspond to the everyday counting numbers we usually use.
The idea can be generalized still further.
D EFINITION 1.19. Let A and B be two sets. If there is an injective function
f : A → B , we say card (A) ≤ card (B ).
According to Theorem 1.16, the SchröderBernstein Theorem, if card (A) ≤
card (B ) and card (B ) ≤ card (A), then there is a bijective function f : A → B . As
expected, in this case we write card (A) = card (B ). When card (A) ≤ card (B ), but
no such bijection exists, we write card (A) < card (B ). Theorem 1.15 shows that
card (A) = card (B ) is an equivalence relation between sets.
The idea here, of course, is that card (A) = card (B ) means A and B have the
same number of elements and card (A) < card (B ) means A is a smaller set than
B . This simple intuitive understanding has some surprising consequences when
the sets involved do not have finite cardinality.
In particular, the set A is countably infinite, if card (A) = card (N). In this case,
it is common to write card (N) = ℵ0 .8 When card (A) ≤ ℵ0 , then A is said to be
a countable set. In other words, the countable sets are those having finite or
countably infinite cardinality.
E XAMPLE 1.9. Let f : N → Z be defined as
f (n) =
n+1
2 ,
1 − n2 ,
when n is odd
when n is even
.
It’s easy to show f is a bijection, so card (N) = card (Z) = ℵ0 .
T HEOREM 1.20. Suppose A and B are countable sets.
(a) A × B is countable.
(b) A ∪ B is countable.
P ROOF. (a) This is a consequence of Theorem 1.17.
(b) This is Exercise 1.21.
8The symbol ℵ is the Hebrew letter “aleph” and ℵ is usually pronounced “aleph nought.”
0
August 4, 2017
http://math.louisville.edu/∼lee/ira
112
CHAPTER 1. BASIC IDEAS
An alert reader will have noticed from previous examples that
ℵ0 = card (Z) = card (ω) = card (N) = card (N × N) = card (N × N × N) = · · ·
A logical question is whether all sets either have finite cardinality, or are
countably infinite. That this is not so is seen by letting S = N in the following
theorem.
T HEOREM 1.21. If S is a set, card (S) < card (P (S)).
P ROOF. Noting that 0 = card ( ) < 1 = card (P ( )), the theorem is true when
S is empty.
Suppose S = . Since {a} ∈ P (S) for all a ∈ S, it follows that card (S) ≤
card (P (S)). Therefore, it suffices to prove there is no surjective function f :
S → P (S).
To see this, assume there is such a function f and let T = {x ∈ S : x ∉ f (x)}.
Since f is surjective, there is a t ∈ S such that f (t ) = T . Either t ∈ T or t ∉ T .
If t ∈ T = f (t ), then the definition of T implies t ∉ T , a contradiction. On
the other hand, if t ∉ T = f (t ), then the definition of T implies t ∈ T , another
contradiction. These contradictions lead to the conclusion that no such function
f can exist.
A set S is said to be uncountably infinite, or just uncountable, if ℵ0 < card (S).
Theorem 1.21 implies ℵ0 < card (P (N)), so P (N) is uncountable. In fact, the same
argument implies
ℵ0 = card (N) < card (P (N)) < card (P (P (N))) < · · ·
So, there are an infinite number of distinct infinite cardinalities.
In 1874 Georg Cantor [5] proved card (R) = card (P (N)) > ℵ0 , where R is the
set of real numbers. (A version of Cantor’s theorem appears in Theorem 2.27
below.) This naturally led to the question whether there are sets S such that
ℵ0 < card (S) < card (R). Cantor spent many years trying answer this question and
never succeeded. His assumption that no such sets exist came to be called the
continuum hypothesis.
The importance of the continuum hypothesis was highlighted by David
Hilbert at the 1900 International Congress of Mathematicians in Paris, when
he put it first on his famous list of the 23 most important open problems in mathematics. Kurt Gödel proved in 1940 that the continuum hypothesis cannot be
disproved using standard set theory, but he did not prove it was true. In 1963 it
was proved by Paul Cohen that the continuum hypothesis is actually unprovable
as a theorem in standard set theory.
So, the continuum hypothesis is a statement with the strange property that
it is neither true nor false within the framework of ordinary set theory. This
means that in the standard axiomatic development of set theory, the continuum
hypothesis, or a careful negation of it, can be taken as an additional axiom without
causing any contradictions. The technical terminology is that the continuum
hypothesis is independent of the axioms of set theory.
August 4, 2017
http://math.louisville.edu/∼lee/ira
6. EXERCISES
113
The proofs of these theorems are extremely difficult and entire broad areas of
mathematics were invented just to make their proofs possible. Even today, there
are some deep philosophical questions swirling around them. A more technical
introduction to many of these ideas is contained in the book by Ciesielski [9]. A
nontechnical and very readable history of the efforts by mathematicians to understand the continuum hypothesis is the book by Aczel [1]. A shorter, nontechnical
account of Cantor’s work is in an article by Dauben [10].
6. Exercises
1.1. If a set S has n elements for n ∈ ω, then how many elements are in P (S)?
1.2. Is there a set S such that S ∩ P (S) = ?
1.3. Prove that for any sets A and B ,
(a) A = (A ∩ B ) ∪ (A \ B )
(b) A ∪ B = (A \ B ) ∪ (B \ A) ∪ (A ∩ B ) and that the sets A \ B , B \ A and A ∩ B
are pairwise disjoint.
(c) A \ B = A ∩ B c .
1.4. Prove Theorem 1.4.
1.5. For any sets A, B , C and D,
(A × B ) ∩ (C × D) = (A ∩C ) × (B ∩ D)
and
(A × B ) ∪ (C × D) ⊂ (A ∪C ) × (B ∪ D).
Why does equality not hold in the second expression?
1.6. Prove Theorem 1.15.
1.7. Suppose R is an equivalence relation on A. For each x ∈ A define C x = {y ∈
A : xR y}. Prove that if x, y ∈ A, then either C x = C y or C x ∩C y = . (The collection
{C x : x ∈ A} is the set of equivalence classes induced by R.)
1.8. If f : A → B and g : B → C are bijections, then so is g ◦ f : A → C .
1.9. Prove or give a counter example: f : X → Y is injective iff whenever A and B
are disjoint subsets of Y , then f −1 (A) ∩ f −1 (B ) = .
1.10. If f : A → B is bijective, then f −1 is unique.
1.11. Prove that f : X → Y is surjective iff for each subset A ⊂ X , Y \ f (A) ⊂
f (X \ A).
1.12. Suppose that A k is a set for each positive integer k.
(a) Show that x ∈
August 4, 2017
∞
n=1
∞
k=n
A k iff x ∈ A k for infinitely many sets A k .
http://math.louisville.edu/∼lee/ira
114
CHAPTER 1. BASIC IDEAS
(b) Show that x ∈
sets A k .
∞
n=1
∞
k=n
A k iff x ∈ A k for all but finitely many of the
∞
The set ∞
A from (a) is often called the superior limit of the sets A k
n=1
k=n k
∞
∞
and n=1 k=n A k is often called the inferior limit of the sets A k .
1.13. Given two sets A and B , it is common to let A B denote the set of all
A
functions f : B → A. Prove that for any set A, card 2 = card (P (A)). This is why
many authors use 2 A as their notation for P (A).
1.14. Let S be a set. Prove the following two statements are equivalent:
(a) S is infinite; and,
(b) there is a proper subset T of S and a bijection f : S → T .
This statement is often used as the definition of when a set is infinite.
1.15. If S is an infinite set, then there is a countably infinite collection of
nonempty pairwise disjoint infinite sets Tn , n ∈ N such that S = n∈N Tn .
1.16. Without using the SchröderBernstein theorem, find a bijection f : [0, 1] →
(0, 1).
1.17. If f : [0, ∞) → (0, ∞) and g : (0, ∞) → [0, ∞) are given by f (x) = x + 1 and
g (x) = x, then the proof of the Schr˘oderBernstein theorem yields what bijection
h : [0, ∞) → (0, ∞)?
1.18. Find a function f : R \ {0} → R \ {0} such that f −1 = 1/ f .
1.19. Find a bijection f : [0, ∞) → (0, ∞).
1.20. If f : A → B and g : B → A are functions such that f ◦ g (x) = x for all x ∈ B
and g ◦ f (x) = x for all x ∈ A, then f −1 = g .
1.21. If A and B are sets such that card (A) = card (B ) = ℵ0 , then card (A ∪ B ) = ℵ0 .
1.22. Using the notation from the proof of the SchröderBernstein Theorem, let
A = [0, ∞), B = (0, ∞), f (x) = x + 1 and g (x) = x. Determine h(x).
1.23. Using the notation from the proof of the SchröderBernstein Theorem, let
A = N, B = Z, f (n) = n and
g (n) =
1 − 3n, n ≤ 0
3n − 1, n > 0
.
Calculate h(6) and h(7).
1.24. Suppose that in the statement of the SchröderBernstein theorem A =
B = Z and f (n) = g (n) = 2n. Following the procedure in the proof yields what
function h?
August 4, 2017
http://math.louisville.edu/∼lee/ira
6. EXERCISES
115
1.25. If {A n : n ∈ N} is a collection of countable sets, then
n∈N
A n is countable.
1.26. If A and B are sets, the set of all functions f : A → B is often denoted by B A .
If S is a set, prove that card 2
S
= card (P (S)).
1.27. If ℵ0 ≤ card (S)), then there is an injective function f : S → S that is not
surjective.
1.28. If card (S) = ℵ0 , then there is a sequence of pairwise disjoint sets Tn , n ∈ N
such that card (Tn ) = ℵ0 for every n ∈ N and n∈N Tn = S.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 2
The Real Numbers
This chapter concerns what can be thought of as the rules of the game: the
axioms of the real numbers. These axioms imply all the properties of the real
numbers and, in a sense, any set satisfying them is uniquely determined to be
the real numbers.
The axioms are presented here as rules without very much justification. Other
approaches can be used. For example, a common approach is to begin with the
Peano axioms — the axioms of the natural numbers — and build up to the real
numbers through several “completions” of the natural numbers. It’s also possible
to begin with the axioms of set theory to build up the Peano axioms as theorems
and then use those to prove our axioms as further theorems. No matter how it’s
done, there are always some axioms at the base of the structure and the rules for
the real numbers are the same, whether they’re axioms or theorems.
We choose to start at the top because the other approaches quickly turn into
a long and tedious labyrinth of technical exercises without much connection to
analysis.
1. The Field Axioms
These first six axioms are called the field axioms because any object satisfying
them is called a field. They give the arithmetic properties of the real numbers.
A field is a nonempty set F along with two binary operations, multiplication
× : F × F → F and addition + : F × F → F satisfying the following axioms.1
A XIOM 1 (Associative Laws). If a, b, c ∈ F, then (a + b) + c = a + (b + c) and
(a × b) × c = a × (b × c).
A XIOM 2 (Commutative Laws). If a, b ∈ F, then a + b = b + a and a × b = b × a.
A XIOM 3 (Distributive Law). If a, b, c ∈ F, then a × (b + c) = (a × b) + (a × c).
A XIOM 4 (Existence of identities). There are 0, 1 ∈ F such that a + 0 = a and
a × 1 = a, for all a ∈ F.
1Given a set A, a function f : A × A → A is called a binary operation. In other words, a binary
operation is just a function with two arguments. The standard notations of +(a, b) = a + b and
×(a, b) = a × b are used here. The symbol × is unfortunately used for both the Cartesian product
and the field operation, but the context in which it’s used removes the ambiguity.
21
22
CHAPTER 2. THE REAL NUMBERS
A XIOM 5 (Existence of an additive inverse). For each a ∈ F there is −a ∈ F
such that a + (−a) = 0.
A XIOM 6 (Existence of a multiplicative inverse). For each a ∈ F \ {0} there is
a −1 ∈ F such that a × a −1 = 1.
Although these axioms seem to contain most properties of the real numbers
we normally use, they don’t characterize the real numbers; they just give the rules
for arithmetic. There are many other fields besides the real numbers and studying
them is a large part of most abstract algebra courses.
E XAMPLE 2.1. From elementary algebra we know that the rational numbers
Q = {p/q : p ∈ Z ∧ q ∈ N}
form a field. It is shown in Theorem 2.14 that
real numbers.
2 ∉ Q, so Q doesn’t contain all the
E XAMPLE 2.2. Let F = {0, 1, 2} with addition and multiplication calculated
modulo 3. The addition and multiplication tables are as follows.
+
0
1
2
0
0
1
2
1
1
2
0
2
2
0
1
×
0
1
2
0
0
0
0
1
0
1
2
2
0
2
1
It is easy to check that the field axioms are satisfied. This field is usually called Z3 .
The following theorems, containing just a few useful properties of fields, are
presented mostly as examples showing how the axioms are used. More complete
developments can be found in any beginning abstract algebra text.
T HEOREM 2.1. The additive and multiplicative identities of a field F are
unique.
P ROOF. Suppose e 1 and e 2 are both multiplicative identities in F. Then
e1 = e1 × e2 = e2,
so the multiplicative identity is unique. The proof for the additive identity is
essentially the same.
T HEOREM 2.2. Let F be a field. If a, b ∈ F with b = 0, then −a and b −1 are
unique.
P ROOF. Suppose b 1 and b 2 are both multiplicative inverses for b = 0. Then,
using Axioms 4 and 1,
b 1 = b 1 × 1 = b 1 × (b × b 2 ) = (b 1 × b) × b 2 = 1 × b 2 = b 2 .
This shows the multiplicative inverse in unique. The proof is essentially the same
for the additive inverse.
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. THE ORDER AXIOM
23
There are many other properties of fields which could be proved here, but
they correspond to the usual properties of the real numbers learned in elementary
school, so we omit them. Some of them are in the exercises at the end of this
chapter.
From now on, the standard notations for algebra will usually be used; e. g.,
we will allow ab instead of a ×b, a −b for a +(−b) and a/b instead of a ×b −1 . The
reader may also use the standard facts she learned from elementary algebra.
2. The Order Axiom
The axiom of this section gives the order and metric properties of the real
numbers. In a sense, the following axiom adds some geometry to a field.
A XIOM 7 (Order axiom.). There is a set P ⊂ F such that
(a) If a, b ∈ P , then a + b, ab ∈ P .2
(b) If a ∈ F, then exactly one of the following is true: a ∈ P , −a ∈ P or
a = 0.
Any field F satisfying the axioms so far listed is naturally called an ordered field.
Of course, the set P is known as the set of positive elements of F. Using Axiom
7(b), we see F is divided into three pairwise disjoint sets: P , {0} and {−x : x ∈ P }.
The latter of these is, of course, the set of negative elements from F. The following
definition introduces familiar notation for order.
D EFINITION 2.3. We write a < b or b > a, if b − a ∈ P . The meanings of a ≤ b
and b ≥ a are now as expected.
Notice that a > 0 ⇐⇒ a = a − 0 ∈ P and a < 0 ⇐⇒ −a = 0 − a ∈ P , so a > 0
and a < 0 agree with our usual notions of positive and negative.
Our goal is to capture all the properties of the real numbers with the axioms.
The order axiom eliminates many fields from consideration. For example, Exercise 2.7 shows the field Z3 of Example 2.2 is not an ordered field. On the other
hand, facts from elementary algebra imply Q is an ordered field, so the first seven
axioms still don’t “capture” the real numbers.
Following are a few standard properties of ordered fields.
T HEOREM 2.4. Let F be an ordered field and a ∈ F. a = 0 iff a 2 > 0.
P ROOF. (⇒) If a > 0, then a 2 > 0 by Axiom 7(a). If a < 0, then −a > 0 by Axiom
7(b) and a 2 = 1a 2 = (−1)(−1)a 2 = (−a)2 > 0.
(⇐) Since 02 = 0, this is obvious.
T HEOREM 2.5. If F is an ordered field and a, b, c ∈ F, then
(a) a < b ⇐⇒ a + c < b + c,
(b) a < b ∧ b < c =⇒ a < c,
(c) a < b ∧ c > 0 =⇒ ac < bc,
2Algebra texts would say is P is closed under addition and multiplication. In Chapter 5 we’ll
use the word “closed” with a different meaning. This is one of the cases where algebraists and
analysts speak different languages. Fortunately, the context usually erases confusion.
August 4, 2017
http://math.louisville.edu/∼lee/ira
24
CHAPTER 2. THE REAL NUMBERS
(d) a < b ∧ c < 0 =⇒ ac > bc.
P ROOF. (a) a < b ⇐⇒ b − a ∈ P ⇐⇒ (b + c) − (a + c) ∈ P ⇐⇒ a + c < b + c.
(b) By supposition, both b − a, c − b ∈ P . Using the fact that P is closed under
addition, we see (b − a) + (c − b) = c − a ∈ P . Therefore, c > a.
(c) Since both b − a, c ∈ P and P is closed under multiplication, c(b − a) =
cb − ca ∈ P and, therefore, ac < bc.
(d) By assumption, b − a, −c ∈ P . Apply part (c) and Exercise 2.1.
T HEOREM 2.6 (Two Out of Three Rule). Let F be an ordered field and a, b, c ∈ F.
If ab = c and any two of a, b or c are positive, then so is the third.
P ROOF. If a > 0 and b > 0, then Axiom 7(a) implies c > 0. Next, suppose a > 0
and c > 0. In order to force a contradiction, suppose b ≤ 0. In this case, Axiom
7(b) shows
0 ≤ a(−b) = −(ab) = −c < 0,
which is impossible.
C OROLLARY 2.7. 1 > 0
P ROOF. Exercise 2.2.
C OROLLARY 2.8. Let F be an ordered field and a ∈ F. If a > 0, then a −1 > 0. If
a < 0, then a −1 < 0.
P ROOF. The proof is Exercise 2.3.
An ordered field begins to look like what we expect for the real numbers. The
number line works pretty much as usual. Combining Corollary 2.7 and Axiom 7(a),
it follows that 2 = 1 + 1 > 1 > 0, 3 = 2 + 1 > 2 > 0 and so forth. By induction, it is
seen there is a copy of N embedded in F. Similarly, there are also copies of Z and
Q in F. This shows every ordered field is infinite. But, there might be holes in the
line. For example if F = Q, numbers like 2, e and π are missing.
D EFINITION 2.9. If F is an ordered field and a < b in F, then
(a, b) = {x ∈ F : a < x < b}, (a, ∞) = {x ∈ F : a < x} and (−∞, a) = {x ∈ F : a > x}
are called open intervals. (The latter two are sometimes called open right and left
rays, respectively.)
The sets
[a, b] = {x ∈ F : a ≤ x ≤ b} [a, ∞) = {x ∈ F : a ≤ x} and (−∞, a] = {x ∈ F : a ≥ x}
are called closed intervals. (As above, the latter two are sometimes called closed
rays.)
[a, b) = {x ∈ F : a ≤ x < b} and (a, b] = {x ∈ F : a < x ≤ b} are halfopen intervals.
The difference between the open and closed intervals is that open intervals
don’t contain their endpoints and closed intervals contain their endpoints. In
the case of a ray, the interval only has one endpoint. It is incorrect to write a ray
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. THE ORDER AXIOM
25
as (a, ∞] or [−∞, a] because neither ∞ nor −∞ is an element of F. The symbols
∞ and −∞ are just place holders telling us the intervals continue forever to the
right or left.
2.1. Metric Properties. The order axiom on a field F allows us to introduce
the idea of the distance between points in F. To do this, we begin with the
following familiar definition.
D EFINITION 2.10. Let F be an ordered field. The absolute value function on F
is a function  ·  : F → F defined as
x =
x,
x ≥0
−x, x < 0
.
The most important properties of the absolute value function are contained
in the following theorem.
T HEOREM 2.11. Let F be an ordered field and x, y ∈ F. Then
(a)
(b)
(c)
(d)
(e)
x ≥ 0 and x = 0 ⇐⇒ x = 0;
x =  − x;
−x ≤ x ≤ x;
x ≤ y ⇐⇒ −y ≤ x ≤ y; and,
x + y ≤ x + y.
P ROOF.
(a) The fact that x ≥ 0 for all x ∈ F follows from Axiom 7(b).
Since 0 = −0, the second part is clear.
(b) If x ≥ 0, then −x ≤ 0 so that  − x = −(−x) = x = x. If x < 0, then −x > 0
and x = −x =  − x.
(c) If x ≥ 0, then −x = −x ≤ x = x. If x < 0, then −x = −(−x) = x < −x =
x.
(d) This is left as Exercise 2.4.
(e) Add the two sets of inequalities −x ≤ x ≤ x and −y ≤ y ≤ y to see
−(x + y) ≤ x + y ≤ x + y. Now apply (d). This is usually called the
triangle inequality.
From studying analytic geometry and calculus, we are used to thinking of
x − y as the distance between the numbers x and y. This notion of a distance
between two points of a set can be generalized.
D EFINITION 2.12. Let S be a set and d : S × S → F satisfy
(a) for all x, y ∈ S, d (x, y) ≥ 0 and d (x, y) = 0 ⇐⇒ x = y,
(b) for all x, y ∈ S, d (x, y) = d (y, x), and
(c) for all x, y, z ∈ S, d (x, z) ≤ d (x, y) + d (y, z).
Then the function d is a metric on S. The pair (S, d ) is called a metric space.
A metric is a function which defines the distance between any two points of a
set.
August 4, 2017
http://math.louisville.edu/∼lee/ira
26
CHAPTER 2. THE REAL NUMBERS
E XAMPLE 2.3. Let S be a set and define d : S × S → S by
d (x, y) =
1, x = y
0, x = y
.
It can readily be verified that d is a metric on S. This simplest of all metrics is
called the discrete metric and it can be defined on any set. It’s not often useful.
T HEOREM 2.13. If F is an ordered field, then d (x, y) = x − y is a metric on F.
P ROOF. Use parts (a), (b) and (e) of Theorem 2.11.
The metric on F derived from the absolute value function is called the standard metric on F. There are other metrics sometimes defined for specialized
purposes, but we won’t have need of them.
3. The Completeness Axiom
All the axioms given so far are obvious from beginning algebra, and, on
the surface, it’s not obvious they haven’t captured all the properties of the real
numbers. Since Q satisfies them all, the following theorem shows we’re not yet
done.
T HEOREM 2.14. There is no α ∈ Q such that α2 = 2.
P ROOF. Assume to the contrary that there is α ∈ Q with α2 = 2. Then there
are p, q ∈ N such that α = p/q with p and q relatively prime. Now,
(2.1)
p
q
2
= 2 =⇒ p 2 = 2q 2
shows p 2 is even. Since the square of an odd number is odd, p must be even;
i. e., p = 2r for some r ∈ N. Substituting this into (2.1), shows 2r 2 = q 2 . The same
argument as above establishes q is also even. This contradicts the assumption
that p and q are relatively prime. Therefore, no such α exists.
Since we suspect 2 is a perfectly fine number, there’s still something missing
from the list of axioms. Completeness is the missing idea.
The Completeness Axiom is somewhat more complicated than the previous
axioms, and several definitions are needed in order to state it.
3.1. Bounded Sets.
D EFINITION 2.15. A subset S of an ordered field F is bounded above, if there
exists M ∈ F such that M ≥ x for all x ∈ S. A subset S of an ordered field F is
bounded below, if there exists m ∈ F such that m ≤ x for all x ∈ S. The elements
M and m are called an upper bound and lower bound for S, respectively. If S is
bounded both above and below, it is a bounded set.
There is no requirement in the definition that the upper and lower bounds
for a set are elements of the set. They can be elements of the set, but typically are
not. For example, if S = (−∞, 0), then [0, ∞) is the set of all upper bounds for S,
August 4, 2017
http://math.louisville.edu/∼lee/ira
3. THE COMPLETENESS AXIOM
27
but none of them is in S. On the other hand, if T = (−∞, 0], then [0, ∞) is again
the set of all upper bounds for T , but in this case 0 is an upper bound which is
also an element of T .
A set need not have upper or lower bounds. For example S = (−∞, 0) has no
lower bounds, while P = (0, ∞) has no upper bounds. The integers, Z, has neither
upper nor lower bounds. If a set has no upper bound, it is unbounded above and,
if it has no lower bound, then it is unbounded below. In either case, it is usually
just said to be unbounded.
If M is an upper bound for the set S, then every x ≥ M is also an upper bound
for S. Considering some simple examples should lead you to suspect that among
the upper bounds for a set, there is one that is best in the sense that everything
greater is an upper bound and everything less is not an upper bound. This is the
basic idea of completeness.
D EFINITION 2.16. Suppose F is an ordered field and S is bounded above in F.
A number B ∈ F is called a least upper bound of S if
(a) B is an upper bound for S, and
(b) if α is any upper bound for S, then B ≤ α.
If S is bounded below in F, then a number b ∈ F is called a greatest lower bound
of S if
(a) b is a lower bound for S, and
(b) if α is any lower bound for S, then b ≥ α.
T HEOREM 2.17. If F is an ordered field and A ⊂ F is nonempty, then A has at
most one least upper bound and at most one greatest lower bound.
P ROOF. Suppose u 1 and u 2 are both least upper bounds for A. Since u 1
and u 2 are both upper bounds for A, two applications of Definition 2.16 shows
u 1 ≤ u 2 ≤ u 1 =⇒ u 1 = u 2 . The proof of the other case is similar.
D EFINITION 2.18. If A ⊂ F is nonempty and bounded above, then the least
upper bound of A is written lub A. When A is not bounded above, we write
lub A = ∞. When A = , then lub A = −∞.
If A ⊂ F is nonempty and bounded below, then the greatest lower bound of
A is written glb A. When A is not bounded below, we write glb A = −∞. When
A = , then glb A = ∞.3
Notice the symbol “∞” is not an element of F. Writing lub A = ∞ is just a
convenient way to say A has no upper bounds. Similarly lub = −∞ tells us
has every real number as an upper bound.
T HEOREM 2.19. Let A ⊂ F and α ∈ F. α = lub A iff (α, ∞) ∩ A = and for all
ε > 0, (α − ε, α] ∩ A = . Similarly, α = glb A iff (−∞, α) ∩ A = and for all ε > 0,
[α, α + ε) ∩ A = .
3Some people prefer the notation sup A and inf A instead of lub A and glb A, respectively. They
stand for the supremum and infimum of A.
August 4, 2017
http://math.louisville.edu/∼lee/ira
28
CHAPTER 2. THE REAL NUMBERS
P ROOF. We will prove the first statement, concerning the least upper bound.
The second statement, concerning the greatest lower bound, follows similarly.
(⇒) If x ∈ (α, ∞) ∩ A, then α cannot be an upper bound of A, which is a
contradiction. If there is an ε > 0 such that (α − ε, α] ∩ A = , then from above, we
conclude
= ((α − ε, α] ∩ A) ∪ ((α, ∞) ∩ A) = (α − ε, ∞) ∩ A.
So, α−ε/2 is an upper bound for A which is less than α = lub A. This contradiction
shows (α − ε, α] ∩ A = .
(⇐) The assumption that (α, ∞)∩ A = implies α ≥ lub A. On the other hand,
suppose lub A < α. By assumption, there is an x ∈ (lub A, α) ∩ A. This is clearly a
contradiction, since lub A < x ∈ A. Therefore, α = lub A.
An eagleeyed reader may wonder why the intervals in Theorem 2.19 are
(α − ε, α] and [α, α + ε) instead of (α − ε, α) and (α, α + ε). Just consider the case
A = {α} to see that the theorem fails when the intervals are open. When lub A ∉ A
or glb A ∉ A, the intervals can be open, as shown in the following corollary.
C OROLLARY 2.20. If A is bounded above and α = lub A ∉ A, then for all ε > 0,
(α − ε, α) ∩ A is an infinite set. Similarly, if A is bounded below and β = glb A ∉ A,
then for all ε > 0, (β, β + ε) ∩ A is an infinite set.
P ROOF. Let ε > 0. According to Theorem 2.19, there is an x 1 ∈ (α − ε, α] ∩ A.
By assumption, x 1 < α. We continue by induction. Suppose n ∈ N and x n has
been chosen to satisfy x n ∈ (α − ε, α) ∩ A. Using Theorem 2.19 as before to choose
x n+1 ∈ (x n , α)∩ A. The set {x n : n ∈ N} is infinite and contained in (α−ε, α)∩ A.
When F = Q, Theorem 2.14 shows there is no least upper bound for A = {x :
x 2 < 2} in Q. In a sense, Q has a hole where this least upper bound should be.
Adding the following completeness axiom enlarges Q to fill in the holes.
A XIOM 8 (Completeness). Every nonempty set which is bounded above has a
least upper bound.
This is the final axiom. Any field F satisfying all eight axioms is called a
complete ordered field. We assume the existence of a complete ordered field, R,
called the real numbers.
In naive set theory it can be shown that if F1 and F2 are both complete ordered
fields, then they are the same, in the following sense. There exists a unique
bijective function i : F1 → F2 such that i (a + b) = i (a) + i (b), i (ab) = i (a)i (b) and
a < b ⇐⇒ i (a) < i (b). Such a function i is called an order isomorphism. The
existence of such an order isomorphism shows that R is essentially unique. More
reading on this topic can be done in some advanced texts [12, 13].
Every statement about upper bounds has a dual statement about lower
bounds. A proof of the following dual to Axiom 8 is left as an exercise.
C OROLLARY 2.21. Every nonempty subset of R which is bounded below has a
greatest lower bound.
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. COMPARISONS OF Q AND R
29
In Section 4 it will be shown that there is an x ∈ R satisfying x 2 = 2. This will
show R removes the deficiency of Q highlighted by Theorem 2.14. The Completeness Axiom plugs up the holes in Q.
3.2. Some Consequences of Completeness. The property of completeness
is what separates analysis from geometry and algebra. Analysis requires the
use of approximation, infinity and more dynamic visualizations than algebra or
classical geometry. The rest of this course is largely concerned with applications
of completeness.
T HEOREM 2.22 (Archimedean Principle ). If a ∈ R, then there exists n a ∈ N
such that n a > a.
P ROOF. If the theorem is false, then a is an upper bound for N. Let β = lub N.
According to Theorem 2.19 there is an m ∈ N such that m > β − 1. But, this is a
contradiction because β = lub N < m + 1 ∈ N.
Some other variations on this theme are in the following corollaries.
C OROLLARY 2.23. Let a, b ∈ R with a > 0.
(a) There is an n ∈ N such that an > b.
(b) There is an n ∈ N such that 0 < 1/n < a.
(c) There is an n ∈ N such that n − 1 ≤ a < n.
P ROOF. (a) Use Theorem 2.22 to find n ∈ N where n > b/a.
(b) Let b = 1 in part (a).
(c) Theorem 2.22 guarantees that S = {n ∈ N : n > a} = . If n is the least
element of this set, then n − 1 ∉ S and n − 1 ≤ a < n.
C OROLLARY 2.24. If I is any interval from R, then I ∩ Q =
and I ∩ Qc = .
P ROOF. See Exercises 2.15 and 2.17.
A subset of R which intersects every interval is said to be dense in R. Corollary
2.24 shows both the rational and irrational numbers are dense.
4. Comparisons of Q and R
All of the above still does not establish that Q is different from R. In Theorem 2.14, it was shown that the equation x 2 = 2 has no solution in Q. The
following theorem shows x 2 = 2 does have solutions in R. Since a copy of Q is
embedded in R, it follows, in a sense, that R is bigger than Q.
T HEOREM 2.25. There is a positive α ∈ R such that α2 = 2.
P ROOF. Let S = {x > 0 : x 2 < 2}. Then 1 ∈ S, so S = . If x ≥ 2, then Theorem 2.5(c) implies x 2 ≥ 4 > 2, so S is bounded above. Let α = lub S. It will be
shown that α2 = 2.
Suppose first that α2 < 2. This assumption implies (2 − α2 )/(2α + 1) > 0.
According to Corollary 2.23, there is an n ∈ N large enough so that
0<
August 4, 2017
1 2 − α2
2α + 1
<
=⇒ 0 <
< 2 − α2 .
n 2α + 1
n
http://math.louisville.edu/∼lee/ira
210
CHAPTER 2. THE REAL NUMBERS
α1 = .α1 (1) α1 (2) α1 (3) α1 (4) α1 (5)
α2 = .α2 (1) α2 (2) α2 (3) α2 (4) α2 (5)
α3 = .α3 (1) α3 (2) α3 (3) α3 (4) α3 (5)
α4 = .α4 (1) α4 (2) α4 (3) α4 (4) α4 (5)
α5 = .α5 (1) α5 (2) α5 (3) α5 (4) α5 (5)
..
..
..
..
..
..
.
.
.
.
.
.
...
...
...
...
...
F IGURE 2.1. The proof of Theorem 2.27 is called the “diagonal argument’” because it constructs a new number z by working down the
main diagonal of the array shown above, making sure z(n) = αn (n) for
each n ∈ N.
Therefore,
α+
1
n
2
2α
1
1
1
+ 2 = α2 +
2α +
n
n
n
n
+
1)
(2α
< α2 +
< α2 + (2 − α2 ) = 2
n
= α2 +
contradicts the fact that α = lub S. Therefore, α2 ≥ 2.
Next, assume α2 > 2. In this case, choose n ∈ N so that
0<
1 α2 − 2
2α
<
=⇒ 0 <
< α2 − 2.
n
2α
n
Then
α−
1
n
2
= α2 −
2α
2α
1
+ 2 > α2 −
> α2 − (α2 − 2) = 2,
n
n
n
again contradicts that α = lub S.
Therefore, α2 = 2.
Theorem 2.14 leads to the obvious question of how much bigger R is than
Q. First, note that since N ⊂ Q, it is clear that card (Q) ≥ ℵ0 . On the other hand,
every q ∈ Q has a unique reduced fractional representation q = m(q)/n(q) with
m(q) ∈ Z and n(q) ∈ N. This gives an injective function f : Q → Z × N defined by
f (q) = (m(q), n(q)), and we conclude card (Q) ≤ card (Z × N) = ℵ0 . The following
theorem ensues.
T HEOREM 2.26. card (Q) = ℵ0 .
In 1874, Georg Cantor first showed that R is not countable. The following
proof is his famous diagonal argument from 1891.
T HEOREM 2.27. card (R) > ℵ0
P ROOF. It suffices to prove that card ([0, 1]) > ℵ0 . If this is not true, then there
is a bijection α : N → [0, 1]; i.e.,
(2.2)
August 4, 2017
[0, 1] = {αn : n ∈ N}.
http://math.louisville.edu/∼lee/ira
5. EXERCISES
211
n
Each x ∈ [0, 1] can be written in the decimal form x = ∞
n=1 x(n)/10 where
x(n) ∈ {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} for each n ∈ N. This decimal representation is not
necessarily unique. For example,
∞ 9
1
5
4
.
=
=
+
2 10 10 n=2 10n
In such a case, there is a choice of x(n) so it is constantly 9 or constantly 0 from
some N onward. When given a choice, we will always opt to end the number with
a string of nines. With this convention, the decimal representation of x is unique.
Define z ∈ [0, 1] by choosing z(n) ∈ {d ∈ ω : d ≤ 8} such that z(n) = αn (n).
n
Let z = ∞
n=1 z(n)/10 . Since z ∈ [0, 1], there is an n ∈ N such that z = αn . But,
this is impossible because z(n) differs from αn in the nth decimal place. This
contradiction shows card ([0, 1]) > ℵ0 .
Around the turn of the twentieth century these thennew ideas about infinite
sets were very controversial in mathematics. This is because some of these ideas
are very unintuitive. For example, the rational numbers are a countable set
and the irrational numbers are uncountable, yet between every two rational
numbers are an uncountable number of irrational numbers and between every
two irrational numbers there are a countably infinite number of rational numbers.
It would seem there are either too few or too many gaps in the sets to make this
possible. Such a seemingly paradoxical situation flies in the face of our intuition,
which was developed with finite sets in mind.
This brings us back to the discussion of cardinalities and the Continuum
Hypothesis at the end of Section 1.5. Most of the time, people working in
real analysis assume the Continuum Hypothesis is true. With this assumption
and Theorem 2.27 it follows that whenever A ⊂ R, then either card (A) ≤ ℵ0 or
card (A) = card (R) = card (P (N)).4 Since P (N) has many more elements than N,
any countable subset of R is considered to be a small set, in the sense of cardinality, even if it is infinite. This works against the intuition of many beginning
students who are not used to thinking of Q, or any other infinite set as being
small. But it turns out to be quite useful because the fact that the union of a
countably infinite number of countable sets is still countable can be exploited in
many ways.5
In later chapters, other useful small versus large dichotomies will be found.
5. Exercises
2.1. Prove that if a, b ∈ F, where F is a field, then (−a)b = −(ab) = a(−b).
2.2. Prove 1 > 0.
4Since ℵ is the smallest infinite cardinal, ℵ is used to denote the smallest uncountable
0
1
cardinal. You will also see card (R) = c, where c is the oldstyle, (Fractur) German letter c, standing
for the “cardinality of the continuum.” Assuming the continuum hypothesis, it follows that ℵ0 <
ℵ1 = c.
5See Problem 1.25 on page 114.
August 4, 2017
http://math.louisville.edu/∼lee/ira
212
CHAPTER 2. THE REAL NUMBERS
2.3. Prove Corollary 2.8: If a > 0, then so is a −1 . If a < 0, then so is a −1 .
2.4. Prove x ≤ y iff −y ≤ x ≤ y.
2.5. Let F be an ordered field and a, b, c ∈ F. If ab = c and two of a, b and c are
negative, then the third is positive.
2.6. If S ⊂ R is bounded above, then
lub S = glb {x : x is an upper bound for S}.
2.7. Prove there is no set P ⊂ Z3 which makes Z3 into an ordered field.
2.8. If α is an upper bound for S and α ∈ S, then α = lub S.
2.9. Let A and B be subsets of R that are bounded above. Define A + B = {a + b :
a ∈ A ∧ b ∈ B }. Prove that lub (A + B ) = lub A + lub B .
2.10. If A ⊂ Z is bounded below, then A has a least element.
2.11. If F is an ordered field and a ∈ F such that 0 ≤ a < ε for every ε > 0, then
a = 0.
2.12. Let x ∈ R. Prove x < ε for all ε > 0 iff x = 0.
2.13. If p is a prime number, then the equation x 2 = p has no rational solutions.
2.14. If p is a prime number and ε > 0, then there are x, y ∈ Q such that x 2 < p <
y 2 < x 2 + ε.
2.15. If a < b, then (a, b) ∩ Q = .
2.16. If q ∈ Q and a ∈ R \ Q, then q + a ∈ R \ Q. Moreover, if q = 0, then aq ∈ R \ Q.
2.17. Prove that if a < b, then there is a q ∈ Q such that a <
2q < b.
2.18. Prove Corollary 2.24.
2.19. If F is an ordered field and x 1 , x 2 , . . . , x n ∈ F for some n ∈ N, then
n
n
xi ≤
(2.5)
i =1
x i .
i =1
2.20. Let F be an ordered field. (a) Prove F has no upper or lower bounds.
(b) Every element of F is both an upper and lower bound for .
2.21. Prove Corollary 2.21.
2.22. Prove card (Qc ) = c.
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. EXERCISES
213
2.23. If A ⊂ R and B = {x : x is an upper bound for A}, then lub (A) = glb (B ).
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 3
Sequences
We begin our study of analysis with sequences. There are several reasons
for starting here. First, sequences are the simplest way to introduce limits, the
central idea of calculus. Second, sequences are a direct route to the topology of
the real numbers. The combination of limits and topology provides the tools to
finally prove the theorems you’ve already used in your calculus course.
1. Basic Properties
D EFINITION 3.1. A sequence is a function a : N → R.
Instead of using the standard function notation of a(n) for sequences, it is
usually more convenient to write the argument of the function as a subscript, a n .
E XAMPLE 3.1. Let the sequence a n = 1 − 1/n. The first three elements are
a 1 = 0, a 2 = 1/2, a 3 = 2/3, etc.
E XAMPLE 3.2. Let the sequence b n = 2n . Then b 1 = 2, b 2 = 4, b 3 = 8, etc.
E XAMPLE 3.3. Let the sequence c n = 100 − 5n so c 1 = 95, c 2 = 90, c 3 = 85, etc.
E XAMPLE 3.4. If a and r are constants, then a sequence given by c 1 = a,
c 2 = ar , c 3 = ar 2 and in general c n = ar n−1 is called a geometric sequence. The
number r is called the ratio of the sequence. Staying away from the trivial cases
where a = 0 or r = 0, a geometric sequence can always be recognized by noticing
that ccn+1
= r for all n ∈ N. Example 3.2 is a geometric sequence with a = r = 2.
n
E XAMPLE 3.5. If a and d are constants, then a sequence of the form d n =
a + (n − 1)d is called an arithmetic sequence. Another way of looking at this is
that d n is an arithmetic sequence if d n+1 − d n = d for all n ∈ N. Example 3.3 is an
arithmetic sequence with a = 95 and d = −5.
E XAMPLE 3.6. Some sequences are not defined by an explicit formula, but are
defined recursively. This is an inductive method of definition in which successive
terms of the sequence are defined by using other terms of the sequence. The most
famous of these is the Fibonacci sequence. To define the Fibonacci sequence,
f n , let f 1 = 0, f 2 = 1 and for n > 2, let f n = f n−2 + f n−1 . The first few terms are
0, 1, 1, 2, 3, 5, 8, . . . . There actually is a simple formula that directly gives f n , and its
derivation is Exercise 3.6.
31
32
CHAPTER 3. SEQUENCES
E XAMPLE 3.7. These simple definitions can lead to complex problems. One
famous case is a hailstone sequence. Let h 1 be any natural number. For n > 1,
recursively define
3h n−1 + 1, if h n−1 is odd
hn =
.
h n−1 /2,
if h n−1 is even
Lothar Collatz conjectured in 1937 that any hailstone sequence eventually settles
down to repeating the pattern 1, 4, 2, 1, 4, 2, · · · . Many people have tried to prove
this and all have failed.
It’s often inconvenient for the domain of a sequence to be N, as required by
Definition 3.1. For example, the sequence beginning 1, 2, 4, 8, . . . can be written
20 , 21 , 22 , 23 , . . . . Written this way, it’s natural to let the sequence function be 2n
with domain ω. As long as there is a simple substitution to write the sequence
function in the form of Definition 3.1, there’s no reason to adhere to the letter
of the law. In general, the domain of a sequence can be any set of the form
{n ∈ Z : n ≥ N } for some N ∈ Z.
D EFINITION 3.2. A sequence a n is bounded if {a n : n ∈ N} is a bounded set.
This definition is extended in the obvious way to bounded above and bounded
below.
The sequence of Example 3.1 is bounded, but the sequence of Example 3.2 is
not, although it is bounded below.
D EFINITION 3.3. A sequence a n converges to L ∈ R if for all ε > 0 there exists
an N ∈ N such that whenever n ≥ N , then a n − L < ε. If a sequence does not
converge, then it is said to diverge.
When a n converges to L, we write limn→∞ a n = L, or often, more simply,
a n → L.
E XAMPLE 3.8. Let a n = 1 − 1/n be as in Example 3.1. We claim a n → 1. To see
this, let ε > 0 and choose N ∈ N such that 1/N < ε. Then, if n ≥ N
a n − 1 = (1 − 1/n) − 1 = 1/n ≤ 1/N < ε,
so a n → 1.
E XAMPLE 3.9. The sequence b n = 2n of Example 3.2 diverges. To see this,
suppose not. Then there is an L ∈ R such that b n → L. If ε = 1, there must be an
N ∈ N such that b n − L < ε whenever n ≥ N . Choose n ≥ N . L − 2n  < 1 implies
L < 2n + 1. But, then
b n+1 − L = 2n+1 − L > 2n+1 − (2n + 1) = 2n − 1 ≥ 1 = ε.
This violates the condition on N . We conclude that for every L ∈ R there exists
an ε > 0 such that for no N ∈ N is it true that whenever n ≥ N , then b n − L < ε.
Therefore, b n diverges.
D EFINITION 3.4. A sequence a n diverges to ∞ if for every B > 0 there is an
N ∈ N such that n ≥ N implies a n > B . The sequence a n is said to diverge to −∞
if −a n diverges to ∞.
August 4, 2017
http://math.louisville.edu/∼lee/ira
1. BASIC PROPERTIES
33
When a n diverges to ∞, we write limn→∞ a n = ∞, or often, more simply,
a n → ∞.
A common mistake is to forget that a n → ∞ actually means the sequence
diverges in a particular way. Don’t be fooled by the suggestive notation into
treating ∞ as a number!
E XAMPLE 3.10. It is easy to prove that the sequence a n = 2n of Example 3.2
diverges to ∞.
T HEOREM 3.5. If a n → L, then L is unique.
P ROOF. Suppose a n → L 1 and a n → L 2 . Let ε > 0. According to Definition 3.2,
there exist N1 , N2 ∈ N such that n ≥ N1 implies a n − L 1  < ε/2 and n ≥ N2 implies
a n − L 2  < ε/2. Set N = max{N1 , N2 }. If n ≥ N , then
L 1 − L 2  = L 1 − a n + a n − L 2  ≤ L 1 − a n  + a n − L 2  < ε/2 + ε/2 = ε.
Since ε is an arbitrary positive number an application of Exercise 2.12 shows
L1 = L2.
T HEOREM 3.6. a n → L iff for all ε > 0, the set {n : a n ∉ (L − ε, L + ε)} is finite.
P ROOF. (⇒) Let ε > 0. According to Definition 3.2, there is an N ∈ N such that
{a n : n ≥ N } ⊂ (L − ε, L + ε). Then {n : a n ∉ (L − ε, L + ε)} ⊂ {1, 2, . . . , N − 1}, which is
finite.
(⇐) Let ε > 0. By assumption {n : a n ∉ (L − ε, L + ε)} is finite, so let N = max{n :
a n ∉ (L−ε, L+ε)}+1. If n ≥ N , then a n ∈ (L−ε, L+ε). By Definition 3.2, a n → L.
C OROLLARY 3.7. If a n converges, then a n is bounded.
P ROOF. Suppose a n → L. According to Theorem 3.6 there are a finite number
of terms of the sequence lying outside (L −1, L +1). Since any finite set is bounded,
the conclusion follows.
The converse of this theorem is not true. For example, a n = (−1)n is bounded,
but does not converge. The main use of Corollary 3.7 is as a quick first check to
see whether a sequence might converge. It’s usually pretty easy to determine
whether a sequence is bounded. If it isn’t, it must diverge.
The following theorem lets us analyze some complicated sequences by breaking them down into combinations of simpler sequences.
T HEOREM 3.8. Let a n and b n be sequences such that a n → A and b n → B .
Then
(a) a n + b n → A + B ,
(b) a n b n → AB , and
(c) a n /b n → A/B as long as b n = 0 for all n ∈ N and B = 0.
P ROOF.
(a) Let ε > 0. There are N1 , N2 ∈ N such that n ≥ N1 implies
a n − A < ε/2 and n ≥ N2 implies b n −B  < ε/2. Define N = max{N1 , N2 }.
If n ≥ N , then
(a n + b n ) − (A + B ) ≤ a n − A + b n − B  < ε/2 + ε/2 = ε.
August 4, 2017
http://math.louisville.edu/∼lee/ira
34
CHAPTER 3. SEQUENCES
Therefore a n + b n → A + B .
(b) Let ε > 0 and α > 0 be an upper bound for a n . Choose N1 , N2 ∈ N such
that n ≥ N1 =⇒ a n − A < ε/2(B  + 1) and n ≥ N2 =⇒ b n − B  < ε/2α.
If n ≥ N = max{N1 , N2 }, then
a n b n − AB  = a n b n − a n B + a n B − AB 
≤ a n b n − a n B  + a n B − AB 
= a n b n − B  + B a n − A
ε
ε
+ B 
<α
2α
2(B  + 1)
< ε/2 + ε/2 = ε.
(c) First, notice that it suffices to show that 1/b n → 1/B , because part (b) of
this theorem can be used to achieve the full result.
Let ε > 0. Choose N ∈ N so that the following two conditions are
satisfied: n ≥ N =⇒ b n  > B /2 and b n − B  < B 2 ε/2. Then, when
n ≥ N,
B − bn
B 2 ε/2
1
1
<
=
= ε.
−
bn B
bn B
(B /2)B
Therefore 1/b n → 1/B .
If you’re not careful, you can easily read too much into the previous theorem
and try to use its converse. Consider the sequences a n = (−1)n and b n = −a n .
Their sum, a n + b n = 0, product a n b n = −1 and quotient a n /b n = −1 all converge,
but the original sequences diverge.
It is often easier to prove that a sequence converges by comparing it with a
known sequence than it is to analyze it directly. For example, a sequence such as
a n = sin2 n/n 3 can easily be seen to converge to 0 because it is dominated by 1/n 3 .
The following theorem makes this idea more precise. It’s called the Sandwich
Theorem here, but is also called the Squeeze, Pinching, Pliers or Comparison
Theorem in different texts.
T HEOREM 3.9 (Sandwich Theorem). Suppose a n , b n and c n are sequences
such that a n ≤ b n ≤ c n for all n ∈ N.
(a) If a n → L and c n → L, then b n → L.
(b) If b n → ∞, then c n → ∞.
(c) If b n → −∞, then a n → −∞.
P ROOF.
(a) Let ε > 0. There is an N ∈ N large enough so that when
n ≥ N , then L − ε < a n and c n < L + ε. These inequalities imply L − ε <
a n ≤ b n ≤ c n < L + ε. Theorem 3.6 shows c n → L.
(b) Let B > 0 and choose N ∈ N so that n ≥ N =⇒ b n > B . Then c n ≥ b n > B
whenever n ≥ N . This shows c n → ∞.
(c) This is essentially the same as part (b).
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. MONOTONE SEQUENCES
35
2. Monotone Sequences
One of the problems with using the definition of convergence to prove a
given sequence converges is the limit of the sequence must be known in order
to verify the sequence converges. This gives rise in the best cases to a “chicken
and egg” problem of somehow determining the limit before you even know the
sequence converges. In the worst case, there is no nice representation of the
limit to use, so you don’t even have a “target” to shoot at. The next few sections
are ultimately concerned with removing this deficiency from Definition 3.2, but
some interesting sideissues are explored along the way.
Not surprisingly, we begin with the simplest case.
D EFINITION 3.10. A sequence a n is increasing, if a n+1 ≥ a n for all n ∈ N. It is
strictly increasing if a n+1 > a n for all n ∈ N.
A sequence a n is decreasing, if a n+1 ≤ a n for all n ∈ N. It is strictly decreasing
if a n+1 < a n for all n ∈ N.
If a n is any of the four types listed above, then it is said to be a monotone
sequence.
Notice the ≤ and ≥ in the definitions of increasing and decreasing sequences,
respectively. Many calculus texts use strict inequalities because they seem to
better match the intuitive idea of what an increasing or decreasing sequence
should do. For us, the nonstrict inequalities are more convenient.
T HEOREM 3.11. A bounded monotone sequence converges.
P ROOF. Suppose a n is a bounded increasing sequence, L = lub {a n : n ∈ N}
and ε > 0. Clearly, a n ≤ L for all n ∈ N. According to Theorem 2.19, there exists
an N ∈ N such that a N > L − ε. Because the sequence is increasing, L ≥ a n ≥ a N >
L − ε for all n ≥ N . This shows a n → L.
If a n is decreasing, let b n = −a n and apply the preceding argument.
The key idea of this proof is the existence of the least upper bound of the sequence when the range of the sequence is viewed as a set of numbers. This means
the Completeness Axiom implies Theorem 3.11. In fact, it isn’t hard to prove Theorem 3.11 also implies the Completeness Axiom, showing they are equivalent
statements. Because of this, Theorem 3.11 is often used as the Completeness
Axiom on R instead of the least upper bound property we used in Axiom 8.
E XAMPLE 3.11. The sequence e n = 1 + n1
n
converges.
Looking at the first few terms of this sequence, e 1 = 2, e 2 = 2.25, e 3 ≈ 2.37,
e 4 ≈ 2.44, it seems to be increasing. To show this is indeed the case, fix n ∈ N and
use the binomial theorem to expand the product as
n
(3.1)
en =
k=0
August 4, 2017
n 1
k nk
http://math.louisville.edu/∼lee/ira
36
CHAPTER 3. SEQUENCES
and
n+1
e n+1 =
(3.2)
k=0
n +1
1
.
k (n + 1)k
For 1 ≤ k ≤ n, the kth term of (3.1) is
n 1
n(n − 1)(n − 2) · · · (n − (k − 1))
=
k nk
k!n k
1 n −1 n −2 n −k +1
···
k! n
n
n
1
1
2
k −1
=
1−
1−
··· 1−
k!
n
n
n
1
1
2
k −1
<
1−
1−
··· 1−
k!
n +1
n +1
n +1
n −1
n + 1 − (k − 1)
1
n
=
···
k! n + 1 n + 1
n +1
(n + 1)n(n − 1)(n − 2) · · · (n + 1 − (k − 1))
=
k!(n + 1)k
=
=
n +1
1
,
k (n + 1)k
which is the kth term of (3.2). Since (3.2) also has one more positive term in the
sum, it follows that e n < e n+1 , and the sequence e n is increasing.
Noting that 1/k! ≤ 1/2k−1 for k ∈ N, we can bound the kth term of (3.1).
n!
1
n 1
=
k n k k!(n − k)! n k
n −1 n −2 n −k +1 1
···
n
n
n
k!
1
<
k!
1
≤ k−1 .
2
=
Substituting this into (3.1) yields
n
en =
k=0
n 1
k nk
< 1+1+
= 1+
1 1
1
+ + · · · + n−1
2 4
2
1 − 21n
1 − 12
< 3,
so e n is bounded.
August 4, 2017
http://math.louisville.edu/∼lee/ira
3. SUBSEQUENCES AND THE BOLZANOWEIERSTRASS THEOREM
37
Since e n is increasing and bounded, Theorem 3.11 implies e n converges. Of
course, you probably remember from your calculus course that e n → e ≈ 2.71828.
T HEOREM 3.12. An unbounded monotone sequence diverges to ∞ or −∞,
depending on whether it is increasing or decreasing, respectively.
P ROOF. Suppose a n is increasing and unbounded. If B > 0, the fact that a n is
unbounded yields an N ∈ N such that a N > B . Since a n is increasing, a n ≥ a N > B
for all n ≥ N . This shows a n → ∞.
The proof when the sequence decreases is similar.
3. Subsequences and the BolzanoWeierstrass Theorem
D EFINITION 3.13. Let a n be a sequence and σ : N → N be a function such
that m < n implies σ(m) < σ(n); i.e., σ is a strictly increasing sequence of natural
numbers. Then b n = a ◦ σ(n) = a σ(n) is a subsequence of a n .
The idea here is that the subsequence b n is a new sequence formed from an
old sequence a n by possibly leaving terms out of a n . In other words, all the terms
of b n must also appear in a n , and they must appear in the same order.
E XAMPLE 3.12. Let σ(n) = 3n and a n be a sequence. Then the subsequence
a σ(n) looks like
a 3 , a 6 , a 9 , . . . , a 3n , . . .
The subsequence has every third term of the original sequence.
E XAMPLE 3.13. If a n = sin(nπ/2), then some possible subsequences are
b n = a 4n+1 =⇒ b n = 1,
c n = a 2n =⇒ c n = 0,
and
d n = a n 2 =⇒ d n = (1 + (−1)n+1 )/2.
T HEOREM 3.14. a n → L iff every subsequence of a n converges to L.
P ROOF. (⇒) Suppose σ : N → N is strictly increasing, as in the preceding
definition. With a simple induction argument, it can be seen that σ(n) ≥ n for all
n. (See Exercise 3.8.)
Now, suppose a n → L and b n = a σ(n) is a subsequence of a n . If ε > 0, there
is an N ∈ N such that n ≥ N implies a n ∈ (L − ε, L + ε). From the preceding
paragraph, it follows that when n ≥ N , then b n = a σ(n) = a m for some m ≥ n. So,
b n ∈ (L − ε, L + ε) and b n → L.
(⇐) Since a n is a subsequence of itself, it is obvious that a n → L.
The main use of Theorem 3.14 is not to show that sequences converge, but,
rather to show they diverge. It gives two strategies for doing this: find two subsequences converging to different limits, or find a divergent subsequence. In
Example 3.13, the subsequences b n and c n demonstrate the first strategy, while
d n demonstrates the second.
August 4, 2017
http://math.louisville.edu/∼lee/ira
38
CHAPTER 3. SEQUENCES
Even if a given sequence is badly behaved, it is possible there are wellbehaved
subsequences. For example, consider the divergent sequence a n = (−1)n . In this
case, a n diverges, but the two subsequences a 2n = 1 and a 2n+1 = −1 are constant
sequences, so they converge.
T HEOREM 3.15. Every sequence has a monotone subsequence.
P ROOF. Let a n be a sequence and T = {n ∈ N : m > n =⇒ a m ≥ a n }. There
are two cases to consider, depending on whether T is finite.
First, assume T is infinite. Define σ(1) = min T and assuming σ(n) is defined,
set σ(n + 1) = min T \ {σ(1), σ(2), . . . , σ(n)}. This inductively defines a strictly increasing function σ : N → N. The definition of T guarantees a σ(n) is an increasing
subsequence of a n .
Now, assume T is finite. Let σ(1) = max T + 1. If σ(n) has been chosen for
some n > max T , then the definition of T implies there is an m > σ(n) such that
a m ≤ a σ(n) . Set σ(n + 1) = m. This inductively defines the strictly increasing
function σ : N → N such that a σ(n) is a decreasing subsequence of a n .
If the sequence in Theorem 3.15 is bounded, then the corresponding monotone subsequence is also bounded. Recalling Theorem 3.11, we arrive at the
following famous theorem.
T HEOREM 3.16 (BolzanoWeierstrass). Every bounded sequence has a convergent subsequence.
4. Lower and Upper Limits of a Sequence
There are an uncountable number of strictly increasing functions σ : N → N,
so every sequence a n has an uncountable number of subsequences. If a n converges, then Theorem 3.14 shows all of these subsequences converge to the same
limit. It’s also apparent that when a n → ∞ or a n → −∞, then all its subsequences
diverge in the same way. When a n does not converge or diverge to ±∞, the
situation is a bit more difficult because some subsequences may converge and
others may diverge.
E XAMPLE 3.14. Let Q = {q n : n ∈ N} and α ∈ R. Since every interval contains
an infinite number of rational numbers, it is possible to choose σ(1) = min{k :
q k − α < 1}. In general, assuming σ(n) has been chosen, choose σ(n + 1) =
min{k > σ(n) : q k − α < 1/n}. Such a choice is always possible because Q ∩ (α −
1/n, α +1/n) \{q k : k ≤ σ(n)} is infinite. This induction yields a subsequence q σ(n)
of q n converging to α.
If a n is a sequence and b n is a convergent subsequence of a n with b n → L,
then L is called an accumulation point of a n . A convergent sequence has only
one accumulation point, but a divergent sequence may have many accumulation points. As seen in Example 3.14, a sequence may have all of R as its set of
accumulation points.
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. LOWER AND UPPER LIMITS OF A SEQUENCE
39
To make some sense out of this, suppose a n is a bounded sequence, and
Tn = {a k : k ≥ n}. Define
n
= glb Tn and µn = lub Tn .
Because Tn ⊃ Tn+1 , it follows that for all n ∈ N,
(3.3)
1
≤
n
≤
n+1
≤ µn+1 ≤ µn ≤ µ1 .
This shows n is an increasing sequence bounded above by µ1 and µn is a decreasing sequence bounded below by 1 . Theorem 3.11 implies both n and µn
converge. If n → and µn → µ, (3.3) shows for all n,
(3.4)
n
≤ ≤ µ ≤ µn .
Suppose b n → β is any convergent subsequence of a n . From the definitions
of n and µn , it is seen that n ≤ b n ≤ µn for all n. Now (3.4) shows ≤ β ≤ µ.
The normal terminology for and µ is given by the following definition.
D EFINITION 3.17. Let a n be a sequence. If a n is bounded below, then the
lower limit of a n is
lim inf a n = lim glb {a k : k ≥ n}.
n→∞
If a n is bounded above, then the upper limit of a n is
lim sup a n = lim lub {a k : k ≥ n}.
n→∞
When a n is unbounded, the lower and upper limits are set to appropriate infinite
values, while recalling the familiar warnings about ∞ not being a number.
E XAMPLE 3.15. Define
an =
2 + 1/n, n odd
1 − 1/n, n even
.
Then
µn = lub {a k : k ≥ n} =
2 + 1/n,
n odd
↓2
2 + 1/(n + 1), n even
and
n
= glb {a k : k ≥ n} =
1 − 1/n,
n even
1 − 1/(n + 1), n even
↑ 1.
So,
lim sup a n = 2 > 1 = lim inf a n .
Suppose a n is bounded above and both µn and µ are as in the discussion
preceding the definition. Choose σ(1) so a σ(1) > µ1 − 1. If σ(n) has been chosen
for some n ∈ N, then choose σ(n + 1) > σ(n) to satisfy
µn ≥ a σ(n+1) > lub Tn+1 − 1/n = u n+1 − 1/n.
This inductively defines a subsequence a σ(n) → µ = lim sup a n , where the convergence is guaranteed by Theorem 3.9, the Sandwich Theorem.
In the cases when lim sup a n = ∞ and lim sup a n = −∞, it is left to the reader
to show there is a subsequence b n → lim sup a n .
August 4, 2017
http://math.louisville.edu/∼lee/ira
310
CHAPTER 3. SEQUENCES
Similar arguments can be made for lim inf a n .
To summarize: If β is an accumulation point of a n , then
lim inf a n ≤ β ≤ lim sup a n .
In case a n is bounded, both lim inf a n and lim sup a n are accumulation points of
a n and a n converges iff lim inf a n = limn→∞ a n = lim sup a n .
The following theorem has been proved.
T HEOREM 3.18. Let a n be a sequence.
(a) There are subsequences of a n converging to lim inf a n and lim sup a n .
(b) If α is an accumulation point of a n , then lim inf a n ≤ α ≤ lim sup a n .
(c) lim inf a n = lim sup a n ∈ R iff a n converges.
5. The Nested Interval Theorem
D EFINITION 3.19. A collection of sets {S n : n ∈ N} is said to be nested, if
S n+1 ⊂ S n for all n ∈ N.
T HEOREM 3.20 (Nested Interval Theorem). If {I n = [a n , b n ] : n ∈ N} is a nested
collection of closed intervals such that limn→∞ (b n − a n ) = 0, then there is an x ∈ R
such that n∈N I n = {x}.
P ROOF. Since the intervals are nested, it’s clear that a n is an increasing sequence bounded above by b 1 and b n is a decreasing sequence bounded below by
a 1 . Applying Theorem 3.11 twice, we find there are α, β ∈ R such that a n → α and
b n → β.
We claim α = β. To see this, let ε > 0 and use the “shrinking” condition on
the intervals to pick N ∈ N so that b N − a N < ε. The nestedness of the intervals
implies a N ≤ a n < b n ≤ b N for all n ≥ N . Therefore
a N ≤ lub {a n : n ≥ N } = α ≤ b N and a N ≤ glb {b n : n ≥ N } = β ≤ b N .
This shows α−β ≤ b N − a N  < ε. Since ε > 0 was chosen arbitrarily, we conclude
α = β.
Let x = α = β. It remains to show that n∈N I n = {x}.
First, we show that x ∈ n∈N I n . To do this, fix N ∈ N. Since a n increases to x,
it’s clear that x ≥ a N . Similarly, x ≤ b N . Therefore x ∈ [a N , b N ]. Because N was
chosen arbitrarily, it follows that x ∈ n∈N I n .
Next, suppose there are x, y ∈ n∈N I n and let ε > 0. Choose N ∈ N such that
b N − a N < ε. Then {x, y} ⊂ n∈N I n ⊂ [a N , b N ] implies x − y < ε. Since ε was
chosen arbitrarily, we see x = y. Therefore n∈N I n = {x}.
E XAMPLE 3.16. If I n = (0, 1/n] for all n ∈ N, then the collection {I n : n ∈ N} is
nested, but n∈N I n = . This shows the assumption that the intervals be closed
in the Nested Interval Theorem is necessary.
E XAMPLE 3.17. If I n = [n, ∞) then the collection {I n : n ∈ N} is nested, but
= . This shows the assumption that the lengths of the intervals be
bounded is necessary. (It will be shown in Corollary 5.11 that when their lengths
don’t go to 0, then the intersection is nonempty, but the uniqueness of x is lost.)
n∈N I n
August 4, 2017
http://math.louisville.edu/∼lee/ira
3: Cauchy Sequences
311
6. Cauchy Sequences
Often the biggest problem with showing that a sequence converges using
the techniques we have seen so far is we must know ahead of time to what it
converges. This is the “chicken and egg” problem mentioned above. An escape
from this dilemma is provided by Cauchy sequences.
D EFINITION 3.21. A sequence a n is a Cauchy sequence if for all ε > 0 there is
an N ∈ N such that n, m ≥ N implies a n − a m  < ε.
This definition is a bit more subtle than it might at first appear. It sort of says
that all the terms of the sequence are close together from some point onward.
The emphasis is on all the terms from some point onward. To stress this, first
consider a negative example.
E XAMPLE 3.18. Suppose a n = nk=1 1/k for n ∈ N. There’s a trick for showing
the sequence a n diverges. First, note that a n is strictly increasing. For any n ∈ N,
consider
2n −1
a 2n −1 =
k=1
j
1 n−1 2 −1 1
=
k j =0 k=0 2 j + k
n−1 2 j −1
>
j =0 k=0
1
2 j +1
=
n−1 1
j =0
2
=
n
→∞
2
Hence, the subsequence a 2n −1 is unbounded and the sequence a n diverges. (To
see how this works, write out the first few sums of the form a 2n −1 .)
On the other hand, a n+1 − a n  = 1/(n + 1) → 0 and indeed, if m is fixed,
a n+m −a n  → 0. This makes it seem as though the terms are getting close together,
as in the definition of a Cauchy sequence. But, a n is not a Cauchy sequence, as
shown by the following theorem.
T HEOREM 3.22. A sequence converges iff it is a Cauchy sequence.
P ROOF. (⇒) Suppose a n → L and ε > 0. There is an N ∈ N such that n ≥ N
implies a n − L < ε/2. If m, n ≥ N , then
a m − a n  = a m − L + L − a n  ≤ a m − L + L − a m  < ε/2 + ε/2 = ε.
This shows a n is a Cauchy sequence.
(⇐) Let a n be a Cauchy sequence. First, we claim that a n is bounded. To see
this, let ε = 1 and choose N ∈ N such that n, m ≥ N implies a n − a m  < 1. In this
case, a N − 1 < a n < a N + 1 for all n ≥ N , so {a n : n ≥ N } is a bounded set. The set
{a n : n < N }, being finite, is also bounded. Since {a n : n ∈ N} is the union of these
two bounded sets, it too must be bounded.
Because a n is a bounded sequence, Theorem 3.16 implies it has a convergent
subsequence b n = a σ(n) → L. Let ε > 0 and choose N ∈ N so that n, m ≥ N implies
1August 4, 2017
August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
http://math.louisville.edu/∼lee/ira
312
CHAPTER 3. SEQUENCES
a n − a m  < ε/2 and b n − L < ε/2. If n ≥ N , then σ(n) ≥ n ≥ N and
a n − L = a n − b n + b n − L
≤ a n − b n  + b n − L
= a n − a σ(n)  + b n − L
< ε/2 + ε/2 = ε.
Therefore, a n → L.
The fact that Cauchy sequences converge is yet another equivalent version
of completeness. In fact, most advanced texts define completeness as “Cauchy
sequences converge.” This is convenient in general spaces because the definition
of a Cauchy sequence only needs the metric on the space and none of its other
structure.
A typical example of the usefulness of Cauchy sequences is given below.
D EFINITION 3.23. A sequence x n is contractive if there is a c ∈ (0, 1) such that
x k+1 − x k  ≤ cx k − x k−1  for all k > 1. c is called the contraction constant.
T HEOREM 3.24. If a sequence is contractive, then it converges.
P ROOF. Let x k be a contractive sequence with contraction constant c ∈ (0, 1).
We first claim that if n ∈ N, then
x n − x n+1  ≤ c n−1 x 1 − x 2 .
(3.5)
This is proved by induction. When n = 1, the statement is
x 1 − x 2  ≤ c 0 x 1 − x 2  = x 1 − x 2 ,
which is trivially true. Suppose that x n − x n+1  ≤ c n−1 x 1 − x 2  for some n ∈ N.
Then, from the definition of a contractive sequence and the induction hypothesis,
x n+1 − x n+2  ≤ cx n − x n+1  ≤ c c n−1 x 1 − x 2  = c n x 1 − x 2 .
This shows the claim is true in the case n + 1. Therefore, by induction, the claim
is true for all n ∈ N.
To show x n is a Cauchy sequence, let ε > 0. Since c n → 0, we can choose
N ∈ N so that
c N −1
x 1 − x 2  < ε.
(1 − c)
(3.6)
Let n > m ≥ N . Then
x n − x m  = x n − x n−1 + x n−1 − x n−2 + x n−2 − · · · − x m+1 + x m+1 − x m 
≤ x n − x n−1  + x n−1 − x n−2  + · · · + x m+1 − x m 
Now, use (3.5) on each of these terms.
≤ c n−2 x 1 − x 2  + c n−3 x 1 − x 2  + · · · + c m−1 x 1 − x 2 
= x 1 − x 2 (c n−2 + c n−3 + · · · + c m−1 )
August 4, 2017
http://math.louisville.edu/∼lee/ira
3: Cauchy Sequences
313
Apply the formula for a geometric sum.
= x 1 − x 2 c m−1
(3.7)
< x 1 − x 2 
1 − c n−m
1−c
c m−1
1−c
Use (3.6) to estimate the following.
c N −1
1−c
ε
< x 1 − x 2 
x 1 − x 2 
=ε
≤ x 1 − x 2 
This shows x n is a Cauchy sequence and must converge by Theorem 3.22.
E XAMPLE 3.19. Let −1 < r < 1 and define the sequence s n = nk=0 r k . (You no
doubt recognize this as the geometric series from your calculus course.) If r = 0,
the convergence of s n is trivial. So, suppose r = 0. In this case,
r n+1
s n+1 − s n 
= r  < 1
=
s n − s n−1 
rn
and s n is contractive. Theorem 3.24 implies s n converges.
E XAMPLE 3.20. Suppose f (x) = 2 + 1/x, a 1 = 2 and a n+1 = f (a n ) for n ∈ N. It
is evident that a n ≥ 2 for all n. Some algebra gives
a n+1 − a n
f ( f (a n−1 )) − f (a n−1 )
1
1
=
=
≤ .
a n − a n−1
f (a n−1 ) − a n−1
1 + 2a n−1 5
This shows a n is a contractive sequence and, according to Theorem 3.24, a n → L
for some L ≥ 2. Since, a n+1 = 2 + 1/a n , taking the limit as n → ∞ of both sides
gives L = 2 + 1/L. A bit more algebra shows L = 1 + 2.
L is called a fixed point of the function f ; i.e. f (L) = L. Many approximation
techniques for solving equations involve such iterative techniques depending
upon contraction to find fixed points.
The calculations in the proof of Theorem 3.24 give the means to approximate
the fixed point to within an allowable error. Looking at line (3.7), notice
c m−1
.
1−c
Let n → ∞ in this inequality to arrive at the error estimate
x n − x m  < x 1 − x 2 
c m−1
.
1−c
In Example 3.20, a 1 = 2, a 2 = 5/2 and c ≤ 1/5. Suppose we want to approximate L to 5 decimal places of accuracy. It suffices to find n satisfying
a n − L < 5 × 10−6 . Using (3.8), with m = 9 shows
(3.8)
L − x m  ≤ x 1 − x 2 
a 1 − a 2 
August 4, 2017
c m−1
≤ 1.6 × 10−6 .
1−c
http://math.louisville.edu/∼lee/ira
314
CHAPTER 3. SEQUENCES
Some arithmetic gives a 9 ≈ 2.41421. The calculator value of
L = 1 + 2 ≈ 2.414213562,
confirming our estimate.
7. Exercises
6n − 1
. Use the definition of convergence for a
3n + 2
sequence to show a n converges.
3.1.
Let the sequence a n =
3.2. If a n is a sequence such that a 2n → L and a 2n+1 → L, then a n → L.
3.3. Let a n be a sequence such that a 2n → A and a 2n − a 2n−1 → 0. Then a n → A.
3.4. If a n is a sequence of positive numbers converging to 0, then
a n → 0.
3.5. Find examples of sequences a n and b n such that a n → 0 and b n → ∞ such
that
(a)
(b)
(c)
(d)
an bn → 0
an bn → ∞
limn→∞ a n b n does not exist, but a n b n is bounded.
Given c ∈ R, a n b n → c.
3.6. If x n and y n are sequences such that limn→∞ x n = L = 0 and limn→∞ x n y n
exists, then limn→∞ y n exists.
3.7. Determine the limit of a n =
n
n!. (Hint: If n is even, then n! > (n/2)n/2 .)
3.8. If σ : N → N is strictly increasing, then σ(n) ≥ n for all n ∈ N.
3.9. Analyze the sequence given by a n =
2n
1/k.
k=n+1
3.10. Every unbounded sequence contains a monotonic subsequence.
3.11. Find a sequence a n such that given x ∈ [0, 1], there is a subsequence b n of
a n such that b n → x.
3.12. A sequence a n converges to 0 iff a n  converges to 0.
3.13. Define the sequence a n =
is not a Cauchy sequence.
n for n ∈ N. Show that a n+1 − a n  → 0, but a n
3.14. Suppose a sequence is defined by a 1 = 0, a 1 = 1 and a n+1 = 12 (a n + a n−1 )
for n ≥ 2. Prove a n converges, and determine its limit.
3.15. If the sequence a n is defined recursively by a 1 = 1 and a n+1 =
then show a n converges and determine its limit.
August 4, 2017
a n + 1,
http://math.louisville.edu/∼lee/ira
7. EXERCISES
315
3.16. Let a 1 = 3 and a n+1 = 2 − 1/x n for n ∈ N. Analyze the sequence.
3.17. If a n is a sequence such that limn→∞ a n+1 /a n  = ρ < 1, then a n → 0.
3.18. Prove that the sequence a n = n 3 /n! converges.
3.19. Let a n and b n be sequences. Prove that both sequences a n and b n converge
iff both a n + b n and a n − b n converge.
3.20. Let a n be a bounded sequence. Prove that given any ε > 0, there is an
interval I with length ε such that {n : a n ∈ I } is infinite. Is it necessary that a n be
bounded?
3.21. A sequence a n converges in the mean if a n = n1 nk=1 a k converges. Prove
that if a n → L, then a n → L, but the converse is not true.
3.22. Find a sequence x n such that for all n ∈ N there is a subsequence of x n
converging to n.
3.23. If a n is a Cauchy sequence whose terms are integers, what can you say
about the sequence?
3.24. Show a n =
n
1/k! is a Cauchy sequence.
k=0
3.25. If a n is a sequence such that every subsequence of a n has a further
subsequence converging to L, then a n → L.
3.26. If a, b ∈ (0, ∞), then show
n
a n + b n → max{a, b}.
3.27. If 0 < α < 1 and s n is a sequence satisfying s n+1  < αs n , then s n → 0.
3.28. If c ≥ 1 in the definition of a contractive sequence, can the sequence
converge?
3.29. If a n is a convergent sequence and b n is a sequence such that a m − a n  ≥
b m − b n  for all m, n ∈ N, then b n converges.
3.30. If a n ≥ 0 for all n ∈ N and a n → L, then
an →
L.
3.31. If a n is a Cauchy sequence and b n is a subsequence of a n such that b n → L,
then a n → L.
3.32. Let x 1 = 3 and x n+1 = 2 − 1/x n for n ∈ N. Analyze the sequence.
3.33. Let a n be a sequence. a n → L iff lim sup a n = L = lim inf a n .
3.34. Is lim sup(a n + b n ) = lim sup a n + lim sup b n ?
3.35. If a n is a sequence of positive numbers, then lim inf a n = lim sup 1/a n .
August 4, 2017
http://math.louisville.edu/∼lee/ira
316
CHAPTER 3. SEQUENCES
3.36. lim sup(a n + b n ) ≤ lim sup a n + lim sup b n
3.37. a n = 1/n is not contractive.
3.38. The equation x 3 − 4x + 2 = 0 has one real root lying between 0 and 1. Find
a sequence of rational numbers converging to this root. Use this sequence to
approximate the root to five decimal places.
3.39. Approximate a solution of x 3 − 5x + 1 = 0 to within 10−4 using a Cauchy
sequence.
3.40. Prove or give a counterexample: If a n → L and σ : N → N is bijective, then
b n = a σ(n) converges. Note that b n might not be a subsequence of a n . (b n is
called a rearrangement of a n .)
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 4
Series
Given a sequence a n , in many contexts it is natural to ask about the sum of
all the numbers in the sequence. If only a finite number of the a n are nonzero,
this is trivial—and not very interesting. If an infinite number of the terms aren’t
zero, the path becomes less obvious. Indeed, it’s even somewhat questionable
whether it makes sense at all to add an infinite number of numbers.
There are many approaches to this question. The method given below is the
most common technique. Others are mentioned in the exercises.
1. What is a Series?
The idea behind adding up an infinite collection of numbers is a reduction to
the wellunderstood idea of a sequence. This is a typical approach in mathematics: reduce a question to a previously solved problem.
D EFINITION 4.1. Given a sequence a n , the series having a n as its terms is the
new sequence
n
sn =
ak = a1 + a2 + · · · + an .
k=1
The numbers s n are called the partial sums of the series. If s n → S ∈ R, then the
series converges to S. This is normally written as
∞
a k = S.
k=1
Otherwise, the series diverges.
The notation ∞
n=1 a n is understood to stand for the sequence of partial sums
of the series with terms a n . When there is no ambiguity, this is often abbreviated
to just a n .
E XAMPLE 4.1. If a n = (−1)n for n ∈ N, then s 1 = −1, s 2 = −1 + 1 = 0, s 3 =
−1 + 1 − 1 = −1 and in general
sn =
(−1)n − 1
2
does not converge because it oscillates between −1 and 0. Therefore, the series
(−1)n diverges.
41
42
Series
E XAMPLE 4.2 (Geometric Series). Recall that a sequence of the form a n =
cr
is called a geometric sequence. It gives rise to a series
n−1
∞
c r n−1 = c + cr + cr 2 + cr 3 + · · ·
n=1
called a geometric series. The number r is called the ratio of the series.
Suppose a n = r n−1 for r = 1. Then,
s 1 = 1, s 2 = 1 + r, s 3 = 1 + r + r 2 , . . .
In general, it can be shown by induction (or even long division of polynomials)
that
n
n
sn =
(4.1)
ak =
k=1
r k−1 =
k=1
1−rn
.
1−r
The convergence of s n in (4.1) depends on the value of r . Letting n → ∞, it’s
apparent that s n diverges when r  > 1 and converges to 1/(1 − r ) when r  < 1.
When r = 1, s n = n → ∞. When r = −1, it’s essentially the same as Example 4.1,
and therefore diverges. In summary,
∞
c r n−1 =
n=1
c
1−r
for r  < 1, and diverges when r  ≥ 1. This is called a geometric series with ratio r .
F IGURE 4.1. Stepping to the wall.
steps
2
1
distance from wall
1/2
0
In some cases, the geometric series has an intuitively plausible limit. If you
start two meters away from a wall and keep stepping halfway to the wall, no
number of steps will get you to the wall, but a large number of steps will get you
as close to the wall as you want. (See Figure 4.1.) So, the total distance stepped
has limiting value 2. The total distance after n steps is the nth partial sum of a
geometric series with ratio r = 1/2 and c = 1.
E XAMPLE 4.3 (Harmonic Series). The series ∞
n=1 1/n is called the harmonic
series. It was shown in Example 3.18 that the harmonic series diverges.
E XAMPLE 4.4. The terms of the sequence
an =
August 4, 2017
1
n2 + n
,
n ∈ N.
http://math.louisville.edu/∼lee/ira
Basic Definitions
43
can be decomposed into partial fractions as
an =
1
1
−
.
n n +1
If s n is the series having a n as its terms, then s 1 = 1/2 = 1 − 1/2. We claim that
s n = 1 − 1/(n + 1) for all n ∈ N. To see this, suppose s k = 1 − 1/(k + 1) for some
k ∈ N. Then
s k+1 = s k + a k+1 = 1 −
1
1
1
1
+
−
= 1−
k +1
k +1 k +2
k +2
and the claim is established by induction. Now it’s easy to see that
∞
1
1
= lim 1 −
= 1.
2 +n
n→∞
n
n
+
2
n=1
This is an example of a telescoping series. The name is apparently based on the
idea that the middle terms of the series cancel, causing the series to collapse like
a handheld telescope.
The following theorem is an easy consequence of the properties of sequences
shown in Theorem 3.8.
T HEOREM 4.2. Let
a n and
b n be convergent series.
(a) If c ∈ R, then c a n = c a n .
(b) (a n + b n ) = a n + b n .
(c) a n → 0
P ROOF. Let A n = nk=1 a k and B n = nk=1 b k be the sequences of partial sums
for each of the two series. By assumption, there are numbers A and B where
A n → A and B n → B .
(a) nk=1 c a k = c nk=1 a k = c A n → c A.
(b)
n
(a
k=1 k
+ bk ) =
(c) For n > 1, a n =
n
a
k=1 k
n
a
k=1 k
+
−
n
b
k=1 k
n−1
a
k=1 k
= An + Bn → A + B .
= A n − A n−1 → A − A = 0.
Notice that the first two parts of Theorem 4.2 show that the set of all convergent series is closed under linear combinations.
Theorem 4.2(c) is very useful because its contrapositive provides the most
basic test for divergence.
C OROLLARY 4.3 (Going to Zero Test). If a n → 0, then
a n diverges.
Many have made the mistake of reading too much into Corollary 4.3. It can
only be used to show divergence. When the terms of a series do tend to zero, that
does not guarantee convergence. Example 4.3, shows Theorem 4.2(c) is necessary,
but not sufficient for convergence.
Another useful observation is that the partial sums of a convergent series are
a Cauchy sequence. The Cauchy criterion for sequences can be rephrased for
series as the following theorem, the proof of which is Exercise 4.4.
August 4, 2017
http://math.louisville.edu/∼lee/ira
44
Series
T HEOREM 4.4 (Cauchy Criterion for Series). Let
statements are equivalent.
a n be a series. The following
(a) a n converges.
(b) For every ε > 0 there is an N ∈ N such that whenever n ≥ m ≥ N , then
n
a i < ε.
i =m
2. Positive Series
Most of the time, it is very hard or impossible to determine the exact limit of
a convergent series. We must satisfy ourselves with determining whether a series
converges, and then approximating its sum. For this reason, the study of series
usually involves learning a collection of theorems that might answer whether a
given series converges, but don’t tell us to what it converges. These theorems are
usually called the convergence tests. The reader probably remembers a battery
of such tests from her calculus course. There is a myriad of such tests, and the
standard ones are presented in the next few sections, along with a few of those
less widely used.
Since convergence of a series is determined by convergence of the sequence
of its partial sums, the easiest series to study are those with wellbehaved partial
sums. Series with monotone sequences of partial sums are certainly the simplest
such series.
D EFINITION 4.5. The series
a n is a positive series, if a n ≥ 0 for all n.
The advantage of a positive series is that its sequence of partial sums is
nonnegative and increasing. Since an increasing sequence converges if and only
if it is bounded above, there is a simple criterion to determine whether a positive
series converges. All of the standard convergence tests for positive series exploit
this criterion.
2.1. The Most Common Convergence Tests. All beginning calculus courses
contain several simple tests to determine whether positive series converge. Most
of them are presented below.
2.1.1. Comparison Tests. The most basic convergence tests are the comparison tests. In these tests, the behavior of one series is inferred from that of another
series. Although they’re easy to use, there is one often fatal catch: in order to use
a comparison test, you must have a known series to which you can compare the
mystery series. For this reason, a wise mathematician collects example series
for her toolbox. The more samples in the toolbox, the more powerful are the
comparison tests.
T HEOREM 4.6 (Comparison Test). Suppose
with a n ≤ b n for all n.
(a) If
(b) If
August 4, 2017
a n and
b n are positive series
b n converges, then so does a n .
a n diverges, then so does b n .
http://math.louisville.edu/∼lee/ira
Positive Series
45
a 1 + a 2 + a 3 + a 4 + a 5 + a 6 + a 7 + a 8 + a 9 + · · · + a 15 +a 16 + · · ·
≤2a 2
≤4a 4
≤8a 8
a 1 + a 2 + a 3 + a 4 + a 5 + a 6 + a 7 + a 8 + a 9 + · · · + a 15 +a 16 + · · ·
≥a 2
≥2a 4
≥4a 8
≥8a 16
F IGURE 4.2. This diagram shows the groupings used in inequality (4.3).
P ROOF. Let A n and B n be the partial sums of a n and b n , respectively. It
follows from the assumptions that A n and B n are increasing and for all n ∈ N,
An ≤ Bn .
(4.2)
If b n = B , then (4.2) implies B is an upper bound for A n , and a n converges.
On the other hand, if a n diverges, A n → ∞ and the Sandwich Theorem
3.9(b) shows B n → ∞.
E XAMPLE 4.5. Example 4.3 shows that 1/n diverges. If p ≤ 1, then 1/n p ≥
1/n, and Theorem 4.6 implies 1/n p diverges.
sin2 n/2n converges because
E XAMPLE 4.6. The series
1
sin2 n
≤ n
2n
2
for all n and the geometric series 1/2n = 1.
T HEOREM 4.7 (Cauchy’s Condensation Test1). Suppose a n is a decreasing
sequence of nonnegative numbers. Then
2n a 2n converges.
a n converges iff
P ROOF. Since a n is decreasing, for n ∈ N,
2n+1 −1
(4.3)
a k ≤ 2n a 2n ≤ 2
k=2n
2n −1
ak .
k=2n−1
(See Figure 2.1.1.) Adding for 1 ≤ n ≤ m gives
2m+1 −1
m
ak ≤
(4.4)
k=2
k
2m −1
2 a 2k ≤ 2
k=1
ak .
k=1
Suppose a n converges to S. The righthand inequality of (4.4) shows m
2 k a 2k <
k=1
2S and 2k a 2k must converge. On the other hand, if a n diverges, then the lefthand side of (4.4) is unbounded, forcing 2k a 2k to diverge.
1The series
August 4, 2017
2n a 2n is sometimes called the condensed series associated with
an .
http://math.louisville.edu/∼lee/ira
46
Series
E XAMPLE 4.7 (pseries). For fixed p ∈ R, the series 1/n p is called a pseries.
The special case when p = 1 is the harmonic series. Notice
2n
=
(2n )p
21−p
n
is a geometric series with ratio 21−p , so it converges only when 21−p < 1. Since
21−p < 1 only when p > 1, it follows from the Cauchy Condensation Test that
the pseries converges when p > 1 and diverges when p ≤ 1. (Of course, the
divergence half of this was already known from Example 4.5.)
The pseries are often useful for the Comparison Test, and also occur in many
areas of advanced mathematics such as harmonic analysis and number theory.
T HEOREM 4.8 (Limit Comparison Test). Suppose a n and
series with
an
an
(4.5)
≤ lim sup
= β.
α = lim inf
bn
bn
b n are positive
(a) If α ∈ (0, ∞) and a n converges, then so does b n , and if b n diverges, then so does a n .
(b) If β ∈ (0, ∞) and b n diverges, then so does a n , and if a n converges, then so does b n .
P ROOF. To prove (a), suppose α > 0. There is an N ∈ N such that
α an
(4.6)
n ≥ N =⇒ <
.
2 bn
If n > N , then (4.6) gives
n
α n
bk <
ak
2 k=N
k=N
(4.7)
a n converges, then (4.7) shows the partial sums of b n are bounded and
b n converges. If b n diverges, then (4.7) shows the partial sums of a n are
unbounded, and a n must diverge.
The proof of (b) is similar.
If
The following easy corollary is the form this test takes in most calculus books.
It’s easier to use than Theorem 4.8 and suffices most of the time.
C OROLLARY 4.9 (Limit Comparison Test). Suppose
series with
an
(4.8)
α = lim
.
n→∞ b n
If α ∈ (0, ∞), then
a n and
b n are positive
b n either both converge or both diverge.
E XAMPLE 4.8. To test the series
an =
August 4, 2017
a n and
1
2n − n
1
2n − n
for convergence, let
and b n =
1
.
2n
http://math.louisville.edu/∼lee/ira
Positive Series
47
Then
an
1/(2n − n)
2n
1
= lim
=
lim
= lim
= 1 ∈ (0, ∞).
n
n
n→∞ b n
n→∞
n→∞
n→∞
1/2
2 −n
1 − n/2n
lim
Since
1/2n = 1, the original series converges by the Limit Comparison Test.
2.1.2. Geometric SeriesType Tests. The most important series is undoubtedly the geometric series. Several standard tests are basically comparisons to
geometric series.
T HEOREM 4.10 (Root Test). Suppose
a n is a positive series and
ρ = lim sup a n1/n .
If ρ < 1, then
a n converges. If ρ > 1, then
a n diverges.
P ROOF. First, suppose ρ < 1 and r ∈ (ρ, 1). There is an N ∈ N so that a n1/n < r
for all n ≥ N . This is the same as a n < r n for all n ≥ N . Using this, it follows that
when n ≥ N ,
n
N −1
ak =
k=1
n
ak +
k=1
N −1
ak <
k=N
n
ak +
k=1
rk <
k=N
N −1
ak +
k=1
rN
.
1−r
This shows the partial sums of a n are bounded. Therefore, it must converge.
1/k
If ρ > 1, there is an increasing sequence of integers k n → ∞ such that a k n > 1
n
for all n ∈ N. This shows a kn > 1 for all n ∈ N. By Theorem 4.3, a n diverges.
x n /n! converges. To see this, note
E XAMPLE 4.9. For any x ∈ R, the series
that according to Exercise 3.3.7,
x n 
n!
1/n
=
x
(n!)1/n
→ 0 < 1.
Applying the Root Test shows the series converges.
E XAMPLE 4.10. Consider the pseries 1/n and 1/n 2 . The first diverges
and the second converges. Since n 1/n → 1 and n 2/n → 1, it can be seen that when
ρ = 1, the Root Test in inconclusive.
T HEOREM 4.11 (Ratio Test). Suppose
r = lim inf
If R < 1, then
a n is a positive series. Let
a n+1
a n+1
≤ lim sup
= R.
an
an
a n converges. If r > 1, then
a n diverges.
P ROOF. First, suppose R < 1 and ρ ∈ (R, 1). There exists N ∈ N such that
a n+1 /a n < ρ whenever n ≥ N . This implies a n+1 < ρa n whenever n ≥ N . From
this it’s easy to prove by induction that a N +m < ρ m a N whenever m ∈ N. It follows
August 4, 2017
http://math.louisville.edu/∼lee/ira
48
Series
that, for n > N ,
n
N
n
k=N +1
k=1
k=1
ak
ak +
ak =
n−N
N
a N +k
ak +
=
k=1
k=1
N
n−N
ak +
<
k=1
k=1
N
ak +
<
aN ρk
k=1
aN ρ
.
1−ρ
Therefore, the partial sums of a n are bounded, and a n converges.
If r > 1, then choose N ∈ N so that a n+1 > a n for all n ≥ N . It’s now apparent
that a n → 0.
In calculus books, the ratio test usually takes the following simpler form.
C OROLLARY 4.12 (Ratio Test). Suppose
a n is a positive series. Let
a n+1
.
n→∞ a n
r = lim
If r < 1, then
a n converges. If r > 1, then
a n diverges.
From a practical viewpoint, the ratio test is often easier to apply than the root
test. But, the root test is actually the stronger of the two in the sense that there
are series for which the ratio test fails, but the root test succeeds. (See Exercise
4.10, for example.) This happens because
(4.9)
lim inf
a n+1
a n+1
≤ lim inf a n1/n ≤ lim sup a n1/n ≤ lim sup
.
an
an
To see this, note the middle inequality is always true. To prove the righthand
inequality, choose r > lim sup a n+1 /a n . It suffices to show lim sup a n1/n ≤ r . As in
the proof of the ratio test, a n+k < r k a n . This implies
a n+k < r n+k
an
,
rn
which leads to
1/(n+k)
a n+k
0, then
α = lim inf p n
an
an
− p n+1 ≤ lim sup p n
− p n+1 = β
a n+1
a n+1
a n converges. If
P ROOF. Let s n =
an N > 1 such that
a n is a positive series, p n is a
1/p n diverges and β < 0, then
n
a , suppose α > 0 and choose r
k=1 k
pn
a n diverges.
∈ (0, α). There must be
an
− p n+1 > r, ∀n ≥ N .
a n+1
Rearranging this gives
p n a n − p n+1 a n+1 > r a n+1 , ∀n ≥ N .
(4.11)
For M > N , (4.11) implies
M
M
p n a n − p n+1 a n+1 >
n=N
r a n+1
n=N
p N a N − p M +1 a M +1 > r (s M − s N −1 )
p N a N − p M +1 a M +1 + r s N −1 > r s M
p N a N + r s N −1
> sM
r
Since N is fixed, the left side is an upper bound for s M , and it follows that a n
converges.
Next suppose 1/p n diverges and β < 0. There must be an N ∈ N such that
pn
an
− p n+1 < 0, ∀n ≥ N .
a n+1
This implies
p n a n < p n+1 a n+1 , ∀n ≥ N .
Therefore, p n a n > p N a N whenever n > N and
an > p N a N
Because N is fixed and
August 4, 2017
1
, ∀n ≥ N .
pn
1/p n diverges, the Comparison Test shows
a n diverges.
http://math.louisville.edu/∼lee/ira
410
Series
Kummer’s test is powerful. In fact, it can be shown that, given any positive
series, a judicious choice of the sequence p n can always be made to determine
whether it converges. (See Exercise 4.17, [20] and [19].) But, as stated, Kummer’s
test is not very useful because choosing p n for a given series is often difficult.
Experience has led to some standard choices that work with large classes of series.
For example, Exercise 4.9 asks you to prove the choice p n = 1 for all n reduces
Kummer’s test to the standard ratio test. Other useful choices are shown in the
following theorems.
T HEOREM 4.14 (Raabe’s Test). Let
all n. Define
α = lim sup n
n→∞
If α > 1, then
a n be a positive series such that a n > 0 for
an
an
− 1 ≥ lim inf n
−1 = β
n→∞
a n+1
a n+1
a n converges. If β < 1, then
a n diverges.
P ROOF. Let p n = n in Kummer’s test, Theorem 4.13.
When Raabe’s test is inconclusive, there are even more delicate tests, such as
the theorem given below.
T HEOREM 4.15 (Bertrand’s Test). Let
for all n. Define
α = lim inf ln n n
n→∞
If α > 1, then
a n be a positive series such that a n > 0
an
an
− 1 − 1 ≤ lim sup ln n n
− 1 − 1 = β.
a n+1
a n+1
n→∞
a n converges. If β < 1, then
a n diverges.
P ROOF. Let p n = n ln n in Kummer’s test.
E XAMPLE 4.11. Consider the series
n
2k
k=1 2k + 1
an =
(4.12)
p
.
It’s of interest to know for what values of p it converges.
An easy computation shows that a n+1 /a n → 1, so the ratio test is inconclusive.
Next, try Raabe’s test. Manipulating
lim n
n→∞
an
a n+1
p
−1
2n+3 p
2n+2
= lim
1
n→∞
n
−1
it becomes a 0/0 form and can be evaluated with L’Hospital’s rule.2
lim
n2
3+2 n p
2+2 n
p
n→∞ (1 + n) (3 + 2 n)
=
p
.
2
From Raabe’s test, Theorem 4.14, it follows that the series converges when p > 2
and diverges when p < 2. Raabe’s test is inconclusive when p = 2.
2See §5.2.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Absolute and Conditional Convergence
411
Now, suppose p = 2. Consider
lim ln n n
n→∞
an
(4 + 3 n)
− 1 − 1 = − lim ln n
=0
n→∞
a n+1
4 (1 + n)2
and Bertrand’s test, Theorem 4.15, shows divergence.
The series (4.12) converges only when p > 2.
3. Absolute and Conditional Convergence
The tests given above are for the restricted case when a series has positive
terms. If the stipulation that the series be positive is thrown out, things becomes
considerably more complicated. But, as is often the case in mathematics, some
problems can be attacked by reducing them to previously solved cases. The
following definition and theorem show how to do this for some special cases.
D EFINITION 4.16. Let a n be a series. If a n  converges, then a n is
absolutely convergent. If it is convergent, but not absolutely convergent, then it is
conditionally convergent.
Since a n  is a positive series, the preceding tests can be used to determine its convergence. The following theorem shows that this is also enough for
convergence of the original series.
T HEOREM 4.17. If
a n is absolutely convergent, then it is convergent.
P ROOF. Let ε > 0. Theorem 4.4 yields an N ∈ N such that when n ≥ m ≥ N ,
ε>
n
n
a k ≥ 0.
a k  ≥
k=m
k=m
Another application Theorem 4.4 finishes the proof.
E XAMPLE 4.12. The series (−1)n+1 /n is called the alternating harmonic
series. (See Figure 4.3.) Since the harmonic series diverges, we see the alternating
harmonic series is not absolutely convergent.
On the other hand, if s n = nk=1 (−1)k+1 /k, then
n
s 2n =
k=1
n
1
1
1
−
=
2k − 1 2k
2k(2k
− 1)
k=1
is a positive series that converges by the Comparison Test. Since s 2n − s 2n−1  =
1/2n → 0, it’s clear that s 2n−1 must also converge to the same limit. Therefore, s n
converges and (−1)n+1 /n is conditionally convergent. (Another way to show
the alternating harmonic series converges is shown in Example 3.18.)
To summarize: absolute convergence implies convergence, but convergence
does not imply absolute convergence.
There are a few tests that address conditional convergence. Following are the
most wellknown.
T HEOREM 4.18 (Abel’s Test). Let a n and b n be sequences satisfying
August 4, 2017
http://math.louisville.edu/∼lee/ira
412
Series
1.0
0.8
0.6
0.4
0.2
0
5
10
15
20
25
30
35
F IGURE 4.3. This plot shows the first 35 partial sums of the alternating
harmonic series. It can be shown it converges to ln 2 ≈ 0.6931, which is
the level of the dashed line. Notice how the odd partial sums decrease
to ln 2 and the even partial sums increase to ln 2.
(a) s n = nk=1 a k is a bounded sequence.
(b) b n ≥ b n+1 , ∀n ∈ N
(c) b n → 0
Then
a n b n converges.
To prove this theorem, the following lemma is needed.
L EMMA 4.19 (Summation by Parts). For every pair of sequences a n and b n ,
n
n
n
a k b k = b n+1
k=1
k=1
a
=1
when n ∈ N. Then
n
ak bk =
k=1
(s k − s k−1 )b k
k=1
n
n
sk bk −
=
k=1
s k−1 b k
k=1
n
n
sk bk −
=
k=1
s k b k+1 − s n b n+1
k=1
n
= b n+1
n
ak −
k=1
August 4, 2017
k
(b k+1 − b k )
k=1
n
a
k=1 k
P ROOF. Let s 0 = 0 and s n =
n
ak −
k
(b k+1 − b k )
k=1
a
=1
http://math.louisville.edu/∼lee/ira
Absolute and Conditional Convergence
413
P ROOF. To prove the theorem, suppose nk=1 a k < M for all n ∈ N. Let ε > 0
and choose N ∈ N such that b N < ε/2M . If N ≤ m < n, use Lemma 4.19 to write
n
n
a b
=m
m−1
a b
a b −
=
=1
=1
n
n
a −
= b n+1
(b
+1 − b
m−1
− bm
ak
)
k=1
=1
=1
m−1
a −
=1
(b
+1 − b
ak
)
k=1
=1
Using (a) gives
n
≤ (b n+1 + b m )M + M
b
+1 − b

=m
Now, use (b) to see
n
= (b n+1 + b m )M + M
(b − b
+1 )
=m
and then telescope the sum to arrive at
= (b n+1 + b m )M + M (b m − b n+1 )
= 2M b m
ε
< 2M
2M
<ε
This shows
n
a
=1
b satisfies Theorem 4.4, and therefore converges.
There’s one special case of this theorem that’s most often seen in calculus
texts.
C OROLLARY 4.20 (Alternating Series Test). If c n decreases to 0, then the series
(−1)n+1 c n converges. Moreover, if s n = nk=1 (−1)k+1 c k and s n → s, then s n −s <
c n+1 .
P ROOF. Let a n = (−1)n+1 and b n = c n in Theorem 4.18 to see the series converges to some number s. For n ∈ N, let s n = nk=0 (−1)k+1 c k and s 0 = 0. Since
s 2n − s 2n+2 = −c 2n+1 + c 2n+2 ≤ 0 and s 2n+1 − s 2n+3 = c 2n+2 − c 2n+3 ≥ 0,
It must be that s 2n ↑ s and s 2n+1 ↓ s. For all n ∈ ω,
0 ≤ s 2n+1 − s ≤ s 2n+1 − s 2n+2 = c 2n+2 and 0 ≤ s − s 2n ≤ s 2n+1 − s 2n = c 2n+1 .
This shows s n − s < c n+1 for all n.
August 4, 2017
http://math.louisville.edu/∼lee/ira
414
Series
F IGURE 4.4. Here is a more whimsical way to visualize the partial
sums of the alternating harmonic series.
A series such as that in Corollary 4.20 is called an alternating series. More
formally, if a n is a sequence such that a n /a n+1 < 0 for all n, then a n is an
alternating series. Informally, it just means the series alternates between positive
and negative terms.
E XAMPLE 4.13. Corollary 4.20 provides another way to prove the alternating
harmonic series in Example 4.12 converges. Figures 4.3 and 4.4 show how the
partial sums bounce up and down across the sum of the series.
4. Rearrangements of Series
This is an advanced section that can be omitted.
We want to use our standard intuition about adding lists of numbers when
working with series. But, this intuition has been formed by working with finite
sums and does not always work with series.
E XAMPLE 4.14. Suppose (−1)n+1 /n = γ so that
to show γ > 1/2. Consider the following calculation.
(−1)n+1 2/n = 2γ. It’s easy
2
n
2 1 2 1
= 2−1+ − + − +···
3 2 5 3
2γ =
August 4, 2017
(−1)n+1
http://math.louisville.edu/∼lee/ira
Rearrangements of Series
415
Rearrange and regroup.
1
1
1
2 1
2 1
+ − − + − − +···
2
3 3
4
5 5
6
1 1 1
= 1− + − +···
2 3 4
=γ
= (2 − 1) −
So, γ = 2γ with γ = 0. Obviously, rearranging and regrouping of this series is a
questionable thing to do.
In order to carefully consider the problem of rearranging a series, a precise
definition is needed.
D EFINITION 4.21. Let σ : N → N be a bijection and
series a σ(n) is a rearrangement of the original series.
a n be a series. The new
The problem with Example 4.14 is that the series is conditionally convergent.
Such examples cannot happen with absolutely convergent series. For the most
part, absolutely convergent series behave as we are intuitively led to expect.
of
T HEOREM 4.22. If a n is absolutely convergent and
a n , then a σ(n) = a n .
a σ(n) is a rearrangement
P ROOF. Let a n = s and ε > 0. Choose N ∈ N so that N ≤ m < n implies
Choose M ≥ N such that
n
a  < ε.
k=m k
{1, 2, . . . , N } ⊂ {σ(1), σ(2), . . . , σ(M )}.
If P > M , then
P
P
ak −
k=1
∞
a k  ≤ ε
a σ(k) ≤
k=1
k=N +1
and both series converge to the same number.
When a series is conditionally convergent, the result of a rearrangement is
hard to predict. This is shown by the following surprising theorem.
T HEOREM 4.23 (Riemann Rearrangement). If a n is conditionally convergent
and c ∈ R ∪ {−∞, ∞}, then there is a rearrangement σ such that a σ(n) = c.
To prove this, the following lemma is needed.
L EMMA 4.24. If
bn =
then both
b n and
P ROOF. Suppose
rem 4.2 implies
a n is conditionally convergent and
an , an > 0
0,
an ≤ 0
−a n , a n < 0
0,
an ≥ 0
,
c n diverge.
b n converges. By assumption,
cn =
August 4, 2017
and c n =
bn −
a n converges, so Theo
an
http://math.louisville.edu/∼lee/ira
416
Series
converges. Another application of Theorem 4.2 shows
bn +
a n  =
cn
converges. This is a contradiction of the assumption that a n is conditionally
convergent, so b n cannot converge.
A similar contradiction arises under the assumption that c n converges.
P ROOF. (Theorem 4.23) Let b n and c n be as in Lemma 4.24 and define the
subsequence a n+ of b n by removing those terms for which b n = 0 and a n = 0.
Define the subsequence a n− of c n by removing those terms for which c n = 0. The
∞
+
−
series ∞
n=1 a n and n=1 a n are still divergent because only terms equal to zero
have been removed from b n and c n .
Now, let c ∈ R and m 0 = n 0 = 0. According to Lemma 4.24, we can define the
natural numbers
n
m 1 = min{n :
k=1
a k+ > c} and n 1 = min{n :
m1
k=1
a k+ +
n
a − < c}.
=1
If m p and n p have been chosen for some p ∈ N, then define
p
m k+1
k=0
n k+1
a+ −
m p+1 = min n :
=m k +1
n
a− +
=n k +1
a+ > c
=m p +1
and
p
m k+1
a+ −
n p+1 = min n :
k=0
=m k +1
n k+1
a−
=n k +1
n p+1
a+ −
+
=m p +1
n
a− < c .
=n p +1
Consider the series
(4.13)
+
a 1+ + a 2+ + · · · + a m
− a 1− − a 2− − · · · − a n−1
1
+
+
+
+ am
+ am
+ · · · + am
− a n−1 +1 − a n−1 +2 − · · · − a n−2
2
1 +1
1 +2
+
+
+
+ am
+ am
+ · · · + am
− a n−2 +1 − a n−2 +2 − · · · − a n−3
3
2 +1
2 +2
+···
It is clear this series is a rearrangement of
n p were chosen guarantee that
p−1
m k+1
a+ −
0<
k=0
=m k +1
and
p
0 0, ∀n ∈ N and a n → 0;
(b) B n = nk=1 b k is a bounded sequence; and,
(c) ∞
n=1 a n b n diverges.
4.15. Let a n be a sequence such that a 2n → A and a 2n − a 2n−1 → 0. Then a n → A.
4.16. Prove Bertrand’s test, Theorem 4.15.
4.17. Let a n be a positive series. Prove that a n converges if and only if there
is a sequence of positive numbers p n and α > 0 such that
lim p n
n→∞
(Hint: If s =
a n and s n =
4.18. Prove that
∞ xn
n=0 n!
an
− p n+1 = α.
a n+1
n
a , then let
k=1 k
p n = (s − s n )/a n .)
converges for all x ∈ R.
∞
k 2 (x + 3)k
k=0
4.19. Find all values of x for which
converges.
4.20. For what values of x does the series
(−1)n+1 x 2n−1
2n − 1
n=1
∞
(4.19)
converge?
(x + 3)n
converge absolutely, converge
n
n=1 n4
∞
4.21.
For what values of x does
conditionally or diverge?
∞
n +6
n=1
n 2 (x − 1)n
4.22. For what values of x does
converge absolutely, converge
conditionally or diverge?
4.23. For what positive values of α does
4.24. Prove that
August 4, 2017
cos
∞
n α
n=1 α n
converge?
nπ
π
sin converges.
3
n
http://math.louisville.edu/∼lee/ira
5. EXERCISES
4.25. For a series
419
∞
a
k=1 n
with partial sums s n , define
σn =
1 n
sn .
n k=1
Prove that if ∞
a = s, then σn → s. Find an example where σn converges, but
k=1 n
∞
a
does
not.
(If σn converges, the sequence is said to be Cesàro summable.)
n
k=1
4.26. If a n is a sequence with a subsequence b n , then ∞
n=1 b n is a subseries
∞
of ∞
a
.
Prove
that
if
every
subseries
of
a
converges,
then ∞
n=1 n
n=1 n
n=1 a n
converges absolutely.
∞
2
n=1 a n .
Give an
converges, then
∞
2
n=1 a n
4.27. If ∞
n=1 a n is a convergent positive series, then so is
example to show the converse is not true.
4.28. Prove or give a counter example: If
converges.
4.29. For what positive values of α does
∞
n=1 a n
∞
n α
n=1 α n
converge?
4.30. If a n ≥ 0 for all n ∈ N and there is a p > 1 such that limn→∞ n p a n exists and
is finite, then ∞
n=1 a n converges. Is this true for p = 1?
4.31. Finish the proof of Theorem 4.23.
4.32. Leonhard Euler started with the equation
x
x
+
= 0,
x −1 1−x
transformed it to
x
1
+
= 0,
1 − 1/x 1 − x
and then used geometric series to write it as
1
1
(4.23)
· · · + 2 + + 1 + x + x 2 + x 3 + · · · = 0.
x
x
Show how Euler did his calculation and find his mistake.
4.33. Let a n be a conditionally convergent series and c ∈ R ∪ {−∞, ∞}. There is
a sequence b n such that b n  = 1 for all n ∈ N and a n b n = c.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 5
The Topology of R
1. Open and Closed Sets
D EFINITION 5.1. A set G ⊂ R is open if for every x ∈ G there is an ε > 0 such
that (x − ε, x + ε) ⊂ G. A set F ⊂ R is closed if F c is open.
The idea is that about every point of an open set, there is some room inside
the set on both sides of the point. It is easy to see that any open interval (a, b) is
an open set because if a < x < b and ε = min{x −a, b −x}, then (x −ε, x +ε) ⊂ (a, b).
It’s obvious R itself is an open set.
On the other hand, any closed interval [a, b] is a closed set. To see this, it
must be shown its complement is open. Let x ∈ [a, b]c and ε = min{x − a, x −b}.
Then (x − ε, x + ε) ∩ [a, b] = , so (x − ε, x + ε) ⊂ [a, b]c . Therefore, [a, b]c is open,
and its complement, namely [a, b], is closed.
A singleton set {a} is closed. To see this, suppose x = a and ε = x − a. Then
a ∉ (x − ε, x + ε), and {a}c must be open. The definition of a closed set implies {a}
is closed.
Open and closed sets can get much more complicated than the intervals
examined above. For example, similar arguments show Z is a closed set and Zc is
open. Both have an infinite number of disjoint pieces.
A common mistake is to assume all sets are either open or closed. Most sets
are neither open nor closed. For example, if S = [a, b) for some numbers a < b,
then no matter the size of ε > 0, neither (a −ε, a +ε) nor (b −ε, b +ε) are contained
in S or S c .
T HEOREM 5.2.
(a) Both
and R are open.
(b) If {G λ : λ ∈ Λ} is a collection of open sets, then
λ∈Λ G λ
is open.
(c) If {G k : 1 ≤ k ≤ n} is a finite collection of open sets, then
open.
n
G
k=1 k
is
P ROOF.
(a) is open vacuously. R is obviously open.
(b) If x ∈ λ∈Λ G λ , then there is a λx ∈ Λ such that x ∈ G λx . Since G λx is
open, there is an ε > 0 such that x ∈ (x − ε, x + ε) ⊂ G λx ⊂ λ∈Λ G λ . This
shows λ∈Λ G λ is open.
(c) If x ∈ nk=1 G k , then x ∈ G k for 1 ≤ k ≤ n. For each G k there is an εk such
that (x − εk , x + εk ) ⊂ G k . Let ε = min{εk : 1 ≤ k ≤ n}. Then (x − ε, x + ε) ⊂
G k for 1 ≤ k ≤ n, so (x − ε, x + ε) ⊂ nk=1 G k . Therefore nk=1 G k is open.
51
CHAPTER 5. THE TOPOLOGY OF R
52
The word finite in part (c) of the theorem is important because the intersection of an infinite number of open sets need not be open. For example, let
G n = (−1/n, 1/n) for n ∈ N. Then each G n is open, but n∈N G n = {0} is not.
Applying DeMorgan’s laws to the parts of Theorem 5.2 gives the following.
C OROLLARY 5.3.
(a) Both and R are closed.
(b) If {F λ : λ ∈ Λ} is a collection of closed sets, then λ∈Λ G λ is closed.
(c) If {F k : 1 ≤ k ≤ n} is a finite collection of closed sets, then nk=1 F k is
closed.
Surprisingly, and R are both open and closed. They are the only subsets of
R with this dual personality. Sets that are both open and closed are sometimes
said to be clopen.
1.1. Topological Spaces. The preceding theorem provides the starting point
for a fundamental area of mathematics called topology. The properties of the
open sets of R motivated the following definition.
D EFINITION 5.4. For X a set, not necessarily a subset of R, let T ⊂ P (X ). The
set T is called a topology on X if it satisfies the following three conditions.
(a) The union of any collection of sets from T is also in T .
(b) The intersection of any finite collection of sets from T is also in T .
(c) X ∈ T and ∈ T .
The pair (X , T ) is called a topological space. The elements of T are the open sets
of the topological space. The closed sets of the topological space are those sets
whose complements are open.
If O = {G ⊂ R : G is open}, then Theorem 5.2 shows (R, O ) is a topological
space and O is called the standard topology on R. While the standard topology
is the most widely used topology, there are many other possible topologies on
R. For example, R = {(a, ∞) : a ∈ R} ∪ {R, } is a topology on R called the right
ray topology. The collection F = {S ⊂ R : S c is finite} ∪ { } is called the finite complement topology. The study of topologies is a huge subject, further discussion
of which would take us too far afield. There are many fine books on the subject
([16]) to which one can refer.
1.2. Limit Points and Closure.
D EFINITION 5.5. x 0 is a limit point1 of S ⊂ R if for every ε > 0,
(x 0 − ε, x 0 + ε) ∩ S \ {x 0 } = .
The derived set of S is
S = {x : x is a limit point of S}.
A point x 0 ∈ S \ S is an isolated point of S.
1This use of the term limit point is not universal. Some authors use the term accumulation
point. Others use condensation point, although this is more often used for those cases when every
neighborhood of x 0 intersects S in an uncountable set.
August 4, 2017
http://math.louisville.edu/∼lee/ira
1. OPEN AND CLOSED SETS
53
Notice that limit points of S need not be elements of S, but isolated points
of S must be elements of S. In a sense, limit points and isolated points are at
opposite extremes. The definitions can be restated as follows:
x 0 is a limit point of S iff ∀ε > 0 (S ∩ (x 0 − ε, x 0 + ε) \ {x 0 } = )
x 0 is an isolated point of S iff ∃ε > 0 (S ∩ (x 0 − ε, x 0 + ε) = {x 0 })
E XAMPLE 5.1. If S = (0, 1], then S = [0, 1] and S has no isolated points.
E XAMPLE 5.2. If T = {1/n : n ∈ Z \ {0}}, then T = {0} and all points of T are
isolated points of T .
T HEOREM 5.6. x 0 is a limit point of S iff there is a sequence x n ∈ S \ {x 0 } such
that x n → x 0 .
P ROOF. (⇒) For each n ∈ N choose x n ∈ S ∩ (x 0 − 1/n, x 0 + 1/n) \ {x 0 }. Then
x n − x 0  < 1/n for all n ∈ N, so x n → x 0 .
(⇐) Suppose x n is a sequence from x n ∈ S \ {x 0 } converging to x 0 . If ε > 0, the
definition of convergence for a sequence yields an N ∈ N such that whenever
n ≥ N , then x n ∈ S ∩ (x 0 − ε, x 0 + ε) \ {x 0 }. This shows S ∩ (x 0 − ε, x 0 + ε) \ {x 0 } = ,
and x 0 must be a limit point of S.
There is some common terminology making much of this easier to state. If
x 0 ∈ R and G is an open set containing x 0 , then G is called a neighborhood of x 0 .
The observations given above can be restated in terms of neighborhoods.
C OROLLARY 5.7. Let S ⊂ R.
(a) x 0 is a limit point of S iff every neighborhood of x 0 contains an infinite
number of points from S.
(b) x 0 ∈ S is an isolated point of S iff there is a neighborhood of x 0 containing only a finite number of points from S.
Following is a generalization of Theorem 3.16.
T HEOREM 5.8 (BolzanoWeierstrass Theorem). If S ⊂ R is bounded and infinite, then S = .
P ROOF. For the purposes of this proof, if I = [a, b] is a closed interval, let
I L = [a, (a + b)/2] be the closed left half of I and I R = [(a + b)/2, b] be the closed
right half of I .
Suppose S is a bounded and infinite set. The assumption that S is bounded
implies the existence of an interval I 1 = [−B, B ] containing S. Since S is infinite,
at least one of the two sets I 1L ∩ S or I 1R ∩ S is infinite. Let I 2 be either I 1L or I 1R such
that I 2 ∩ S is infinite.
If I n is such that I n ∩ S is infinite, let I n+1 be either I nL or I nR , where I n+1 ∩ S is
infinite.
In this way, a nested sequence of intervals, I n for n ∈ N, is defined such that
I n ∩ S is infinite for all n ∈ N and the length of I n is B /2n−2 → 0. According to the
Nested Interval Theorem, there is an x 0 ∈ R such that n∈N I n = {x 0 }.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 5. THE TOPOLOGY OF R
54
To see x 0 is a limit point of S, for each n, choose x n ∈ S ∩ I n \ {x 0 }. This is
possible because S ∩ I n is infinite. It follows that x n − x 0  < B /2n−2 , so x n → x 0 .
Theorem 5.6 shows x 0 ∈ S .
T HEOREM 5.9. A set S ⊂ R is closed iff it contains all its limit points.
P ROOF. (⇒) Suppose S is closed and x 0 is a limit point of S. If x 0 ∉ S, then
S open implies the existence of ε > 0 such that (x 0 − ε, x 0 + ε) ∩ S = . This
contradicts the fact that x 0 is a limit point of S. Therefore, x 0 ∈ S, and S contains
all its limit points.
(⇐) Since S contains all its limit points, if x 0 ∉ S, there must exist an ε > 0
such that (x 0 − ε, x 0 + ε) ∩ S = . It follows from this that S c is open. Therefore S
is closed.
c
D EFINITION 5.10. The closure of a set S is the set S = S ∪ S .
For the set S of Example 5.1, S = [0, 1]. In Example 5.2, T = {1/n : n ∈ Z \ {0}} ∪
{0}. According to Theorem 5.9, the closure of any set is a closed set. A useful way
to think about this is that S is the smallest closed set containing S. This is made
more precise in Exercise 5.2.
Following is a generalization of Theorem 3.20.
C OROLLARY 5.11. If {F n : n ∈ N} is a nested collection of nonempty closed and
bounded sets, then n∈N F n = .
P ROOF. Form a sequence x n by choosing x n ∈ F n for each n ∈ N. Since the F n
are nested, {x n : n ∈ N} ⊂ F 1 , and the boundedness of F 1 implies x n is a bounded
sequence. An application of Theorem 3.16 yields a subsequence y n of x n such
that y n → y. It suffices to prove y ∈ F n for all n ∈ N.
To do this, fix n 0 ∈ N. Because y n is a subsequence of x n and x n0 ∈ F n0 , it is
easy to see y n ∈ F n0 for all n ≥ n 0 . Using the fact that y n → y, we see y ∈ F n0 . Since
F n0 is closed, Theorem 5.9 shows y ∈ F n0 .
2. Relative Topologies and Connectedness
2.1. Relative Topologies. Another useful topological notion is that of a relative or subspace topology. In our case, this amounts to using the standard
topology on R to generate a topology on a subset of R. The definition is as follows.
D EFINITION 5.12. Let X ⊂ R. The set S ⊂ X is relatively open in X , if there
is a set G, open in R, such that S = G ∩ X . The set T ⊂ X is relatively closed in
X , if there is a set F , closed in R, such that S = F ∩ X . (If there is no chance for
confusion, the simpler terminology open in X and closed in X is sometimes used.)
It is left as exercises to show that if X ⊂ R and S consists of all relatively open
subsets of X , then (X , S ) is a topological space and T is relatively closed in X , if
X \ T ∈ S . (See Exercises 5.12 and 5.13.)
E XAMPLE 5.3. Let X = [0, 1]. The subsets [0, 1/2) = X ∩ (−1, 1/2) and (1/4, 1] =
X ∩ (1/4, 2) are both relatively open in X .
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 2: Relative Topologies and Connectedness
E XAMPLE 5.4. If X = Q, then {x ∈ Q : − 2 < x <
[− 2, 2] ∩ Q is clopen relative to Q.
55
2} = (− 2, 2) ∩ Q =
2.2. Connected Sets. One place where the relative topologies are useful is
in relation to the following definition.
D EFINITION 5.13. A set S ⊂ R is disconnected if there are two open intervals
U and V such that U ∩V = , U ∩ S = , V ∩ S = and S ⊂ U ∪V . Otherwise, it is
connected. The sets U ∩ S and V ∩ S are said to be a separation of S.
In other words, S is disconnected if it can be written as the union of two
disjoint and nonempty sets that are both relatively open in S. Since both these
sets are complements of each other relative to S, they are both clopen in S. This,
in turn, implies S is disconnected if it has a proper relatively clopen subset.
E XAMPLE 5.5. Let S = {x} be a set containing a single point. S is connected
because there cannot exist nonempty disjoint open sets U and V such that
S ∩U = and S ∩ V = . The same argument shows that is connected.
E XAMPLE 5.6. If S = [−1, 0) ∪ (0, 1], then U = (−2, 0) and V = (0, 2) are open
sets such that U ∩ V = , U ∩ S = , V ∩ S = and S ⊂ U ∪ V . This shows S is
disconnected.
E XAMPLE 5.7. The sets U = (−∞, 2) and V = ( 2, ∞) are open sets such
that U ∩ V = , U ∩ Q = , V ∩ Q = and Q ⊂ U ∪ V = R \ { 2}. This shows Q is
disconnected. In fact, the only connected subsets of Q are single points. Sets with
this property are often called totally disconnected.
The notion of connectedness is not really very interesting on R because the
connected sets are exactly what one would expect. It becomes more complicated
in higher dimensional spaces.
T HEOREM 5.14. A nonempty set S ⊂ R is connected iff it is either a single point
or an interval.
P ROOF. (⇒) If S is not a single point or an interval, there must be numbers
r < s < t such that r, t ∈ S and s ∉ S. In this case, the sets U = (−∞, s) and
V = (s, ∞) are a disconnection of S.
(⇐) It was shown in Example 5.5 that a set containing a single point is connected. So, assume S is an interval.
Suppose S is not connected with U and V forming a disconnection of S.
Choose u ∈ U ∩ S and v ∈ V ∩ S. There is no generality lost by assuming u < v, so
that [u, v] ⊂ S.
Let A = {x : [u, x) ⊂ U }.
We claim A = . To see this, use the fact that U is open to find ε ∈ (0, v − u)
such that (u − ε, u + ε) ⊂ U . Then u < u + ε/2 < v, so u + ε/2 ∈ A.
Define w = lub A.
Since v ∈ V it is evident u < w ≤ v and w ∈ S.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 5. THE TOPOLOGY OF R
56
If w ∈ U , then u < w < v and there is ε ∈ (0, v − w) such that (w −ε, w +ε) ⊂ U
and [u, w + ε) ⊂ S because w + ε < v. This clearly contradicts the definition of w,
so w ∉ U .
If w ∈ V , then there is an ε > 0 such that (w − ε, w] ⊂ V . In particular, this
shows w = lub A ≤ w − ε < w. This contradiction forces the conclusion that
w ∉V.
Now, putting all this together, we see w ∈ S ⊂ U ∪ V and w ∉ U ∪ V . This is a
clear contradiction, so we’re forced to conclude there is no separation of S.
3. Covering Properties and Compactness on R
3.1. Open Covers.
D EFINITION 5.15. Let S ⊂ R. A collection of open sets, O = {G λ : λ ∈ Λ}, is an
open cover of S if S ⊂ G∈O G. If O ⊂ O is also an open cover of S, then O is an
open subcover of S from O .
E XAMPLE 5.8. Let S = (0, 1) and O = {(1/n, 1) : n ∈ N}. We claim that O is an
open cover of S. To prove this, let x ∈ (0, 1). Choose n 0 ∈ N such that 1/n 0 < x.
Then
x ∈ (1/n 0 , 1) ⊂
(1/n, 1) =
G.
n∈N
G∈O
Since x is an arbitrary element of (0, 1), it follows that (0, 1) = G∈O G.
Suppose O is any infinite subset of O and x ∈ (0, 1). Since O is infinite, there
exists an n ∈ N such that x ∈ (1/n, 1) ∈ O . The rest of the proof proceeds as above.
On the other hand, if O is a finite subset of O , then let M = max{n : (1/n, 1) ∈
O }. If 0 < x < 1/M , it is clear that x ∉ G∈O G, so O is not an open cover of (0, 1).
E XAMPLE 5.9. Let T = [0, 1) and 0 < ε < 1. If
O = {(1/n, 1) : n ∈ N} ∪ {(−ε, ε)},
then O is an open cover of T .
It is evident that any open subcover of T from O must contain (−ε, ε), because
that is the only element of O which contains 0. Choose n ∈ N such that 1/n < ε.
Then O = {(−ε, ε), (1/n, 1)} is an open subcover of T from O which contains only
two elements.
T HEOREM 5.16 (Lindelöf Property). If S ⊂ R and O is any open cover of S, then
O contains a subcover with a countable number of elements.
P ROOF. Let O = {G λ : λ ∈ Λ} be an open cover of S ⊂ R. Since O is an open
cover of S, for each x ∈ S there is a λx ∈ Λ and numbers p x , q x ∈ Q satisfying
x ∈ (p x , q x ) ⊂ G λx ∈ O . The collection T = {(p x , q x ) : x ∈ S} is an open cover of S.
Thinking of the collection T = {(p x , q x ) : x ∈ S} as a set of ordered pairs of
rational numbers, it is seen that card (T ) ≤ card (Q × Q) = ℵ0 , so T is countable.
2August 4, 2017
August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
http://math.louisville.edu/∼lee/ira
Section 3: Covering Properties and Compactness on R
57
For each interval I ∈ T , choose a λI ∈ Λ such that I ⊂ G λI . Then
S⊂
I⊂
I ∈T
G λI
I ∈T
shows O = {G λI : I ∈ T } ⊂ O is an open subcover of S from O . Also, card O ≤
card (T ) ≤ ℵ0 , so O is a countable open subcover of S from O .
C OROLLARY 5.17. Any open subset of R can be written as a countable union of
pairwise disjoint open intervals.
P ROOF. Let G be open in R. For x ∈ G let αx = glb {y : (y, x] ⊂ G} and βx =
lub {y : [x, y) ⊂ G}. The fact that G is open implies αx < x < βx . Define I x =
(αx , βx ).
Then I x ⊂ G. To see this, suppose x < w < βx . Choose y ∈ (w, βx ). The
definition of βx guarantees w ∈ (x, y) ⊂ G. Similarly, if αx < w < x, it follows that
w ∈ G.
This shows O = {I x : x ∈ G} has the property that G = x∈G I x .
Suppose x, y ∈ G and I x ∩ I y = . There is no generality lost in assuming
x < y. In this case, there must be a w ∈ (x, y) such that w ∈ I x ∩ I y . We know
from above that both [x, w] ⊂ G and [w, y] ⊂ G, so [x, y] ⊂ G. It follows that
αx = α y < x < y < βx = β y and I x = I y .
From this we conclude O consists of pairwise disjoint open intervals.
To finish, apply Theorem 5.16 to extract a countable subcover from O .
Corollary 5.17 can also be proved by a different strategy. Instead of using
Theorem 5.16 to extract a countable subcover, we could just choose one rational
number from each interval in the cover. The pairwise disjointness of the intervals
in the cover guarantee this will give a bijection between O and a subset of Q. This
method has the advantage of showing O itself is countable from the start.
3.2. Compact Sets. There is a class of sets for which the conclusion of Lindelöf’s theorem can be strengthened.
D EFINITION 5.18. An open cover O of a set S is a finite cover, if O has only a
finite number of elements. The definition of a finite subcover is analogous.
D EFINITION 5.19. A set K ⊂ R is compact, if every open cover of K contains a
finite subcover.
T HEOREM 5.20 (HeineBorel). A set K ⊂ R is compact iff it is closed and
bounded.
P ROOF. (⇒) Suppose K is unbounded. The collection O = {(−n, n) : n ∈ N} is
an open cover of K . If O is any finite subset of O , then G∈O G is a bounded set
and cannot cover the unbounded set K . This shows K cannot be compact, and
every compact set must be bounded.
Suppose K is not closed. According to Theorem 5.9, there is a limit point x of
K such that x ∉ K . Define O = {[x − 1/n, x + 1/n]c : n ∈ N}. Then O is a collection
of open sets and K ⊂ G∈O G = R \ {x}. Let O = {[x − 1/n i , x + 1/n i ]c : 1 ≤ i ≤ N }
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 5. THE TOPOLOGY OF R
58
be a finite subset of O and M = max{n i : 1 ≤ i ≤ N }. Since x is a limit point of K ,
there is a y ∈ K ∩ (x − 1/M , x + 1/M ). Clearly, y ∉ G∈O G = [x − 1/M , x + 1/M ]c ,
so O cannot cover K . This shows every compact set must be closed.
(⇐) Let K be closed and bounded and let O be an open cover of K . Applying
Theorem 5.16, if necessary, we can assume O is countable. Thus, O = {G n : n ∈ N}.
For each n ∈ N, define
n
Fn = K \
n
Gi = K ∩
i =1
i =1
G ic .
Then F n is a sequence of nested, bounded and closed subsets of K . Since O covers
K , it follows that
Fn ⊂ K \
n∈N
Gn = .
n∈N
According to the Corollary 5.11, the only way this can happen is if F n = for
some n ∈ N. Then K ⊂ ni=1 G i , and O = {G i : 1 ≤ i ≤ n} is a finite subcover of K
from O .
Compactness shows up in several different, but equivalent ways on R. We’ve
already seen several of them, but their equivalence is not obvious. The following
theorem shows a few of the most common manifestations of compactness.
T HEOREM 5.21. Let K ⊂ R. The following statements are equivalent to each
other.
K is compact.
K is closed and bounded.
Every infinite subset of K has a limit point in K .
Every sequence {a n : n ∈ N} ⊂ K has a subsequence converging to an
element of K .
(e) If F n is a nested sequence of nonempty relatively closed subsets of K , then
n∈N F n = .
(a)
(b)
(c)
(d)
P ROOF. (a) ⇐⇒ (b) is the HeineBorel Theorem, Theorem 5.20.
That (b)⇒(c) is the BolzanoWeierstrass Theorem, Theorem 5.8.
(c)⇒(d) is contained in the sequence version of the BolzanoWeierstrass
theorem, Theorem 3.16.
(d)⇒(e) Let F n be as in (e). For each n ∈ N, choose a n ∈ F n . By assumption,
a n has a convergent subsequence b n → b. Each F n contains a tail of the sequence
b n , so b ∈ F n ⊂ F n for each n. Therefore, b ∈ n∈N F n , and (e) follows.
(e)⇒(b). Suppose K is such that (e) is true.
Let F n = K ∩ ((−∞, −n] ∪ [n, ∞)). Then F n is a sequence of sets which are
relatively closed in K such that n∈N F n = . If K is unbounded, then F n =
, ∀n ∈ N, and a contradiction of (e) is evident. Therefore, K must be bounded.
If K is not closed, then there must be a limit point x of K such that x ∉ K .
Define a sequence of relatively closed and nested subsets of K by F n = [x −1/n, x +
1/n] ∩ K for n ∈ N. Then n∈N F n = , because x ∉ K . This contradiction of (e)
shows that K must be closed.
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. MORE SMALL SETS
59
These various ways of looking at compactness have been given different
names by topologists. Property (c) is called limit point compactness and (d) is
called sequential compactness. There are topological spaces in which various of
the equivalences do not hold.
4. More Small Sets
This is an advanced section that can be omitted.
We’ve already seen one way in which a subset of R can be considered small—if
its cardinality is at most ℵ0 . Such sets are small in the settheoretic sense. This
section shows how sets can be considered small in the metric and topological
senses.
4.1. Sets of Measure Zero. An interval is the only subset of R for which most
people could immediately come up with some sort of measure — its length. This
idea of measuring a set by length can be generalized. For example, we know every
open set can be written as a countable union of open intervals, so it is natural to
assign the sum of the lengths of its component intervals as the measure of the
set. Discounting some technical difficulties, such as components with infinite
length, this is how the Lebesgue measure of an open set is defined. It is possible
to assign a measure to more complicated sets, but we’ll only address the special
case of sets with measure zero, sometimes called Lebesgue null sets.
D EFINITION 5.22. A set S ⊂ R has measure zero if given any ε > 0 there is a
sequence (a n , b n ) of open intervals such that
∞
S⊂
(a n , b n ) and
n∈N
(b n − a n ) < ε.
n=1
Such sets are small in the metric sense.
E XAMPLE 5.10. If S = {a} contains only one point, then S has measure zero.
To see this, let ε > 0. Note that S ⊂ (a − ε/4, a + ε/4) and this single interval has
length ε/2 < ε.
There are complicated sets of measure zero, as we’ll see later. For now, we’ll
start with a simple theorem.
T HEOREM 5.23. If {S n : n ∈ N} is a countable collection of sets of measure zero,
then n∈N S n has measure zero.
P ROOF. Let ε > 0. For each n, let {(a n,k , b n,k ) : k ∈ N} be a collection of intervals such that
∞
ε
Sn ⊂
(a n,k , b n,k ) and
(b n,k − a n,k ) < n .
2
k∈N
k=1
Then
∞
Sn ⊂
n∈N
August 4, 2017
∞
(a n,k , b n,k ) and
n∈N k∈N
n=1 k=1
ε
= ε.
n
n=1 2
∞
(b n,k − a n,k ) <
http://math.louisville.edu/∼lee/ira
510
CHAPTER 5. THE TOPOLOGY OF R
Combining this with Example 5.10 gives the following corollary.
C OROLLARY 5.24. Every countable set has measure zero.
The rational numbers is a large set in the sense that every interval contains a
rational number. But we now see it is small in both the set theoretic and metric
senses because it is countable and of measure zero.
Uncountable sets of measure zero are constructed in Section 4.3.
There is some standard terminology associated with sets of measure zero. If a
property P is true, except on a set of measure zero, then it is often said “P is true
almost everywhere” or “almost every point satisfies P .” It is also said “P is true on
a set of full measure.” For example, “Almost every real number is irrational.” or
“The irrational numbers are a set of full measure.”
4.2. Dense and Nowhere Dense Sets. We begin by considering a way that a
set can be considered topologically large in an interval. If I is any interval, recall
from Corollary 2.24 that I ∩ Q = and I ∩ Qc = . An immediate consequence
of this is that every real number is a limit point of both Q and Qc . In this sense,
the rational and irrational numbers are both uniformly distributed across the
number line. This idea is generalized in the following definition.
D EFINITION 5.25. Let A ⊂ B ⊂ R. A is said to be dense in B , if B ⊂ A.
Both the rational and irrational numbers are dense in every interval. Corollary 5.17 shows the rational and irrational numbers are dense in every open set.
It’s not hard to construct other sets dense in every interval. For example, the set
of dyadic numbers, D = {p/2q : p, q ∈ Z}, is dense in every interval — and dense
in the rational numbers.
On the other hand, Z is not dense in any interval because it’s closed and
contains no interval. If A ⊂ B , where B is an open set, then A is not dense in B , if
A contains any intervalsized gaps.
T HEOREM 5.26. Let A ⊂ B ⊂ R. A is dense in B iff whenever I is an open
interval such that I ∩ B = , then I ∩ A = .
P ROOF. (⇒) Assume there is an open interval I such that I ∩ B = and
I ∩ A = . If x ∈ I ∩ B , then I is a neighborhood of x that does not intersect A.
Definition 5.5 shows x ∉ A ⊂ A, a contradiction of the assumption that B ⊂ A.
This contradiction implies that whenever I ∩ B = , then I ∩ A = .
(⇐) If x ∈ B ∩ A = A, then x ∈ A. Assume x ∈ B \ A. By assumption, for each
n ∈ N, there is an x n ∈ (x − 1/n, x + 1/n) ∩ A. Since x n → x, this shows x ∈ A ⊂ A.
It now follows that B ⊂ A.
If B ⊂ R and I is an open interval with I ∩ B = , then I ∩ B is often called a
portion of B . The previous theorem says that A is dense in B iff every portion of
B intersects A.
If A being dense in B is thought of as A being a large subset of B , then perhaps
when A is not dense in B , it can be thought of as a small subset. But, thinking of
A as being small when it is not dense isn’t quite so clear when it is noticed that A
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. MORE SMALL SETS
511
could still be dense in some portion of B , even if it isn’t dense in B . To make A be
a truly small subset of B in the topological sense, it should not be dense in any
portion of B . The following definition gives a way to assure this is true.
D EFINITION 5.27. Let A ⊂ B ⊂ R. A is said to be nowhere dense in B if B \ A is
dense in B .
The following theorem shows that a nowhere dense set is small in the sense
mentioned above because it fails to be dense in any part of B .
T HEOREM 5.28. Let A ⊂ B ⊂ R. A is nowhere dense in B iff for every open
interval I such that I ∩ B = , there is an open interval J ⊂ I such that J ∩ B =
and J ∩ A = .
P ROOF. (⇒) Let I be an open interval such that I ∩ B = . By assumption,
B \ A is dense in B , so Theorem 5.26 implies I ∩ (B \ A) = . If x ∈ I ∩ (B \ A), then
there is an open interval J such that x ∈ J ⊂ I and J ∩ A = . Since A ⊂ A, this J
satisfies the theorem.
(⇐) Let I be an open interval with I ∩ B = . By assumption, there is an open
interval J ⊂ I such that J ∩ A = . It follows that J ∩ A = . Theorem 5.26 implies
B \ A is dense in B .
E XAMPLE 5.11. Let G be an open set that is dense in R. If I is any open interval,
then Theorem 5.26 implies I ∩G = . Because G is open, if x ∈ I ∩G, then there is
an open interval J such that x ∈ J ⊂ G. Now, Theorem 5.28 shows G c is nowhere
dense.
The nowhere dense sets are topologically small in the following sense.
T HEOREM 5.29 (Baire). If I is an open interval, then I cannot be written as a
countable union of nowhere dense sets.
P ROOF. Let A n be a sequence of nowhere dense subsets of I . According to
Theorem 5.28, there is a bounded open interval J 1 ⊂ I such that J 1 ∩ A 1 = . By
shortening J 1 a bit at each end, if necessary, it may be assumed that J 1 ∩ A 1 = .
Assume J n has been chosen for some n ∈ N. Applying Theorem 5.28 again, choose
an open interval J n+1 as above so J n+1 ⊂ J n and J n+1 ∩ A n+1 = . Corollary 5.11
shows
I\
An ⊃
Jn =
n∈N
n∈N
and the theorem follows.
Theorem 5.29 is called the Baire category theorem because of the terminology
introduced by RenéLouis Baire in 1899.3 He said a set was of the first category, if
it could be written as a countable union of nowhere dense sets. An easy example
of such a set is any countable set, which is a countable union of singletons. All
3RenéLouis Baire (18741932) was a French mathematician. He proved the Baire category
theorem in his 1899 doctoral dissertation.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 5. THE TOPOLOGY OF R
512
other sets are of the second category.4 Theorem 5.29 can be stated as “Any open
interval is of the second category.” Or, more generally, as “Any nonempty open
set is of the second category.”
A set is called a Gδ set, if it is the countable intersection of open sets. It is
called an Fσ set, if it is the countable union of closed sets. DeMorgan’s laws show
that the complement of an Fσ set is a Gδ set and vice versa. It’s evident that any
countable subset of R is an Fσ set, so Q is an Fσ set.
On the other hand, suppose Q is a Gδ set. Then there is a sequence of open
sets G n such that Q = n∈N G n . Since Q is dense, each G n must be dense and
Example 5.11 shows G nc is nowhere dense. From DeMorgan’s law, R = Q ∪ n∈N G nc ,
showing R is a first category set and violating the Baire category theorem. Therefore, Q is not a Gδ set.
Essentially the same argument shows any countable subset of R is a first
category set. The following protracted example shows there are uncountable sets
of the first category.
4.3. The Cantor MiddleThirds Set. One particularly interesting example of
a nowhere dense set is the Cantor MiddleThirds set, introduced by the German
mathematician Georg Cantor in 1884.5 It has many strange properties, only a few
of which will be explored here.
To start the construction of the Cantor MiddleThirds set, let C 0 = [0, 1] and
C 1 = I 1 \ (1/3, 2/3) = [0, 1/3] ∪ [2/3, 1]. Remove the open middle thirds of the
intervals comprising C 1 , to get
C 2 = 0,
1
2 1
2 7
8
∪ ,
∪ ,
∪ ,1 .
9
9 3
3 9
9
Continuing in this way, if C n consists of 2n pairwise disjoint closed intervals each
of length 3−n , construct C n+1 by removing the open middle third from each of
those closed intervals, leaving 2n+1 closed intervals each of length 3−(n+1) . This
gives a nested sequence of closed sets C n each consisting of 2n closed intervals of
length 3−n . (See Figure 5.1.) The Cantor MiddleThirds set is
C=
Cn .
n∈N
Corollaries 5.3 and 5.11 show C is closed and nonempty. In fact, the latter is
apparent because {0, 1/3, 2/3, 1} ⊂ C n for every n. At each step in the construction,
2n open middle thirds, each of length 3−(n+1) were removed from the intervals
comprising C n . The total length of the open intervals removed was
2n
1 ∞ 2
=
n+1
3 n=0 3
n=0 3
∞
n
= 1.
4Baire did not define any categories other than these two. Some authors call first category
sets meager sets, so as not to make readers fruitlessly wait for definitions of third, fourth and fifth
category sets.
5Cantor’s original work [6] is reprinted with an English translation in Edgar’s Classics on Fractals [11]. Cantor only mentions his eponymous set in passing and it had actually been presented
earlier by others.
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. MORE SMALL SETS
513
F IGURE 5.1. Shown here are the first few steps in the construction of
the Cantor MiddleThirds set.
Because of this, Example 5.11 implies C is nowhere dense in [0, 1].
C is an example of a perfect set; i.e., a closed set all of whose points are limit
points of itself. (See Exercise 5.25.) Any closed set without isolated points is
perfect. The Cantor MiddleThirds set is interesting because it is an example of a
perfect set without any interior points. Many people call any bounded perfect set
without interior points a Cantor set. Most of the time, when someone refers to
the Cantor set, they mean C .
There is another way to view the Cantor set. Notice that at the nth stage of the
construction, removing the middle thirds of the intervals comprising C n removes
those points whose base 3 representation contains the digit 1 in position n + 1.
Then,
∞
C= c=
(5.1)
cn
: c n ∈ {0, 2} .
n
n=1 3
So, C consists of all numbers c ∈ [0, 1] that can be written in base 3 without using
the digit 1.6
cn
If c ∈ C , then (5.1) shows c = ∞
n=1 3n for some sequence c n with range in {0, 2}.
Moreover, every such sequence corresponds to a unique element of C . Define
φ : C → [0, 1] by
φ(c) = φ
(5.2)
∞
∞ c /2
cn
n
.
=
n
n
3
2
n=1
n=1
Since c n is a sequence from {0, 2}, then c n /2 is a sequence from {0, 1} and φ(c) can
be considered the binary representation of a number in [0, 1]. According to (5.1),
it follows that φ is a surjection and
φ(C ) =
∞
c n /2
: c n ∈ {0, 2} =
n
n=1 2
∞
bn
: b n ∈ {0, 1} = [0, 1].
n
n=1 2
Therefore, card (C ) = card ([0, 1]) > ℵ0 .
The Cantor set is topologically small because it is nowhere dense and large
from the settheoretic viewpoint because it is uncountable.
The Cantor set is also a set of measure zero. To see this, let C n be as in the
construction of the Cantor set given above. Then C ⊂ C n and C n consists of 2n
6Notice that 1 =
August 4, 2017
∞ 2/3n , 1/3 =
n=1
∞ 2/3n , etc.
n=2
http://math.louisville.edu/∼lee/ira
CHAPTER 5. THE TOPOLOGY OF R
514
pairwise disjoint closed intervals each of length 3−n . Their total length is (2/3)n .
Given ε > 0, choose n ∈ N so (2/3)n < ε/2. Each of the closed intervals comprising
C n can be placed inside a slightly longer open interval so the sums of the lengths
of the 2n open intervals is less than ε.
5. Exercises
5.1. If G is an open set and F is a closed set, then G \ F is open and F \G is closed.
5.2. Let S ⊂ R and F = {F : F is closed and S ⊂ F }. Prove S =
that S is the smallest closed set containing S.
F ∈F
F . This proves
5.3. Let S and T be subsets of R. Prove or give a counterexample:
(a) A ∪ B = A ∪ B , and
(b) A ∩ B = A ∩ B
5.4. If S is a finite subset of R, then S is closed.
5.5. For any sets A, B ⊂ R, define
A + B = {a + b : a ∈ A and b ∈ B }.
(a) If X , Y ⊂ R, then X + Y ⊂ X + Y .
(b) Find an example to show equality may not hold in the preceding statement.
5.6. Q is neither open nor closed.
5.7. A set S ⊂ R is open iff ∂S ∩ S = . (∂S is the set of boundary points of S.)
5.8. (a) Every closed set can be written as a countable intersection of open sets.
(b) Every open set can be written as a countable union of closed sets.
In other words, every closed set is a Gδ set and every open set is an Fσ set.
5.9. Find a sequence of open sets G n such that
closed.
n∈N G n
is neither open nor
5.10. An open set G is called regular if G = (G)◦ . Find an open set that is not
regular.
5.11. Let R = {(x, ∞) : x ∈ R} and T = R∪{R, }. Prove that (R, T ) is a topological
space. This is called the right ray topology on R.
5.12. If X ⊂ R and S is the collection of all sets relatively open in X , then (X , S )
is a topological space.
5.13. If X ⊂ R and G is an open set, then X \G is relatively closed in X .
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. EXERCISES
515
5.14. For any set S, let F = {T ⊂ S : card (S \ T ) ≤ ℵ0 } ∪ { }. Then (S, F ) is a
topological space. This is called the finite complement topology.
5.15. An uncountable subset of R must have a limit point.
5.16. If S ⊂ R, then S is closed.
5.17. Prove that the set of accumulation points of any sequence is closed.
5.18. Prove any closed set is the set of accumulation points for some sequence.
5.19. If a n is a sequence such that a n → L, then {a n : n ∈ N} ∪ {L} is compact.
5.20. If F is closed and K is compact, then F ∩ K is compact.
5.21. If {K α : α ∈ A} is a collection of compact sets, then
α∈A K α
is compact.
5.22. Prove the union of a finite number of compact sets is compact. Give an
example to show this need not be true for the union of an infinite number of
compact sets.
5.23. (a) Give an example of a set S such that S is disconnected, but S ∪ {1} is
connected. (b) Prove that 1 must be a limit point of S.
5.24. If K is compact and V is open with K ⊂ V , then there is an open set U such
that K ⊂ U ⊂ U ⊂ V .
5.25. If C is the Cantor middlethirds set, then C = C .
5.26. If x ∈ R and K is compact, then there is a z ∈ K such that x −z = glb{x − y :
y ∈ K }. Is z unique?
5.27. If K is compact and O is an open cover of K , then there is an ε > 0 such
that for all x ∈ K there is a G ∈ O with (x − ε, x + ε) ⊂ G.
5.28. Let f : [a, b] → R be a function such that for every x ∈ [a, b] there is a δx > 0
such that f is bounded on (x − δx , x + δx ). Prove f is bounded.
5.29. Is the function defined by (5.2) a bijection?
5.30. If A is nowhere dense in an interval I , then A contains no interval.
5.31. Use the Baire category theorem to show R is uncountable.
5.32. If G is a dense Gδ subset of R, then G c is a first category set.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 6
Limits of Functions
1. Basic Definitions
D EFINITION 6.1. Let D ⊂ R, x 0 be a limit point of D and f : D → R. The
limit of f (x) at x 0 is L, if for each ε > 0 there is a δ > 0 such that when x ∈ D with
0 < x−x 0  < δ, then  f (x)−L < ε. When this is the case, we write limx→x0 f (x) = L.
It should be noted that the limit of f at x 0 is determined by the values of f
near x 0 and not at x 0 . In fact, f need not even be defined at x 0 .
F IGURE 6.1. This figure shows a way to think about the limit. The
graph of f must not leave the top or bottom of the box (x 0 − δ, x 0 + δ) ×
(L − ε, L + ε), except possibly the point (x 0 , f (x 0 )).
A useful way of rewording the definition is to say that limx→x0 f (x) = L iff
for every ε > 0 there is a δ > 0 such that x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } implies
f (x) ∈ (L − ε, L + ε). This can also be succinctly stated as
∀ε > 0 ∃δ > 0 f ( (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } ) ⊂ (L − ε, L + ε) .
E XAMPLE 6.1. If f (x) = c is a constant function and x 0 ∈ R, then for any
positive numbers ε and δ,
x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } ⇒  f (x) − c = c − c = 0 < ε.
This shows the limit of every constant function exists at every point, and the limit
is just the value of the function.
1
August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
61
62
CHAPTER 6. LIMITS OF FUNCTIONS
8
4
2
2
F IGURE 6.2. The function from Example 6.3. Note that the graph is a
line with one “hole” in it.
E XAMPLE 6.2. Let f (x) = x, x 0 ∈ R, and ε = δ > 0. Then
x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 } ⇒  f (x) − x 0  = x − x 0  < δ = ε.
This shows that the identity function has a limit at every point and its limit is just
the value of the function at that point.
2
−8
E XAMPLE 6.3. Let f (x) = 2xx−2
. In this case, the implied domain of f is
D = R \ {2}. We claim that limx→2 f (x) = 8.
To see this, let ε > 0 and choose δ ∈ (0, ε/2). If 0 < x − 2 < δ, then
 f (x) − 8 =
2x 2 − 8
− 8 = 2(x + 2) − 8 = 2x − 2 < ε.
x −2
E XAMPLE 6.4. Let f (x) = x + 1. Then the implied domain of f is D = [−1, ∞).
We claim that limx→−1 f (x) = 0.
To see this, let ε > 0 and choose δ ∈ (0, ε2 ). If 0 < x − (−1) = x + 1 < δ, then
 f (x) − 0 =
x +1 <
δ<
ε2 = ε.
F IGURE 6.3. The function f (x) = x/x from Example 6.5.
E XAMPLE 6.5. If f (x) = x/x for x = 0, then limx→0 f (x) does not exist. (See
Figure 6.3.) To see this, suppose limx→0 f (x) = L, ε = 1 and δ > 0. If L ≥ 0 and
−δ < x < 0, then f (x) = −1 < L − ε. If L < 0 and 0 < x < δ, then f (x) = 1 > L + ε.
These inequalities show for any L and every δ > 0, there is an x with 0 < x < δ
such that  f (x) − L > ε.
August 4, 2017
http://math.louisville.edu/∼lee/ira
1. BASIC DEFINITIONS
63
1.0
0.5
0.2
0.4
0.6
0.8
1.0
0.5
1.0
F IGURE 6.4. This is the function from Example 6.6. The graph shown
here is on the interval [0.03, 1]. There are an infinite number of oscillations from −1 to 1 on any open interval containing the origin.
There is an obvious similarity between the definition of limit of a sequence
and limit of a function. The following theorem makes this similarity explicit, and
gives another way to prove facts about limits of functions.
T HEOREM 6.2. Let f : D → R and x 0 be a limit point of D. limx→x0 f (x) = L iff
whenever x n is a sequence from D \ {x 0 } such that x n → x 0 , then f (x n ) → L.
P ROOF. (⇒) Suppose limx→x0 f (x) = L and x n is a sequence from D \ {x 0 }
such that x n → x 0 . Let ε > 0. There exists a δ > 0 such that  f (x) − L < ε whenever
x ∈ (x −δ, x +δ)∩D \{x 0 }. Since x n → x 0 , there is an N ∈ N such that n ≥ N implies
0 < x n − x 0  < δ. In this case,  f (x n ) − L < ε, showing f (x n ) → L.
(⇐) Suppose whenever x n is a sequence from D \{x 0 } such that x n → x 0 , then
f (x n ) → L, but limx→x0 f (x) = L. Then there exists an ε > 0 such that for all δ > 0
there is an x ∈ (x 0 −δ, x 0 +δ)∩D \{x 0 } such that  f (x)−L ≥ ε. In particular, for each
n ∈ N, there must exist x n ∈ (x 0 − 1/n, x 0 + 1/n) ∩ D \ {x 0 } such that  f (x n ) − L ≥ ε.
Since x n → x 0 , this is a contradiction. Therefore, limx→x0 f (x) = L.
Theorem 6.2 is often used to show a limit doesn’t exist. Suppose we want to
show limx→x0 f (x) doesn’t exist. There are two strategies: find a sequence x n → x 0
such that f (x n ) has no limit; or, find two sequences y n → x 0 and z n → x 0 such
that f (y n ) and f (z n ) converge to different limits. Either way, the theorem shows
limx→x0 fails to exist.
In Example 6.5, we could choose x n = (−1)n /n so f (x n ) oscillates between
−1 and 1. Or, we could choose y n = 1/n = −z n so f (x n ) → 1 and f (z n ) → −1.
1
2
E XAMPLE 6.6. Let f (x) = sin(1/x), a n = nπ
and b n = (4n+1)π
. Then a n ↓ 0,
b n ↓ 0, f (a n ) = 0 and f (b n ) = 1 for all n ∈ N. An application of Theorem 6.2 shows
limx→0 f (x) does not exist. (See Figure 6.4.)
T HEOREM 6.3 (Squeeze Theorem). Suppose f , g and h are all functions defined on D ⊂ R with f (x) ≤ g (x) ≤ h(x) for all x ∈ D. If x 0 is a limit point of D and
limx→x0 f (x) = limx→x0 h(x) = L, then limx→x0 g (x) = L.
P ROOF. Let x n be any sequence from D \ {x 0 } such that x n → x 0 . According
to Theorem 6.2, both f (x n ) → L and h(x n ) → L. Since f (x n ) ≤ g (x n ) ≤ h(x n ),
August 4, 2017
http://math.louisville.edu/∼lee/ira
64
CHAPTER 6. LIMITS OF FUNCTIONS
0.10
0.05
0.02
0.04
0.06
0.08
0.10
0.05
0.10
F IGURE 6.5. This is the function from Example 6.7. The bounding
lines y = x and y = −x are also shown. There are an infinite number of
oscillations between −x and x on any open interval containing the
origin.
an application of the Sandwich Theorem for sequences shows g (x n ) → L. Now,
another use of Theorem 6.2 shows limx→x0 g (x) = L.
E XAMPLE 6.7. Let f (x) = x sin(1/x). Since −1 ≤ sin(1/x) ≤ 1 when x = 0, we
see that −x ≤ x sin(1/x) ≤ x for x = 0. Since limx→0 x = 0, Theorem 6.3 implies
limx→0 x sin(1/x) = 0. (See Figure 6.5.)
T HEOREM 6.4. Suppose f : D → R and g : D → R and x 0 is a limit point of D.
If limx→x0 f (x) = L and limx→x0 g (x) = M , then
(a)
(b)
(c)
(d)
limx→x0 ( f + g )(x) = L + M ,
limx→x0 (a f )(x) = aL, ∀a ∈ R,
limx→x0 ( f g )(x) = LM , and
limx→x0 (1/ f )(x) = 1/L, as long as L = 0.
P ROOF. Suppose a n is a sequence from D \ {x 0 } converging to x 0 . Then Theorem 6.2 implies f (a n ) → L and g (a n ) → M . (a)(d) follow at once from the
corresponding properties for sequences.
E XAMPLE 6.8. Let f (x) = 3x + 2. If g 1 (x) = 3, g 2 (x) = x and g 3 (x) = 2, then
f (x) = g 1 (x)g 2 (x) + g 3 (x). Examples 6.1 and 6.2 along with parts (a) and (c) of
Theorem 6.4 immediately show that for every x ∈ R, limx→x0 f (x) = f (x 0 ).
In the same manner as Example 6.8, it can be shown for every rational function f (x), that limx→x0 f (x) = f (x 0 ) whenever f (x 0 ) exists.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 3: Continuity
65
2. Unilateral Limits
D EFINITION 6.5. Let f : D → R and x 0 be a limit point of (−∞, x 0 )∩D. f has L
as its lefthand limit at x 0 if for all ε > 0 there is a δ > 0 such that f ((x 0 −δ, x 0 )∩D) ⊂
(L − ε, L + ε). In this case, we write limx↑x0 f (x) = L.
Let f : D → R and x 0 be a limit point of D ∩ (x 0 , ∞). f has L as its righthand
limit at x 0 if for all ε > 0 there is a δ > 0 such that f (D ∩ (x 0 , x 0 + δ)) ⊂ (L − ε, L + ε).
In this case, we write limx↓x0 f (x) = L.2
These are called the unilateral or onesided limits of f at x 0 . When they are
different, the graph of f is often said to have a “jump” at x 0 , as in the following
example.
E XAMPLE 6.9. As in Example 6.5, let f (x) = x/x. Then limx↓0 f (x) = 1 and
limx↑0 f (x) = −1. (See Figure 6.3.)
In parallel with Theorem 6.2, the onesided limits can also be reformulated
in terms of sequences.
T HEOREM 6.6. Let f : D → R and x 0 .
(a) Let x 0 be a limit point of D ∩ (x 0 , ∞). limx↓x0 f (x) = L iff whenever x n is a
sequence from D ∩ (x 0 , ∞) such that x n → x 0 , then f (x n ) → L.
(b) Let x 0 be a limit point of (−∞, x 0 ) ∩ D. limx↑x0 f (x) = L iff whenever x n is
a sequence from (−∞, x 0 ) ∩ D such that x n → x 0 , then f (x n ) → L.
The proof of Theorem 6.6 is similar to that of Theorem 6.2 and is left to the
reader.
T HEOREM 6.7. Let f : D → R and x 0 be a limit point of D.
lim f (x) = L
x→x 0
⇐⇒
lim f (x) = L = lim f (x)
x↑x 0
x↓x 0
P ROOF. This proof is left as an exercise.
T HEOREM 6.8. If f : (a, b) → R is monotone, then both unilateral limits of f
exist at every point of (a, b).
P ROOF. To be specific, suppose f is increasing and x 0 ∈ (a, b). Let ε > 0
and L = lub { f (x) : a < x < x 0 }. According to Corollary 2.20, there must exist an
x ∈ (a, x 0 ) such that L − ε < f (x) ≤ L. Define δ = x 0 − x. If y ∈ (x 0 − δ, x 0 ), then
L − ε < f (x) ≤ f (y) ≤ L. This shows limx↑x0 f (x) = L.
The proof that limx↓x0 f (x) exists is similar.
To handle the case when f is decreasing, consider − f instead of f .
3. Continuity
D EFINITION 6.9. Let f : D → R and x 0 ∈ D. f is continuous at x 0 if for every ε >
0 there exists a δ > 0 such that when x ∈ D with x − x 0  < δ, then  f (x)− f (x 0 ) < ε.
The set of all points at which f is continuous is denoted C ( f ).
August 4, 2017
http://math.louisville.edu/∼lee/ira
66
CHAPTER 6. LIMITS OF FUNCTIONS
F IGURE 6.6. The function f is continuous at x 0 , if given any ε > 0 there
is a δ > 0 such that the graph of f does not cross the top or bottom of
the dashed rectangle (x 0 − δ, x 0 + d ) × ( f (x 0 ) − ε, f (x 0 ) + ε).
Several useful ways of rephrasing this are contained in the following theorem.
They are analogous to the similar statements made about limits. Proofs are left to
the reader.
T HEOREM 6.10. Let f : D → R and x 0 ∈ D. The following statements are
equivalent.
(a) x 0 ∈ C ( f ),
(b) For all ε > 0 there is a δ > 0 such that
x ∈ (x 0 − δ, x 0 + δ) ∩ D ⇒ f (x) ∈ ( f (x 0 ) − ε, f (x 0 ) + ε),
(c) For all ε > 0 there is a δ > 0 such that
f ((x 0 − δ, x 0 + δ) ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
E XAMPLE 6.10. Define
f (x) =
2x 2 −8
x−2 ,
x =2
8,
x =2
.
It follows from Example 6.3 that 2 ∈ C ( f ).
There is a subtle difference between the treatment of the domain of the
function in the definitions of limit and continuity. In the definition of limit, the
“target point,” x 0 is required to be a limit point of the domain, but not actually
be an element of the domain. In the definition of continuity, x 0 must be in
the domain of the function, but does not have to be a limit point. To see a
consequence of this difference, consider the following example.
E XAMPLE 6.11. If f : Z → R is an arbitrary function, then C ( f ) = Z. To see this,
let n 0 ∈ Z, ε > 0 and δ = 1. If x ∈ Z with x − n 0  < δ, then x = n 0 . It follows that
 f (x) − f (n 0 ) = 0 < ε, so f is continuous at n 0 .
2 Calculus books often use the notation lim
x↑x 0 f (x) = limx→x 0 − f (x) and limx↓x 0 f (x) =
limx→x0 + f (x).
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 3: Continuity
67
This leads to the following theorem.
T HEOREM 6.11. Let f : D → R and x 0 ∈ D. If x 0 is a limit point of D, then
x 0 ∈ C ( f ) iff limx→x0 f (x) = f (x 0 ). If x 0 is an isolated point of D, then x 0 ∈ C ( f ).
P ROOF. If x 0 is isolated in D, then there is an δ > 0 such that (x 0 − δ, x 0 + δ) ∩
D = {x 0 }. For any ε > 0, the definition of continuity is satisfied with this δ.
Next, suppose x 0 ∈ D .
The definition of continuity says that f is continuous at x 0 iff for all ε > 0 there
is a δ > 0 such that when x ∈ (x 0 − δ, x 0 + δ) ∩ D, then f (x) ∈ ( f (x 0 ) − ε, f (x 0 ) + ε).
The definition of limit says that limx→x0 f (x) = f (x 0 ) iff for all ε > 0 there is a
δ > 0 such that when x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 }, then f (x) ∈ ( f (x 0 ) − ε, f (x 0 ) + ε).
Comparing these two definitions, it is clear that x 0 ∈ C ( f ) implies
lim f (x) = f (x 0 ).
x→x 0
On the other hand, suppose limx→x0 f (x) = f (x 0 ) and ε > 0. Choose δ according to the definition of limit. When x ∈ (x 0 − δ, x 0 + δ) ∩ D \ {x 0 }, then f (x) ∈
( f (x 0 ) − ε, f (x 0 ) + ε). It follows from this that when x = x 0 , then f (x) − f (x 0 ) =
f (x 0 ) − f (x 0 ) = 0 < ε. Therefore, when x ∈ (x 0 − δ, x 0 + δ) ∩ D, then f (x) ∈ ( f (x 0 ) −
ε, f (x 0 ) + ε), and x 0 ∈ C ( f ), as desired.
E XAMPLE 6.12. If f (x) = c, for some c ∈ R, then Example 6.1 and Theorem
6.11 show that f is continuous at every point.
E XAMPLE 6.13. If f (x) = x, then Example 6.2 and Theorem 6.11 show that f
is continuous at every point.
C OROLLARY 6.12. Let f : D → R and x 0 ∈ D. x 0 ∈ C ( f ) iff whenever x n is a
sequence from D with x n → x 0 , then f (x n ) → f (x 0 ).
P ROOF. Combining Theorem 6.11 with Theorem 6.2 shows this to be true.
E XAMPLE 6.14 (Dirichlet Function). Suppose
f (x) =
1, x ∈ Q
0, x ∉ Q
.
For each x ∈ Q, there is a sequence of irrational numbers converging to x, and for
each y ∈ Qc there is a sequence of rational numbers converging to y. Corollary
6.12 shows C ( f ) = .
E XAMPLE 6.15 (Salt and Pepper Function). Since Q is a countable set, it can
be written as a sequence, Q = {q n : n ∈ N}. Define
f (x) =
1/n, x = q n ,
0,
x ∈ Qc .
If x ∈ Q, then x = q n , for some n and f (x) = 1/n > 0. There is a sequence x n
from Qc such that x n → x and f (x n ) = 0 → f (x) = 1/n. Therefore C ( f ) ∩ Q = .
August 4, 2017
http://math.louisville.edu/∼lee/ira
68
CHAPTER 6. LIMITS OF FUNCTIONS
On the other hand, let x ∈ Qc and ε > 0. Choose N ∈ N large enough so that
1/N < ε and let δ = min{x − q n  : 1 ≤ n ≤ N }. If x − y < δ, there are two cases to
consider. If y ∈ Qc , then  f (y) − f (x) = 0 − 0 = 0 < ε. If y ∈ Q, then the choice of
δ guarantees y = q n for some n > N . In this case,  f (y) − f (x) = f (y) = f (q n ) =
1/n < 1/N < ε. Therefore, x ∈ C ( f ).
This shows that C ( f ) = Qc .
It is a consequence of the Baire category theorem that there is no function f
such that C ( f ) = Q. Proving this would take us too far afield.
The following theorem is an almost immediate consequence of Theorem 6.4.
T HEOREM 6.13. Let f : D → R and g : D → R. If x 0 ∈ C ( f ) ∩C (g ), then
(a)
(b)
(c)
(d)
x 0 ∈ C ( f + g ),
x 0 ∈ C (α f ), ∀α ∈ R,
x 0 ∈ C ( f g ), and
x 0 ∈ C ( f /g ) when g (x 0 ) = 0.
C OROLLARY 6.14. If f is a rational function, then f is continuous at each point
of its domain.
P ROOF. This is a consequence of Examples 6.12 and 6.13 used with Theorem
6.13.
T HEOREM 6.15. Suppose f : D f → R and g : D g → R such that f (D f ) ⊂ D g . If
there is an x 0 ∈ C ( f ) such that f (x 0 ) ∈ C (g ), then x 0 ∈ C (g ◦ f ).
P ROOF. Let ε > 0 and choose δ1 > 0 such that
g (( f (x 0 ) − δ1 , f (x 0 ) + δ1 ) ∩ D g ) ⊂ (g ◦ f (x 0 ) − ε, g ◦ f (x 0 ) + ε).
Choose δ2 > 0 such that
f ((x 0 − δ2 , x 0 + δ2 ) ∩ D f ) ⊂ ( f (x 0 ) − δ1 , f (x 0 ) + δ1 ).
Then
g ◦ f ((x 0 − δ2 , x 0 + δ2 ) ∩ D f ) ⊂ g (( f (x 0 ) − δ1 , f (x 0 ) + δ1 ) ∩ D g )
⊂ (g ◦ f (x 0 ) − δ2 , g ◦ f (x 0 ) + δ2 ) ∩ D f ).
Since this shows Theorem 6.10(c) is satisfied at x 0 with the function g ◦ f , it follows
that x 0 ∈ C (g ◦ f ).
E XAMPLE 6.16. If f (x) = x for x ≥ 0, then according to Problem 6.8, C ( f ) =
[0, ∞). Theorem 6.15 shows f ◦ f (x) = 4 x is continuous on [0, ∞).
n
In similar way, it can be shown by induction that f (x) = x m/2 is continuous
on [0, ∞) for all m, n ∈ Z.
4. Unilateral Continuity
D EFINITION 6.16. Let f : D → R and x 0 ∈ D. f is leftcontinuous at x 0 if for
every ε > 0 there is a δ > 0 such that f ((x 0 − δ, x 0 ] ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
Let f : D → R and x 0 ∈ D. f is rightcontinuous at x 0 if for every ε > 0 there is
a δ > 0 such that f ([x 0 , x 0 + δ) ∩ D) ⊂ ( f (x 0 ) − ε, f (x 0 ) + ε).
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 4: Unilateral Continuity
69
E XAMPLE 6.17. Let the floor function be
x = max{n ∈ Z : n ≤ x}
and the ceiling function be
x = min{n ∈ Z : n ≥ x}.
The floor function is rightcontinuous, but not leftcontinuous at each integer,
and the ceiling function is leftcontinuous, but not rightcontinuous at each
integer.
T HEOREM 6.17. Let f : D → R and x 0 ∈ D. x 0 ∈ C ( f ) iff f is both right and
leftcontinuous at x 0 .
P ROOF. The proof of this theorem is left as an exercise.
According to Theorem 6.7, when f is monotone on an interval (a, b), the
unilateral limits of f exist at every point. In order for such a function to be
continuous at x 0 ∈ (a, b), it must be the case that
lim f (x) = f (x 0 ) = lim f (x).
x↑x 0
x↓x 0
If either of the two equalities is violated, the function is not continuous at x 0 .
In the case, when limx↑x0 f (x) = limx↓x0 f (x), it is said that a jump discontinuity occurs at x 0 .
E XAMPLE 6.18. The function
f (x) =
x/x, x = 0
x =0
0,
.
has a jump discontinuity at x = 0.
In the case when limx↑x0 f (x) = limx↓x0 f (x) = f (x 0 ), it is said that f has a
removable discontinuity at x 0 . The discontinuity is called “removable” because
in this case, the function can be made continuous at x 0 by merely redefining its
value at the single point, x 0 , to be the value of the two onesided limits.
2
−4
E XAMPLE 6.19. The function f (x) = xx−2
is not continuous at x = 2 because 2
is not in the domain of f . Since limx→2 f (x) = 4, if the domain of f is extended
to include 2 by setting f (2) = 4, then this extended f is continuous everywhere.
(See Figure 6.7.)
T HEOREM 6.18. If f : (a, b) → R is monotone, then (a, b) \C ( f ) is countable.
P ROOF. In light of the discussion above and Theorem 6.7, it is apparent that
the only types of discontinuities f can have are jump discontinuities.
To be specific, suppose f is increasing and x 0 , y 0 ∈ (a, b) \ C ( f ) with x 0 < y 0 .
In this case, the fact that f is increasing implies
lim f (x) < lim f (x) ≤ lim f (x) < lim f (x).
x↑x 0
x↓x 0
x↑y 0
x↓y 0
This implies that for any two x 0 , y 0 ∈ (a, b)\C ( f ), there are disjoint open intervals,
I x0 = (limx↑x0 f (x), limx↓x0 f (x)) and I y 0 = (limx↑y 0 f (x), limx↓y 0 f (x)). For each
August 4, 2017
http://math.louisville.edu/∼lee/ira
610
CHAPTER 6. LIMITS OF FUNCTIONS
7
6
5
4
3
f (x) =
2
x2 −4
x−2
1
1
2
3
4
F IGURE 6.7. The function from Example 6.19. Note that the graph is a
line with one “hole” in it. Plugging up the hole removes the discontinuity.
x ∈ (a, b) \ C ( f ), choose q x ∈ I x ∩ Q. Because of the pairwise disjointness of the
intervals {I x : x ∈ (a, b) \C ( f )}, this defines an bijection between (a, b) \C ( f ) and
a subset of Q. Therefore, (a, b) \C ( f ) must be countable.
A similar argument holds for a decreasing function.
Theorem 6.18 implies that a monotone function is continuous at “nearly every” point in its domain. Characterizing the points of discontinuity as countable
is the best that can be hoped for, as seen in the following example.
E XAMPLE 6.20. Let D = {d n : n ∈ N} be a countable set and define J x = {n :
d n < x}. The function
(6.1)
f (x) =
Jx =
0,
1
n∈J x 2n
Jx =
.
is increasing and C ( f ) = D c . The proof of this statement is left as Exercise 6.9.
5. Continuous Functions
Up until now, continuity has been considered as a property of a function at a
point. There is much that can be said about functions continuous everywhere.
D EFINITION 6.19. Let f : D → R and A ⊂ D. We say f is continuous on A if
A ⊂ C ( f ). If D = C ( f ), then f is continuous.
Continuity at a point is, in a sense, a metric property of a function because
it measures relative distances between points in the domain and image sets.
Continuity on a set becomes more of a topological property, as shown by the next
few theorems.
T HEOREM 6.20. f : D → R is continuous iff whenever G is open in R, then
f −1 (G) is relatively open in D.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 5: Continuous Functions
611
P ROOF. (⇒) Assume f is continuous on D and let G be open in R. Let x ∈
f (G) and choose ε > 0 such that ( f (x) − ε, f (x) + ε) ⊂ G. Using the continuity
of f at x, we can find a δ > 0 such that f ((x − δ, x + δ) ∩ D) ⊂ G. This implies
that (x − δ, x + δ) ∩ D ⊂ f −1 (G). Because x was an arbitrary element of f −1 (G), it
follows that f −1 (G) is open.
(⇐) Choose x ∈ D and let ε > 0. By assumption, the set f −1 (( f (x) − ε, f (x) + ε)
is relatively open in D. This implies the existence of a δ > 0 such that (x − δ, x +
δ) ∩ D ⊂ f −1 (( f (x) − ε, f (x) + ε). It follows from this that f ((x − δ, x + δ) ∩ D) ⊂
( f (x) − ε, f (x) + ε), and x ∈ C ( f ).
−1
A function as simple as any constant function demonstrates that f (G) need
not be open when G is open. Defining f : [0, ∞) → R by f (x) = sin x tan−1 x shows
that the image of a closed set need not be closed because f ([0, ∞)) = (−π/2, π/2).
T HEOREM 6.21. If f is continuous on a compact set K , then f (K ) is compact.
P ROOF. Let O be an open cover of f (K ) and I = { f −1 (G) : G ∈ O }. By Theorem
6.20, I is a collection of sets which are relatively open in K . Since I covers K , I
is an open cover of K . Using the fact that K is compact, we can choose a finite
subcover of K from I , say {G 1 ,G 2 , . . . ,G n }. There are {H1 , H2 , . . . , Hn } ⊂ O such
that f −1 (Hk ) = G k for 1 ≤ k ≤ n. Then
f (K ) ⊂ f
Gk =
1≤k≤n
Hk .
1≤k≤n
Thus, {H1 , H2 , . . . , H3 } is a subcover of f (K ) from O .
Several of the standard calculus theorems giving properties of continuous
functions are consequences of Corollary 6.21. In a calculus course, K is usually a
compact interval, [a, b].
C OROLLARY 6.22. If f : K → R is continuous and K is compact, then f is
bounded.
P ROOF. By Theorem 6.21, f (K ) is compact. Now, use the BolzanoWeierstrass
theorem to conclude f is bounded.
C OROLLARY 6.23 (Maximum Value Theorem). If f : K → R is continuous and
K is compact, then there are m, M ∈ K such that f (m) ≤ f (x) ≤ f (M ) for all x ∈ K .
P ROOF. According to Theorem 6.21 and the BolzanoWeierstrass theorem,
f (K ) is closed and bounded. Because of this, glb f (K ) ∈ f (K ) and lub f (K ) ∈ f (K ).
It suffices to choose m ∈ f −1 (glb f (K )) and M ∈ f −1 (lub f (K )).
C OROLLARY 6.24. If f : K → R is continuous and invertible and K is compact,
then f −1 : f (K ) → K is continuous.
P ROOF. Let G be open in K . According to Theorem 6.20, it suffices to show
f (G) is open in f (K ).
To do this, note that K \G is compact, so by Theorem 6.21, f (K \G) is compact,
and therefore closed. Because f is injective, f (G) = f (K ) \ f (K \ G). This shows
f (G) is open in f (K ).
August 4, 2017
http://math.louisville.edu/∼lee/ira
612
CHAPTER 6. LIMITS OF FUNCTIONS
T HEOREM 6.25. If f is continuous on an interval I , then f (I ) is an interval.
P ROOF. If f (I ) is not connected, there must exist two disjoint open sets, U
and V , such that f (I ) ⊂ U ∪ V and f (I ) ∩U = = f (I ) ∩ V . In this case, Theorem
6.20 implies f −1 (U ) and f −1 (V ) are both open. They are clearly disjoint and
f −1 (U ) ∩ I = = f −1 (V ) ∩ I . But, this implies f −1 (U ) and f −1 (V ) disconnect I ,
which is a contradiction. Therefore, f (I ) is connected.
C OROLLARY 6.26 (Intermediate Value Theorem). If f : [a, b] → R is continuous
and α is between f (a) and f (b), then there is a c ∈ [a, b] such that f (c) = α.
P ROOF. This is an easy consequence of Theorem 6.25 and Theorem 5.14.
D EFINITION 6.27. A function f : D → R has the Darboux property if whenever
a, b ∈ D and γ is between f (a) and f (b), then there is a c between a and b such
that f (c) = γ.
Calculus texts usually call the Darboux property the intermediate value property. Corollary 6.26 shows that a function continuous on an interval has the
Darboux property. The next example shows continuity is not necessary for the
Darboux property to hold.
E XAMPLE 6.21. The function
f (x) =
sin 1/x, x = 0
0,
x =0
is not continuous, but does have the Darboux property. (See Figure 6.4.) It can be
seen from Example 6.6 that 0 ∉ C ( f ).
To see f has the Darboux property, choose two numbers a < b.
If a > 0 or b < 0, then f is continuous on [a, b] and Corollary 6.26 suffices to
finish the proof.
On the other hand, if 0 ∈ [a, b], then there must exist an n ∈ Z such that both
2
2
2
2
(4n+1)π , (4n+3)π ∈ [a, b]. Since f ( (4n+1)π ) = 1, f ( (4n+3)π ) = −1 and f is continuous
on the interval between them, we see f ([a, b]) = [−1, 1], which is the entire range
of f . The claim now follows.
6. Uniform Continuity
Most of the ideas contained in this section will not be needed until we begin
developing the properties of the integral in Chapter 8.
D EFINITION 6.28. A function f : D → R is uniformly continuous if for all ε > 0
there is a δ > 0 such that when x, y ∈ D with x − y < δ, then  f (x) − f (y) < ε.
The idea here is that in the ordinary definition of continuity, the δ in the
definition depends on both ε and the x at which continuity is being tested; i.e., δ
is really a function of both ε and x. With uniform continuity, δ only depends on
ε; i.e., δ is only a function of x, and the same δ works across the whole domain.
T HEOREM 6.29. If f : D → R is uniformly continuous, then it is continuous.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 6: Uniform Continuity
613
P ROOF. This proof is left as Exercise 6.30.
The converse is not true.
E XAMPLE 6.22. Let f (x) = 1/x on D = (0, 1) and ε > 0. It’s clear that f is
continuous on D. Let δ > 0 and choose m, n ∈ N such that m > 1/δ and n − m > ε.
If x = 1/m and y = 1/n, then 0 < y < x < δ and f (y) − f (x) = n − m > ε. Therefore,
f is not uniformly continuous.
T HEOREM 6.30. If f : D → R is continuous and D is compact, then f is uniformly continuous.
P ROOF. Suppose f is not uniformly continuous. Then there is an ε > 0 such
that for every n ∈ N there are x n , y n ∈ D with x n −y n  < 1/n and  f (x n )− f (y n ) ≥ ε.
An application of the BolzanoWeierstrass theorem yields a subsequence x nk of
x n such that x nk → x 0 ∈ D.
Since f is continuous at x 0 , there is a δ > 0 such that whenever x ∈ (x 0 −
δ, x 0 + δ) ∩ D, then  f (x) − f (x 0 ) < ε/2. Choose n k ∈ N such that 1/n k < δ/2 and
x nk ∈ (x 0 − δ/2, x 0 + δ/2). Then both x nk , y nk ∈ (x 0 − δ, x 0 + δ) and
ε ≤  f (x nk ) − f (y nk ) =  f (x nk ) − f (x 0 ) + f (x 0 ) − f (y nk )
≤  f (x nk ) − f (x 0 ) +  f (x 0 ) − f (y nk ) < ε/2 + ε/2 = ε,
which is a contradiction.
Therefore, f must be uniformly continuous.
The following corollary is an immediate consequence of Theorem 6.30.
C OROLLARY 6.31. If f : [a, b] → R is continuous, then f is uniformly continuous.
T HEOREM 6.32. Let D ⊂ R and f : D → R. If f is uniformly continuous and x n
is a Cauchy sequence from D, then f (x n ) is a Cauchy sequence..
P ROOF. The proof is left as Exercise 6.37.
Uniform continuity is necessary in Theorem 6.32. To see this, let f : (0, 1) → R
be f (x) = 1/x and x n = 1/n. Then x n is a Cauchy sequence, but f (x n ) = n is not.
This idea is explored in Exercise 6.32.
It’s instructive to think about the converse to Theorem 6.32. Let f (x) = x 2 ,
defined on all of R. Since f is continuous everywhere, Corollary 6.12 shows f
maps Cauchy sequences to Cauchy sequences. On the other hand, in Exercise
6.36, it is shown that f is not uniformly continuous. Therefore, the converse to
Theorem 6.32 is false. Those functions mapping Cauchy sequences to Cauchy
sequences are sometimes said to be Cauchy continuous. The converse to Theorem
6.32 can be tweaked to get a true statement.
T HEOREM 6.33. Let f : D → R where D is bounded. If f is Cauchy continuous,
then f is uniformly continuous.
August 4, 2017
http://math.louisville.edu/∼lee/ira
614
CHAPTER 6. LIMITS OF FUNCTIONS
P ROOF. Suppose f is not uniformly continuous. Then there is an ε > 0 and
sequences x n and y n from D such that x n − y n  < 1/n and  f (x n ) − f (y n ) ≥ ε.
Since D is bounded, the sequence x n is bounded and the BolzanoWeierstrass
theorem gives a Cauchy subsequence, x nk . The new sequence
zk =
x n(k+1)/2
k odd
y nk/2
k even
is easily shown to be a Cauchy sequence. But, f (z k ) is not a Cauchy sequence,
since  f (z k ) − f (z k+1 ) ≥ ε for all odd k. This contradicts the fact that f is Cauchy
continuous. We’re forced to conclude the assumption that f is not uniformly
continuous is false.
7. Exercises
6.1. Prove lim (x 2 + 3x) = −2.
x→−2
6.2. Give examples of functions f and g such that neither function has a limit at
a, but f + g does. Do the same for f g .
6.3. Let f : D → R and a ∈ D .
lim f (x) = L ⇐⇒ lim f (x) = lim f (x) = L
x→a
x↑a
x↓a
6.4. Find two functions defined on R such that
0 = lim f (x) + g (x) = lim f (x) + lim g (x).
x→0
x→0
x→0
6.5. If lim f (x) = L > 0, then there is a δ > 0 such that f (x) > 0 when 0 < x − a <
δ.
x→a
6.6. If Q = {q n : n ∈ N} is an enumeration of the rational numbers and
f (x) =
1/n, x = q n
0,
x ∈ Qc
then limx→a f (x) = 0, for all a ∈ Qc .
6.7. Use the definition of continuity to show f (x) = x 2 is continuous everywhere
on R.
6.8. Prove that f (x) =
x is continuous on [0, ∞).
6.9. If f is defined as in (6.1), then D = C ( f )c .
6.10. If f : R → R is monotone, then there is a countable set D such that the values
of f can be altered on D in such a way that the altered function is leftcontinuous
at every point of R.
August 4, 2017
http://math.louisville.edu/∼lee/ira
7. EXERCISES
615
6.11. Does there exist an increasing function f : R → R such that C ( f ) = Q?
6.12. If f : R → R and there is an α > 0 such that  f (x) − f (y) ≤ αx − y for all
x, y ∈ R, then show that f is continuous.
6.13. Suppose f and g are each defined on an open interval I , a ∈ I and a ∈
C ( f ) ∩ C (g ). If f (a) > g (a), then there is an open interval J such that f (x) > g (x)
for all x ∈ J .
6.14. If f , g : (a, b) → R are continuous, then G = {x : f (x) < g (x)} is open.
6.15. If f : R → R and a ∈ C ( f ) with f (a) > 0, then there is a neighborhood G of a
such that f (G) ⊂ (0, ∞).
6.16. Let f and g be two functions which are continuous on a set D ⊂ R. Prove
or give a counter example: {x ∈ D : f (x) > g (x)} is relatively open in D.
6.17. If f , g : R → R are functions such that f (x) = g (x) for all x ∈ Q and C ( f ) =
C (g ) = R, then f = g .
6.18. Let I = [a, b]. If f : I → I is continuous, then there is a c ∈ I such that
f (c) = c.
6.19. Find an example to show the conclusion of Problem 6.18 fails if I = (a, b).
6.20. If f and g are both continuous on [a, b], then {x : f (x) ≤ g (x)} is compact.
6.21. If f : [a, b] → R is continuous, not constant,
m = glb { f (x) : a ≤ x ≤ b} and M = lub { f (x) : a ≤ x ≤ b},
then f ([a, b]) = [m, M ].
6.22. Suppose f : R → R is a function such that every interval has points at which
f is negative and points at which f is positive. Prove that every interval has points
where f is not continuous.
6.23. If f : [a, b] → R has a limit at every point, then f is bounded. Is this true for
f : (a, b) → R?
6.24. Give an example of a bounded function f : R → R with a limit at no point.
6.25. If f : R → R is continuous and periodic, then there are x m , x M ∈ R such that
f (x m ) ≤ f (x) ≤ f (x M ) for all x ∈ R. (A function f is periodic, if there is a p > 0
such that f (x + p) = f (x) for all x ∈ R. The least such p is called the period of f .
6.26. A set S ⊂ R is disconnected iff there is a continuous f : S → R such that
f (S) = {0, 1}.
August 4, 2017
http://math.louisville.edu/∼lee/ira
616
CHAPTER 6. LIMITS OF FUNCTIONS
6.27. If f : R → R satisfies f (x + y) = f (x) + f (y) for all x and y and 0 ∈ C ( f ), then
f is continuous.
6.28. Assume that f : R → R is such that f (x + y) = f (x) f (y) for all x, y ∈ R. If f
has a limit at zero, prove that either limx→0 f (x) = 1 or f (x) = 0 for all x ∈ R \ {0}.
6.29. If F ⊂ R is closed, then there is an f : R → R such that F = C ( f )c .
6.30. If f : [a, b] → R is uniformly continuous, then f is continuous.
6.31. A function f : R → R is periodic with period p > 0, if f (x + p) = f (x) for all x.
If f : R → R is periodic with period p and continuous on [0, p], then f is uniformly
continuous.
6.32. Prove that an unbounded function on a bounded open interval cannot be
uniformly continuous.
6.33. If f : D → R is uniformly continuous on a bounded set D, then f is
bounded.
6.34. Prove Theorem 6.29.
6.35. Every polynomial of odd degree has a root.
6.36. Show f (x) = x 2 , with domain R, is not uniformly continuous.
6.37. Prove Theorem 6.32.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 7
Differentiation
1. The Derivative at a Point
D EFINITION 7.1. Let f be a function defined on a neighborhood of x 0 . f is
differentiable at x 0 , if the following limit exists:
f (x 0 ) = lim
h→0
f (x 0 + h) − f (x 0 )
.
h
Define D( f ) = {x : f (x) exists}.
The standard notations for the derivative will be used; e.g., f (x),
etc.
d f (x)
dx , D f
(x),
An equivalent way of stating this definition is to note that if x 0 ∈ D( f ), then
f (x 0 ) = lim
x→x 0
f (x) − f (x 0 )
.
x − x0
(See Figure 1.)
This can be interpreted in the standard way as the limiting slope of the secant
line as the points of intersection approach each other.
E XAMPLE 7.1. If f (x) = c for all x and some c ∈ R, then
lim
h→0
f (x 0 + h) − f (x 0 )
c −c
= lim
= 0.
h→0 h
h
So, f (x) = 0 everywhere.
E XAMPLE 7.2. If f (x) = x, then
lim
h→0
f (x 0 + h) − f (x 0 )
x0 + h − x0
h
= lim
= lim = 1.
h→0
h→0 h
h
h
So, f (x) = 1 everywhere.
T HEOREM 7.2. For any function f , D( f ) ⊂ C ( f ).
P ROOF. Suppose x 0 ∈ D( f ). Then
f (x) − f (x 0 )
(x − x 0 )
x − x0
= f (x 0 ) 0 = 0.
lim f (x) − f (x 0 ) = lim
x→x 0
x→x 0
This shows limx→x0 f (x) = f (x 0 ), and x 0 ∈ C ( f ).
1
August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
71
72
CHAPTER 7. DIFFERENTIATION
F IGURE 7.1. These graphs illustrate that the two standard ways of
writing the difference quotient are equivalent.
Of course, the converse of Theorem 7.2 is not true.
E XAMPLE 7.3. The function f (x) = x is continuous on R, but
lim
h↓0
f (0 + h) − f (0)
f (0 + h) − f (0)
= 1 = − lim
,
h↑0
h
h
so f (0) fails to exist.
Theorem 7.2 and Example 7.3 show that differentiability is a strictly stronger
condition than continuity. For a long time most mathematicians believed that
every continuous function must certainly be differentiable at some point. In
the nineteenth century, several researchers, most notably Bolzano and Weierstrass, presented examples of functions continuous everywhere and differentiable
nowhere.2 It has since been proved that, in a technical sense, the “typical” continuous function is nowhere differentiable [4]. So, contrary to the impression left by
many beginning calculus courses, differentiability is the exception rather than
the rule, even for continuous functions..
2. Differentiation Rules
Following are the standard rules for differentiation learned by every calculus
student.
T HEOREM 7.3. Suppose f and g are functions such that x 0 ∈ D( f ) ∩ D(g ).
(a) x 0 ∈ D( f + g ) and ( f + g ) (x 0 ) = f (x 0 ) + g (x 0 ).
(b) If a ∈ R, then x 0 ∈ D(a f ) and (a f ) (x 0 ) = a f (x 0 ).
(c) x 0 ∈ D( f g ) and ( f g ) (x 0 ) = f (x 0 )g (x 0 ) + f (x 0 )g (x 0 ).
2Bolzano presented his example in 1834, but it was little noticed. The 1872 example of
Weierstrass is more wellknown [2]. A translation of Weierstrass’ original paper [21] is presented
by Edgar [11]. Weierstrass’ example is not very transparent because it depends on trigonometric
series. Many more elementary constructions have since been made. One such will be presented in
Example 9.9.
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. DIFFERENTIATION RULES
73
(d) If g (x 0 ) = 0, then x 0 ∈ D( f /g ) and
f
g
(x 0 ) =
f (x 0 )g (x 0 ) − f (x 0 )g (x 0 )
.
(g (x 0 ))2
P ROOF. (a)
( f + g )(x 0 + h) − ( f + g )(x 0 )
h→0
h
f (x 0 + h) + g (x 0 + h) − f (x 0 ) − g (x 0 )
= lim
h→0
h
f (x 0 + h) − f (x 0 ) g (x 0 + h) − g (x 0 )
= lim
+
= f (x 0 ) + g (x 0 )
h→0
h
h
(b)
(a f )(x 0 + h) − (a f )(x 0 )
f (x 0 + h) − f (x 0 )
lim
= a lim
= a f (x 0 )
h→0
h→0
h
h
(c)
f (x 0 + h)g (x 0 + h) − f (x 0 )g (x 0 )
( f g )(x 0 + h) − ( f g )(x 0 )
= lim
lim
h→0
h→0
h
h
Now, “slip a 0” into the numerator and factor the fraction.
lim
f (x 0 + h)g (x 0 + h) − f (x 0 )g (x 0 + h) + f (x 0 )g (x 0 + h) − f (x 0 )g (x 0 )
h→0
h
f (x 0 + h) − f (x 0 )
g (x 0 + h) − g (x 0 )
= lim
g (x 0 + h) + f (x 0 )
h→0
h
h
Finally, use the definition of the derivative and the continuity of f and g at x 0 .
= lim
= f (x 0 )g (x 0 ) + f (x 0 )g (x 0 )
(d) It will be proved that if g (x 0 ) = 0, then (1/g ) (x 0 ) = −g (x 0 )/(g (x 0 ))2 . This
statement, combined with (c), yields (d).
1
1
−
(1/g )(x 0 + h) − (1/g )(x 0 )
g (x 0 + h) g (x 0 )
lim
= lim
h→0
h→0
h
h
g (x 0 ) − g (x 0 + h)
1
= lim
h→0
h
g (x 0 + h)g (x 0 )
g (x 0 )
=−
(g (x 0 )2
Plug this into (c) to see
f
g
(x 0 ) = f
1
g
(x 0 )
1
−g (x 0 )
+ f (x 0 )
g (x 0 )
(g (x 0 ))2
f (x 0 )g (x 0 ) − f (x 0 )g (x 0 )
=
.
(g (x 0 ))2
= f (x 0 )
August 4, 2017
http://math.louisville.edu/∼lee/ira
74
CHAPTER 7. DIFFERENTIATION
Combining Examples 7.1 and 7.2 with Theorem 7.3, the following theorem is
easy to prove.
C OROLLARY 7.4. A rational function is differentiable at every point of its domain.
T HEOREM 7.5 (Chain Rule). If f and g are functions such that x 0 ∈ D( f ) and
f (x 0 ) ∈ D(g ), then x 0 ∈ D(g ◦ f ) and (g ◦ f ) (x 0 ) = g ◦ f (x 0 ) f (x 0 ).
P ROOF. Let y 0 = f (x 0 ). By assumption, there is an open interval J containing
f (x 0 ) such that g is defined on J . Since J is open and x 0 ∈ C ( f ), there is an open
interval I containing x 0 such that f (I ) ⊂ J .
Define h : J → R by
g (y) − g (y 0 ) − g (y ), y = y
0
0
y − y0
h(y) =
.
0,
y = y0
Since y 0 ∈ D(g ), we see
lim h(y) = lim
y→y 0
y→y 0
g (y) − g (y 0 )
− g (y 0 ) = g (y 0 ) − g (y 0 ) = 0 = h(y 0 ),
y − y0
so y 0 ∈ C (h). Now, x 0 ∈ C ( f ) and f (x 0 ) = y 0 ∈ C (h), so Theorem 6.15 implies
x 0 ∈ C (h ◦ f ). In particular
lim h ◦ f (x) = 0.
(7.1)
x→x 0
From the definition of h ◦ f for x ∈ I with f (x) = f (x 0 ), we can solve for
(7.2)
g ◦ f (x) − g ◦ f (x 0 ) = (h ◦ f (x) + g ◦ f (x 0 ))( f (x) − f (x 0 )).
Notice that (7.2) is also true when f (x) = f (x 0 ). Divide both sides of (7.2) by x −x 0 ,
and use (7.1) to obtain
g ◦ f (x) − g ◦ f (x 0 )
f (x) − f (x 0 )
lim
= lim (h ◦ f (x) + g ◦ f (x 0 ))
x→x 0
x→x
0
x − x0
x − x0
= (0 + g ◦ f (x 0 )) f (x 0 )
= g ◦ f (x 0 ) f (x 0 ).
T HEOREM 7.6. Suppose f : [a, b] → [c, d ] is continuous and invertible. If x 0 ∈
D( f ) and f (x 0 ) = 0 for some x 0 ∈ (a, b), then f (x 0 ) ∈ D( f −1 ) and f −1 ( f (x 0 )) =
1/ f (x 0 ).
P ROOF. Let y 0 = f (x 0 ) and suppose y n is any sequence in f ([a, b]) \ {y 0 } converging to y 0 and x n = f −1 (y n ). By Theorem 6.24, f −1 is continuous, so
x 0 = f −1 (y 0 ) = lim f −1 (y n ) = lim x n .
n→∞
n→∞
Therefore,
lim
n→∞
August 4, 2017
f −1 (y n ) − f −1 (y 0 )
xn − x0
1
= lim
=
.
n→∞
yn − y0
f (x n ) − f (x 0 ) f (x 0 )
http://math.louisville.edu/∼lee/ira
3. DERIVATIVES AND EXTREME POINTS
75
E XAMPLE 7.4. It follows easily from Theorem 7.3 that f (x) = x 3 is differentiable everywhere with f (x) = 3x 2 . Define g (x) = 3 x. Then g (x) = f −1 (x).
Suppose g (y 0 ) = x 0 for some y 0 ∈ R. According to Theorem 7.6,
1
1
1
1
1
(7.3)
g (y 0 ) =
= 2=
= 3
= 2/3 .
2
2
f (x 0 ) 3x 0 3(g (y 0 ))
3( y 0 )
3y 0
If h(x) = x 2/3 , then h(x) = g (x)2 , so (7.3) and the Chain Rule show
2
h (x) = 3 , x = 0,
3 x
as expected.
In the same manner as Example 7.4, the usual power rule for differentiation
can be proved.
C OROLLARY 7.7. Suppose q ∈ Q, f (x) = x q and D is the domain of f . Then
f (x) = q x q−1 on the set
D,
when q ≥ 1
.
D \ {0}, when q < 1
3. Derivatives and Extreme Points
As learned in calculus, the derivative is a powerful tool for determining the
behavior of functions. The following theorems form the basis for much of differential calculus. First, we state a few familiar definitions.
D EFINITION 7.8. Suppose f : D → R and x 0 ∈ D. f is said to have a relative
maximum at x 0 if there is a δ > 0 such that f (x) ≤ f (x 0 ) for all x ∈ (x 0 −δ, x 0 +δ)∩D.
f has a relative minimum at x 0 if − f has a relative maximum at x 0 . If f has either
a relative maximum or a relative minimum at x 0 , then it is said that f has a
relative extreme value at x 0 .
The absolute maximum of f occurs at x 0 if f (x 0 ) ≥ f (x) for all x ∈ D. The
definitions of absolute minimum and absolute extreme are analogous.
Examples like f (x) = x on (0, 1) show that even the nicest functions need not
have relative extrema.
T HEOREM 7.9. Suppose f : (a, b) → R. If f has a relative extreme value at x 0
and x 0 ∈ D( f ), then f (x 0 ) = 0.
P ROOF. Suppose f (x 0 ) is a relative maximum value of f . Then there must be
a δ > 0 such that f (x) ≤ f (x 0 ) whenever x ∈ (x 0 − δ, x 0 + δ). Since f (x 0 ) exists,
(7.4)
x ∈ (x 0 − δ, x 0 ) =⇒
f (x) − f (x 0 )
f (x) − f (x 0 )
≥ 0 =⇒ f (x 0 ) = lim
≥0
x↑x 0
x − x0
x − x0
x ∈ (x 0 , x 0 + δ) =⇒
f (x) − f (x 0 )
f (x) − f (x 0 )
≤ 0 =⇒ f (x 0 ) = lim
≤ 0.
x↓x 0
x − x0
x − x0
and
(7.5)
August 4, 2017
http://math.louisville.edu/∼lee/ira
76
CHAPTER 7. DIFFERENTIATION
Combining (7.4) and (7.5) shows f (x 0 ) = 0.
If f (x 0 ) is a relative minimum value of f , apply the previous argument to
−f .
Suppose f : [a, b] → R is continuous. Corollary 6.23 guarantees f has both an
absolute maximum and minimum on the compact interval [a, b]. Theorem 7.9
implies these extrema must occur at points of the set
C = {x ∈ (a, b) : f (x) = 0} ∪ {x ∈ [a, b] : f (x) does not exist}.
The elements of C are often called the critical points or critical numbers of f on
[a, b]. To find the maximum and minimum values of f on [a, b], it suffices to find
its maximum and minimum on the smaller set C , which is often finite.
4. Differentiable Functions
Differentiation becomes most useful when a function has a derivative at each
point of an interval.
D EFINITION 7.10. The function f is differentiable on an open interval I if
I ⊂ D( f ). If f is differentiable on its domain, then it is said to be differentiable. In
this case, the function f is called the derivative of f .
The fundamental theorem about differentiable functions is the Mean Value
Theorem. Following is its simplest form.
L EMMA 7.11 (Rolle’s Theorem). If f : [a, b] → R is continuous on [a, b], differentiable on (a, b) and f (a) = 0 = f (b), then there is a c ∈ (a, b) such that f (c) = 0.
P ROOF. Since [a, b] is compact, Corollary 6.23 implies the existence of x m , x M ∈
[a, b] such that f (x m ) ≤ f (x) ≤ f (x M ) for all x ∈ [a, b]. If f (x m ) = f (x M ), then f
is constant on [a, b] and any c ∈ (a, b) satisfies the lemma. Otherwise, either
f (x m ) < 0 or f (x M ) > 0. If f (x m ) < 0, then x m ∈ (a, b) and Theorem 7.9 implies
f (x m ) = 0. If f (x M ) > 0, then x M ∈ (a, b) and Theorem 7.9 implies f (x M ) = 0.
Rolle’s Theorem is just a steppingstone on the path to the Mean Value Theorem. Two versions of the Mean Value Theorem follow. The first is a version more
general than the one given in most calculus courses. The second is the usual
version.4
T HEOREM 7.12 (Cauchy Mean Value Theorem). If f : [a, b] → R and g :
[a, b] → R are both continuous on [a, b] and differentiable on (a, b), then there is a
c ∈ (a, b) such that
g (c)( f (b) − f (a)) = f (c)(g (b) − g (a)).
P ROOF. Let
h(x) = (g (b) − g (a))( f (a) − f (x)) + (g (x) − g (a))( f (b) − f (a)).
3August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
4Theorem 7.12 is also sometimes called the Generalized Mean Value Theorem.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Section 4: Differentiable Functions
77
F IGURE 7.2. This is a “picture proof” of Corollary 7.13.
Because of the assumptions on f and g , h is continuous on [a, b] and differentiable on (a, b) with h(a) = h(b) = 0. Theorem 7.11 implies there is a c ∈ (a, b)
such that h (c) = 0. Then
0 = h (c) = −(g (b) − g (a)) f (c) + g (c)( f (b) − f (a))
=⇒ g (c)( f (b) − f (a)) = f (c)(g (b) − g (a)).
C OROLLARY 7.13 (Mean Value Theorem). If f : [a, b] → R is continuous on
[a, b] and differentiable on (a, b), then there is a c ∈ (a, b) such that f (b) − f (a) =
f (c)(b − a).
P ROOF. Let g (x) = x in Theorem 7.12.
Many of the standard theorems of elementary calculus are easy consequences
of the Mean Value Theorem. For example, following are the usual theorems about
monotonicity.
First, recall the following definitions.
D EFINITION 7.14. A function f : (a, b) → R is increasing on (a, b), if a < x <
y < b implies f (x) ≤ f (y). It is decreasing, if − f is increasing. When it is increasing
or decreasing, it is monotone.
Notice with these definitions, a constant function is both increasing and
decreasing. In the case when a < x < y < b implies f (x) < f (y), then f is strictly
increasing. The definition of strictly decreasing is analogous.
T HEOREM 7.15. Suppose f : (a, b) → R is a differentiable function. f is increasing on (a, b) iff f (x) ≥ 0 for all x ∈ (a, b). f is decreasing on (a, b) iff f (x) ≤ 0 for
all x ∈ (a, b).
P ROOF. Only the first assertion is proved because the proof of the second is
pretty much the same with all the inequalities reversed.
(⇒) If x, y ∈ (a, b) with x = y, then the assumption that f is increasing gives
f (y) − f (x)
f (y) − f (x)
≥ 0 =⇒ f (x) = lim
≥ 0.
y→x
y −x
y −x
August 4, 2017
http://math.louisville.edu/∼lee/ira
78
CHAPTER 7. DIFFERENTIATION
(⇐) Let x, y ∈ (a, b) with x < y. According to Theorem 7.13, there is a c ∈ (x, y)
such that f (y) − f (x) = f (c)(y − x) ≥ 0. This shows f (x) ≤ f (y), so f is increasing
on (a, b).
C OROLLARY 7.16. Let f : (a, b) → R be a differentiable function. f is constant
iff f (x) = 0 for all x ∈ (a, b).
It follows from Theorem 7.2 that every differentiable function is continuous.
But, it’s not true that a derivative must be continuous.
E XAMPLE 7.5. Let
f (x) =
x 2 sin x1 , x = 0
0,
x =0
.
We claim f is differentiable everywhere, but f is not continuous.
To see this, first note that when x = 0, the standard differentiation formulas
give that f (x) = 2x sin(1/x)−cos(1/x). To calculate f (0), choose any h = 0. Then
h 2 sin(1/h)
h2
f (h)
=
≤
= h
h
h
h
and it easily follows from the definition of the derivative and the Squeeze Theorem
(Theorem 6.3) that f (0) = 0.
Therefore,
0,
x =0
f (x) =
.
1
1
2x sin x − cos x , x = 0
Let x n = 1/2πn for n ∈ N. Then x n → 0 and
f (x n ) = 2x n sin(1/x n ) − cos(1/x n )
=
1
sin 2πn − cos 2πn = −1
πn
for all n. Therefore, f (x n ) → −1 = 0 = f (0), and f is not continuous at 0.
But, derivatives do share one useful property with continuous functions; they
satisfy an intermediate value property. Compare the following theorem with
Corollary 6.26.
T HEOREM 7.17 (Darboux’s Theorem). If f is differentiable on an open set
containing [a, b] and γ is between f (a) and f (b), then there is a c ∈ [a, b] such
that f (c) = γ.
P ROOF. If f (a) = f (b), then c = a satisfies the theorem. So, we may as well
assume f (a) = f (b). There is no generality lost in assuming f (a) < f (b), for,
otherwise, we just replace f with g = − f .
Let h(x) = f (x) − γx so that D( f ) = D(h) and h (x) = f (x) − γ. In particular,
this implies h (a) < 0 < h (b). Because of this, there must be an ε > 0 small enough
so that
h(a + ε) − h(a)
< 0 =⇒ h(a + ε) < h(a)
ε
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. APPLICATIONS OF THE MEAN VALUE THEOREM
79
F IGURE 7.3. This could be the function h of Theorem 7.17.
and
h(b) − h(b − ε)
> 0 =⇒ h(b − ε) < h(b).
ε
(See Figure 7.3.) In light of these two inequalities and Theorem 6.23, there must
be a c ∈ (a, b) such that h(c) = glb {h(x) : x ∈ [a, b]}. Now Theorem 7.9 gives 0 =
h (c) = f (c) − γ, and the theorem follows.
Here’s an example showing a possible use of Theorem 7.17.
E XAMPLE 7.6. Let
f (x) =
0, x = 0
1, x = 0
.
Theorem 7.17 implies f is not a derivative.
A more striking example is the following
E XAMPLE 7.7. Define
f (x) =
sin x1 , x = 0
1,
x =0
and g (x) =
sin x1 , x = 0
−1,
x =0
.
Since
f (x) − g (x) =
0, x = 0
2, x = 0
does not have the intermediate value property, at least one of f or g is not a
derivative. (Actually, neither is a derivative because f (x) = −g (−x).)
5. Applications of the Mean Value Theorem
In the following sections, the standard notion of higher order derivatives
is used. To make this precise, suppose f is defined on an interval I . The
function f itself can be written f (0) . If f is differentiable, then f is written
f (1) . Continuing inductively, if n ∈ ω, f (n) exists on I and x 0 ∈ D( f (n) ), then
f (n+1) (x 0 ) = d f (n) (x 0 )/d x.
5August 4, 2017
August 4, 2017
©Lee Larson (Lee.Larson@Louisville.edu)
http://math.louisville.edu/∼lee/ira
710
CHAPTER 7. DIFFERENTIATION
5.1. Taylor’s Theorem. The motivation behind Taylor’s theorem is the attempt to approximate a function f near a number a by a polynomial. The
polynomial of degree 0 which does the best job is clearly p 0 (x) = f (a). The
best polynomial of degree 1 is the tangent line to the graph of the function
p 1 (x) = f (a) + f (a)(x − a). Continuing in this way, we approximate f near a
by the polynomial p n of degree n such that f (k) (a) = p n(k) (a) for k = 0, 1, . . . , n. A
simple induction argument shows that
n
p n (x) =
(7.6)
k=0
f (k) (a)
(x − a)k .
k!
This is the wellknown Taylor polynomial of f at a.
Many students leave calculus with the mistaken impression that (7.6) is the
important part of Taylor’s theorem. But, the important part of Taylor’s theorem
is the fact that in many cases it is possible to determine how large n must be to
achieve a desired accuracy in the approximation of f ; i. e., the error term is the
important part.
T HEOREM 7.18 (Taylor’s Theorem). If f is a function such that f , f , . . . , f (n)
are continuous on [a, b] and f (n+1) exists on (a, b), then there is a c ∈ (a, b) such
that
n f (k) (a)
f (n+1) (c)
f (b) =
(b − a)k +
(b − a)n+1 .
k!
(n + 1)!
k=0
P ROOF. Let the constant α be defined by
n
f (b) =
(7.7)
k=0
α
f (k) (a)
(b − a)k +
(b − a)n+1
k!
(n + 1)!
and define
n
F (x) = f (b) −
k=0
f (k) (x)
α
(b − x)k +
(b − x)n+1 .
k!
(n + 1)!
From (7.7) we see that F (a) = 0. Direct substitution in the definition of F shows
that F (b) = 0. From the assumptions in the statement of the theorem, it is easy to
see that F is continuous on [a, b] and differentiable on (a, b). An application of
Rolle’s Theorem yields a c ∈ (a, b) such that
0 = F (c) = −
f (n+1) (c)
α
(b − c)n − (b − c)n =⇒ α = f (n+1) (c),
n!
n!
as desired.
Now, suppose f is defined on an open interval I with a, x ∈ I . If f is n + 1
times differentiable on I , then Theorem 7.18 implies there is a c between a and x
such that
f (x) = p n (x) + R f (n, x, a),
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. APPLICATIONS OF THE MEAN VALUE THEOREM
4
n=4
711
n=8
2
n = 20
2
4
6
8
y = cos(x)
2
4
n=6
n=2
n = 10
F IGURE 7.4. Here are several of the Taylor polynomials for the function
cos(x), centered at a = 0, graphed along with cos(x).
where R f (n, x, a) =
f (n+1) (c)
n+1
(n+1)! (x − a)
is the error in the approximation.6
E XAMPLE 7.8. Let f (x) = cos x. Suppose we want to approximate f (2) to 5
decimal places of accuracy. Since it’s an easy point to work with, we’ll choose
a = 0. Then, for some c ∈ (0, 2),
(7.8)
R f (n, 2, 0) =
 f (n+1) (c) n+1
2n+1
2
≤
.
(n + 1)!
(n + 1)!
A bit of experimentation with a calculator shows that n = 12 is the smallest n such
that the righthand side of (7.8) is less than 5 × 10−6 . After doing some arithmetic,
it follows that
p 12 (2) = 1 −
27809
22 24 26 28 210 212
+ − + −
+
=−
≈ −0.41615.
2! 4! 6! 8! 10! 12!
66825
is a 5 decimal place approximation to cos(2). (A calculator gives the value cos(2) =
−0.416146836547142 which is about 0.00000316 larger, comfortably less than the
desired maximum error.)
But, things don’t always work out the way we might like. Consider the following example.
E XAMPLE 7.9. Suppose
2
f (x) =
e −1/x , x = 0
0,
x =0
.
6There are several different formulas for the error. The one given here is sometimes called the
Lagrange form of the remainder. In Example 8.4 a form of the remainder using integration instead
of differentiation is derived.
August 4, 2017
http://math.louisville.edu/∼lee/ira
712
CHAPTER 7. DIFFERENTIATION
Figure 7.5 below has a graph of this function. In Example 7.11 below it is shown
that f is differentiable to all orders everywhere and f (n) (0) = 0 for all n ≥ 0. With
this function the Taylor polynomial centered at 0 gives a useless approximation.
5.2. L’Hôpital’s Rules and Indeterminate Forms. According to Theorem 6.4,
f (x) limx→a f (x)
=
x→a g (x)
limx→a g (x)
lim
whenever limx→a f (x) and limx→a g (x) both exist and limx→a g (x) = 0. But, it
is easy to find examples where both limx→a f (x) = 0 and limx→a g (x) = 0 and
limx→a f (x)/g (x) exists, as well as similar examples where limx→a f (x)/g (x) fails
to exist. Because of this, such a limit problem is said to be in the indeterminate
form 0/0. The following theorem allows us to determine many such limits.
T HEOREM 7.19 (Easy L’Hôpital’s Rule). Suppose f and g are each continuous
on [a, b], differentiable on (a, b) and f (b) = g (b) = 0. If g (x) = 0 on (a, b) and
limx↑b f (x)/g (x) = L, where L could be infinite, then limx↑b f (x)/g (x) = L.
P ROOF. Let x ∈ [a, b), so f and g are continuous on [x, b] and differentiable
on (x, b). Cauchy’s Mean Value Theorem, Theorem 7.12, implies there is a c(x) ∈
(x, b) such that
f (c(x))g (x) = g (c(x)) f (x) =⇒
f (x) f (c(x))
=
.
g (x) g (c(x))
Since x < c(x) < b, it follows that limx↑b c(x) = b. This shows that
L = lim
x↑b
f (c(x))
f (x)
f (x)
= lim
= lim
.
g (x) x↑b g (c(x)) x↑b g (x)
Several things should be noted about this proof. First, there is nothing special
about the lefthand limit used in the statement of the theorem. It could just as
easily be written in terms of the righthand limit. Second, if limx→a f (x)/g (x) is
not of the indeterminate form 0/0, then applying L’Hôpital’s rule will usually give
a wrong answer. To see this, consider
lim
x→0
x
1
= 0 = 1 = lim .
x→0 1
x +1
Another case where the indeterminate form 0/0 occurs is in the limit at
infinity. That L’Hôpital’s rule works in this case can easily be deduced from
Theorem 7.19.
C OROLLARY 7.20. Suppose f and g are differentiable on (a, ∞) and
lim f (x) = lim g (x) = 0.
x→∞
x→∞
If g (x) = 0 on (a, ∞) and limx→∞ f (x)/g (x) = L, where L could be infinite, then
limx→∞ f (x)/g (x) = L.
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. APPLICATIONS OF THE MEAN VALUE THEOREM
713
P ROOF. There is no generality lost by assuming a > 0. Let
F (x) =
f (1/x), x ∈ (0, 1/a]
x =0
0,
and G(x) =
g (1/x), x ∈ (0, 1/a]
x =0
0,
.
Then
lim F (x) = lim f (x) = 0 = lim g (x) = lim G(x),
x→∞
x↓0
x→∞
x↓0
so both F and G are continuous at 0. It follows that both F and G are continuous
on [0, 1/a] and differentiable on (0, 1/a) with G (x) = −g (x)/x 2 = 0 on (0, 1/a)
and limx↓0 F (x)/G (x) = limx→∞ f (x)/g (x) = L. The rest follows from Theorem
7.19.
The other standard indeterminate form arises when
lim f (x) = ∞ = lim g (x).
x→∞
x→∞
This is called an ∞/∞ indeterminate form. It is often handled by the following
theorem.
T HEOREM 7.21 (Hard L’Hôpital’s Rule). Suppose that f and g are differentiable
on (a, ∞) and g (x) = 0 on (a, ∞). If
lim f (x) = lim g (x) = ∞ and
x→∞
x→∞
lim
x→∞
f (x)
= L ∈ R ∪ {−∞, ∞},
g (x)
then
f (x)
= L.
x→∞ g (x)
lim
P ROOF. First, suppose L ∈ R and let ε > 0. Choose a 1 > a large enough so that
f (x)
− L < ε, ∀x > a 1 .
g (x)
Since limx→∞ f (x) = ∞ = limx→∞ g (x), we can assume there is an a 2 > a 1 such
that both f (x) > 0 and g (x) > 0 when x > a 2 . Finally, choose a 3 > a 2 such that
whenever x > a 3 , then f (x) > f (a 2 ) and g (x) > g (a 2 ).
Let x > a 3 and apply Cauchy’s Mean Value Theorem, Theorem 7.12, to f and
g on [a 2 , x] to find a c(x) ∈ (a 2 , x) such that
f (a )
(7.9)
2
f (c(x)) f (x) − f (a 2 ) f (x) 1 − f (x)
=
=
.
g (c(x)) g (x) − g (a 2 ) g (x) 1 − g (a2 )
g (x)
If
h(x) =
1−
g (a 2 )
g (x)
1−
f (a 2 )
f (x)
,
then (7.9) implies
f (x) f (c(x))
=
h(x).
g (x) g (c(x))
August 4, 2017
http://math.louisville.edu/∼lee/ira
714
CHAPTER 7. DIFFERENTIATION
Since limx→∞ h(x) = 1, there is an a 4 > a 3 such that whenever x > a 4 , then h(x)−
1 < ε. If x > a 4 , then
f (x)
f (c(x))
−L =
h(x) − L
g (x)
g (c(x))
f (c(x))
=
h(x) − Lh(x) + Lh(x) − L
g (c(x))
f (c(x))
− L h(x) + Lh(x) − 1
≤
g (c(x))
< ε(1 + ε) + Lε = (1 + L + ε)ε
can be made arbitrarily small through a proper choice of ε. Therefore
lim f (x)/g (x) = L.
x→∞
The case when L = ∞ is done similarly by first choosing a B > 0 and adjusting
(7.9) so that f (x)/g (x) > B when x > a 1 . A similar adjustment is necessary when
L = −∞.
There is a companion corollary to Theorem 7.21 which is proved in the same
way as Corollary 7.20.
C OROLLARY 7.22. Suppose that f and g are continuous on [a, b] and differentiable on (a, b) with g (x) = 0 on (a, b). If
lim f (x) = lim g (x) = ∞ and
x↓a
x↓a
then
lim
x↓a
f (x)
= L ∈ R ∪ {−∞, ∞},
g (x)
f (x)
= L.
g (x)
lim
x↓a
E XAMPLE 7.10. If α > 0, then limx→∞ ln x/x α is of the indeterminate form
∞/∞. Taking derivatives of the numerator and denominator yields
lim
1/x
= lim
x→∞ αx α−1
1
x→∞ αx α
α
= 0.
Theorem 7.21 now implies limx→∞ ln x/x = 0, and therefore ln x increases more
slowly than any positive power of x.
E XAMPLE 7.11. Let f be as in Example 7.9. (See Figure 7.5.) It is clear f (n) (x)
exists whenever n ∈ ω and x = 0. We claim f (n) (0) = 0. To see this, we first prove
that
2
e −1/x
(7.10)
lim
= 0, ∀n ∈ Z.
x→0 x n
When n ≤ 0, (7.10) is obvious. So, suppose (7.10) is true whenever m ≤ n for
some n ∈ ω. Making the substitution u = 1/x, we see
2
(7.11)
August 4, 2017
e −1/x
u n+1
lim n+1 = lim
.
u→∞ e u 2
x↓0 x
http://math.louisville.edu/∼lee/ira
6. EXERCISES
715
1
–3
–2
2
1
–1
3
F IGURE 7.5. This is a plot of f (x) = exp(−1/x 2 ). Notice how the graph
flattens out near the origin.
The righthand side is an ∞/∞ indeterminate form, so L’Hôpital’s rule can be
used. Since
lim
u→∞
(n + 1)u n
2ue u
2
(n + 1)u n−1
= lim
u→∞
2e u
2
2
n +1
e −1/x
=
lim n−1 = 0
2 x↓0 x
by the inductive hypothesis, Theorem 7.21 gives (7.11) in the case of the righthand limit. The lefthand limit is handled similarly. Finally, (7.10) follows by
induction.
When x = 0, a bit of experimentation can convince the reader that f (n) (x)
2
is of the form p n (1/x)e −1/x , where p n is a polynomial. Induction and repeated
applications of (7.10) establish that f (n) (0) = 0 for n ∈ ω.
6. Exercises
7.1. If
f (x) =
x 2, x ∈ Q
0,
otherwise
,
then show D( f ) = {0} and find f (0).
7.2. Let f be a function defined on some neighborhood of x = a with f (a) = 0.
Prove f (a) = 0 if and only if a ∈ D( f ).
7.3. If f is defined on an open set containing x 0 , the symmetric derivative of f at
x 0 is defined as
f (x 0 + h) − f (x 0 − h)
f s (x 0 ) = lim
.
h→0
2h
Prove that if f (x) exists, then so does f s (x). Is the converse true?
August 4, 2017
http://math.louisville.edu/∼lee/ira
716
CHAPTER 7. DIFFERENTIATION
7.4. Let G be an open set and f ∈ D(G). If there is an a ∈ G such that limx→a f (x)
exists, then limx→a f (x) = f (a).
7.5. Suppose f is continuous on [a, b] and f exists on (a, b). If there is an
x 0 ∈ (a, b) such that the line segment between (a, f (a)) and (b, f (b)) contains the
point (x 0 , f (x 0 )), then there is a c ∈ (a, b) such that f (c) = 0.
7.6. If ∆ = { f : f = F for some F : R → R}, then ∆ is closed under addition and
scalar multiplication. (This shows the derivatives form a vector space.)
7.7. If
f 1 (x) =
1/2,
x =0
sin(1/x), x = 0
and
f 2 (x) =
1/2,
x =0
sin(−1/x), x = 0
,
then at least one of f 1 and f 2 is not in ∆.
7.8. Prove or give a counter example: If f is continuous on R and differentiable
on R \ {0} with limx→0 f (x) = L, then f is differentiable on R.
7.9. Suppose f is differentiable everywhere and f (x+y) = f (x) f (y) for all x, y ∈ R.
Show that f (x) = f (0) f (x) and determine the value of f (0).
7.10. If I is an open interval, f is differentiable on I and a ∈ I , then there is a
sequence a n ∈ I \ {a} such that a n → a and f (a n ) → f (a).
7.11. Use the definition of the derivative to find
d
dx
x.
7.12. Let f be continuous on [0, ∞) and differentiable on (0, ∞). If f (0) = 0 and
 f (x) <  f (x) for all x > 0, then f (x) = 0 for all x ≥ 0.
7.13. Suppose f : R → R is such that f is continuous on [a, b]. If there is a
c ∈ (a, b) such that f (c) = 0 and f (c) > 0, then f has a local minimum at c.
7.14. Prove or give a counter example: If f is continuous on R and differentiable
on R \ {0} with limx→0 f (x) = L, then f is differentiable on R.
7.15. Let f be continuous on [a, b] and differentiable on (a, b). If f (a) = α and
 f (x) < β for all x ∈ (a, b), then calculate a bound for f (b).
7.16. Suppose that f : (a, b) → R is differentiable and f is bounded. If x n is a
sequence from (a, b) such that x n → a, then f (x n ) converges.
7.17. Let G be an open set and f ∈ D(G). If there is an a ∈ G such that limx→a f (x)
exists, then limx→a f (x) = f (a).
August 4, 2017
http://math.louisville.edu/∼lee/ira
6. EXERCISES
717
7.18. Prove or give a counter example: If f ∈ D((a, b)) such that f is bounded,
then there is an F ∈ C ([a, b]) such that f = F on (a, b).
7.19. Show that f (x) = x 3 + 2x + 1 is invertible on R and, if g = f −1 , then find
g (1).
7.20. Suppose that I is an open interval and that f (x) ≥ 0 for all x ∈ I . If a ∈ I ,
then show that the part of the graph of f on I is never below the tangent line to
the graph at (a, f (a)).
7.21. Suppose f is continuous on [a, b] and f exists on (a, b). If there is an
x 0 ∈ (a, b) such that the line segment between (a, f (a)) and (b, f (b)) contains the
point (x 0 , f (x 0 )), then there is a c ∈ (a, b) such that f (c) = 0.
7.22. Let f be defined on a neighborhood of x.
(a) If f (x) exists, then
f (x − h) − 2 f (x) + f (x + h)
= f (x).
h→0
h2
(b) Find a function f where this limit exists, but f (x) does not exist.
lim
7.23. If f : R → R is differentiable everywhere and is even, then f is odd. If
f : R → R is differentiable everywhere and is odd, then f is even.7
7.24. Prove that
sin x − x −
x3
x5
+
6 120
<
1
5040
when x ≤ 1.
7.25. The exponential function e x is not a polynomial.
7A function g is even if g (−x) = g (x) for every x and it is odd if g (−x) = −g (x) for every x. Even
and odd functions are described as such because this is how g (x) = x n behaves when n is an even
or odd integer, respectively.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 8
Integration
Contrary to the impression given by most calculus courses, there are many
ways to define integration. The one given here is called the Riemann integral or
the RiemannDarboux integral, and it is the one most commonly presented to
calculus students.
1. Partitions
A partition of the interval [a, b] is a finite set P ⊂ [a, b] such that {a, b} ⊂ P . The
set of all partitions of [a, b] is denoted part ([a, b]). Basically, a partition should
be thought of as a way to divide an interval into a finite number of subintervals
by choosing some points where it is divided.
If P ∈ part ([a, b]), then the elements of P can be ordered in a list as a = x 0 <
x 1 < · · · < x n = b. The adjacent points of this partition determine n compact
intervals of the form I kP = [x k−1 , x k ], 1 ≤ k ≤ n. If the partition is understood from
the context, we write I k instead of I kP . It’s clear these intervals only intersect at
their common endpoints and there is no requirement they have the same length.
Since it’s inconvenient to always list each part of a partition, we’ll use the
partition of the previous paragraph as the generic partition. Unless it’s necessary
within the context to specify some other form for a partition, assume any partition
is the generic partition. (See Figure 1.)
If I is any interval, its length is written I . Using the notation of the previous
paragraph, it follows that
n
n
(x k − x k−1 ) = x n − x 0 = b − a.
I k  =
k=1
k=1
The norm of a partition P is
P = max{I kP  : 1 ≤ k ≤ n}.
In other words, the norm of P is just the length of the longest subinterval determined by P . If I k  = P for every I k , then P is called a regular partition.
x0
I1
x1
I2
x2
I3
x3
I4
x4
I5
a
x5
b
F IGURE 8.1. The generic partition with five subintervals.
81
82
CHAPTER 8. INTEGRATION
Suppose P,Q ∈ part ([a, b]). If P ⊂ Q, then Q is called a refinement of P . When
this happens, we write P
Q. In this case, it’s easy to see that P
Q implies
P ≥ Q . It also follows at once from the definitions that P ∪Q ∈ part ([a, b]) with
P
P ∪ Q and Q P ∪ Q. The partition P ∪ Q is called the common refinement
of P and Q.
2. Riemann Sums
{x k∗
Let f : [a, b] → R and P ∈ part ([a, b]). Choose x k∗ ∈ I k for each k. The set
: 1 ≤ k ≤ n} is called a selection from P . The expression
R f , P, x k∗ =
n
k=1
f (x k∗ )I k 
is the Riemann sum for f with respect to the partition P and selection x k∗ . The
Riemann sum is the usual first step toward integration in a calculus course and
can be visualized as the sum of the areas of rectangles with height f (x k∗ ) and
width I k  — as long as the rectangles are allowed to have negative area when
f (x k∗ ) < 0. (See Figure 8.2.)
Notice that given a particular function f and partition P , there are an uncountably infinite number of different possible Riemann sums, depending on
the selection x k∗ . This sometimes makes working with Riemann sums quite complicated.
E XAMPLE 8.1. Suppose f : [a, b] → R is the constant function f (x) = c. If
P ∈ part ([a, b]) and {x k∗ : 1 ≤ k ≤ n} is any selection from P , then
R f , P, x k∗ =
n
k=1
f (x k∗ )I k  = c
n
I k  = c(b − a).
k=1
y = f (x)
⇤
a x1
x0
x1
x⇤2
x2
x⇤3
x3
x⇤4
b
x4
F IGURE 8.2. The Riemann sum R f , P, x k∗ is the sum of the areas
of the rectangles in this figure. Notice the rightmost rectangle has
negative area because f (x 4∗ ) < 0.
August 4, 2017
http://math.louisville.edu/∼lee/ira
Riemann Sums
83
E XAMPLE 8.2. Suppose f (x) = x on [a, b]. Choose any P ∈ part ([a, b]) where
P < 2(b − a)/n. (Convince yourself this is always possible.1) Make two specific
selections l k∗ = x k−1 and r k∗ = x k . If x k∗ is any other selection from P , then l k∗ ≤
x k∗ ≤ r k∗ and the fact that f is increasing on [a, b] gives
R f , P, l k∗ ≤ R f , P, x k∗ ≤ R f , P, r k∗ .
With this in mind, consider the following calculation.
(8.1)
R f , P, r k∗ − R f , P, l k∗ =
n
k=1
n
(r k∗ − l k∗ )I k 
(x k − x k−1 )I k 
=
k=1
n
=
I k 2
k=1
n
P
≤
2
k=1
=n P
<
2
4(b − a)2
n
This shows that if a partition is chosen with a small enough norm, all the Riemann
sums for f over that partition will be close to each other.
In the special case when P is a regular partition, I k  = (b − a)/n, r k = a +
k(b − a)/n and
R f , P, r k∗ =
n
r k I k 
k=1
n
a+
=
k=1
=
k(b − a) b − a
n
n
b−a
b−a n
na +
k
n
n k=1
b−a
b − a n(n + 1)
na +
n
n
2
b −a n −1
n +1
=
a
+b
.
2
n
n
=
In the limit as n → ∞, this becomes the familiar formula (b 2 − a 2 )/2, for the
integral of f (x) = x over [a, b].
D EFINITION 8.1. The function f is Riemann integrable on [a, b], if there
exists a number R f such that for all ε > 0 there is a δ > 0 so that whenever
1This is with the generic partition
August 4, 2017
http://math.louisville.edu/∼lee/ira
84
CHAPTER 8. INTEGRATION
P ∈ part ([a, b]) with P < δ, then
R f − R f , P, x k∗  < ε
for any selection x k∗ from P .
T HEOREM 8.2. If f : [a, b] → R and R f exists, then R f is unique.
P ROOF. Suppose R 1 ( f ) and R 2 ( f ) both satisfy the definition and ε > 0. For
i = 1, 2 choose δi > 0 so that whenever P < δi , then
R i ( f ) − R f , P, x k∗  < ε/2,
as in the definition above. If P ∈ part ([a, b]) so that P < δ1 ∧ δ2 , then
R 1 ( f ) − R 2 ( f ) ≤ R 1 ( f ) − R f , P, x k∗  + R 2 ( f ) − R f , P, x k∗  < ε
and it follows R 1 ( f ) = R 2 ( f ).
T HEOREM 8.3. If f : [a, b] → R and R f exists, then f is bounded.
P ROOF. Left as Exercise 8.1.
3. Darboux Integration
As mentioned above, a difficulty with handling Riemann sums is there are
so many different ways to choose partitions and selections that working with
them is unwieldy. One way to resolve this problem was shown in Example 8.2,
where it was easy to find largest and smallest Riemann sums associated with each
partition. However, that’s not always a straightforward calculation, so to use that
idea, a little more care must be taken.
D EFINITION 8.4. Let f : [a, b] → R be bounded and P ∈ part ([a, b]). For each
I k determined by P , let
M k = lub { f (x) : x ∈ I k } and m k = glb { f (x) : x ∈ I k }.
The upper and lower Darboux sums for f on [a, b] are
D f ,P =
n
k=1
M k I k  and D f , P =
n
m k I k .
k=1
The following theorem is the fundamental relationship between Darboux
sums. Pay careful attention because it’s the linchpin holding everything together!
T HEOREM 8.5. If f : [a, b] → R is bounded and P,Q ∈ part ([a, b]) with P
then
Q,
D f , P ≤ D f ,Q ≤ D f ,Q ≤ D f , P .
August 4, 2017
http://math.louisville.edu/∼lee/ira
Darboux Integration
85
P ROOF. Let P be the generic partition and let Q = P ∪{x}, where x ∈ (x k0 −1 , x x0 )
for some k 0 . Clearly, P Q. Let
M l = lub { f (x) : x ∈ [x k0 −1 , x]}
m l = glb { f (x) : x ∈ [x k0 −1 , x]}
M r = lub { f (x) : x ∈ [x, x k0 ]}
m r = glb { f (x) : x ∈ [x, x k0 ]}
Then
m k0 ≤ m l ≤ M l ≤ M k0
and m k0 ≤ m r ≤ M r ≤ M k0
so that
m k0 I k0  = m k0 [x k0 −1 , x] + [x, x k0 ]
≤ m l [x k0 −1 , x] + m r [x, x k0 ]
≤ M l [x k0 −1 , x] + M r [x, x k0 ]
≤ M k0 [x k0 −1 , x] + M k0 [x, x k0 ]
= M k0 I k0 .
This implies
D f ,P =
n
m k I k 
k=1
m k I k  + m k0 I k0 
=
k=k 0
m k I k  + m l [x k0 −1 , x] + m r [x, x k0 ]
≤
k=k 0
= D f ,Q
≤ D f ,Q
M k I k  + M l [x k0 −1 , x] + M r [x, x k0 ]
=
k=k 0
n
M k I k 
≤
k=1
= D f ,P
The argument given above shows that the theorem holds if Q has one more point
than P . Using induction, this same technique also shows the theorem holds when
Q has an arbitrarily larger number of points than P .
The main lesson to be learned from Theorem 8.5 is that refining a partition causes the lower Darboux sum to increase and the upper Darboux sum to
decrease. Moreover, if P,Q ∈ part ([a, b]) and f : [a, b] → [−B, B ], then,
−B (b − a) ≤ D f , P ≤ D f , P ∪Q ≤ D f , P ∪Q ≤ D f ,Q ≤ B (b − a).
August 4, 2017
http://math.louisville.edu/∼lee/ira
86
CHAPTER 8. INTEGRATION
Therefore every Darboux lower sum is less than or equal to every Darboux upper
sum. Consider the following definition with this in mind.
D EFINITION 8.6. The upper and lower Darboux integrals of a bounded function f : [a, b] → R are
D f = glb {D f , P : P ∈ part ([a, b])}
and
D f = lub {D f , P : P ∈ part ([a, b])},
respectively.
As a consequence of the observations preceding the definition, it follows that
D f ≥ D f always. In the case D f = D f , the function is said to be Darboux
integrable on [a, b], and the common value is written D f .
The following is obvious.
C OROLLARY 8.7. A bounded function f : [a, b] → R is Darboux integrable if
and only if for all ε > 0 there is a P ∈ part ([a, b]) such that D f , P − D f , P < ε.
Which functions are Darboux integrable? The following corollary gives a first
approximation to an answer.
C OROLLARY 8.8. If f ∈ C ([a, b]), then D f exists.
P ROOF. Let ε > 0. According to Corollary 6.31, f is uniformly continuous, so
there is a δ > 0 such that whenever x, y ∈ [a, b] with x − y < δ, then  f (x)− f (y) <
ε/(b − a). Let P ∈ part ([a, b]) with P < δ. By Corollary 6.23, in each subinterval
I i determined by P , there are x i∗ , y i∗ ∈ I i such that
f (x i∗ ) = lub { f (x) : x ∈ I i } and
f (y i∗ ) = glb { f (x) : x ∈ I i }.
Since x i∗ − y i∗  ≤ I i  < δ, we see 0 ≤ f (x i∗ ) − f (y i∗ ) < ε/(b − a), for 1 ≤ i ≤ n. Then
D f −D f ≤ D f ,P −D f ,P
n
=
i =1
n
=
i =1
<
f (x i∗ )I i  −
n
i =1
f (y i∗ )I i 
( f (x i∗ ) − f (y i∗ ))I i 
ε n
I i 
b − a i =1
=ε
and the corollary follows.
This corollary should not be construed to imply that only continuous functions are Darboux integrable. In fact, the set of integrable functions is much more
extensive than only the continuous functions. Consider the following example.
August 4, 2017
http://math.louisville.edu/∼lee/ira
The Integral
87
E XAMPLE 8.3. Let f be the salt and pepper function of Example 6.15. It was
shown that C ( f ) = Qc . We claim that f is Darboux integrable over any compact
interval [a, b].
To see this, let ε > 0 and N ∈ N so that 1/N < ε/2(b − a). Let
{q ki : 1 ≤ i ≤ m} = {q k : 1 ≤ k ≤ N } ∩ [a, b]
and choose P ∈ part ([a, b]) such that P < ε/2m. Then
D f ,P =
n
lub { f (x) : x ∈ I }I 
=1
lub { f (x) : x ∈ I }I  +
=
q ki ∉I
lub { f (x) : x ∈ I }I 
q ki ∈I
1
(b − a) + m P
N
ε
ε
<
(b − a) + m
2(b − a)
2m
= ε.
≤
Since f (x) = 0 whenever x ∈ Qc , it follows that D f , P = 0. Therefore, D f =
D f = 0 and D f = 0.
4. The Integral
There are now two different definitions for the integral. It would be embarassing, if they gave different answers. The following theorem shows they’re really
different sides of the same coin.2
T HEOREM 8.9. Let f : [a, b] → R.
(a) R f exists iff D f exists.
(b) If R f exists, then R f = D f .
P ROOF. (a) (=⇒) Suppose R f exists and ε > 0. By Theorem 8.3, f is
bounded. Choose P ∈ part ([a, b]) such that
R f − R f , P, x k∗  < ε/4
for all selections x k∗ from P . From each I k , choose x k and x k so that
M k − f (x k ) <
ε
4(b − a)
and
f (x k ) − m k <
ε
.
4(b − a)
2Theorem 8.9 shows that the two integrals presented here are the same. But, there are many
other integrals, and not all of them are equivalent. For example, the wellknown Lebesgue integral
includes all Riemann integrable functions, but not all Lebesgue integrable functions are Riemann
integrable. The Denjoy integral is another extension of the Riemann integral which is not the same
as the Lebesgue integral. For more discussion of this, see [12].
August 4, 2017
http://math.louisville.edu/∼lee/ira
88
CHAPTER 8. INTEGRATION
Then
D f , P − R f , P, x k =
n
n
f (x k )I k 
M k I k  −
k=1
k=1
n
(M k − f (x k ))I k 
=
k=1
<
ε
ε
(b − a) = .
4(b − a)
4
In the same way,
R f , P, x k − D f , P < ε/4.
Therefore,
D f −D f
= glb {D f ,Q : Q ∈ part ([a, b])} − lub {D f ,Q : Q ∈ part ([a, b])}
≤ D f ,P −D f ,P
ε
ε
< R f , P, x k + − R f , P, x k −
4
4
ε
≤ R f , P, x k − R f , P, x k +
2
< R f , P, x k − R f  + R f − R f , P, x k  +
<ε
ε
2
Since ε is an arbitrary positive number, this shows D f exists and equals R f ,
which is part (b) of the theorem.
(⇐=) Suppose f : [a, b] → [−B, B ], D f exists and ε > 0. Since D f exists,
there is a P 1 ∈ part ([a, b]), with points a = p 0 < · · · < p m = b, such that
ε
D f , P1 − D f , P1 < .
2
Set δ = ε/8mB . Choose P ∈ part ([a, b]) with P < δ and let P 2 = P ∪ P 1 . Since
P 1 P 2 , according to Theorem 8.5,
ε
D f , P2 − D f , P2 < .
2
Thinking of P as the generic partition, the interiors of its intervals (x i −1 , x i )
may or may not contain points of P 1 . For 1 ≤ i ≤ n, let
Q i = {x i −1 , x i } ∪ (P 1 ∩ (x i −1 , x i )) ∈ part (I i ) .
If P 1 ∩ (x i −1 , x i ) = , then D f , P and D f , P 2 have the term M i I i  in common because Q i = {x i −1 , x i }.
Otherwise, P 1 ∩ (x i −1 , x i ) = and
D f ,Q i ≥ −B P 2 ≥ −B P > −B δ.
Since P 1 has m − 1 points in (a, b), there are at most m − 1 of the Q i not contained
in P .
August 4, 2017
http://math.louisville.edu/∼lee/ira
The Cauchy Criterion
89
This leads to the estimate
D f , P − D f , P2 = D f , P −
ε
D f ,Q i < (m − 1)2B δ < .
4
i =1
n
In the same way,
ε
D f , P 2 − D f , P < (m − 1)2B δ < .
4
Putting these estimates together yields
D f ,P −D f ,P =
D f , P − D f , P2 + D f , P2 − D f , P2 + D f , P2 − D f , P
ε ε ε
+ + =ε
4 2 4
This shows that, given ε > 0, there is a δ > 0 so that P < δ implies
<
D f , P − D f , P < ε.
Since
D f , P ≤ D f ≤ D f , P and D f , P ≤ R f , P, x i∗ ≤ D f , P
for every selection x i∗ from P , it follows that R f , P, x i∗ −D f  < ε when P < δ.
We conclude f is Riemann integrable and R f = D f .
From Theorem 8.9, we are justified in using a single notation for both R f
b
b
and D f . The obvious choice is the familiar a f (x) d x, or, more simply, a f .
When proving statements about the integral, it’s convenient to switch back
and forth between the Riemann and Darboux formulations. Given f : [a, b] → R
the following three facts summarize much of what we know.
(1)
b
a
f exists iff for all ε > 0 there is a δ > 0 and an α ∈ R such that whenever
P ∈ part ([a, b]) and x i∗ is a selection from P , then R f , P, x i∗ − α < ε.
In this case
b
a
f = α.
b
a
(2)
f exists iff ∀ε > 0∃P ∈ part ([a, b]) D f , P − D f , P < ε
(3) For any P ∈ part ([a, b]) and selection x i∗ from P ,
D f , P ≤ R f , P, x i∗ ≤ D f , P .
5. The Cauchy Criterion
b
We now face a conundrum. In order to show that a f exists, we must know
its value. It’s often very hard to determine the value of an integral, even if the
integral exists. We’ve faced this same situation before with sequences. The basic
definition of convergence for a sequence, Definition 3.2, requires the limit of the
sequence be known. The path out of the dilemma in the case of sequences was
the Cauchy criterion for convergence, Theorem 3.22. The solution is the same
here, with a Cauchy criterion for the existence of the integral.
T HEOREM 8.10 (Cauchy Criterion). Let f : [a, b] → R. The following statements
are equivalent.
August 4, 2017
http://math.louisville.edu/∼lee/ira
810
CHAPTER 8. INTEGRATION
b
(a) a f exists.
(b) Given ε > 0 there exists P ∈ part ([a, b]) such that if P
Q 2 , then
Q 1 and P
R f ,Q 1 , x k∗ − R f ,Q 2 , y k∗ < ε
(8.2)
for any selections from Q 1 and Q 2 .
P ROOF. (=⇒) Assume
b
a
f exists. According to Definition 8.1, there is a δ > 0
b
such that whenever P ∈ part ([a, b]) with P < δ, then  a f − R f , P, x i∗  < ε/2
for every selection. If P Q 1 and P Q 2 , then Q 1 < δ, Q 2 < δ and a simple
application of the triangle inequality shows
R f ,Q 1 , x k∗ − R f ,Q 2 , y k∗
≤ R f ,Q 1 , x k∗ −
b
a
b
f +
a
f − R f ,Q 2 , y k∗ < ε.
(⇐=) Let ε > 0 and choose P ∈ part ([a, b]) satisfying (b) with ε/2 in place of ε.
We first claim that f is bounded. To see this, suppose it is not. Then it must be
unbounded on an interval I k0 determined by P . Fix a selection {x k∗ ∈ I k : 1 ≤ k ≤ n}
and let y k∗ = x k∗ for k = k 0 with y k∗ any element of I k0 . Then
0
ε
> R f , P, x k∗ − R f , P, y k∗ = f (x k∗0 ) − f (y k∗0 ) I k0 .
2
But, the righthand side can be made bigger than ε/2 with an appropriate choice
of y k∗ because of the assumption that f is unbounded on I k0 . This contradiction
0
forces the conclusion that f is bounded.
Thinking of P as the generic partition and using m k and M k as usual with
Darboux sums, for each k, choose x k∗ , y k∗ ∈ I k such that
ε
ε
M k − f (x k∗ ) <
and f (y k∗ ) − m k <
.
4nI k 
4nI k 
With these selections,
D f ,P −D f ,P
= D f , P − R f , P, x k∗ + R f , P, x k∗ − R f , P, y k∗ + R f , P, y k∗ − D f , P
n
=
k=1
n
≤
k=1
(M k − f (x k∗ ))I k  + R f , P, x k∗ − R f , P, y k∗ +
M k − f (x k∗ ) I k  + R f , P, x k∗ − R f , P, y k∗ +
n
k=1
( f (y k∗ ) − m k )I k 
n
k=1
n
( f (y k∗ ) − m k )I k 
ε
ε
I k  + R f , P, x k∗ − R f , P, y k∗ +
I k 
4nI

4nI
k
k
k=1
k=1
ε ε ε
< + + <ε
4 2 4
Corollary 8.7 implies D f exists and Theorem 8.9 finishes the proof.
n
<
C OROLLARY 8.11. If
August 4, 2017
b
a
f exists and [c, d ] ⊂ [a, b], then
d
c
f exists.
http://math.louisville.edu/∼lee/ira
Properties of the Integral
811
P ROOF. Let P 0 = {a, b, c, d } ∈ part ([a, b]) and ε > 0. Using Theorem 8.10,
choose a partition P ε such that P 0
P ε and whenever P ε
P and P ε
P,
then
R f , P, x k∗ − R f , P , y k∗  < ε.
Let P ε1 = P ε ∩ [a, c], P ε2 = P ε ∩ [c, d ] and P ε3 = P ε ∩ [d , b]. Suppose P ε2
P ε2 Q 2 . Then P ε1 ∪Q i ∪ P ε3 for i = 1, 2 are refinements of P ε and
Q 1 and
R f ,Q 1 , x k∗ − R f ,Q 2 , y k∗  =
R f , P ε1 ∪Q 1 ∪ P ε3 , x k∗ − R f , P ε1 ∪Q 2 ∪ P ε3 , y k∗  < ε
b
a
for any selections. An application of Theorem 8.10 shows
f exists.
6. Properties of the Integral
T HEOREM 8.12. If
b
a
f and
(a) If α, β ∈ R, then
(b)
(c)
b
a f g exists.
b
a  f  exists.
b
a g
b
a (α f
both exist, then
+ βg ) exists and
b
a (α f
+ βg ) = α
b
a
f +β
b
a g.
P ROOF. (a) Let ε > 0. If α = 0, in light of Example 8.1, it is clear α f is integrable.
So, assume α = 0, and choose a partition P f ∈ part ([a, b]) such that whenever
Pf
P , then
R f , P, x k∗ −
b
f <
a
ε
.
2α
Then
R α f , P, x k∗ − α
n
b
a
f =
k=1
n
= α
k=1
ε
= .
2
b
f
a
b
f (x k∗ )I k  −
= α R f , P, x k∗ −
< α
b
α f (x k∗ )I k  − α
ε
2α
f
a
b
f
a
b
This shows α f is integrable and a α f = α a f .
Assuming β = 0, in the same way, we can choose a P g ∈ part ([a, b]) such that
when P g
P , then
R g , P, x k∗ −
August 4, 2017
b
a
g <
ε
.
2β
http://math.louisville.edu/∼lee/ira
812
CHAPTER 8. INTEGRATION
Let P ε = P f ∪ P g be the common refinement of P f and P g , and suppose P ε
Then
R α f + βg , P, x k∗ − α
b
a
f +β
b
g 
a
≤ α R f , P, x k∗ −
b
a
f + β R g , P, x k∗ −
for any selection. This shows α f + βg is integrable and
β
b
a g.
P.
b
b
a (α f
b
a
g <ε
+ βg ) = α
b
a
f+
b
(b) Claim: If a h exists, then so does a h 2
To see this, suppose first that 0 ≤ h(x) ≤ M on [a, b]. If M = 0, the claim is
trivially true, so suppose M > 0. Let ε > 0 and choose P ∈ part ([a, b]) such that
ε
.
D (h, P ) − D (h, P ) ≤
2M
For each 1 ≤ k ≤ n, let
m k = glb {h(x) : x ∈ I k } ≤ lub {h(x) : x ∈ I k } = M k .
Since h ≥ 0,
m k2 = glb {h(x)2 : x ∈ I k } ≤ lub {h(x)2 : x ∈ I k } = M k2 .
Using this, we see
D h2, P − D h2, P =
n
k=1
n
(M k2 − m k2 )I k 
(M k + m k )(M k − m k )I k 
=
k=1
n
(M k − m k )I k 
≤ 2M
k=1
= 2M D (h, P ) − D (h, P )
< ε.
Therefore, h 2 is integrable when h ≥ 0.
If h is not nonnegative, let m = glb {h(x) : a ≤ x ≤ b}. Then h − m ≥ 0, and
h − m is integrable by (a). From the claim, (h − m)2 is integrable. Since
h 2 = (h − m)2 + 2mh − m 2 ,
it follows from (a) that h 2 is integrable.
Finally, f g = 41 (( f + g )2 − ( f − g )2 ) is integrable by the claim and (a).
(c) Claim: If h ≥ 0 is integrable, then so is h.
To see this, let ε > 0 and choose P ∈ part ([a, b]) such that
D (h, P ) − D (h, P ) < ε2 .
For each 1 ≤ k ≤ n, let
m k = glb {
August 4, 2017
h(x) : x ∈ I k } ≤ lub {
h(x) : x ∈ I k } = M k .
http://math.louisville.edu/∼lee/ira
Properties of the Integral
813
and define
A = {k : M k − m k < ε} and B = {k : M k − m k ≥ ε}.
Then
(M k − m k )I k  < ε(b − a).
(8.3)
k∈A
Using the fact that m k ≥ 0, we see that M k − m k ≤ M k + m k , and
1
(M k + m k )(M k − m k )I k 
ε k∈B
(M k − m k )I k  ≤
(8.4)
k∈B
1
(M 2 − m k2 )I k 
ε k∈B k
=
1
D (h, P ) − D (h, P )
ε
<ε
≤
Combining (8.3) and (8.4), it follows that
D
h, P − D
h, P < ε(b − a) + ε = ε((b − a) + 1)
can be made arbitrarily small. Therefore, h is integrable.
Since  f  = f 2 an application of (b) and the claim suffice to prove (c).
T HEOREM 8.13. If
b
a
f exists, then
(a) If f ≥ 0 on [a, b], then
(b) 
b
a
f ≤
b
a f
b
a
f ≥ 0.

(c) If a ≤ c ≤ b, then
b
a
f =
c
a
f+
b
c
f.
P ROOF. (a) Since all the Riemann sums are nonnegative, this follows at once.
b
(b) It is always true that  f  ± f ≥ 0 and  f  − f ≥ 0, so by (a), a ( f  + f ) ≥ 0
and
b
a ( f
 − f ) ≥ 0. Rearranging these shows −
b
b
b
a
f ≤
b
a f
 and
b
a
f ≤
b
a f
.
Therefore,  a f  ≤ a  f , which is (b).
(c) By Corollary 8.11, all the integrals exist. Let ε > 0 and choose P l ∈ part ([a, c])
and P r ∈ part ([c, b]) such that whenever P l
Q l and P r
Q r , then,
R f ,Q l , x k∗ −
c
a
f <
ε
2
and
R f ,Q r , y k∗ −
b
c
If P = P l ∪ P r and Q = Q l ∪ Q r , then P,Q ∈ part ([a, b]) and P
inequality gives
R f ,Q, x k∗ −
c
a
b
f−
c
ε
f < .
2
Q. The triangle
f < ε.
Since every refinement of P has the form Q l ∪Q r , part (c) follows.
August 4, 2017
http://math.louisville.edu/∼lee/ira
814
CHAPTER 8. INTEGRATION
There’s some notational trickery that can be played here. If
we define
a
b
f =−
b
a
b
a
f exists, then
f . With this convention, it can be shown
b
(8.5)
a
c
f =
b
f+
a
f
c
no matter the order of a, b and c, as long as at least two of the integrals exist. (See
Problem 8.4.)
7. The Fundamental Theorem of Calculus
T HEOREM 8.14 (Fundamental Theorem of Calculus 1). Suppose f , F : [a, b] →
R satisfy
b
(a) a f exists
(b) F ∈ C ([a, b]) ∩ D((a, b))
(c) F (x) = f (x), ∀x ∈ (a, b)
Then
b
a
f = F (b) − F (a).
P ROOF. Let ε > 0. According to (a) and Definition 8.1, P ∈ part ([a, b]) can be
chosen such that
R f , P, x k∗ −
b
a
f < ε.
for every selection from P . On each interval [x k−1 , x k ] determined by P , the
function F satisfies the conditions of the Mean Value Theorem. (See Corollary
7.13.) Therefore, for each k, there is an c k ∈ (x k−1 , x k ) such that
F (x k ) − F (x k−1 ) = F (c k )(x k − x k−1 ) = f (c k )I k .
So,
b
a
n
b
f − (F (b) − F (a)) =
a
f−
n
b
=
a
b
=
a
(F (x k ) − F (x k−1 )
k=1
f−
f (c k )I k 
k=1
f − R f , P, c k
<ε
and the theorem follows.
E XAMPLE 8.4. The Fundamental Theorem of Calculus can be used to give a
different form of Taylor’s theorem. As in Theorem 7.18, suppose f and its first
b
n + 1 derivatives exist on [a, b] and a f (n+1) exists. There is a function R f (n, x, t )
such that
n f (k) (t )
R f (n, x, t ) = f (x) −
(x − t )k
k!
k=0
August 4, 2017
http://math.louisville.edu/∼lee/ira
The Fundamental Theorem of Calculus
815
for a ≤ t ≤ b. Differentiating both sides of the equation with respect to t , note
that the righthand side telescopes, so the result is
d
(x − t )n (n+1)
R f (n, x, t ) = −
f
(t ).
dt
n!
Using Theorem 8.14 and the fact that R f (n, x, x) = 0 gives
R f (n, x, c) = R f (n, x, c) − R f (n, x, x)
c
d
R f (n, x, t ) d t
d
t
x
x (x − t )n
=
f (n+1) (t ) d t ,
n!
c
=
which is the integral form of the remainder from Taylor’s formula.
C OROLLARY 8.15 (Integration by Parts). If f , g ∈ C ([a, b]) ∩ D((a, b)) and both
f g and f g are integrable on [a, b], then
b
a
b
fg +
a
f g = f (b)g (b) − f (a)g (a).
P ROOF. Use Theorems 7.3(c) and 8.14.
b
Suppose a f exists. By Corollary 8.11, f is integrable on every interval [a, x],
x
for x ∈ [a, b]. This allows us to define a function F : [a, b] → R as F (x) = a f ,
called the indefinite integral of f on [a, b].
T HEOREM 8.16 (Fundamental Theorem of Calculus 2). Let f be integrable
on [a, b] and F be the indefinite integral of f . Then F ∈ C ([a, b]) and F (x) = f (x)
whenever x ∈ C ( f ) ∩ (a, b).
b
P ROOF. To show F ∈ C ([a, b]), let x 0 ∈ [a, b] and ε > 0. Since a f exists, there
is an M > lub { f (x) : a ≤ x ≤ b}. Choose 0 < δ < ε/M and x ∈ (x 0 −δ, x 0 +δ)∩[a, b].
Then
x
F (x) − F (x 0 ) =
x0
f ≤ M x − x 0  < M δ < ε
and x 0 ∈ C (F ).
August 4, 2017
http://math.louisville.edu/∼lee/ira
816
CHAPTER 8. INTEGRATION
M
m
x+h
x
F IGURE 8.3.
1
lim
h→0 h
This figure illustrates a “box” argument showing
x+h
x
f = f (x).
Let x 0 ∈ C ( f ) ∩ (a, b) and ε > 0. There is a δ > 0 such that x ∈ (x 0 − δ, x 0 + δ) ⊂
(a, b) implies  f (x) − f (x 0 ) < ε. If 0 < h < δ, then
1
F (x 0 + h) − F (x 0 )
− f (x 0 ) =
h
h
=
≤
1
h
1
h
1
h
= ε.
<
x 0 +h
x0
x 0 +h
x0
x 0 +h
x0
x 0 +h
x0
f − f (x 0 )
( f (t ) − f (x 0 )) d t
 f (t ) − f (x 0 ) d t
εdt
This shows F + (x 0 ) = f (x 0 ). It can be shown in the same way that F − (x 0 ) = f (x 0 ).
Therefore F (x 0 ) = f (x 0 ).
The right picture makes Theorem 8.16 almost obvious. Consider Figure 8.3.
Suppose x ∈ C ( f ) and ε > 0. There is a δ > 0 such that
f ((x − d , x + d ) ∩ [a, b]) ⊂ ( f (x) − ε/2, f (x) + ε/2).
Let
m = glb { f y : x − y < δ} ≤ lub { f y : x − y < δ} = M .
Apparently M − m < ε and for 0 < h < δ,
x+h
mh ≤
August 4, 2017
x
f ≤ M h =⇒ m ≤
F (x + h) − F (x)
≤ M.
h
http://math.louisville.edu/∼lee/ira
The Fundamental Theorem of Calculus
817
Since M − m → 0 as h → 0, a “squeezing” argument shows
lim
h↓0
F (x + h) − F (x)
= f (x).
h
A similar argument establishes the limit from the left and F (x) = f (x).
E XAMPLE 8.5. The usual definition of the natural logarithm function depends
on the Fundamental Theorem of Calculus. Recall for x > 0,
x
ln(x) =
1
1
dt.
t
Since f (t ) = 1/t is continuous on (0, ∞), Theorem 8.16 shows
d
1
ln(x) = .
dx
x
It should also be noted that the notational convention mentioned above equation
(8.5) is used to get ln(x) < 0 when 0 < x < 1.
It’s easy to read too much into the Fundamental Theorem of Calculus. We
are tempted to start thinking of integration and differentiation as opposites
of each other. But, this is far from the truth. The operations of integration
and antidifferentiation are different operations, that happen to sometimes be
tied together by the Fundamental Theorem of Calculus. Consider the following
examples.
E XAMPLE 8.6. Let
f (x) =
x/x, x = 0
0,
x =0
It’s easy to prove that f is integrable over any compact interval, and that F (x) =
x
−1 f = x − 1 is an indefinite integral of f . But, F is not differentiable at x = 0
and f is not a derivative, according to Theorem 7.17.
E XAMPLE 8.7. Let
f (x) =
x 2 sin x12 , x = 0
0,
x =0
It’s straightforward to show that f is differentiable and
f (x) =
2x sin x12 − x2 cos x12 , x = 0
0,
x =0
Since f is unbounded near x = 0, it follows from Theorem 8.3 that f is not
integrable over any interval containing 0.
E XAMPLE 8.8. Let f be the salt and pepper function of Example 6.15. It was
b
x
shown in Example 8.3 that a f = 0 on any interval [a, b]. If F (x) = 0 f , then
c
F (x) = 0 for all x and F = f only on C ( f ) = Q .
August 4, 2017
http://math.louisville.edu/∼lee/ira
818
CHAPTER 8. INTEGRATION
8. Change of Variables
Integration by substitution works sidebyside with the Fundamental Theorem of Calculus in the integration section of any calculus course. Most of the
time calculus books require all functions in sight to be continuous. In that case, a
substitution theorem is an easy consequence of the Fundamental Theorem and
the Chain Rule. (See Exercise 8.13.) More general statements are true, but they
are harder to prove.
T HEOREM 8.17. If f and g are functions such that
(a)
(b)
(c)
(d)
g is strictly monotone on [a, b],
g is continuous on [a, b],
g is differentiable on (a, b), and
g (b)
b
both g (a) f and a ( f ◦ g )g exist,
then
g (b)
(8.6)
g (a)
b
f =
a
( f ◦ g )g .
P ROOF. Suppositions (a) and (b) show g is a bijection from [a, b] to an interval
[c, d ]. The correspondence between the endpoints depends on whether g is
increasing or decreasing.
Let ε > 0.
From (d) and Definition 8.1, there is a δ1 > 0 such that whenever P ∈ part ([a, b])
with P < δ1 , then
(8.7)
R ( f ◦ g )g , P, x i∗ −
b
( f ◦ g )g <
a
ε
2
for any selection from P . Choose P 1 ∈ part ([a, b]) such that P 1 < δ1 .
Using the same argument, there is a δ2 > 0 such that whenever Q ∈ part ([c, d ])
with Q < δ2 , then
R f ,Q, x i∗ −
(8.8)
d
c
f <
ε
2
for any selection from Q. As above, choose Q 1 ∈ part ([c, d ]) such that Q 1 < δ2 .
Setting P 2 = P 1 ∪ {g −1 (x) : x ∈ Q 1 } and Q 2 = P 1 ∪ {g (x) : x ∈ P 1 }, it is apparent
that P 1
P 2 , Q 1 Q 2 , P 2 ≤ P 1 < δ1 , Q 2 ≤ Q 1 < δ2 and Q 2 = {g (x) : x ∈
P 2 }. From (8.7) and (8.8), it follows that
b
a
(8.9)
( f ◦ g )g − R ( f ◦ g )g , P 2 , x i∗ <
ε
2
and
d
c
f − R f ,Q 2 , y i∗ <
ε
2
for any selections from P 2 and Q 2 .
Label the points of P 2 as a = x 1 < x 2 < · · · < x n = b and those of Q 2 as c = y 0 <
y 1 < · · · < y n = d . From (b), (c) and the Mean Value Theorem, for each i , choose
August 4, 2017
http://math.louisville.edu/∼lee/ira
8. CHANGE OF VARIABLES
819
c i ∈ (x i −1 , x i ) such that
g (x i ) − g (x i −1 ) = g (c i )(x i − x i −1 ).
(8.10)
Notice that {c i : 1 ≤ i ≤ n} is a selection from P 2 .
First, assume g is strictly increasing. In this case g (x i ) = y i for 0 ≤ i ≤ n and
g (c i ) ∈ (y i −1 , y i ) for 0 < i ≤ n, so g (c i ) is a selection from Q 2 .
g (b)
g (a)
b
f−
a
( f ◦ g )g
g (b)
=
g (a)
f − R f ,Q 2 , g (c i ) + R f ,Q 2 , g (c i ) −
g (b)
≤
g (a)
b
a
( f ◦ g )g
f − R f ,Q 2 , g (c i ) + R f ,Q 2 , g (c i ) −
b
a
( f ◦ g )g
Use the triangle inequality and (8.9). Expand the second Riemann sum.
<
ε
+
2
n
b
f (g (c i )) g (x i ) − g (x i −1 ) −
i =1
( f ◦ g )g
a
Apply the Mean Value Theorem, as in (8.10), and then use (8.9).
=
ε
+
2
n
b
f (g (c i ))g (c i ) (x i − x i −1 ) −
i =1
ε
+ R ( f ◦ g )g , P 2 , c i −
2
ε ε
< +
2 2
=ε
=
a
( f ◦ g )g
b
a
( f ◦ g )g
and (8.6) follows.
Now assume g is strictly decreasing on [a, b]. The proof is much the same
as above, except the bookkeeping is trickier because order is reversed instead
of preserved by g . This means g (x k ) = y n−k when 0 ≤ k ≤ n and g (c n−k+1 ) ∈
(y k−1 , y k ) for 0 < k ≤ n. Therefore, g (c n−k+1 ) is a selection from Q 2 . From the
Mean Value Theorem,
y k − y k−1 = g (x n−k ) − g (x n−k+1 )
(8.11)
= −(g (x n−k+1 ) − g (x n−k ))
= −g (c n−k+1 )(x n−k+1 − x n−k ),
August 4, 2017
http://math.louisville.edu/∼lee/ira
820
CHAPTER 8. INTEGRATION
where c k ∈ (x k−1 , x k ) is as above. The rest of the proof is much like the case when
g is increasing.
g (b)
b
f−
g (a)
( f ◦ g )g
a
g (a)
= −
g (b)
g (b)
≤ −
g (a)
f + R f ,Q 2 , g (c n−k+1 ) − R f ,Q 2 , g (c n−k+1 ) −
b
a
f + R f ,Q 2 , g (c n−k+1 ) + −R f ,Q 2 , g (c n−k+1 ) −
( f ◦ g )g
b
a
( f ◦ g )g
Use (8.9), expand the second Riemann sum and apply (8.11).
<
n
ε
+ −
f (g (c n−k+1 ))(y k − y k−1 ) −
2
k=1
=
ε
+
2
b
a
( f ◦ g )g
n
b
f (g (c n−k+1 ))g (c n−k+1 )(x n−k+1 − x n−k ) −
k=1
a
( f ◦ g )g 
Reverse the order of the sum and use (8.9).
=
ε
+
2
n
b
f (g (c k ))g (c k )(x k − x k−1 ) −
k=1
ε
+ R ( f ◦ g )g , P 2 , c k −
2
ε ε
< +
2 2
=ε
=
a
( f ◦ g )g
b
( f ◦ g )g
a
The theorem has been proved.
1
E XAMPLE 8.9. Suppose we want to calculate −1 1 − x 2 d x. Using the notation of Theorem 8.17, let f (x) = 1 − x 2 , g (x) = sin x and [a, b] = [−π/2, π/2]. In
this case, g is an increasing function. Then (8.6) becomes
1
−1
sin(π/2)
1 − x2 d x =
=
=
=
August 4, 2017
sin(−π/2)
π/2
−π/2
π/2
1 − x2 d x
1 − sin2 x cos x d x
cos2 x d x
−π/2
π
.
2
http://math.louisville.edu/∼lee/ira
Integral Mean Value Theorems
821
On the other hand, it can also be done with a decreasing function. If g (x) = cos x
and [a, b] = [0, π], then
cos 0
1
1 − x2 d x =
−1
cos π
cos π
=−
=−
=
0
=
cos 0
π
0
π
π
1 − x2 d x
1 − x2 d x
1 − cos2 x(− sin x) d x
1 − cos2 x sin x d x
sin2 x d x
0
π
2
=
eˇ
9. Integral Mean Value Theorems
T HEOREM 8.18. Suppose f , g : [a, b] → R are such that
(a) g (x) ≥ 0 on [a, b],
(b) f is bounded and m ≤ f (x) ≤ M for all x ∈ [a, b], and
b
b
(c) a f and a f g both exist.
There is a c ∈ [m, M ] such that
b
b
a
f g =c
g.
a
P ROOF. Obviously,
b
m
(8.12)
a
If
b
a g
b
g≤
a
g.
a
= 0, we’re done. Otherwise, let
c=
Then
b
fg ≤M
b
a
f g =c
b
a g
b
a fg
b
a g
.
and from (8.12), it follows that m ≤ c ≤ M .
C OROLLARY 8.19. Let f and g be as in Theorem 8.18, but additionally assume
f is continuous. Then there is a c ∈ (a, b) such that
b
a
b
f g = f (c)
g.
a
P ROOF. This follows from Theorem 8.18 and Corollaries 6.23 and 6.26.
T HEOREM 8.20. Suppose f , g : [a, b] → R are such that
(a) g (x) ≥ 0 on [a, b],
(b) f is bounded and m ≤ f (x) ≤ M for all x ∈ [a, b], and
August 4, 2017
http://math.louisville.edu/∼lee/ira
822
CHAPTER 8. INTEGRATION
(c)
b
a
f and
b
a
f g both exist.
There is a c ∈ [a, b] such that
b
a
c
f g =m
b
g +M
a
g.
c
P ROOF. For a ≤ x ≤ b let
b
x
G(x) = m
g.
g +M
a
x
By Theorem 8.16, G ∈ C ([a, b]) and
b
glbG ≤ G(b) = m
b
g≤
a
b
fg ≤M
a
a
b
a
Now, apply Corollary 6.26 to find c where G(c) =
g = G(a) ≤ lubG.
f g.
10. Exercises
8.1. If f : [a, b] → R and R f exists, then f is bounded.
8.2. Let
1, x ∈ Q
f (x) =
0, x ∉ Q
.
(a) Use Definition 8.1 to show f is not integrable on any interval.
(b) Use Definition 8.6 to show f is not integrable on any interval.
8.3. Calculate
5 2
2 x
using the definition of integration.
8.4. If at least two of the integrals exist, then
b
c
b
f =
a
a
f+
f
c
no matter the order of a, b and c.
8.5. If α > 0, f : [a, b] → [α, β] and
b
a
f exists, then
b
a 1/ f
exists.
8.6. If f : [a, b] → [0, ∞) is continuous and D( f ) = 0, then f (x) = 0 for all x ∈ [a, b].
8.7. If
b
a
f exists, then limx↓a
b
x
f =
8.8. If f is monotone on [a, b], then
b
a
b
a
f.
f exists.
8.9. If f and g are integrable on [a, b], then
b
a
August 4, 2017
b
fg ≤
a
f2
b
1/2
g2
.
a
http://math.louisville.edu/∼lee/ira
10. EXERCISES
(Hint: Expand
823
b
a (x f
+ g )2 as a quadratic with variable x.)3
8.10. If f : [a, b] → [0, ∞) is continuous, then there is a c ∈ [a, b] such that
f (c) =
x
8.11. If f (x) =
1
1
b−a
b
1/2
f2
.
a
dt
for x > 0, then f (x y) = f (x) + f (y) for x, y > 0.
t
8.12. If f (x) = ln(x) for x = 0, then f (x) = 1/x.
8.13. In the statement of Theorem 8.17, make the additional assumptions that f
and g are both continuous. Use the Fundamental Theorem of Calculus to give
an easier proof.
8.14. Find a function f : [a, b] → R such that
(a) f is continuous on [c, b] for all c ∈ (a, b],
b
(b) limx↓a x f = 0, and
(c) limx↓a f (x) does not exist.
8.15. Find a bounded function solving Exercise 8.14.
3This is variously called the Cauchy inequality, CauchySchwarz inequality, or the CauchySchwarzBunyakowsky inequality. Rearranging the last one, some people now call it the CBS
inequality.
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 9
Sequences of Functions
1. Pointwise Convergence
We have accumulated much experience working with sequences of numbers.
The next level of complexity is sequences of functions. This chapter explores
several ways that sequences of functions can converge to another function. The
basic starting point is contained in the following definitions.
D EFINITION 9.1. Suppose S ⊂ R and for each n ∈ N there is a function f n :
S → R. The collection { f n : n ∈ N} is a sequence of functions defined on S.
For each fixed x ∈ S, f n (x) is a sequence of numbers, and it makes sense to
ask whether this sequence converges. If f n (x) converges for each x ∈ S, a new
function f : S → R is defined by
f (x) = lim f n (x).
n→∞
The function f is called the pointwise limit of the sequence f n , or, equivalently, it
S
is said f n converges pointwise to f . This is abbreviated f n −→ f , or simply f n → f ,
if the domain is clear from the context.
E XAMPLE 9.1. Let
x <0
0,
f n (x) = x n , 0 ≤ x < 1 .
1,
x ≥1
Then f n → f where
f (x) =
0, x < 1
1, x ≥ 1
.
(See Figure 9.1.) This example shows that a pointwise limit of continuous functions need not be continuous.
E XAMPLE 9.2. For each n ∈ N, define f n : R → R by
nx
f n (x) =
.
1 + n2 x2
(See Figure 9.2.) Clearly, each f n is an odd function and limx→∞ f n (x) = 0. A bit
of calculus shows that f n (1/n) = 1/2 and f n (−1/n) = −1/2 are the extreme values
of f n . Finally, if x = 0,
 f n (x) =
nx
nx
1
< 2 2 =
2
2
1+n x
n x
nx
91
92
CHAPTER 9. SEQUENCES OF FUNCTIONS
1.0
0.8
0.6
0.4
0.2
0.2
0.4
0.6
0.8
1.0
F IGURE 9.1. The first ten functions from the sequence of Example 9.1.
0.4
0.2
3
2
1
1
2
3
0.2
0.4
F IGURE 9.2. The first four functions from the sequence of Example 9.2.
implies f n → 0. This example shows that functions can remain bounded away
from 0 and still converge pointwise to 0.
E XAMPLE 9.3. Define f n : R → R by
2n+4
3
1
x − 2n+3 ,
< x < 2n+2
2
2n+1
3
f n (x) = −22n+4 x + 2n+4 , 2n+2
≤ x < 21n
0,
otherwise
To figure out what this looks like, it might help to look at Figure 9.3.
The graph of f n is a piecewise linear function supported on [1/2n+1 , 1/2n ] and
the area under the isosceles triangle of the graph over this interval is 1. Therefore,
1
0 f n = 1 for all n.
If x > 0, then whenever x > 1/2n , we have f n (x) = 0. From this it follows that
f n → 0.
The lesson to be learned from this example is that it may not be true that
1
1
limn→∞ 0 f n = 0 limn→∞ f n .
August 4, 2017
http://math.louisville.edu/∼lee/ira
Pointwise and Uniform Convergence
93
64
32
16
8
1 1
16 8
1
4
1
2
1
F IGURE 9.3. The first four functions f n → 0 from the sequence of
Example 9.3.
1/2
1/4
1/8
1
1
F IGURE 9.4. The first ten functions of the sequence f n → x from
Example 9.4.
E XAMPLE 9.4. Define f n : R → R by
f n (x) =
n 2
1
2 x + 2n ,
x ≤
x,
x >
1
n
1
n
.
(See Figure 9.4.) The parabolic section in the center was chosen so f n (±1/n) = 1/n
and f n (±1/n) = ±1. This splices the sections together at (±1/n, ±1/n) so f n is
differentiable everywhere. It’s clear f n → x, which is not differentiable at 0.
This example shows that the limit of differentiable functions need not be
differentiable.
August 4, 2017
http://math.louisville.edu/∼lee/ira
94
CHAPTER 9. SEQUENCES OF FUNCTIONS
The examples given above show that continuity, integrability and differentiability are not preserved in the pointwise limit of a sequence of functions. To
have any hope of preserving these properties, a stronger form of convergence is
needed.
2. Uniform Convergence
D EFINITION 9.2. The sequence f n : S → R converges uniformly to f : S → R
on S, if for each ε > 0 there is an N ∈ N so that whenever n ≥ N and x ∈ S, then
 f n (x) − f (x) < ε.
In this case, we write f n S f , or simply f n
f , if the set S is clear from the
context.
f(x) + ε
f(x)
fn(x)
f(x
a
ε
b
F IGURE 9.5.  f n (x) − f (x) < ε on [a, b], as in Definition 9.2.
The difference between pointwise and uniform convergence is that with
pointwise convergence, the convergence of f n to f can vary in speed at each
point of S. With uniform convergence, the speed of convergence is roughly the
same all across S. Uniform convergence is a stronger condition to place on the
sequence f n than pointwise convergence in the sense of the following theorem.
T HEOREM 9.3. If f n
S
S
f , then f n −→ f .
P ROOF. Let x 0 ∈ S and ε > 0. There is an N ∈ N such that when n ≥ N , then
 f (x) − f n (x) < ε for all x ∈ S. In particular,  f (x 0 ) − f n (x 0 ) < ε when n ≥ N . This
shows f n (x 0 ) → f (x 0 ). Since x 0 ∈ S is arbitrary, it follows that f n → f .
The first three examples given above show the converse to Theorem 9.3
is false. There is, however, one interesting and useful case in which a partial
converse is true.
S
D EFINITION 9.4. If f n −→ f and f n (x) ↑ f (x) for all x ∈ S, then f n increases to f
S
on S. If f n −→ f and f n (x) ↓ f (x) for all x ∈ S, then f n decreases to f on S. In either
case, f n is said to converge to f monotonically.
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. UNIFORM CONVERGENCE
95
The functions of Example 9.4 decrease to x. Notice that in this case, the convergence is also happens to be uniform. The following theorem shows Example
9.4 to be an instance of a more general phenomenon.
T HEOREM 9.5 (Dini’s Theorem). If
(a)
(b)
(c)
(d)
then f n
S is compact,
S
f n −→ f monotonically,
f n ∈ C (S) for all n ∈ N, and
f ∈ C (S),
f.
P ROOF. There is no loss of generality in assuming f n ↓ f , for otherwise we
consider − f n and − f . With this assumption, if g n = f n − f , then g n is a sequence
of continuous functions decreasing to 0. It suffices to show g n 0.
To do so, let ε > 0. Using continuity and pointwise convergence, for each
x ∈ S find an open set G x containing x and an N x ∈ N such that g Nx (y) < ε for all
y ∈ G x . Notice that the monotonicity condition guarantees g n (y) < ε for every
y ∈ G x and n ≥ N x .
The collection {G x : x ∈ S} is an open cover for S, so it must contain a finite
subcover {G xi : 1 ≤ i ≤ n}. Let N = max{N xi : 1 ≤ i ≤ n} and choose m ≥ N . If
x ∈ S, then x ∈ G xi for some i , and 0 ≤ g m (x) ≤ g N (x) ≤ g Ni (x) < ε. It follows that
g n 0.
1
fn (x) = xn
1/2
2
F IGURE 9.6.
1/n
1
This shows a typical function from the sequence of
Example 9.5.
E XAMPLE 9.5. Let f n (x) = x n for n ∈ N, then f n decreases to 0 on [0, 1). If 0 <
a < 1 Dini’s Theorem shows f n 0 on the compact interval [0, a]. On the whole
interval [0, 1), f n (x) > 1/2 when 2−1/n < x < 1, so f n is not uniformly convergent.
(Why doesn’t this violate Dini’s Theorem?)
August 4, 2017
http://math.louisville.edu/∼lee/ira
96
CHAPTER 9. SEQUENCES OF FUNCTIONS
3. Metric Properties of Uniform Convergence
If S ⊂ R, let B (S) = { f : S → R : f is bounded}. For f ∈ B (S), define f S =
lub { f (x) : x ∈ S}. (It is abbreviated to f , if the domain S is clear from the
context.) Apparently, f ≥ 0, f = 0 ⇐⇒ f ≡ 0 and, if g ∈ B (S), then f − g =
g − f . Moreover, if h ∈ B (S), then
f − g = lub { f (x) − g (x) : x ∈ S}
≤ lub { f (x) − h(x) + h(x) − g (x) : x ∈ S}
≤ lub { f (x) − h(x) : x ∈ S} + lub {h(x) − g (x) : x ∈ S}
= f −h + h −g
Combining all this, it follows that f − g is a metric1 on B (S).
The definition of uniform convergence implies that for a sequence of bounded
functions f n : S → R,
fn
f ⇐⇒ f n − f → 0.
Because of this, the metric f − g is often called the uniform metric or the
supmetric. Many ideas developed using the metric properties of R can be carried over into this setting. In particular, there is a Cauchy criterion for uniform
convergence.
D EFINITION 9.6. Let S ⊂ R. A sequence of functions f n : S → R is a Cauchy
sequence under the uniform metric, if given ε > 0, there is an N ∈ N such that
when m, n ≥ N , then f n − f m < ε.
T HEOREM 9.7. Let f n ∈ B (S). There is a function f ∈ B (S) such that f n
f n is a Cauchy sequence in B (S).
f iff
P ROOF. (⇒) Let f n
f and ε > 0. There is an N ∈ N such that n ≥ N implies
f n − f < ε/2. If m ≥ N and n ≥ N , then
fm − fn ≤ fm − f + f − fn <
ε ε
+ =ε
2 2
shows f n is a Cauchy sequence.
(⇐) Suppose f n is a Cauchy sequence in B (S) and ε > 0. Choose N ∈ N so
that when f m − f n < ε whenever m ≥ N and n ≥ N . In particular, for a fixed
x 0 ∈ S and m, n ≥ N ,  f m (x 0 ) − f n (x 0 ) ≤ f m − f n < ε shows the sequence f n (x 0 )
is a Cauchy sequence in R and therefore converges. Since x 0 is an arbitrary point
of S, this defines an f : S → R such that f n → f .
Finally, if m, n ≥ N and x ∈ S the fact that  f n (x) − f m (x) < ε gives
 f n (x) − f (x) = lim  f n (x) − f m (x) ≤ ε.
m→∞
This shows that when n ≥ N , then f n − f ≤ ε. We conclude that f ∈ B (S) and
fn
f.
1Definition 2.12
August 4, 2017
http://math.louisville.edu/∼lee/ira
4. SERIES OF FUNCTIONS
97
A collection of functions S is said to be complete under uniform convergence,
if every Cauchy sequence in S converges to a function in S . Theorem 9.7 shows
B (S) is complete under uniform convergence. We’ll see several other collections
of functions that are complete under uniform convergence.
E XAMPLE 9.6. For S ⊂ R let L(S) be all the functions f : S → R such that
f (x) = mx + b for some constants m and b. In particular, let f n be a Cauchy
sequence in L([0, 1]). Theorem 9.7 shows there is an f : [0, 1] → R such that
fn
f . In order to show L([0, 1]) is complete, it suffices to show f ∈ L([0, 1]).
To do this, let f n (x) = m n x + b n for each n. Then f n (0) = b n → f (0) and
m n = f n (1) − b n → f (1) − f (0).
Given any x ∈ [0, 1],
f n (x) − (( f (1) − f (0))x + f (0)) = m n x + b n − (( f (1) − f (0))x + f (0))
= (m n − ( f (1) − f (0)))x + b n − f (0) → 0.
This shows f (x) = ( f (1) − f (0))x + f (0) ∈ L([0, 1]) and therefore L([0, 1]) is complete.
E XAMPLE 9.7. Let P = {p(x) : p is a polynomial}. The sequence of polynomials p n (x) = nk=0 x k /k! increases to e x on [0, a] for any a > 0, so Dini’s Theorem
shows p n e x on [0, a]. But, e x ∉ P , so P is not complete. (See Exercise 7.25.)
4. Series of Functions
The definitions of pointwise and uniform convergence are extended in the
natural way to series of functions. If ∞
f is a series of functions defined on a
k=1 k
set S, then the series converges pointwise or uniformly, depending on whether
the sequence of partial sums, s n = nk=1 f k converges pointwise or uniformly,
respectively. It is absolutely convergent or absolutely uniformly convergent, if
∞
n=1  f n  is convergent or uniformly convergent on S, respectively.
The following theorem is obvious and its proof is left to the reader.
∞
T HEOREM 9.8. Let ∞
n=1 f n be a series of functions defined on S. If n=1 f n
∞
is absolutely convergent, then it is convergent. If n=1 f n is absolutely uniformly
convergent, then it is uniformly convergent.
The following theorem is a restatement of Theorem 9.5 for series.
T HEOREM 9.9. If ∞
n=1 f n is a series of nonnegative continuous functions converging pointwise to a continuous function on a compact set S, then ∞
n=1 f n
converges uniformly on S.
A simple, but powerful technique for showing uniform convergence of series
is the following.
T HEOREM 9.10 (Weierstrass MTest). If f n : S → R is a sequence of functions
and M n is a sequence nonnegative numbers such that f n S ≤ M n for all n ∈ N
∞
and ∞
n=1 M n converges, then n=1 f n is absolutely uniformly convergent.
August 4, 2017
http://math.louisville.edu/∼lee/ira
98
CHAPTER 9. SEQUENCES OF FUNCTIONS
P ROOF. Let ε > 0 and s n be the sequence of partial sums of ∞
n=1  f n . Using
the Cauchy criterion for convergence of a series, choose an N ∈ N such that when
n > m ≥ N , then nk=m M k < ε. So,
n
sn − sm =
n
fk ≤
k=m+1
n
fk ≤
k=m+1
M k < ε.
k=m
This shows s n is a Cauchy sequence and must converge according to Theorem 9.7.
E XAMPLE 9.8. Let a > 0 and M n = a n /n!. Since
a
M n+1
= lim
= 0,
lim
n→∞ n + 1
n→∞ M n
the Ratio Test shows ∞
n=0 M n converges. When x ∈ [−a, a],
an
xn
≤
.
n!
n!
n
The Weierstrass MTest now implies ∞
n=0 x /n! converges absolutely uniformly
on [−a, a] for any a > 0. (See Exercise 9.4.)
5. Continuity and Uniform Convergence
fn
S
T HEOREM 9.11. If f n : S → R is such that each f n is continuous at x 0 and
f , then f is continuous at x 0 .
P ROOF. Let ε > 0. Since f n
f , there is an N ∈ N such that whenever n ≥ N
and x ∈ S, then  f n (x) − f (x) < ε/3. Because f N is continuous at x 0 , there is a
δ > 0 such that x ∈ (x 0 − δ, x 0 + δ) ∩ S implies  f N (x) − f N (x 0 ) < ε/3. Using these
two estimates, it follows that when x ∈ (x 0 − δ, x 0 + δ) ∩ S,
 f (x) − f (x 0 ) =  f (x) − f N (x) + f N (x) − f N (x 0 ) + f N (x 0 ) − f (x 0 )
≤  f (x) − f N (x) +  f N (x) − f N (x 0 ) +  f N (x 0 ) − f (x 0 )
< ε/3 + ε/3 + ε/3 = ε.
Therefore, f is continuous at x 0 .
The following corollary is immediate from Theorem 9.11.
C OROLLARY 9.12. If f n is a sequence of continuous functions converging uniformly to f on S, then f is continuous.
Example 9.1 shows that continuity is not preserved under pointwise convergence. Corollary 9.12 establishes that if S is compact, then C (S) is complete under
the uniform metric.
The fact that C ([a, b]) is closed under uniform convergence is often useful
because, given a “bad” function f ∈ C ([a, b]), it’s often possible to find a sequence
f n of “good” functions in C ([a, b]) converging uniformly to f . Following is the
most widely used theorem of this type.
T HEOREM 9.13 (Weierstrass Approximation Theorem). If f ∈ C ([a, b]), then
there is a sequence of polynomials p n
f.
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. CONTINUITY AND UNIFORM CONVERGENCE
99
To prove this theorem, we first need a lemma.
L EMMA 9.14. For n ∈ N let c n =
k n (t ) =
−1
1
2 n
−1 (1 − t ) d t
and
c n (1 − t 2 )n , t  ≤ 1
0,
.
t  > 1
(See Figure 9.7.) Then
(a) k n (t ) ≥ 0 for all t ∈ R and n ∈ N;
1
(b) −1 k n = 1 for all n ∈ N; and,
(c) if 0 < δ < 1, then k n 0 on (−∞, −δ] ∪ [δ, ∞).
1.2
1.0
0.8
0.6
0.4
0.2
1.0
0.5
0.5
1.0
F IGURE 9.7. Here are the graphs of k n (t ) for n = 1, 2, 3, 4, 5.
P ROOF. Parts (a) and (b) follow easily from the definition of k n .
To prove (c) first note that
1/ n
1
1=
−1
kn ≥
−1/ n
c n (1 − t 2 )n d t ≥ c n
2
n
1−
1
n
n
.
n
Since 1 − n1 ↑ 1e , it follows there is an α > 0 such that c n < α n.2 Letting δ ∈ (0, 1)
and δ ≤ t ≤ 1,
k n (t ) ≤ k n (δ) ≤ α n(1 − δ2 )n → 0
by L’Hospital’s Rule. Since k n is an even function, this establishes (c).
2Repeated application of integration by parts shows
cn =
n + 1/2 n − 1/2 n − 3/2
3/2
Γ(n + 3/2)
×
×
×···×
=
.
n
n −1
n −2
1
πΓ(n + 1)
With the aid of Stirling’s formula, it can be shown c n ≈ 0.565 n.
August 4, 2017
http://math.louisville.edu/∼lee/ira
910
CHAPTER 9. SEQUENCES OF FUNCTIONS
A sequence of functions satisfying conditions such as those in Lemma 9.14 is
called a convolution kernel or a Dirac sequence.3 Several such kernels play a key
role in the study of Fourier series, as we will see in Theorems 10.5 and 10.13. The
one defined above is called the Landau kernel.4
We now turn to the proof of the theorem.
P ROOF. There is no generality lost in assuming [a, b] = [0, 1], for otherwise we
consider the linear change of variables g (x) = f ((b − a)x + a). Similarly, we can
assume f (0) = f (1) = 0, for otherwise we consider g (x) = f (x) − (( f (1) − f (0))x −
f (0), which is a polynomial added to f . We can further assume f (x) = 0 when
x ∉ [0, 1].
Set
1
p n (x) =
(9.1)
−1
f (x + t )k n (t ) d t .
To see p n is a polynomial, change variables in the integral using u = x + t to arrive
at
x+1
p n (x) =
1
f (u)k n (u − x) d u =
x−1
f (u)k n (x − u) d u,
0
because f (x) = 0 when x ∉ [0, 1]. Notice that k n (x − u) is a polynomial in u
with coefficients being polynomials in x, so integrating f (u)k n (x − u) yields a
polynomial in x. (Just try it for a small value of n and a simple function f !)
Use (9.1) and Lemma 9.14(b) to see for δ ∈ (0, 1) that
1
(9.2) p n (x) − f (x) =
−1
f (x + t )k n (t ) d t − f (x)
1
=
−1
1
≤
−1
( f (x + t ) − f (x))k n (t ) d t
 f (x + t ) − f (x)k n (t ) d t
δ
=
−δ
 f (x + t ) − f (x)k n (t ) d t +
δ<t ≤1
 f (x + t ) − f (x)k n (t ) d t .
We’ll handle each of the final integrals in turn.
Let ε > 0 and use the uniform continuity of f to choose a δ ∈ (0, 1) such that
when t  < δ, then  f (x + t ) − f (x) < ε/2. Then, using Lemma 9.14(b) again,
δ
(9.3)
−δ
 f (x + t ) − f (x)k n (t ) d t <
ε
2
δ
−δ
k n (t ) d t <
ε
2
3Given two functions f and g defined on R, the convolution of f and g is the integral
f
g (x) =
∞
−∞
f (t )g (x − t ) d t .
The term convolution kernel is used because such kernels typically replace g in the convolution
given above, as can be seen in the proof of the Weierstrass approximation theorem.
4
It was investigated by the German mathematician Edmund Landau (1877–1938).
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. CONTINUITY AND UNIFORM CONVERGENCE
911
According to Lemma 9.14(c), there is an N ∈ N so that when n ≥ N and t  ≥ δ,
ε
. Using this, it follows that
then k n (t ) < 8( f +1)(1−δ)
(9.4)
δ<t ≤1
 f (x + t ) − f (x)k n (t ) d t
1
−δ
=
−1
 f (x + t ) − f (x)k n (t ) d t +
δ
1
−δ
≤2 f
<2 f
−1
 f (x + t ) − f (x)k n (t ) d t
k n (t ) d t + 2 f
δ
k n (t ) d t
ε
ε
ε
(1 − δ) + 2 f
(1 − δ) =
8( f + 1)(1 − δ)
8( f + 1)(1 − δ)
2
Combining (9.3) and (9.4), it follows from (9.2) that p n (x) − f (x) < ε for all
x ∈ [0, 1] and p n
f.
C OROLLARY 9.15. If f ∈ C ([a, b]) and ε > 0, then there is a polynomial p such
that f − p [a,b] < ε.
The theorems of this section can also be used to construct some striking
examples of functions with unwelcome behavior. Following is perhaps the most
famous.
E XAMPLE 9.9. There is a continuous f : R → R that is differentiable nowhere.
P ROOF. Thinking of the canonical example of a continuous function that
fails to be differentiable at a point—the absolute value function—we start with a
“sawtooth” function. (See Figure 9.8.)
s 0 (x) =
2n ≤ x < 2n + 1, n ∈ Z
x − 2n,
2n + 2 − x, 2n + 1 ≤ x < 2n + 2, n ∈ Z
Notice that s 0 is continuous and periodic with period 2 and maximum value 1.
Compress it both vertically and horizontally:
s n (x) =
3
4
n
s 0 4 n x , n ∈ N.
Each s n is continuous and periodic with period p n = 2/4n and s n = (3/4)n .
1.0
0.8
0.6
0.4
0.2
0.5
1.0
1.5
2.0
2.5
3.0
3.5
F IGURE 9.8. s 0 , s 1 and s 2 from Example 9.9.
August 4, 2017
http://math.louisville.edu/∼lee/ira
912
CHAPTER 9. SEQUENCES OF FUNCTIONS
3.0
2.5
2.0
1.5
1.0
0.5
0.5
1.0
1.5
2.0
F IGURE 9.9. The nowhere differentiable function f from Example 9.9. It is periodic with period 2 and one complete period is
shown.
Finally, the desired function is
∞
f (x) =
s n (x).
n=0
Since s n = (3/4)n , the Weierstrass M test implies the series defining f is uniformly convergent and Corollary 9.12 shows f is continuous on R. We will show
f is differentiable nowhere.
Let x ∈ R, m ∈ N and h m = 1/(2 · 4m ).
If n > m, then h m /p n = 4n−m−1 ∈ ω, so s n (x ± h m ) − s n (x) = 0 and
m s (x ± h ) − s (x)
f (x ± h m ) − f (x)
m
k
k
=
.
±h m
±h
m
k=0
(9.5)
On the other hand, if n < m, then a worstcase estimate is
3
s n (x ± h m ) − s n (x)
≤
hm
4
n
/
1
= 3n .
4n
This gives
m−1
k=0
m−1 s (x ± h ) − s (x)
s k (x ± h m ) − s k (x)
m
k
k
≤
±h m
±h
m
k=0
3m − 1 3m
<
.
3−1
2
Since s m is linear on intervals of length 4−m = 2 · h m with slope ±3m on those
linear segments, at least one of the following is true:
(9.6)
≤
(9.7)
s m (x + h m ) − s(x)
s m (x − h m ) − s(x)
= 3m or
= 3m .
hm
−h m
Suppose the first of these is true. The argument is essentially the same in the
second case.
August 4, 2017
http://math.louisville.edu/∼lee/ira
6. INTEGRATION AND UNIFORM CONVERGENCE
913
Using (9.5), (9.6) and (9.7), the following estimate ensues
f (x + h m ) − f (x)
=
hm
∞
s k (x + h m ) − s k (x)
hm
k=0
m
=
s k (x + h m ) − s k (x)
hm
k=0
≥
s m (x + h m ) − s m (x) m−1 s k (x ± h m ) − s k (x)
−
hm
±h m
k=0
> 3m −
3m 3m
=
.
2
2
Since 3m /2 → ∞, it is apparent f (x) does not exist.
There are many other constructions of nowhere differentiable continuous
functions. The first was published by Weierstrass [21] in 1872, although it was
known in the folklore sense among mathematicians earlier than this. (There is
an English translation of Weierstrass’ paper in [11].) In fact, it is now known in a
technical sense that the “typical” continuous function is nowhere differentiable
[4].
6. Integration and Uniform Convergence
One of the recurring questions with integrals is when it is true that
lim
n→∞
fn =
lim f n .
n→∞
This is often referred to as “passing the limit through the integral.” At some
point in her career, any student of advanced analysis or probability theory will
be tempted to just blithely pass the limit through. But functions such as those of
Example 9.3 show that some care is needed. A common criterion for doing so is
uniform convergence.
T HEOREM 9.16. If f n : [a, b] → R such that
on [a, b], then
b
a
b
a fn
exists for each n and f n
f
b
f = lim
n→∞ a
fn
.
P ROOF. Some care must be taken in this proof, because there are actually
two things to prove. Before the equality can be shown, it must be proved that f is
integrable.
August 4, 2017
http://math.louisville.edu/∼lee/ira
914
CHAPTER 9. SEQUENCES OF FUNCTIONS
To show that f is integrable, let ε > 0 and N ∈ N such that f − f N < ε/3(b−a).
If P ∈ part ([a, b]), then
R f , P, x k∗ − R f N , P, x k∗  = 
(9.8)
n
k=1
N
=
k=1
N
≤
k=1
f (x k∗ )I k  −
n
k=1
f N (x k∗ )I k 
( f (x k∗ ) − f N (x k∗ ))I k 
 f (x k∗ ) − f N (x k∗ )I k 
n
ε
I k 
3(b − a)) k=1
ε
=
3
<
P
According to Theorem 8.10, there is a P ∈ part ([a, b]) such that whenever
Q 1 and P Q 2 , then
ε
R f N ,Q 1 , x k∗ − R f N ,Q 2 , y k∗  < .
3
(9.9)
Combining (9.8) and (9.9) yields
R f ,Q 1 , x k∗ − R f ,Q 2 , y k∗
= R f ,Q 1 , x k∗ − R f N ,Q 1 , x k∗ + R f N ,Q 1 , x k∗
−R f N ,Q 1 , x k∗ + R f N ,Q 2 , y k∗ − R f ,Q 2 , y k∗
≤ R f ,Q 1 , x k∗ − R f N ,Q 1 , x k∗ + R f N ,Q 1 , x k∗ − R f N ,Q 1 , x k∗
+ R f N ,Q 2 , y k∗ − R f ,Q 2 , y k∗
ε ε ε
< + + =ε
3 3 3
Another application of Theorem 8.10 shows that f is integrable.
Finally, when n ≥ N ,
b
a
shows that
b
a fn
b
f−
→
b
a
a
b
fn =
a
b
( f − fn ) <
a
ε
ε
= <ε
3(b − a) 3
f.
C OROLLARY 9.17. If
uniformly on [a, b], then
∞
n=1 f n
is a series of integrable functions converging
b ∞
a n=1
∞
fn =
b
n=1 a
fn
Use of this corollary is sometimes referred to as “reversing summation and
integration.” It’s tempting to do this reversal, but without some condition such as
uniform convergence, justification for the action is often difficult.
August 4, 2017
http://math.louisville.edu/∼lee/ira
7. DIFFERENTIATION AND UNIFORM CONVERGENCE
915
E XAMPLE 9.10. It was shown in Example 4.2 that the geometric series
∞
1
,
1−t
tn =
n=0
−1 < t < 1.
In Exercise 9.3, you are asked to prove this convergence is uniform on any compact subset of (−1, 1). Substituting −t for t in the above formula, it follows that
∞
(−t )n
n=0
1
1+t
on [0, x], when 0 < x < 1. Corollary 9.17 implies
x
ln(1 + x) =
0
∞
dt
=
1 + t n=0
x
0
(−t )n d t = x − x 2 + x 3 − x 4 + · · · .
The same argument works when −1 < x < 0, so
ln(1 + x) = x − x 2 + x 3 − x 4 + · · ·
when x ∈ (−1, 1).
Combining Theorem 9.16 with Dini’s Theorem, gives the following.
C OROLLARY 9.18 (Monotone Convergence Theorem). If f n is a sequence of
continuous functions converging monotonically to a continuous function f on
b
b
[a, b], then a f n → a f .
7. Differentiation and Uniform Convergence
The relationship between uniform convergence and differentiation is somewhat more complex than those relationships we’ve already examined. First,
because there are two sequences involved, f n and f n , either of which may converge or diverge at a point; and second, because differentiation is more “delicate”
than continuity or integration.
Example 9.4 is an explicit example of a sequence of differentiable functions
converging uniformly to a function which is not differentiable at a point. The
derivatives of the functions from that example converge pointwise to a function
that is not a derivative. Combining the Weierstrass Approximation Theorem and
Example 9.9 pushes this to the extreme by showing the existence of a sequence
of polynomials converging uniformly to a continuous nowhere differentiable
function.
The following theorem starts to shed some light on the situation.
T HEOREM 9.19. If f n is a sequence of derivatives defined on [a, b] and f n
then f is a derivative.
f,
P ROOF. For each n, let F n be an antiderivative of f n . By considering F n (x) −
F n (a), if necessary, there is no generality lost with the assumption that F n (a) = 0
for all n.
Let ε > 0. There is an N ∈ N such that
ε
m, n ≥ N =⇒ f m − f n <
.
b−a
August 4, 2017
http://math.louisville.edu/∼lee/ira
916
CHAPTER 9. SEQUENCES OF FUNCTIONS
If x ∈ [a, b] and m, n ≥ N , then the Mean Value Theorem and the assumption that
F m (a) = F n (a) = 0 yield a c ∈ [a, b] such that
F m (x) − F n (x) = (F m (x) − F n (x)) − (F m (a) − F n (a))
=  f m (c) − f n (c) x − a ≤ f m − f n (b − a) < ε.
(9.10)
This shows F n is a Cauchy sequence in C ([a, b]) and there is an F ∈ C ([a, b]) with
Fn F .
It suffices to show F = f . To do this, several estimates are established.
Let M ∈ N so that
ε
m, n ≥ M =⇒ f m − f n < .
3
Notice this implies
ε
(9.11)
f − f n ≤ , ∀n ≥ M .
3
For such m, n ≥ M and x, y ∈ [a, b] with x = y, another application of the
Mean Value Theorem gives
F n (x) − F n (y) F m (x) − F m (y)
−
x−y
x−y
1
=
(F n (x) − F m (x)) − (F n (y) − F m (y))
x − y
1
ε
=
f n (c) − f m (c) x − y ≤ f n − f m < .
x − y
3
Letting m → ∞, it follows that
(9.12)
F n (x) − F n (y) F (x) − F (y)
ε
≤ , ∀n ≥ M .
−
x−y
x−y
3
Fix n ≥ M and x ∈ [a, b]. Since F n (x) = f n (x), there is a δ > 0 so that
(9.13)
ε
F n (x) − F n (y)
− f n (x) < , ∀y ∈ (x − δ, x + δ) \ {x}.
x−y
3
Finally, using (9.12), (9.13) and (9.11), we see
F (x) − F (y)
− f (x)
x−y
F (x) − F (y) F n (x) − F n (y)
−
x−y
x−y
F n (x) − F n (y)
+
− f n (x) + f n (x) − f (x)
x−y
F (x) − F (y) F n (x) − F n (y)
≤
−
x−y
x−y
F n (x) − F n (y)
+
− f n (x) + f n (x) − f (x)
x−y
ε ε ε
< + + = ε.
3 3 3
=
August 4, 2017
http://math.louisville.edu/∼lee/ira
7. DIFFERENTIATION AND UNIFORM CONVERGENCE
This establishes that
917
F (x) − F (y)
= f (x),
y→x
x−y
lim
as desired.
C OROLLARY 9.20. If G n ∈ C ([a, b]) is a sequence such that G n
converges for some x 0 ∈ [a, b], then G n G where G = g .
g and G n (x 0 )
P ROOF. Suppose G n (x 0 ) → α. For each n choose an antiderivative F n of g n
such that F n (a) = 0. Theorem 9.19 shows g is a derivative and an argument
similar to that in the proof of Theorem 9.19 shows F n F on [a, b], where F = g .
Since F n −G n = 0, Corollary (7.16) shows G n (x) = F n (x)+(G n (x 0 )−F n (x 0 )). Define
G(x) = F (x) + (α − F (x 0 )).
Let ε > 0 and x ∈ [a, b]. There is an N ∈ N such that
ε
ε
n ≥ N =⇒ F n − F < and G n (x 0 ) − α < .
3
3
If n ≥ N ,
G n (x) −G(x) = F n (x) + (G n (x 0 ) − F n (x 0 )) − (F (x) + (α − F (x 0 )))
≤ F n (x) − F (x) + G n (x 0 ) − α + F n (x 0 ) − F (x 0 )
ε ε ε
< + +
3 3 3
=ε
This shows G n
G on [a, b] where G = F = g .
C OROLLARY 9.21. If f n is a sequence of differentiable functions defined on [a, b]
such that ∞
f (x ) exists for some x 0 ∈ [a, b] and ∞
f converges uniformly,
k=1 k 0
k=1 k
then
∞
∞
f
f
=
k=1
k=1
P ROOF. Left as an exercise.
E XAMPLE 9.11. Let a > 0 and f n (x) = x n /n!. Note that f n = f n−1 for n ∈ N.
Example 9.8 shows ∞
n=0 f n (x) is uniformly convergent on [−a, a]. Corollary 9.21
shows
∞ xn
∞ xn
(9.14)
=
.
n=0 n!
n=0 n!
on [−a, a]. Since a is an arbitrary positive constant, (9.14) is seen to hold on all of
R.
n
If f (x) = ∞
n=0 x /n!, then the argument given above implies the initial value
problem
f (x) = f (x)
f (0) = 1
As is wellknown, the unique solution to this problem is f (x) = e x . Therefore,
ex =
August 4, 2017
xn
.
n=0 n!
∞
http://math.louisville.edu/∼lee/ira
918
CHAPTER 9. SEQUENCES OF FUNCTIONS
8. Power Series
8.1. The Radius and Interval of Convergence. One place where uniform
convergence plays a key role is with power series. Recall the definition.
D EFINITION 9.22. A power series is a function of the form
∞
f (x) =
(9.15)
a n (x − c)n .
n=0
Members of the sequence a n are the coefficients of the series. The domain of f is
the set of all x at which the series converges. The constant c is called the center of
the series.
To determine the domain of (9.15), let x ∈ R \ {c} and use the Root Test to see
the series converges when
lim sup a n (x − c)n 1/n = x − c lim sup a n 1/n < 1
and diverges when
x − c lim sup a n 1/n > 1.
If r lim sup a n 1/n ≤ 1 for some r ≥ 0, then these inequalities imply (9.15) is absolutely convergent when x − c < r . In other words, if
(9.16)
R = lub {r : r lim sup a n 1/n < 1},
then the domain of (9.15) is an interval of radius R centered at c. The root
test gives no information about convergence when x − c = R. This R is called
the radius of convergence of the power series. Assuming R > 0, the open interval
centered at c with radius R is called the interval of convergence. It may be different
from the domain of the series because the series may converge at neither, one, or
both endpoints of the interval of convergence.
The ratio test can also be used to determine the radius of convergence, but,
as shown in (4.9), it will not work as often as the root test. When it does,
a n+1
(9.17)
R = lub {r : r lim
< 1}.
n→∞ a n
This is usually easier to compute than (9.16), and both will give the same value
for R, when they can both be evaluated.
E XAMPLE 9.12. Calling to mind Example 4.2, it is apparent the geometric
n
power series ∞
n=0 x has center 0, radius of convergence 1 and domain (−1, 1).
E XAMPLE 9.13. For the power series
lim sup
2n
n
1/n
∞
n
n
n=1 2 (x + 2) /n, we compute
1
= 2 =⇒ R = .
2
Since the series diverges when x = −2 ± 12 , it follows that the interval of convergence is (−5/2, −3/2).
n
E XAMPLE 9.14. The power series ∞
n=1 x /n has interval of convergence
(−1, 1) and domain [−1, 1). Notice it is not absolutely convergent when x = −1.
August 4, 2017
http://math.louisville.edu/∼lee/ira
8. POWER SERIES
919
n
2
E XAMPLE 9.15. The power series ∞
n=1 x /n has interval of convergence
(−1, 1), domain [−1, 1] and is absolutely convergent on its whole domain.
The preceding is summarized in the following theorem.
T HEOREM 9.23. Let the power series be as in (9.15) and R be given by either
(9.16) or (9.17).
(a) If R = 0, then the domain of the series is {c}.
(b) If R > 0 the series converges absolutely at x when c − x < R and
diverges at x when c−x > R. In the case when R = ∞, the series converges
everywhere.
(c) If R ∈ (0, ∞), then the series may converge at none, one or both of c − R
and c + R.
8.2. Uniform Convergence of Power Series. The partial sums of a power
series are a sequence of polynomials converging pointwise on the domain of the
series. As has been seen, pointwise convergence is not enough to say much about
the behavior of the power series. The following theorem opens the door to a lot
more.
T HEOREM 9.24. A power series converges absolutely and uniformly on compact
subsets of its interval of convergence.
P ROOF. There is no generality lost in assuming the series has the form of
(9.15) with c = 0. Let the radius of convergence be R > 0 and K be a compact
subset of (−R, R) with α = lub {x : x ∈ K }. Choose r ∈ (α, R). If x ∈ K , then
n
a n x n  < a n r n  for n ∈ N. Since ∞
n=0 a n r  converges, the Weierstrass M test
∞
n
shows n=0 a n x is absolutely and uniformly convergent on K .
The following two corollaries are immediate consequences of Corollary 9.12
and Theorem 9.16, respectively.
C OROLLARY 9.25. A power series is continuous on its interval of convergence.
C OROLLARY 9.26. If [a, b] is an interval contained in the interval of convern
gence for the power series ∞
n=0 a n (x − c) , then
b ∞
a n=0
a n (x − c)n =
b
∞
an
n=0
a
(x − c)n .
E XAMPLE 9.16. Define
f (x) =
sin x
x ,
x =0
1,
x =0
.
Since limx→0 f (x) = 1, f is continuous everywhere. Suppose we want
an accuracy of five decimal places.
If x = 0,
1 ∞ (−1)n+1 2n−1 ∞ (−1)n 2n
f (x) =
x
=
x
x n=1 (2n − 1)!
n=0 (2n + 1)!
August 4, 2017
π
0
f with
http://math.louisville.edu/∼lee/ira
920
CHAPTER 9. SEQUENCES OF FUNCTIONS
The latter series converges to f everywhere. Corollary 9.26 implies
π
π
0
f (x) d x =
=
0
∞
(−1)n 2n
x
dx
n=0 (2n + 1)!
∞
(−1)n
n=0 (2n + 1)!
π
x 2n d x
0
(−1)n
=
π2n+1
(2n
+
1)(2n
+
1)!
n=0
∞
(9.18)
The latter series satisfies the Alternating Series Test. Since π1 5/(15 × 15!) ≈ 1.5 ×
10−6 , Corollary 4.20 shows
π
0
(−1)n
π2n+1 ≈ 1.85194
(2n
+
1)(2n
+
1)!
n=0
6
f (x) d x ≈
The next question is: What about differentiability?
Notice that the continuity of the exponential function and L’Hospital’s Rule
give
ln n
ln n
= exp lim
= exp(0) = 1.
n→∞
n
n
lim n 1/n = lim exp
n→∞
n→∞
Therefore, for any sequence a n ,
(9.19)
lim sup(na n )1/n = lim sup n 1/n a n1/n = lim sup a n1/n .
n
Now, suppose the power series ∞
n=0 a n x has a nontrivial interval of convergence, I . Formally differentiating the power series termbyterm gives a new
n−1
power series ∞
. According to (9.19) and Theorem 9.23, the termbyn=1 na n x
term differentiated series has the same interval of convergence as the original.
Its partial sums are the derivatives of the partial sums of the original series and
Theorem 9.24 guarantees they converge uniformly on any compact subset of I .
Corollary 9.21 shows
∞ d
∞
d ∞
an x n =
an x n =
na n x n−1 , ∀x ∈ I .
d x n=0
d
x
n=0
n=1
This process can be continued inductively to obtain the same results for all higher
order derivatives. We have proved the following theorem.
n
T HEOREM 9.27. If f (x) = ∞
n=0 a n (x − c) is a power series with nontrivial
interval of convergence, I , then f is differentiable to all orders on I with
(9.20)
f (m) (x) =
∞
n!
a n (x − c)n−m .
n=m (n − m)!
Moreover, the differentiated series has I as its interval of convergence.
August 4, 2017
http://math.louisville.edu/∼lee/ira
8. POWER SERIES
921
n
8.3. Taylor Series. Suppose f (x) = ∞
n=0 a n x has I = (−R, R) as its interval
of convergence for some R > 0. According to Theorem 9.27,
f (m) (0) =
m!
f (m) (0)
a m =⇒ a m =
, ∀m ∈ ω.
(m − m)!
m!
Therefore,
∞
f (x) =
n=0
f (n) (0) n
x , ∀x ∈ I .
n!
This is a remarkable result! It shows that the values of f on I are completely
determined by its values on any neighborhood of 0. This is summarized in the
following theorem.
T HEOREM 9.28. If a power series f (x) =
interval of convergence I , then
∞
(9.21)
f (x) =
n=0
∞
n=0 a n (x
− c)n has a nontrivial
f (n) (c)
(x − c)n , ∀x ∈ I .
n!
The series (9.21) is called the Taylor series5 for f centered at c. The Taylor
series can be formally defined for any function that has derivatives of all orders at
c, but, as Example 7.9 shows, there is no guarantee it will converge to the function
anywhere except at c. Taylor’s Theorem 7.18 can be used to examine the question
of pointwise convergence. If f can be represented by a power series on an open
interval I , then f is said to be analytic on I .
8.4. The Endpoints of the Interval of Convergence. We have seen that at
the endpoints of its interval of convergence a power series may diverge or even
absolutely converge. A natural question when it does converge is the following:
What is the relationship between the value at the endpoint and the values inside
the interval of convergence?
T HEOREM 9.29 (Abel). A power series is continuous on its domain.
n
P ROOF. Let f (x) = ∞
n=0 (x − c) have radius of convergence R and interval of
convergence I . If I = {0}, the theorem is vacuously true from Definition 6.9. If
I = R, the theorem follows from Corollary 9.25. So, assume R ∈ (0, ∞). It must be
shown that if f converges at an endpoint of I = (c −R, c +R), then f is continuous
at that endpoint.
It can be assumed c = 0 and R = 1. There is no loss of generality with either
of these assumptions because otherwise just replace f (x) with f ((x + c)/R). The
theorem will be proved for α = c + R since the other case is proved similarly.
5When c = 0, it is often called the Maclaurin series for f .
August 4, 2017
http://math.louisville.edu/∼lee/ira
922
CHAPTER 9. SEQUENCES OF FUNCTIONS
n
a
k=0 k
Set s = f (1), s −1 = 0 and s n =
n
n
ak x k =
(s k − s k−1 )x k
k=0
n
k=0
for n ∈ ω. For x < 1,
=
sk x k −
k=0
= sn x n +
n
s k−1 x k
k=1
n−1
n−1
k=0
k=0
sk x k − x
= s n x n + (1 − x)
n−1
sk x k
sk x k
k=0
When n → ∞, since s n is bounded and x < 1,
∞
sk x k .
f (x) = (1 − x)
(9.22)
k=0
Since (1 − x)
∞
n
n=0 x
= 1, (9.22) implies
∞
 f (x) − s = (1 − x)
(9.23)
(s k − s)x k .
k=0
Let ε > 0. Choose N ∈ N such that whenever n ≥ N , then s n − s < ε/2. Choose
δ ∈ (0, 1) so
N
δ
s k − s < ε/2.
k=0
Suppose x is such that 1 − δ < x < 1. With these choices, (9.23) becomes
N
k=0
(s k − s)x k
k=N +1
N
<δ
∞
(s k − s)x k + (1 − x)
 f (x) − s ≤ (1 − x)
ε ε
ε
(1 − x)
xk < + = ε
2
2 2
k=N +1
∞
s k − s +
k=0
It has been shown that limx↑1 f (x) = f (1), so 1 ∈ C ( f ).
Here is an example showing the power of these techniques.
E XAMPLE 9.17. The series
∞
(−1)n x 2n =
n=0
1
1 + x2
has (−1, 1) as its interval of convergence. If 0 ≤ x < 1, then Corollary 9.17 justifies
x
arctan(x) =
August 4, 2017
0
dt
=
1+ t2
x ∞
0 n=0
(−1)n t 2n d t =
(−1)n 2n+1
x
.
n=0 2n + 1
∞
http://math.louisville.edu/∼lee/ira
8. POWER SERIES
923
This series for the arctangent converges by the alternating series test when x = 1,
so Theorem 9.29 implies
(−1)n
π
= lim arctan(x) = arctan(1) = .
x↑1
2n
+
1
4
n=0
∞
(9.24)
A bit of rearranging gives the formula
π = 4 1−
1 1 1
+ − +··· ,
3 5 7
which is known as Gregory’s series for π.
Finally, Abel’s theorem opens up an interesting idea for the summation of
series. Suppose ∞
n=0 a n is a series. The Abel sum of this series is
∞
∞
A
a n = lim
x↑1 n=0
n=0
an x n .
Consider the following example.
E XAMPLE 9.18. Let a n = (−1)n so
∞
an = 1 − 1 + 1 − 1 + 1 − 1 + · · ·
n=0
diverges. But,
∞
A
∞
a n = lim
x↑1 n=0
n=0
(−x)n = lim
x↑1
1
1
= .
1+x 2
This shows the Abel sum of a series may exist when the ordinary sum does
not. Abel’s theorem guarantees when both exist they are the same.
Abel summation is one of many different summation methods used in areas
such as harmonic analysis. (For another see Exercise 4.25.)
T HEOREM 9.30 (Tauber). If
∞
n=0 a n
is a series satisfying
(a) na n → 0 and
(b) A ∞
n=0 a n = A,
then
∞
n=0 a n =A.
P ROOF. Let s n =
∞
sn −
n
a .
k=0 k
ak x k =
k=0
For x ∈ (0, 1) and n ∈ N,
n
n
ak −
k=0
n
=
k=0
n
=
ak x k
k=n+1
∞
a k (1 − x k ) −
k=0
∞
ak x k −
ak x k
k=n+1
a k (1 − x)(1 + x + · · · + x k−1 ) −
k=0
≤ (1 − x)
∞
ka k  +
k=0
August 4, 2017
ak x k
k=n+1
n
(9.25)
∞
a k x k .
k=n+1
http://math.louisville.edu/∼lee/ira
924
CHAPTER 9. SEQUENCES OF FUNCTIONS
Let ε > 0. According to (a) and Exercise 3.21, there is an N ∈ N such that
n ≥ N =⇒ na n  <
(9.26)
1 n
ε
ε
and
ka k  < .
2
n k=0
2
Let n ≥ N and 1 − 1/n < x < 1. Using the right term in (9.26),
ka k  <
1 n
ε
ka k  < .
n k=0
2
a k x k <
ε k
x
k=n+1 2k
n
(1 − x)
(9.27)
k=0
Using the left term in (9.26) gives
∞
k=n+1
∞
ε x n+1
2n 1 − x
ε
< .
2
Combining (9.26) and (9.26) with (9.25) shows
(9.28)
<
∞
sn −
a k x k < ε.
k=0
Assumption (b) implies s n → A.
9. Exercises
9.1. If f n (x) = nx(1 − x)n for 0 ≤ x ≤ 1, then show f n converges pointwise, but
not uniformly on [0, 1].
9.2. Show sinn x converges uniformly on [0, a] for all a ∈ (0, π/2). Does sinn x
converge uniformly on [0, π/2)?
9.3. Show that
(−1, 1).
9.4. Prove
x n converges uniformly on [−r, r ] when 0 < r < 1, but not on
∞
n
n=0 x /n! does not converge uniformly on R.
9.5. The series
∞
cos nx
nx
n=0 e
is uniformly convergent on any set of the form [a, ∞) with a > 0.
9.6. A sequence of functions f n : S → R is uniformly bounded on S if there is an
M > 0 such that f n S ≤ M for all n ∈ N. Prove that if f n is uniformly convergent
on S and each f n is bounded on S, then the sequence f n is uniformly bounded
on S.
9.7. Let S ⊂ R and c ∈ R. If f n : S → R is a Cauchy sequence, then so is c f n .
August 4, 2017
http://math.louisville.edu/∼lee/ira
9. EXERCISES
925
9.8. If S ⊂ R and f n , g n : S → R are Cauchy sequences, then so is f n + g n .
9.9. Let S ⊂ R. If f n , g n : S → R are uniformly bounded Cauchy sequences, then
so is f n g n .
9.10. Prove or give a counterexample: If f n is a sequence of monotone functions
converging pointwise to a continuous function f , then f n
f.
9.11. Prove or give a counterexample: If f n : [a, b] → R is a sequence of monotone
functions converging pointwise to a continuous function f , then f n
f.
9.12. Prove there is a sequence of polynomials on [a, b] converging uniformly to
a nowhere differentiable function.
9.13. Prove Corollary 9.21.
(−1)n
. (This is the MadhavaLeibniz series which
n
n=0 3 (2n + 1)
was used in the fourteenth century to compute π to 11 decimal places.)
9.14. Prove π = 2 3
xn
, then
n=0 n!
∞
9.15. If exp(x) =
∞
d
dx
exp(x) = exp(x) for all x ∈ R.
∞
9.16. Is
1
Abel convergent?
n
ln
n
n=2
August 4, 2017
http://math.louisville.edu/∼lee/ira
CHAPTER 10
Fourier Series
In the late eighteenth century, it was wellknown that complicated functions could sometimes be approximated by a sequence of polynomials. Some
of the leading mathematicians at that time, including such luminaries as Daniel
Bernoulli, Euler and d’Alembert began studying the possibility of using sequences
of trigonometric functions for approximation. In 1807, this idea opened into a
huge area of research when Joseph Fourier used series of sines and cosines to
solve several outstanding partial differential equations of physics.1
In particular, he used series of the form
∞
a n cos nx + b n sin nx
n=0
to approximate his solutions. Series of this form are called trigonometric series,
and the ones derived from Fourier’s methods are called Fourier series. Much of
the mathematical research done in the nineteenth and early twentieth century
was devoted to understanding the convergence of Fourier series. This chapter
presents nothing more than the tip of that huge iceberg.
1. Trigonometric Polynomials
D EFINITION 10.1. A function of the form
n
(10.1)
p(x) =
αk cos kx + βk sin kx
k=0
is called a trigonometric polynomial. The largest value of k such that αk  + βk  =
0 is the degree of the polynomial. Denote by T the set of all trigonometric
polynomials.
Evidently, all functions in T are 2πperiodic and T is closed under addition
and multiplication by real numbers. Indeed, it is a real vector space, in the sense
of linear algebra and the set {sin nx : n ∈ N} ∪ {cos nx : n ∈ ω} is a basis for T .
The following theorem can be proved using integration by parts or trigonometric identities.
T HEOREM 10.2. If m, n ∈ Z, then
π
(10.2)
−π
sin mx cos nx d x = 0,
1
Fourier’s methods can be seen in most books on partial differential equations, such as [3]. For
example, see solutions of the heat and wave equations using the method of separation of variables.
101
102
CHAPTER 10. FOURIER SERIES
π
(10.3)
−π
sin mx sin nx d x =
0, m = n
0, m = 0 or n = 0
π m=n=0
and
0, m = n
π
cos mx cos nx d x = 2π m = n = 0 .
−π
π m=n=0
(10.4)
If p(x) is as in (10.1), then Theorem 10.2 shows
π
2πα0 =
p(x) d x,
−π
and for n > 0,
παn =
π
p(x) cos nx d x,
−π
πβn =
π
p(x) cos nx d x.
−π
Combining these, it follows that if
an =
1
π
π
−π
p(x) cos nx d x and b n =
1
π
π
p(x) sin nx d x
−π
for n ∈ ω, then
p(x) =
(10.5)
a0 ∞
+
a n cos nx + b n sin nx.
2 n=1
(Remember that all but a finite number of the a n and b n are 0!)
At this point, the logical question is whether this same method can be used
to represent a more general 2πperiodic function. For any function f , integrable
on [−π, π], the coefficients can be defined as above; i.e., for n ∈ ω,
(10.6)
an =
1
π
π
−π
f (x) cos nx d x and b n =
1
π
π
f (x) sin nx d x.
−π
The numbers a n and b n are called the Fourier coefficients of f . The problem is
whether and in what sense an equation such as (10.5) might be true. This turns
out to be a very deep and difficult question with no short answer.2 Because we
don’t know whether equality in the sense of (10.5) is true, the usual practice is to
write
a0 ∞
(10.7)
f (x) ∼
+
a n cos nx + b n sin nx,
2 n=1
indicating that the series on the right is calculated from the function on the left
using (10.6). The series is called the Fourier series for f .
2Many people, including me, would argue that the study of Fourier series has been the
most important area of mathematical research over the past two centuries. Huge mathematical
disciplines, including set theory, measure theory and harmonic analysis trace their lineage back to
basic questions about Fourier series. Even after centuries of study, research in this area continues
unabated.
August 4, 2017
http://math.louisville.edu/∼lee/ira
2. THE RIEMANN LEBESGUE LEMMA
103
F IGURE 10.1. This shows f (x) = x, s 1 (x) and s 3 (x), where s n (x)
is the n th partial sum of the Fourier series for f .
E XAMPLE 10.1. Let f (x) = x. Since f is an even functions and sin nx is odd,
1
π
for all n ∈ N. On the other hand,
bn =
π
−π
x sin nx d x = 0
π,
n=0
1
an =
x cos nx d x = 2(cos nπ − 1)
π −π
, n∈N
n2π
for n ∈ ω. Therefore,
π 4 cos x 4 cos 3x 4 cos 5x 4 cos 7x 4 cos 9x
−
−
−
−
+···
x ∼ −
2
π
9π
25π
49π
81π
(See Figure 10.1.)
π
There are at least two fundamental questions arising from (10.7): Does the
Fourier series of f converge to f ? Can f be recovered from its Fourier series,
even if the Fourier series does not converge to f ? These are often called the
convergence and representation questions, respectively. The next few sections
will give some partial answers.
2. The Riemann Lebesgue Lemma
We learned early in our study of series that the first and simplest convergence
test is to check whether the terms go to zero. For Fourier series, this is always the
case.
T HEOREM 10.3 (RiemannLebesgue Lemma). If f is a function such that
exists, then
b
lim
α→∞ a
August 4, 2017
f (t ) cos αt d t = 0 and lim
b
α→∞ a
b
a
f
f (t ) sin αt d t = 0.
http://math.louisville.edu/∼lee/ira
104
CHAPTER 10. FOURIER SERIES
P ROOF. Since the two limits have similar proofs, only the first will be proved.
Let ε > 0 and P be a generic partition of [a, b] satisfying
b
0<
a
ε
f −D f ,P < .
2
For m i = glb { f (x) : x i −1 < x < x i }, define a function g on [a, b] by g (x) = m i when
b
x i −1 ≤ x < x i and g (b) = m n . Note that a g = D f , P , so
b
(10.8)
0<
a
ε
(f − g) < .
2
Choose
α>
(10.9)
4 n
m i .
ε i =1
Since f ≥ g ,
b
a
b
f (t ) cos αt d t =
a
b
≤
a
( f (t ) − g (t )) cos αt d t +
( f (t ) − g (t )) cos αt d t +
b
≤
(f − g)+
a
b
≤
a
(f − g)+
b
a
g (t ) cos αt d t
b
a
g (t ) cos αt d t
1 n
m i (sin(αx i ) − sin(αx i −1 ))
α i =1
2 n
m i 
α i =1
Use (10.8) and (10.9).
ε ε
+
2 2
=ε
<
C OROLLARY 10.4. If f is integrable on [−π, π] with a n and b n the Fourier
coefficients of f , then a n → 0 and b n → 0.
3. The Dirichlet Kernel
Suppose f is integrable on [−π, π] and 2πperiodic on R, so the Fourier series
of f exists. The partial sums of the Fourier series are written as s n ( f , x), or more
simply s n (x) when there is only one function in sight. To be more precise,
s n ( f , x) =
n
a0
+
(a k cos kx + b k sin kx) .
2 k=1
Notice s n is a trigonometric polynomial of degree at most n.
We begin with the following calculation.
August 4, 2017
http://math.louisville.edu/∼lee/ira
3. THE DIRICHLET KERNEL
105
15
12
9
6
3
Π
Π
Π
2
2
Π
3
F IGURE 10.2. The Dirichlet kernel D n (s) for n = 1, 4, 7.
s n (x) =
n
a0
+
(a k cos kx + b k sin kx)
2 k=1
=
1
2π
=
1
2π
=
1
2π
π
−π
π
−π
1
k=1 π
−π
f (t ) cos kt cos kx + f (t ) sin kt sin kx d t
n
2(cos kt cos kx + sin kt sin kx) d t
f (t ) 1 +
k=1
π
−π
π
n
f (t ) d t +
n
f (t ) 1 +
2 cos k(x − t ) d t
k=1
Substitute s = x − t and use the assumption that f is 2πperiodic.
(10.10)
=
1
2π
π
−π
n
f (x − s) 1 + 2
cos ks d s
k=1
The sequence of trigonometric polynomials from within the integral,
n
D n (s) = 1 + 2
(10.11)
cos ks,
k=1
is called the Dirichlet kernel. Its properties will prove useful for determining the
pointwise convergence of Fourier series.
T HEOREM 10.5. The Dirichlet kernel has the following properties.
(a) D n (s) is an even 2πperiodic function for each n ∈ N.
(b) D n (0) = 2n + 1 for each n ∈ N.
(c) D n (s) ≤ 2n + 1 for each n ∈ N and all s.
August 4, 2017
http://math.louisville.edu/∼lee/ira
106
CHAPTER 10. FOURIER SERIES
(d)
1
2π
π
−π
D n (s) d s = 1 for each n ∈ N.
(e) D n (s) =
of π.
sin(n + 1/2)s
for each n ∈ N and s/2 not an integer multiple
sin s/2
P ROOF. Properties (a)–(d) follow from the definition of the kernel.
The proof of property (e) uses some trigonometric manipulation. Suppose
n ∈ N and s = mπ for any m ∈ Z.
n
D n (s) = 1 + 2
cos ks
k=1
Use the facts that the cosine is even and the sine is odd.
n
cos ks +
=
k=−n
=
1
sin 2s
1
=
sin 2s
n
k=−n
n
cos 2s
n
sin ks
sin 2s k=−n
s
s
sin cos ks + cos sin ks
2
2
1
sin(k + )s
2
k=−n
This is a telescoping sum.
=
sin(n + 12 )s
sin 2s
According to (10.10),
s n ( f , x) =
1
2π
π
−π
f (x − t )D n (t ) d t .
This is similar to a situation we’ve seen before within the proof of the Weierstrass approximation theorem, Theorem 9.13. The integral given above is a
convolution integral similar to that used in the proof of Theorem 9.13, although
the Dirichlet kernel isn’t a convolution kernel in the sense of Lemma 9.14 because
it doesn’t satisfy conditions (a) and (c) of that lemma. (See Figure 10.3.)
4. Dini’s Test for Pointwise Convergence
T HEOREM 10.6 (Dini’s Test). Let f : R → R be a 2πperiodic function integrable
on [−π, π] with Fourier series given by (10.7). If there is a δ > 0 and s ∈ R such that
δ
0
then
August 4, 2017
f (x + t ) + f (x − t ) − 2s
d t < ∞,
t
a0 ∞
+
(a k cos kx + b k cos kx) = s.
2 k=1
http://math.louisville.edu/∼lee/ira
4. DINI’S TEST FOR POINTWISE CONVERGENCE
107
F IGURE 10.3. This graph shows D 50 (t ) and the envelope y =
±1/ sin(t /2). As n gets larger, the D n (t ) fills the envelope more completely.
P ROOF. Since D n is even,
1 π
f (x − t )D n (t ) d t
2π −π
1 0
1 π
=
f (x − t )D n (t ) d t +
f (x − t )D n (t ) d t
2π −π
2π 0
1 π
=
f (x + t ) + f (x − t ) D n (t ) d t .
2π 0
By Theorem 10.5(d) and (e),
s n (x) =
1
2π
1
=
2π
s n (x) − s =
π
0
0
f (x + t ) + f (x − t ) − 2s D n (t ) d t
π
f (x + t ) + f (x − t ) − 2s
t
1
·
· sin(n + )t d t .
t
t
2
sin 2
Since t / sin 2t is bounded on (0, π), Theorem 10.3 shows s n (x) − s → 0. Now use
Corollary 8.11 to finish the proof.
E XAMPLE 10.2. Suppose f (x) = x for −π < x ≤ π and is 2πperiodic on R.
Since f is odd, a n = 0 for all n. Integration by parts gives b n = (−1)n+1 2/n for
n ∈ N. Therefore,
∞
2
f ∼
(−1)n+1 sin nx.
n
n=1
For x ∈ (−π, π), let 0 < δ < min{π − x, π + x}. (This is just the distance from x to
closest endpoint of (−π, π).) Using Dini’s test, we see
δ
0
August 4, 2017
f (x + t ) + f (x − t ) − 2x
dt =
t
δ
0
x + t + x − t − 2x
d t = 0 < ∞,
t
http://math.louisville.edu/∼lee/ira
108
CHAPTER 10. FOURIER SERIES
Π
Π
2
Π
Π
Π
2
2
2Π
Π
Π
2
Π
F IGURE 10.4. This plot shows the function of Example 10.2 and
s 8 (x) for that function.
so
∞
(10.12)
(−1)n+1
n=1
2
sin nx = x for − π < x < π.
n
In particular, when x = π/2, (10.12) gives another way to derive (9.24). When
x = π, the series converges to 0, which is the middle of the “jump” for f .
This behavior of converging to the middle of a jump discontinuity is typical.
To see this, denote the onesided limits of f at x by
f (x−) = lim f (t ) and f (x+) = lim f (t ),
t ↑x
t ↓x
and suppose f has a jump discontinuity at x with
s=
f (x−) + f (x+)
.
2
Guided by Dini’s test, consider
δ
0
f (x + t ) + f (x − t ) − 2s
dt
t
δ f (x + t ) + f (x − t ) − f (x−) − f (x+)
=
dt
t
0
δ f (x + t ) − f (x+)
δ f (x − t ) − f (x−)
dt +
dt
≤
t
t
0
0
If both of the integrals on the right are finite, then the integral on the left is also
finite. This amounts to a proof of the following corollary.
August 4, 2017
http://math.louisville.edu/∼lee/ira
5. GIBBS PHENOMENON
109
C OROLLARY 10.7. Suppose f : R → R is 2πperiodic and integrable on [−π, π].
If both onesided limits exist at x and there is a δ > 0 such that both
δ
0
f (x + t ) − f (x+)
d t < ∞ and
t
δ
0
f (x − t ) − f (x−)
d t < ∞,
t
then the Fourier series of f converges to
f (x−) + f (x+)
.
2
The Dini test given above provides a powerful condition sufficient to ensure
the pointwise convergence of a Fourier series. There is a plethora of ever more
abstruse conditions that can be proved in a similar fashion to show pointwise
convergence.
The problem is complicated by the fact that there are continuous functions
with Fourier series divergent at a point and integrable functions with Fourier
series diverging everywhere [14]. In Section 6, a continuous function whose
Fourier series diverges is constructed.
5. Gibbs Phenomenon
For x ∈ [−π, π) define
(10.13)
f (x) =
x
x ,
0 < x < π
0,
x = 0, π
and extend f 2πperiodically to all of R. This function is often called a square
wave. A straightforward calculation gives
f ∼
4 ∞ sin(2k − 1)x
.
π k=1
2k − 1
Corollary 10.7 shows s n (x) → f (x) everywhere. This convergence cannot be uniform because all the partial sums are continuous and f is discontinuous at every
integer multiple of π. A plot of s 19 (x) is shown in Figure 10.5. Notice the higher
peaks in the oscillation of s n (x) just before and after the jump discontinuities
of f . This behavior is not unique to f , as it can also be seen in Figure 10.4. If a
function is discontinuous at some point, the partial sums of its Fourier series
will always have such higher peaks near that point. This behavior is called Gibbs
phenomenon.3
Instead of doing a general analysis of Gibbs phenomenon, we’ll only analyze
the simple case shown in the square wave f . It’s basically a calculus exercise.
3It is named after the American mathematical physicist, J. W. Gibbs, who pointed it out in 1899.
He was not the first to notice the phenomenon, as the British mathematician Henry Wilbraham
had published a littlenoticed paper on it it 1848. Gibbs’ interest in the phenomenon was sparked
by investigations of the American experimental physicist A. A. Michelson who wrote a letter to
Nature disputing the possibility that a Fourier series could converge to a discontinuous function.
The ensuing imbroglio is recounted in a marvelous book by Paul J. Nahin [17].
August 4, 2017
http://math.louisville.edu/∼lee/ira
1010
CHAPTER 10. FOURIER SERIES
F IGURE 10.5. This is a plot of s19 ( f , x), where f is defined by (10.13).
F IGURE 10.6. This is a plot of s9 ( f , x), where f is defined by (10.13).
To locate the peaks in the graph, differentiate the partial sums.
s 2n−1 (x) =
d 4 n sin(2k − 1)x 4 n
=
cos(2k − 1)x
d x π k=1
2k − 1
π k=1
It is left as Exercise 10.10 to show this has a closed form.
2 sin 2nx
s 2n−1 (x) =
π sin x
Looking at the numerator, we see s 2n−1 (x) has critical numbers at x = kπ/2n
for k ∈ Z. In the interval (0, π), s 2n−1 (kπ/2n) is a relative maximum for odd k and
a relative minimum for even k. (See Figure 10.6.) The value s 2n−1 (π/2n) is the
height of the leftmost peak. What is the behavior of these maxima?
From Figure 10.7 it appears they have an asymptotic limit near 1.18. To prove
this, consider the following calculation.
s 2n−1 f ,
π
π
4 n sin (2k − 1) 2n
=
2n
π k=1
2k − 1
(2k−1)π
2 n sin
2n
=
(2k−1)π
π k=1
2n
August 4, 2017
π
n
http://math.louisville.edu/∼lee/ira
6. A DIVERGENT FOURIER SERIES
1011
The last sum is a midpoint Riemann sum for the function sinx x on the interval
[0, π] using a regular partition of n subintervals. Example 9.16 shows
2
π
π
0
sin x
d x ≈ 1.17898.
x
Since f (0+) − f (0−) = 2, this is an overshoot of a bit less than 9%. There is a
similar undershoot on the other side. It turns out this is typical behavior at points
of discontinuity [22].
6. A Divergent Fourier Series
This is an advanced section that can be omitted.
Eighteenth century mathematicians, including Fourier, Cauchy, Euler, Weierstrass, Lagrange and Riemann, had computed the Fourier series for many functions. They believed from these examples that the Fourier series of a continuous
function must converge to that function. Fourier went far beyond this claim in his
important 1822 book Théorie Analytique de la Chaleur. Cauchy even published a
flawed proof of this “fact.”
Finally, in 1873, Paul du BoisReymond, settled the question by giving the
construction of a continuous function whose Fourier series diverges at a point
[18]. It was finally shown in 1966 by Lennart Carleson [7] that the Fourier series of
a continuous function converges to that function everywhere with the exception
of a set of measure zero.
The problems around the convergence of Fourier series motivated a huge
amount of research that is still going on today. In this section, we look at the tip of
that iceberg by presenting a continuous function F (t ) whose Fourier series fails
to converge when t = 0.
6.1. The Conjugate Dirichlet Kernel.
F IGURE 10.7. This is a plot of sn ( f , π/2n) for n = 1, 2, · · · , 100. The dots
come in pairs because s 2n−1 ( f , π/2n) = s 2n ( f , π/2n).
August 4, 2017
http://math.louisville.edu/∼lee/ira
1012
CHAPTER 10. FOURIER SERIES
L EMMA 10.8. If m, n ∈ ω and 0 < t  < π for k ∈ Z, then
n
sin kt =
cos(m − 12 )t − cos(m + 12 )t
2 sin 2t
k=m
.
P ROOF.
n
sin kt =
k=m
=
=
n
1
sin 2t k=m
n
1
t
sin sin kt
2
sin 2t k=m
cos k −
1
1
t − cos k + t
2
2
cos m − 12 t − cos n + 21 t
2 sin 2t
D EFINITION 10.9. The conjugate Dirichlet kernel4 is
D˜ n (t ) =
(10.14)
n
sin kt ,
n ∈ N.
k=1
We’ll not have much use for the conjugate Dirichlet kernel, except as a convenient way to refer to sums of the form (10.14).
Lemma 10.8 immediately gives the following bound.
C OROLLARY 10.10. If 0 < t  < π, then
D˜ n (t ) ≤
1
sin 2t
.
6.2. A Sawtooth Wave. If the function f (x) = (π − x)/2 on [0, 2π) is extended
2πperiodically to R, then the graph of the resulting function is often referred to
as a “sawtooth wave”. It has a particularly nice Fourier series:
∞ sin kx
π−x
∼
.
2
k
k=1
According to Corollary 10.7
0,
x = 2nπ, n ∈ Z
sin kx
=
.
k
f (x), otherwise
k=1
∞
We’re interested in various partial sums of this series.
L EMMA 10.11. If m, n ∈ ω with m ≤ n and 0 < t  < 2π, then
n
sin kt
1
≤
.
k
m sin 2t
k=m
4In this case, the word “conjugate” does not refer to the complex conjugate, but to the har
monic conjugate. They are related by D n (t ) + i D˜ n (t ) = 1 + 2 n
e ki t .
k=1
August 4, 2017
http://math.louisville.edu/∼lee/ira
6. A DIVERGENT FOURIER SERIES
1013
P ROOF.
n
sin kt
=
k
k=m
n
1
k
D˜ k (t ) − D˜ k−1 (t )
k=m
Use summation by parts.
n
=
D˜ k (t )
1
D˜ n (t ) D˜ n−1 (t )
1
−
+
−
k k +1
n +1
m
D˜ k (t )
1
1
−
k k +1
k=m
n
≤
k=m
+
D˜ n (t )
D˜ n−1 (t )
+
n +1
m
Apply Corollary 10.10.
≤
=
=
1
n
2 sin 2t
k=m
1
1
1
1
1
−
+
k k +1 n +1 m
1
1
1
1
−
+
+
m n +1 n +1 m
2 sin 2t
1
m sin 2t
.
P ROPOSITION 10.12. If n ∈ N and 0 < t  < π, then
n
sin kt
≤ 1 + π.
k
k=1
P ROOF.
n
sin kt
=
k
k=1
1≤k≤ 1t
≤
1≤k≤ 1t
≤
1≤k≤ 1t
sin kt
sin kt
+
k
k
1
0,
.
1 ≤ m < 2n
Rearrange the sum in the definition of f n to see
n
f n (t ) =
cos(n − k)t − cos(n + k)t
k
k=1
n
= 2 sin nt
sin kt
.
k
k=1
This closed form for f n combined with Proposition 10.12 implies the sequence of
functions f n is uniformly bounded:
n
(10.16)
f n (t ) = 2 sin nt
sin kt
≤ 2 + 2π.
k
k=1
At last, the main function can be defined.
∞
F (t ) =
f 2n 3 (t )
n2
n=1
The Weierstrass MTest along with (10.16) implies F is uniformly convergent
and therefore continuous on R. Using (10.15), consider
1
2π
1
=
2π
s 2m3 (F, 0) =
π
F (t )D 2m3 (t ) d t
−π
π ∞
−π n=1
f 2n 3 (t )
n2
D 2m3 (t ) d t
The uniform convergence allows the sum and integration to be reordered.
∞
=
1 1
2
n=1 n 2π
π
−π
f 2n 3 (t )D 2m3 (t ) d t
∞
=
August 4, 2017
1
s 3 f 2n 3 , 0
2 2m
n
n=1
http://math.louisville.edu/∼lee/ira
7. THE FEJÉR KERNEL
1015
Use (10.15).
∞
1
s 3 f 2n 3 , 0
2 2m
n
n=m
1
> 2 s 2m 3 f 2 m 3 , 0
m
=
m3
1 2 1
= 2
m k=1 k
3
1
ln 2m
2
m
= m ln 2
>
This implies, lim sup s n (F, 0) ≥ limm→∞ m ln 2 = ∞, so s n (F, 0) does not converge.
7. The Fejér Kernel
Since pointwise convergence of the partial sums seems complicated, why
not change the rules of the game? Instead of looking at the sequence of partial
sums, consider a rolling average instead:
σn ( f , x) =
n
1
s n ( f , x).
n + 1 k=0
The trigonometric polynomials σn ( f , x) are called the Cesàro means of the partial
sums. If limn→∞ σn ( f , x) exists, then the Fourier series for f is said to be (C , 1)
summable at x. The idea is that this averaging will “smooth out” the partial sums,
making them more nicely behaved. It is not hard to show that if s n ( f , x) converges
at some x, then σn ( f , x) will converge to the same thing. But there are sequences
for which σn ( f , x) converges and s n ( f , x) does not. (See Exercises 3.21 and 4.25.)
As with s n (x), we’ll simply write σn (x) instead of σn ( f , x), when it is clear
which function is being considered.
We start with a calculation.
σn (x) =
(*)
n
1
s k (x)
n + 1 k=0
=
n 1
1
n + 1 k=0 2π
=
1
2π
1
=
2π
=
August 4, 2017
1
2π
π
−π
π
−π
f (x − t )
n
1
D k (t ) d t
n + 1 k=0
f (x − t )
n sin(k + 1/2)t
1
dt
n + 1 k=0
sin t /2
π
−π
π
−π
f (x − t )D k (t ) d t
f (x − t )
n
1
2
(n + 1) sin t /2 k=0
sin t /2 sin(k + 1/2)t d t
http://math.louisville.edu/∼lee/ira
1016
CHAPTER 10. FOURIER SERIES
10
8
6
4
2
Π
2
Π
Π
2
Π
F IGURE 10.8. A plot of K 5 (t ), K 8 (t ) and K 10 (t ).
Use the identity 2 sin A sin B = cos(A − B ) − cos(A + B ).
=
1
2π
π
−π
f (x − t )
n
1/2
2
(n + 1) sin t /2 k=0
(cos kt − cos(k + 1)t ) d t
The sum telescopes.
=
1
2π
π
−π
f (x − t )
1/2
(n + 1) sin2 t /2
(1 − cos(n + 1)t ) d t
Use the identity 2 sin2 A = 1 − cos 2A.
1
=
2π
(**)
sin n+1
1
2 t
f (x − t )
(n + 1)
sin 2t
−π
π
2
dt
The Fejér kernel is the sequence of functions highlighted above; i.e.,
(10.17)
sin n+1
1
2 t
K n (t ) =
(n + 1)
sin 2t
2
, n ∈ N.
Comparing the lines labeled (*) and (**) in the previous calculation, we see another form for the Fejér kernel is
(10.18)
August 4, 2017
K n (t ) =
n
1
D k (t ).
n + 1 k=0
http://math.louisville.edu/∼lee/ira
7. THE FEJÉR KERNEL
1017
Once again, we’re confronted with a convolution integral containing a kernel:
σn (x) =
1
2π
π
−π
f (x − t )K n (t ) d t .
T HEOREM 10.13. The Fejér kernel has the following properties.5
(a)
(b)
(c)
(d)
(e)
(f)
K n (t ) is an even 2πperiodic function for each n ∈ N.
K n (0) = n + 1 for each n ∈ ω.
K n (t ) ≥ 0 for each n ∈ N.
π
1
2π −π K n (t ) d t = 1 for each n ∈ ω.
If 0 < δ < π, then K n 0 on [−π, δ] ∪ [δ, π].
δ
π
If 0 < δ < π, then −π K n (t ) d t → 0 and δ K n (t ) d t → 0.
P ROOF. Theorem 10.5 and (10.18) imply (a), (b) and (d). Equation (10.17)
implies (c).
Let δ be as in (e). In light of (a), it suffices to prove (e) for the interval [δ, π].
Noting that sin t /2 is decreasing on [δ, π], it follows that for δ ≤ t ≤ π,
sin n+1
1
2 t
K n (t ) =
(n + 1)
sin 2t
2
2
1
1
≤
(n + 1) sin 2t
1
1
≤
→0
(n + 1) sin2 δ
2
It follows that K n 0 on [δ, π] and (e) has been proved.
Theorem 9.16 and (e) imply (f ).
T HEOREM 10.14 (Fejér). If f : R → R is 2πperiodic, integrable on [−π, π] and
continuous at x, then σn (x) → f (x).
π
π
P ROOF. Since f is 2πperiodic and −π f (t ) d t exists, so does −π ( f (x − t ) −
f (x)) d t . Theorem 8.3 gives an M > 0 so  f (x − t ) − f (x) < M for all t .
Let ε > 0 and choose δ > 0 such that f (x) − f (y) < ε/3 whenever x − y < δ.
By Theorem 10.13(f), there is an N ∈ N so that whenever n ≥ N ,
1
2π
5
δ
−π
K n (t ) d t <
ε
1
and
3M
2π
π
δ
K n (t ) d t <
ε
.
3M
Compare this theorem with Lemma 9.14.
August 4, 2017
http://math.louisville.edu/∼lee/ira
1018
CHAPTER 10. FOURIER SERIES
We start calculating.
σn (x) − f (x) =
1
2π
π
−π
=
=
f (x − t )K n (t ) d t −
−π
1
2π
−π
δ
π
δ
( f (x − t ) − f (x))K n (t ) d t +
1
2π
−δ
M
1
<
K n (t ) d t +
2π −π
2π
+
−δ
( f (x − t ) − f (x))K n (t ) d t
( f (x − t ) − f (x))K n (t ) d t
−δ
−π
f (x)K n (t ) d t
( f (x − t ) − f (x))K n (t ) d t
( f (x − t ) − f (x))K n (t ) d t +
+
≤
−π
π
1
2π
−δ
1
2π
π
1
2π
π
δ
1
2π
δ
−δ
( f (x − t ) − f (x))K n (t ) d t
( f (x − t ) − f (x))K n (t ) d t
δ
−δ
 f (x − t ) − f (x)K n (t ) d t +
, ≥, 23
∞, infinity, 27
∩, intersection, 13
∧, logical and, 12
∨, logical or, 12
lub , least upper bound, 27
n, initial segment, 111
N, natural numbers, 12
ω, nonnegative integers, 12
part ([a, b]) partitions of [a, b], 81
→ pointwise convergence, 91
P (A), power set, 12
Π, indexed product, 16
R, real numbers, 28
R (.) Riemann integral, 83
R (., ., .) Riemann sum, 82
⊂, subset, 11
, proper subset, 11
⊃, superset, 11
, proper superset, 11
∆, symmetric difference, 13
×, product (Cartesian or real), 15, 21
T , trigonometric polynomials, 101
uniform convergence, 94
∪, union, 12
Z, integers, 12
Abel’s test, 411
absolute value, 25
accumulation point, 38
almost every, 510
alternating harmonic series, 411
Alternating Series Test, 413
and ∧, 12
Archimedean Principle, 29
axioms of R
additive inverse, 21
associative laws, 21
commutative laws, 21
completeness, 28
distributive law, 21
identities, 21
multiplicative inverse, 22
order, 23
Baire category theorem, 511
Baire, RenéLouis, 511
Bertrand’s test, 410
BolzanoWeierstrass Theorem, 38, 53
bound
lower, 26
upper, 26
bounded, 26
above, 26
below, 26
Cantor, Georg, 112
diagonal argument, 210
A2
Index
middlethirds set, 512
cardinality, 111
countably infinite, 111
finite, 111
uncountably infinite, 112
Cartesian product, 15
Cauchy
condensation test, 45
continuous, 613
criterion, 43
Mean Value Theorem, 76
sequence, 311
CauchySchwarz Inequality, 822
ceiling function, 69
clopen set, 52
closed set, 51
closure of a set, 54
Cohen, Paul, 112
compact, 57
equivalences, 58
comparison test, 44
completeness, 26
composition, 17
connected set, 55
continuous, 65
Cauchy, 613
left, 68
right, 68
uniformly, 612
continuum hypothesis, 112, 211
convergence
pointwise, 91
uniform, 94
convolution kernel, 910
critical number, 76
critical point, 76
Darboux
integral, 86
lower integral, 86
lower sum, 84
Theorem, 78
upper integral, 86
upper sum, 84
De Morgan’s Laws, 14
dense set, 29, 510
irrational numbers, 29
rational numbers, 29
derivative, 71
chain rule, 74
rational function, 74
derived set, 52
De Morgan’s Laws, 15
August 4, 2017
A3
diagonal argument, 210
differentiable function, 76
Dini
Test, 106
Theorem, 95
Dirac sequence, 910
Dirichlet
conjugate kernel, 1012
function, 67
kernel, 105
disconnected set, 55
extreme point, 75
Fejér
kernel, 1017
theorem, 1017
field, 21
complete ordered, 28
field axioms, 21
finite cover, 57
floor function, 69
Fourier series, 102
full measure, 510
function, 16
bijective, 17
composition, 17
constant, 17
decreasing, 77
differentiable, 76
even, 717
image of a set, 17
increasing, 77
injective, 17
inverse, 18
inverse image of a set, 17
monotone, 77
odd, 717
onetoone, 17
onto, 17
salt and pepper, 67
surjective, 17
Fundamental Theorem of Calculus, 814,
815
geometric sequence, 31
Gibbs phenomenon, 109
Gödel, Kurt, 112
greatest lower bound, 27
HeineBorel Theorem, 57
Hilbert, David, 112
indexed collection of sets, 14
http://math.louisville.edu/∼lee/ira
A4
indexing set, 14
infinity ∞, 27
initial segment, 111
integers, 12
integral
Cauchy criterion, 89
change of variables, 818
integration by parts, 815
intervals, 24
irrational numbers, 29
isolated point, 52
Kummer’s test, 49
least upper bound, 27
lefthand limit, 65
limit
lefthand, 65
righthand, 65
unilateral, 65
limit comparison test, 46
limit point, 52
limit point compact, 58
Lindelöf property, 56
L’Hôpital’s Rules, 712
Maclaurin series, 921
meager, 512
Mean Value Theorem, 77
metric, 25
discrete, 26
space, 25
standard, 26
ntuple, 15
natural numbers, 12
Nested Interval Theorem, 310
nested sets, 310, 54
nowhere dense, 511
open cover, 56
finite, 57
open set, 51
or ∨, 12
order isomorphism, 28
ordered field, 23
ordered pair, 15
ordered triple, 15
partition, 81
common refinement, 81
generic, 81
norm, 81
refinement, 81
August 4, 2017
Index
selection, 82
Peano axioms, 21
perfect set, 513
portion of a set, 510
power series, 918
analytic function, 921
center, 918
domain, 918
geometric, 918
interval of convergence, 918
Maclaurin, 921
radius of convergence, 918
Taylor, 921
power set, 12
Raabe’s test, 410
ratio test, 47
rational function, 68
rational numbers, 22, 29
real numbers, R, 28
relation, 16
domain, 16
equivalence, 16
function, 16
range, 16
reflexive, 16
symmetric, 16
transitive, 16
relative maximum, 75
relative minimum, 75
relative topology, 54
relatively closed set, 54
relatively open set, 54
Riemann
integral, 83
Rearrangement Theorem, 415
sum, 82
RiemannLebesgue Lemma, 103
righthand limit, 65
Rolle’s Theorem, 76
root test, 47
salt and pepper function, 67, 817
Sandwich Theorem, 34
SchröderBernstein Theorem, 19
sequence, 31
accumulation point, 38
bounded, 32
bounded above, 32
bounded below, 32
Cauchy, 311
contractive, 312
convergent, 32
http://math.louisville.edu/∼lee/ira
Index
decreasing, 35
divergent, 32
Fibonacci, 31
functions, 91
geometric, 31
hailstone, 32
increasing, 35
lim inf, 39
limit, 32
lim sup, 39
monotone, 35
recursive, 31
subsequence, 37
sequentially compact, 58
series, 41
Abel’s test, 411
absolutely convergent, 411
alternating, 414
alternating harmonic, 411
Alternating Series Test, 413
Bertrand’s test, 410
Cauchy Criterion, 43
Cauchy’s condensation test, 45
Cesàro summability, 419
comparison test, 44
conditionally convergent, 411
convergent, 41
divergent, 41
Fourier, 102
geometric, 41
Gregory’s, 923
harmonic, 42
Kummer’s test, 49
limit comparison test, 46
pseries, 45
partial sums, 41
positive, 44
Raabe’s test, 410
ratio test, 47
rearrangement, 415
root test, 47
subseries, 419
summation by parts, 412
telescoping, 43
terms, 41
set, 11
clopen, 52
closed, 51
compact, 57
complement, 13
complementation, 13
dense, 510
difference, 13
August 4, 2017
A5
element, 11
empty set, 12
equality, 11
Fσ , 512
Gδ , 512
intersection, 13
meager, 512
nowhere dense, 511
open, 51
perfect, 513
proper subset, 11
subset, 11
symmetric difference, 13
union, 13
square wave, 109
subcover, 56
subspace topology, 54
summation
Abel, 923
by parts, 412
Cesàro, 419
Taylor series, 921
Taylor’s Theorem, 710
integral remainder, 814
topology, 52
finite complement, 515
relative, 514
right ray, 52, 514
standard, 52
totally disconnected set, 55
trigonometric polynomial, 101
unbounded, 27
uniform continuity, 612
uniform metric, 96
Cauchy sequence, 96
complete, 97
unilateral limit, 65
Weierstrass
Approximation Theorem, 98, 1018
MTest, 97
http://math.louisville.edu/∼lee/ira