๐Ÿ“š Revision: sequences and limits#

โฑ | words

References and additional materials
_images/sundaram1996.png

[Sundaram, 1996]

Section 1.2.1, 1.2.2, 1.2.3, 1.4.1

Review of the basics of real analysis (mathematical analysis studies sequences, limit, continuity, differentiation, integration, etc.)

Note

Many textbooks use bold notation for vectors, but we do not. Typically it is explicitly stated that \(x \in \mathbb{R}^n\).

Norm and distance#

Recall that multidimensional number sets (with vectors as elements) are given by Cartesian products

\[\mathbb{R}^2 = \mathbb{R} \times \mathbb{R} = \{(x, y) : x \in \mathbb{R}, \; y \in \mathbb{R}\}\]
\[\mathbb{R}^n = \times_{i \in \{1,2,ยทยทยท ,n\}} \mathbb{R} = \{(x_1, x_2, ยท ยท ยท , x_n) : x_i \in \mathbb{R} \; \forall i \in \{1, 2, ยท ยท ยท , n\}\}\]

Definition

The (Euclidean) norm of \(x \in \mathbb{R}^n\) is defined as

\[\| x \| = \left( \sum_{i=1}^n x_i^2 \right)^{1/2}\]
  • \(\| x \|\) represents the length of \(x\)

_images/vec.png

Fig. 12 Length of red line \(= \sqrt{x_1^2 + x_2^2} =: \|x\|\)#

  • \(\| x - y \|\) represents distance between \(x\) and \(y\)

_images/vec_minus.png

Fig. 13 Length of red line \(= \|x - y\|\)#

Fact

For any \(\alpha \in \mathbb{R}\) and any \(x, y \in \mathbb{R}^n\), the following statements are true:

  • \(\| x \| \geq 0\)

  • \(\| x \| = 0 \iff x = 0\)

  • \(\| \alpha x \| = |\alpha| \| x \|\)

Triangle inequality

  • \(\| x + y \| \leq \| x \| + \| y \|\)

In fact, any function can be used as a norm, provided that the listed properties are satisfied

Example

More general distance function in \(\mathbb{R}\).

_images/metric.png

Fig. 14 Circle drawn with different norms#

Example

Naturally, in \(\mathbb{R}\) Euclidean norm simplifies to

\[\| x \| = \sqrt{x^2} = |x|\]
Hide code cell source
import numpy as np
import matplotlib.pyplot as plt

def subplots():
    "Custom subplots with axes throught the origin"
    fig, ax = plt.subplots()

    # Set the axes through the origin
    for spine in ['left', 'bottom']:
        ax.spines[spine].set_position('zero')
    for spine in ['right', 'top']:
        ax.spines[spine].set_color('none')
    
    ax.grid()
    return fig, ax

fig, ax = subplots()

ax.set_ylim(-3, 3)
ax.set_yticks((-3, -2, -1, 1, 2, 3))
x = np.linspace(-3, 3, 100)
ax.plot(x, np.abs(x), 'g-', lw=2, alpha=0.7, label=r'$f(x) = |x|$')
ax.plot(x, x, 'k--', lw=2, alpha=0.7, label=r'$f(x) = x$')
ax.legend(loc='lower right')

plt.show()

Therefore we can think of norm as a generalization of the absolute value to \(\mathbb{R}\)

\(\epsilon\)-balls#

Definition

For \(\epsilon > 0\), the \(\epsilon\)-ball \(B_{\epsilon}(a)\) around \(a \in \mathbb{R}^n\) is the set of all \(x \in \mathbb{R}^n\) such that \(\|a - x\| < \epsilon\)

_images/eps_ball2D.png

Correspondingly, in one dimension \(\mathbb{R}\)

\[B_\epsilon(a) = \{ x \in \mathbb{R} : |a - x| < \epsilon \} = \{ x \in \mathbb{R} : a - \epsilon < x < a + \epsilon \}\]
_images/eps_ball1D.png

The following fact is an alternative way to refer to the completeness of \(\mathbb{R}\), see previous lecture on density of real numbers.

Fact

If \(x\) is in every \(\epsilon\)-ball around \(a\) then \(x=a\)

Fact

If \(a \ne b\), then \(\exists \; \epsilon > 0\) such that \(B_{\epsilon}(a)\) and \(B_{\epsilon}(b)\) are disjoint.

_images/disjnt_balls0.png

Sequences#

Definition

A sequence \(\{x_k\}\) in \(\mathbb{R}^n\) is a function from \(\mathbb{N}\) to \(\mathbb{R}^n\)

To each \(n \in \mathbb{N}\) we associate one \(x_k \in \mathbb{R}^n\)

Typically written as \(\{x_k\}_{k=1}^{\infty}\) or \(\{x_k\}\) or \(\{x_1, x_2, x_3, \ldots\}\)

Example

In \(\mathbb{R}\)

  • \(\{x_k\} = \{2, 4, 6, \ldots \}\)

  • \(\{x_k\} = \{1, 1/2, 1/4, \ldots \}\)

  • \(\{x_k\} = \{1, -1, 1, -1, \ldots \}\)

  • \(\{x_k\} = \{0, 0, 0, \ldots \}\)

In \(\mathbb{R}^n\)

  • \(\{x_k\} = \big\{(2,..,2), (4,..,4), (6,..,6), \ldots \big\}\)

  • \(\{x_k\} = \big\{(1, 1/2), (1/2,1/4), (1/4,1/8), \ldots \big\}\)

Definition

Sequence \(\{x_k\}\) is called bounded if \(\exists M\in \mathbb{R}\) such that \(\|x_k\| < M\) for all \(k\)x.

Example

\[x_k = 1/k \quad x_k = (-1)^k \quad x_k = 2k\]
\[\text{(bounded)} \quad \text{(bounded)} \quad \text{(unbounded)}\]

Convergence and limit#

\(\mathbb{R}^1\)#

Let \(a \in \mathbb{R}\) and let \(\{x_k\}\) be a sequence

Suppose, for any \(\epsilon > 0\), we can find an \(N \in \mathbb{N}\) such that

\[x_k \in B_\epsilon(a) \text{ for all } k \geq N\]

alternatively for \(\mathbb{R}\)

\[| x_k -a | <\epsilon \text{ for all } k \geq N\]

Then \(\{x_k\}\) is said to converge to \(a\)

Definition

Sequence \(\{x_k\}\) converges to \(a \in \mathbb{R}\) if

\[\forall \, \epsilon > 0, \; \exists \, N \in \mathbb{N} \; \text{ such that } k \geq N \implies | x_k -a | <\epsilon\]

We can say

the sequence \(\{x_k\}\) is eventually in the \(\epsilon\)-ball around \(a\)

Hide code cell source
import matplotlib.pyplot as plt
import numpy as np

# from matplotlib import rc
# rc('font',**{'family':'serif','serif':['Palatino']})
# rc('text', usetex=True)

def fx(n):
    return 1 + 1/(n**(0.7))
def subplots(fs):
    "Custom subplots with axes throught the origin"
    fig, ax = plt.subplots(figsize=fs)
    # Set the axes through the origin
    for spine in ['left', 'bottom']:
        ax.spines[spine].set_position('zero')
    for spine in ['right', 'top']:
        ax.spines[spine].set_color('none')
    return fig, ax
def plot_seq(N,epsilon,a,fn):
    fig, ax = subplots((9, 5))  
    xmin, xmax = 0.5, N+1
    ax.set_xlim(xmin, xmax)
    ax.set_ylim(0, 2.1)
    n = np.arange(1, N+1)
    ax.set_xticks([])
    ax.plot(n, fn(n), 'ko', label=r'$x_k$', alpha=0.8)
    ax.hlines(a, xmin, xmax, color='k', lw=0.5, label='$a$')
    ax.hlines([a - epsilon, a + epsilon], xmin, xmax, color='k', lw=0.5, linestyles='dashed')
    ax.fill_between((xmin, xmax), a - epsilon, a + epsilon, facecolor='blue', alpha=0.1)
    ax.set_yticks((a - epsilon, a, a + epsilon))
    ax.set_yticklabels((r'$a - \epsilon$', r'$a$', r'$a + \epsilon$'))
    ax.legend(loc='upper right', frameon=False, fontsize=14)
    plt.show()

N = 50
a = 1
plot_seq(N,0.30,a,fx)
plot_seq(N,0.20,a,fx)
plot_seq(N,0.10,a,fx)

Definition

The point \(a\) is called the limit of the sequence, denoted

\[x_k \to a \text{ as } k \to \infty \quad \text{ or } \quad \lim_{k \to \infty} x_k = a,\]

if

\[\forall \, \epsilon > 0, \; \exists \, N \in \mathbb{N} \; \text{ such that } k \geq N \implies |x_k - a|< \epsilon\]

Example

\(\{x_k\}\) defined by \(x_k = 1 + 1/k\) converges to \(1\):

\[x_k \to 1 \; \mathrm{as} \; k \to \infty \quad\text{or}\quad \lim_{k \to \infty} x_k = 1\]

To prove this we must show that \(\forall \, \epsilon > 0\), there is an \(N \in \mathbb{N}\) such that

\[k \geq N \implies |x_k - 1| < \epsilon\]

To show this formally we need to come up with an โ€œalgorithmโ€

  1. You give me any \(\epsilon > 0\)

  2. I respond with an \(N\) such that equation above holds

In general, as \(\epsilon\) shrinks, \(N\) will have to grow

Proof:

Pick an arbitrary \(\epsilon > 0\)

Now we have to come up with an \(N\) such that

\[k \geq N \implies |1 + 1/k - 1| < \epsilon\]

Let \(N\) be the first integer greater than \( 1/\epsilon\)

Then

\[k \geq N \implies k > 1/\epsilon \iff 1/k < \epsilon \iff |1 + 1/k - 1| < \epsilon \]

\(\blacksquare\)

Remark: Any \(N' > N\) would also work

Example

The sequence \(x_k = 2^{-k}\) converges to \(0\) as \(k \to \infty\)

Proof:

Must show that, \(\forall \, \epsilon > 0\), \(\exists \, N \in \mathbb{N}\) such that

\[k \geq N \implies |2^{-k} - 0| < \epsilon\]

So pick any \(\epsilon > 0\), and observe that

\[|2^{-k} - 0| < \epsilon \; \iff \; 2^{-k} < \epsilon \; \iff \; k > - \frac{\ln \epsilon}{\ln 2}\]

Hence we take \(N\) to be the first integer greater than \(- \ln \epsilon / \ln 2\)

Then

\[k \geq N \implies k > -\frac{\ln \epsilon}{\ln 2} \iff |2^{-k} - 0| < \epsilon\]

What if we want to show that \(x_k \to a\) fails?

To show convergence fails we need to show the negation of

\[\forall \,\; \epsilon > 0, \;\; \exists \,\; N \in \mathbb{N} \;\mathrm{such\;that}\; k \geq N \implies x_k \in B_{\epsilon}(a)\]

In words, there is an \(\epsilon > 0\) where we canโ€™t find any such \(N\)

That is, for any choice of \(N\) there will be \(n>N\) such that \(x_k\) jumps to the outside \(B_{\epsilon}(a)\)

In other words, there exists a \(B_\epsilon(a)\) such that \(x_k \notin B_\epsilon(a)\) again and again as \(k \to \infty\).

This is the kind of picture weโ€™re thinking of

Hide code cell source
def fx2(n):
    return 1 + 1/(n**(0.7)) - 0.3 * (n % 8 == 0)

N = 80
a = 1
plot_seq(N,0.15,a,fx2)

Example

The sequence \(x_k = (-1)^k\) does not converge to any \(a \in \mathbb{R}\)

Proof:

This is what we want to show

\[\exists \,\; \epsilon > 0 \;\text{ such that } x_k \notin B_{\epsilon}(a) \text{ infinitely many times as } k \to \infty\]

Since itโ€™s a โ€œthere existsโ€, we need to come up with such an \(\epsilon\)

Letโ€™s try \(\epsilon = 0.5\), so that

\[B_\epsilon(a) = \{ x \in \mathbb{R} : |x - a| < 0.5 \} = (a-0.5, a+0.5 )\]

We have:

  • If \(k\) is odd then \(x_k = -1\) when \(k > N\) for any \(N\).

  • If \(k\) is even then \(x_k = 1\) when \(k > N\) for any \(N\).

Therefore even if \(a=1\) or \(a=-1\), \(\{x_k\}\) not in \(B_\epsilon(a)\) infinitely many times as \(k \to \infty\). It holds for all other values of \(a \in \mathbb{R}\).

\(\blacksquare\)

\(\mathbb{R}^n\)#

Definition

Sequence \(\{x_k\}\) is said to converge to \(a \in \mathbb{R}^n\) if

\[\forall \epsilon > 0, \; \exists \, N \in \mathbb{N} \; \text{ such that } \; k \geq N \implies x_k \in B_{\epsilon}(a)\]

We can say again that

\(\{x_k\}\) is eventually in any \(\epsilon\)-neighborhood of \(a\)

In this case \(a\) is called the limit of the sequence, and as in one-dimensional case, we write

\[x_k \to a \; \text{ as } \; k \to \infty \quad \text{or} \quad \lim_{k \to \infty} x_k = a\]
_images/convergence.png
_images/convergence2.png
_images/convergence3.png

Definition

We call \(\{ x_k \}\) convergent if it converges to some limit in \(\mathbb{R}^n\)

Vector vs Componentwise Convergence#

Fact

A sequence \(\{x_k\}\) in \(\mathbb{R}^n\) converges to \(a \in \mathbb{R}^n\) if and only if each component sequence converges in \(\mathbb{R}\)

That is,

\[\begin{split}\begin{pmatrix} x^1_k \\ \vdots \\ x^n_k \end{pmatrix} \to \begin{pmatrix} a^1 \\ \vdots \\ a^n \end{pmatrix} \quad \text{in } \mathbb{R}^n \quad \iff \quad \begin{array}{cc} x^1_k \to a^1 & \quad \text{in } \mathbb{R} \\ \vdots & \\ x^n_k \to a^n & \quad \text{in } \mathbb{R} \end{array}\end{split}\]
_images/norm_and_pointwise.png

Properties of limit#

The following facts prove to be very useful in applications and problem sets

Fact

  1. \(x_k \to a\) in \(\mathbb{R}^n\) if and only if \(\|x_k - a\| \to 0\) in \(\mathbb{R}\)

  2. If \(x_k \to x\) and \(y_n \to y\) then \(x_k + y_n \to x + y\)

  3. If \(x_k \to x\) and \(\alpha \in \mathbb{R}\) then \(\alpha x_k \to \alpha x\)

  4. If \(x_k \to x\) and \(y_n \to y\) then \(x_k y_n \to xy\)

  5. If \(x_k \to x\) and \(y_n \to y\) then \(x_k / y_n \to x/y\), provided \(y_n \ne 0\), \(y \ne 0\)

  6. If \(x_k \to x\) then \(x_k^p \to x^p\)

Fact

Each sequence in \(\mathbb{R}^n\) has at most one limit

Fact

Every convergent sequence is bounded

Cauchy sequences#

Informal definition: Cauchy sequences are those where \(|x_k - x_{k+1}|\) gets smaller and smaller

_images/cauchy.png

Example

Sequences generated by iterative methods for solving nonlinear equations often have this property

Hide code cell source
f = lambda x: -4*x**3+5*x+1
g = lambda x: -12*x**2+5

def newton(fun,grad,x0,tol=1e-6,maxiter=100,callback=None):
    '''Newton method for solving equation f(x)=0
    with given tolerance and number of iterations.
    Callback function is invoked at each iteration if given.
    '''
    for i in range(maxiter):
        x1 = x0 - fun(x0)/grad(x0)
        err = abs(x1-x0)
        if callback != None: callback(err=err,x0=x0,x1=x1,iter=i)
        if err<tol: break
        x0 = x1
    else:
        raise RuntimeError('Failed to converge in %d iterations'%maxiter)
    return (x0+x1)/2

def print_err(iter,err,**kwargs):
    x = kwargs['x'] if 'x' in kwargs.keys() else kwargs['x0']
    print('{:4d}:  x = {:14.8f}    diff = {:14.10f}'.format(iter,x,err))

print('Newton method')
res = newton(f,g,x0=123.45,callback=print_err,tol=1e-10)
Newton method
   0:  x =   123.45000000    diff =  41.1477443465
   1:  x =    82.30225565    diff =  27.4306976138
   2:  x =    54.87155804    diff =  18.2854286376
   3:  x =    36.58612940    diff =  12.1877193931
   4:  x =    24.39841001    diff =   8.1212701971
   5:  x =    16.27713981    diff =   5.4083058492
   6:  x =    10.86883396    diff =   3.5965889909
   7:  x =     7.27224497    diff =   2.3839931063
   8:  x =     4.88825187    diff =   1.5680338561
   9:  x =     3.32021801    diff =   1.0119341175
  10:  x =     2.30828389    diff =   0.6219125117
  11:  x =     1.68637138    diff =   0.3347943714
  12:  x =     1.35157701    diff =   0.1251775194
  13:  x =     1.22639949    diff =   0.0188751183
  14:  x =     1.20752437    diff =   0.0004173878
  15:  x =     1.20710698    diff =   0.0000002022
  16:  x =     1.20710678    diff =   0.0000000000

Definition

A sequence \(\{x_k\}\) is called Cauchy if

\[\forall \; \epsilon > 0, \;\; \exists \; N \in \mathbb{N} \; \mathrm{such\;that}\; \forall k, m \geqslant N \implies \| x_k - x_m \| < \epsilon\]

Alternatively

\[\forall \; \epsilon > 0, \;\; \exists \; N \in \mathbb{N} \; \mathrm{such\;that}\; \forall k \geqslant N, j \geqslant 1 \implies \| x_k - x_{k+j} \| < \epsilon\]

Cauchy sequences allow to establish convergence without finding the limit itself!

Fact

Every convergent sequence is Cauchy, and every Cauchy sequence is convergent.

Example

\(\{x_k\}\) defined by \(x_k = \alpha^k\) where \(\alpha \in (0, 1)\) is Cauchy

Proof:

For any \(k , j\) we have

\[|x_k - x_{k+j}| = |\alpha^k - \alpha^{k+j}| = \alpha^k |1 - \alpha^j| \leq \alpha^k\]

Fix \(\epsilon > 0\)

We can show that \(k > \log(\epsilon) / \log(\alpha) \implies \alpha^k < \epsilon\)

Hence any integer \(N > \log(\epsilon) / \log(\alpha)\) the sequence is Cauchy by definition.

\(\blacksquare\)

Subsequences#

Definition

A sequence \(\{x_{k_j} \}\) is called a subsequence of \(\{x_k\}\) if

  1. \(\{x_{k_j} \}\) is a subset of \(\{x_k\}\)

  2. \(\{k_j\}\) is sequence of strictly increasing natural numbers

Example

\[\{x_k\} = \{x_1, x_2, x_3, x_4, x_5, \ldots\} \]
\[\{x_{k_j}\} = \{x_2, x_4, x_6, x_8 \ldots\} \]

In this case

\[\{k_j\} = \{n_1, n_2, n_3, \ldots\} = \{2, 4, 6, \ldots\}\]

Example

\(\{\frac{1}{1}, \frac{1}{3}, \frac{1}{5},\ldots\}\) is a subsequence of \(\{\frac{1}{1}, \frac{1}{2}, \frac{1}{3}, \ldots\}\)

\(\{\frac{1}{1}, \frac{1}{2}, \frac{1}{3},\ldots\}\) is a subsequence of \(\{\frac{1}{1}, \frac{1}{2}, \frac{1}{3}, \ldots\}\)

\(\{\frac{1}{2}, \frac{1}{2}, \frac{1}{2},\ldots\}\) is not a subsequence of \(\{\frac{1}{1}, \frac{1}{2}, \frac{1}{3}, \ldots\}\)

Fact

If \(\{ x_k \}\) converges to \(x\) in \(\mathbb{R}^n\), then every subsequence of \(\{x_k\}\) also converges to \(x\)

_images/subseqconverg.png

Fig. 15 Convergence of subsequences#

Bolzano-Weierstrass theorem#

This leads us to the famous theorem, which will be part of the proof of the central Weierstrass extreme values theorem, which provides conditions for existence of a maximum and minimum of a function.

Fact: Bolzano-Weierstrass theorem

Every bounded sequence in \(\mathbb{R}^n\) has a convergent subsequence

Limits for functions#

Consider a univariate function \(f: \mathbb{R} \rightarrow \mathbb{R}\) written as \(f(x)\)

The domain of \(f\) is \(\mathbb{R}\), the range of \(f\) is a (possibly improper) subset of \(\mathbb{R}\)

[Spivak, 2006] (p. 90) provides the following informal definition of a limit: โ€œThe function \(f\) approaches the limit \(l\) near [the point \(x =\)] \(a\), if we can make \(f(x)\) as close as we like to \(l\) by requiring that \(x\) be sufficiently close to, but unequal to, \(a\).โ€

Definition

Given the function \(f: A \subset \mathbb{R}^n \to \mathbb{R}\), \(a \in \mathbb{R}\) is a limit of \(f(x)\) as \(x \to x_0 \in A\) if

\[\forall \, \epsilon > 0, \; \exists \, \delta >0 \; \text{ such that } 0 < \|x โˆ’ x_0\| < \delta \implies |f(x) โˆ’ a| < \epsilon\]

In this case we write

\[ \lim_{x \rightarrow x_0} f(x) = a \quad \text{ or } \quad f(x) \to a \text{ as } x \to x_0 \]
_images/func_lim_epsilondelta.png

Fig. 16 \(\lim_{x \to x_0} f(x) = L\) under the Cauchy definition#

The absolute value \(|\cdot|\) in this definition plays the role of general distance function \(\|\cdot\|\) which would be appropriate for the functions from \(\mathbb{R}^n\) to \(\mathbb{R}^m\).

Note

Observe that the definition of the limit for functions is very similar to the definition of the limit for sequences, except

  • \(N \in \mathbb{N}\) is replaced by \(\delta > 0\),

  • \(k>N\) is replaced by \(\forall x, \, 0 < \|x โˆ’ x_0\| < \delta\), and

  • \(|x_k - a| < \epsilon\) is replaced by \(|f(x) - a| < \epsilon\)

  • note that in the \(\epsilon\)-ball around \(x_0\) the point \(x_0\) is excluded

Heine and Cauchy definitions of the limit for functions#

As a side note, letโ€™s mention that there are two ways to define the limit of a function \(f\).

The definition of the limit for functions above also known as \(\epsilon\)-\(\delta\) definition is due to Augustin-Louis Cauchy.

The alternative definition based on the convergent sequences in the domain of a function is due to Eduard Heine

Definition (Limit of a function due to Heine)

Given the function \(f: A \subset \mathbb{R}^n \to \mathbb{R}\), \(a \in \mathbb{R}\) is a limit of \(f(x)\) as \(x \to x_0 \in A\) if

\[\forall \{x_k\} \in A \colon \lim_{k \to \infty} x_k = x_0 \in A \quad \implies \quad f(x_k) \to a \]

Fact

Cauchy and Heine definitions of the function limit are equivalent

Therefore we can use the same notation of the definition of the limit of a function

\[\lim_{x \to x_0} f(x) = a \quad \text{or} \quad f(x) \to a \; \text{as} \; x \to x_0\]

The structure of \(\epsilonโ€“\delta\) arguments#

Suppose that we want to attempt to show that \(\lim _{x \rightarrow a} f(x)=b\).

In order to do this, we need to show that, for any choice \(\epsilon>0\), there exists some \(\delta_{\epsilon}>0\) such that, whenever \(|x-a|<\delta_{\epsilon}\), it is the case that \(|f(x)-b|<\epsilon\).

  • We write \(\delta_{\epsilon}\) to indicate that the choice of \(\delta\) is allowed to vary with the choice of \(\epsilon\).

An often fruitful approach to the construction of a formal \(\epsilon\)-\(\delta\) limit argument is to proceed as follows:

  1. Start with the end-point that we need to establish: \(|f(x)-b|<\epsilon\).

  2. Use appropriate algebraic rules to rearrange this โ€œfinalโ€ inequality into something of the form \(|g(x)(x-a)|<\epsilon\).

  3. This new version of the required inequality can be rewritten as \(|g(x)||(x-a)|<\epsilon\).

  4. If \(g(x)=g\), a constant that does not vary with \(x\), then this inequality becomes \(|g||x-a|<\epsilon\). In such cases, we must have \(|x-a|<\frac{\epsilon}{|g|}\), so that an appropriate choice of \(\delta_{\epsilon}\) is \(\delta_{\epsilon}=\frac{\epsilon}{|g|}\).

  5. If \(g(x)\) does vary with \(x\), then we have to work a little bit harder.

Suppose that \(g(x)\) does vary with \(x\). How might we proceed in that case? One possibility is to see if we can find a restriction on the range of values for \(\delta\) that we consider that will allow us to place an upper bound on the value taken by \(|g(x)|\).

In other words, we try and find some restriction on \(\delta\) that will ensure that \(|g(x)|<G\) for some finite \(G>0\). The type of restriction on the values of \(\delta\) that you choose would ideally look something like \(\delta<D\), for some fixed real number \(D>0\). (The reason for this is that it is typically small deviations of \(x\) from a that will cause us problems rather than large deviations of \(x\) from a.)

If \(0<|g(x)|<G\) whenever \(0<\delta<D\), then we have

\[ |g(x)||x-a|<\epsilon \Longleftarrow G|x-a|<\epsilon \Longleftrightarrow|x-a|<\frac{\epsilon}{G} . \]

In such cases, an appropriate choice of \(\delta_{\epsilon}\) is \(\delta_{\epsilon}=\min \left\{\frac{\epsilon}{G}, D\right\}\).

Properties of the limits#

  • In practice, we would like to be able to find at least some limits without having to resort to the formal โ€œepsilon-deltaโ€ arguments that define them. The following rules can sometimes assist us with this.

Fact

Let \(c \in \mathbb{R}\) be a fixed constant, \(a \in \mathbb{R}, \alpha \in \mathbb{R}, \beta \in \mathbb{R}, n \in \mathbb{N}\), \(f: \mathbb{R} \longrightarrow \mathbb{R}\) be a function for which \(\lim _{x \rightarrow a} f(x)=\alpha\), and \(g: \mathbb{R} \longrightarrow \mathbb{R}\) be a function for which \(\lim _{x \rightarrow a} g(x)=\beta\). The following rules apply for limits:

  • \(\lim _{x \rightarrow a} c=c\) for any \(a \in \mathbb{R}\).

  • \(\lim _{x \rightarrow a}(c f(x))=c\left(\lim _{x \rightarrow a} f(x)\right)=c \alpha\).

  • \(\lim _{x \rightarrow a}(f(x)+g(x))=\left(\lim _{x \rightarrow a} f(x)\right)+\left(\lim _{x \rightarrow a} g(x)\right)=\alpha+\beta\).

  • \(\lim _{x \rightarrow a}(f(x) g(x))=\left(\lim _{x \rightarrow a} f(x)\right)\left(\lim _{x \rightarrow a} g(x)\right)=\alpha \beta\).

  • \(\lim _{x \rightarrow a}\frac{f(x)}{g(x)}=\frac{\lim _{x \rightarrow a} f(x)}{\lim _{x \rightarrow a} g(x)}=\frac{\alpha}{\beta}\) whenever \(\beta \neq 0\).

  • \(\lim _{x \rightarrow a} \sqrt[n]{f(x)}=\sqrt[n]{\lim _{x \rightarrow a} f(x)}=\sqrt[n]{\alpha}\) whenever \(\sqrt[n]{\alpha}\) is defined.

  • \(\lim _{x \rightarrow a} \ln f(x)=\ln \lim _{x \rightarrow a} f(x)=\ln \alpha\) whenever \(\ln \alpha\) is defined.

  • \(\lim _{x \rightarrow a} e^{f(x)}=\exp\big(\lim _{x \rightarrow a} f(x)\big)=e^{\alpha}\)

  • Lโ€™Hopitalโ€™s rule: in the indeterminate cases $\frac{0}{0}$ and $\frac{\plusminus\infty}{\plusminus\infty}$ \(\lim _{x \rightarrow a}\frac{f(x)}{g(x)}=\lim _{x \rightarrow a}\frac{f'(x)}{g'(x)}\).

Example

  • This example is drawn from Willis Lauritz Peterson of the University of Utah.

  • Consider the mapping \(f: \mathbb{R} \longrightarrow \mathbb{R}\) defined by \(f(x)=7 x-4\). We want to show that \(\lim _{x \rightarrow 2} f(x)=10\).

  • Note that \(|f(x)-10|=|7 x-4-10|=|7 x-14|=|7(x-2)|=\) \(|7||x-2|=7|x-2|\).

  • We require \(|f(x)-10|<\epsilon\). Note that

\[ |f(x)-10|<\epsilon \Longleftrightarrow 7|x-2|<\epsilon \Longleftrightarrow|x-2|<\frac{\epsilon}{7} \]
  • Thus, for any \(\epsilon>0\), if \(\delta_{\epsilon}=\frac{\epsilon}{7}\), then \(|f(x)-10|<\epsilon\) whenever \(|x-2|<\delta_{\epsilon}\).

Example

  • Consider the mapping \(f: \mathbb{R} \longrightarrow \mathbb{R}\) defined by \(f(x)=x^{2}\). We want to show that \(\lim _{x \rightarrow 2} f(x)=4\).

  • Note that \(|f(x)-4|=\left|x^{2}-4\right|=|(x+2)(x-2)|=|x+2||x-2|\).

  • Suppose that \(|x-2|<\delta\), which in turn means that \((2-\delta)<x<(2+\delta)\). Thus we have \((4-\delta)<(x+2)<(4+\delta)\).

  • Let us restrict attention to \(\delta \in(0,1)\). This gives us \(3<(x+2)<5\), so that \(|x+2|<5\).

  • Thus, when \(|x-2|<\delta\) and \(\delta \in(0,1)\), we have \(|f(x)-4|=|x+2||x-2|<5 \delta\).

  • We require \(|f(x)-4|<\epsilon\). One way to ensure this is to set \(\delta_{\epsilon}=\min \left(1, \frac{\epsilon}{5}\right)\).

Example

This example is drawn from Willis Lauritz Peterson of the University of Utah.

Consider the mapping \(f: \mathbb{R} \rightarrow \mathbb{R}\) defined by \(f(x) = x^2 โˆ’ 3x + 1\).

We want to show that \(lim_{x \rightarrow 2} f(x ) = โˆ’1\).

  • Note that \(|f(x) โˆ’ (โˆ’1)| = |x^2 โˆ’ 3x + 1 + 1| = |x^2 โˆ’ 3x + 2| = |(x โˆ’ 1)(x โˆ’ 2)| = |x โˆ’ 1||x โˆ’ 2|\).

  • Suppose that \(|x โˆ’ 2| < \delta\), which in turn means that \((2 โˆ’ \delta) < x < (2 + \delta)\). Thus we have \((1 โˆ’ \delta) < (x โˆ’ 1) < (1 + \delta)\).

  • Let us restrict attention to \(\delta \in (0, 1)\). This gives us \(0 < (x โˆ’ 1) < 2\), so that \(|x โˆ’ 1| < 2\).

  • Thus, when \(|x โˆ’ 2| < \delta\) and \(\delta \in (0, 1)\), we have \(|f(x) โˆ’ (โˆ’1)| = |x โˆ’ 1||x โˆ’ 2| < 2\delta\).

  • We require \(|f(x) โˆ’ (โˆ’1)| < \epsilon\). One way to ensure this is to set \(\delta_\epsilon = \min(1, \frac{\epsilon}{2} )\).

Example

Limits can sometimes exist even when the function being considered is not so well behaved. One such example is provided by [Spivak, 2006] (pp. 91โ€“92). It involves the use of a trigonometric function.

The example involves the function \(f: \mathbb{R} \setminus 0 \rightarrow \mathbb{R}\) that is defined by \(f(x) = x sin ( \frac{1}{x})\).

Clearly this function is not defined when \(x = 0\). Furthermore, it can be shown that \(\lim_{x \rightarrow 0} sin ( \frac{1}{x})\) does not exist. However, it can also be shown that \(lim_{x \rightarrow 0} f(x) = 0\).

The reason for this is that \(sin (\theta) \in [โˆ’1, 1]\) for all \(\theta \in \mathbb{R}\). Thus \(sin ( \frac{1}{x} )\) is bounded above and below by finite numbers as \(x \rightarrow 0\). This allows the \(x\) component of \(x sin (\frac{1}{x})\) to dominate as \(x \rightarrow 0\).

Example

Limits do not always exist. In this example, we consider a case in which the limit of a function as \(x\) approaches a particular point does not exist.

Consider the mapping \(f: \mathbb{R} \rightarrow \mathbb{R}\) defined by

\[\begin{split} f(x ) = \begin{cases} 1 \quad \text{ if } x > 5; \\ 0 \quad \text{ if } x < 5 \end{cases} \end{split}\]

We want to show that \(\lim_{x \rightarrow 5} f(x)\) does not exist.

Suppose that the limit does exist. Denote the limit by \(l\). Recall that \(d (x, y ) = \{ (y โˆ’ x )2 \}^{\frac{1}{2}} = |y โˆ’ x |\). Let \(\delta > 0\).

If \(|5 โˆ’ x | < \delta\), then \(5 โˆ’ \delta < x < 5 + \delta\), so that \(x \in (5 โˆ’ \delta, 5 + \delta)\).

Note that \(x \in (5 โˆ’ \delta, 5 + \delta) = (5 โˆ’ \delta, 5) \cup [5, 5 + \delta)\), where \((5 โˆ’ \delta, 5) \ne \varnothing\) and \([5, 5 + \delta) \ne \varnothing\).

Thus we know the following:

  1. There exist some \(x \in (5 โˆ’ \delta, 5) \subseteq (5 โˆ’ \delta, 5 + \delta)\), so that \(f(x) = 0\) for some \(x \in (5 โˆ’ \delta, 5 + \delta)\).

  2. There exist some \(x \in [5, 5 + \delta) \subseteq (5 โˆ’ \delta, 5 + \delta)\), so that \(f(x) = 1\) for some \(x \in (5 โˆ’ \delta, 5 + \delta)\).

  3. The image set under \(f\) for \((5 โˆ’ \delta, 5 + \delta)\) is \(f ((5 โˆ’ \delta, 5 + \delta)) = \{ f(x) : x \in (5 โˆ’ \delta, 5 + \delta) \} = \{ 0, 1 \} \subseteq [0, 1]\).

  4. Note that for any choice of \(\delta > 0\), \([0, 1]\) is the smallest connected interval that contains the image set \(f ((5 โˆ’ \delta, 5 + \delta)) = \{ 0, 1 \}\).

Hence, in order for the limit to exist, we need \([0, 1] \subseteq (l โˆ’ \epsilon, l + \epsilon)\) for all \(\epsilon > 0\). But for \(\epsilon \in (0, \frac{1}{2})\), there is no \(l \in \mathbb{R}\) for which this is the case. Thus we can conclude that \(\lim_{x \rightarrow 5} f(x)\) does not exist.

Continuity of functions#

Fundamental property of functions, required not only to establish existence of optima and optimizers, but also roots, fixed points, etc.

Definition

Let \(f \colon A \in \mathbb{R}^n \to \mathbb{R}\)

\(f\) is called continuous at \(x \in A\) if

\[\forall \{x_k\} \in A \colon \lim_{k \to \infty} x_k = x \in A \quad \implies \quad f(x_k) \to f(x) \]

Note that the definition requires that

  • \(f(x_k)\) converges for each choice of \(x_k \to x\),

  • the limit is always the same, and that limit is \(f(x)\)

Definition

\(f: A \to \mathbb{R}\) is called continuous if it is continuous at every \(x \in A\)

_images/cont_func.png

Fig. 17 Continuous function#

Example

Function \(f(x) = \exp(x)\) is continuous at \(x=0\)

Proof:

Consider any sequence \(\{x_k\}\) which converges to \(0\)

We want to show that for any \(\epsilon>0\) there exists \(N\) such that \(k \geq N \implies |f(x_k) - f(0)| < \epsilon\). We have

\[\begin{split}\begin{array}{l} |f(x_k) - f(0)| = |\exp(x_k) - 1| < \epsilon \quad \iff \\ \exp(x_k) - 1 < \epsilon \; \text{ and } \; \exp(x_k) - 1 > -\epsilon \quad \iff \\ x_k < \ln(1+\epsilon) \; \text{ and } \; x_k > \ln(1-\epsilon) \quad \Longleftarrow \\ | x_k - 0 | < \min \big(\ln(1+\epsilon),\ln(1-\epsilon) \big) = \ln(1-\epsilon) \end{array}\end{split}\]

Because due to \(x_k \to x\) for any \(\epsilon' = \ln(1-\epsilon)\) there exists \(N\) such that \(k \geq N \implies |x_k - 0| < \epsilon'\), we have \(f(x_k) \to f(x)\) by definition. Thus, \(f\) is continuous at \(x=0\).

\(\blacksquare\)

Fact

Some functions known to be continuous on their domains:

  • \(f: x \mapsto x^\alpha\)

  • \(f: x \mapsto |x|\)

  • \(f: x \mapsto \log(x)\)

  • \(f: x \mapsto \exp(x)\)

  • \(f: x \mapsto \sin(x)\)

  • \(f: x \mapsto \cos(x)\)

Types of discontinuities#

_images/4-types-of-discontinuity.png

Fig. 18 4 common types of discontinuity#

Example

The indicator function \(x \mapsto \mathbb{1}\{x > 0\}\) has a jump discontinuity at \(0\).

Fact

Let \(f\) and \(g\) be functions and let \(\alpha \in \mathbb{R}\)

  1. If \(f\) and \(g\) are continuous at \(x\) then so is \(f + g\), where

\[(f + g)(x) := f(x) + g(x)\]
  1. If \(f\) is continuous at \(x\) then so is \(\alpha f\), where

\[(\alpha f)(x) := \alpha f(x)\]
  1. If \(f\) and \(g\) are continuous at \(x\) and real valued then so is \(f \circ g\), where

\[(f \circ g)(x) := f(x) \cdot g(x)\]

In the latter case, if in addition \(g(x) \ne 0\), then \(f/g\) is also continuous.

As a result, set of continuous functions is โ€œclosedโ€ under elementary arithmetic operations

Example

The function \(f \colon \mathbb{R} \to \mathbb{R}\) defined by

\[f(x) = \frac{\exp(x) + \sin(x)}{2 + \cos(x)} + \frac{x^4}{2} - \frac{\cos^3(x)}{8!}\]

is continuous (we just have to be careful to ensure that denominator is not zero โ€“ which it is not for all \(x\in\mathbb{R}\))

Example

An example of oscillating discontinuity is the function \(f(x) = \sin(1/x)\) which is discontinuous at \(x=0\).

Video to illustrate this function