πŸ“– Fundamentals of optimization#

⏱ | words

Screencast of the end of the Monday lecture that I didn’t cover

References and additional materials
_images/simon1994.png

[Simon and Blume, 1994]

Chapters 12.2, 12.3, 12.4, 12.5, 13.4, 29.2, 30.1

_images/sundaram1996.png

[Sundaram, 1996]

Chapters 1.2.6, 1.2.7, 1.2.8, 1.2.9, 1.2.10, 1.4.1, Section 3

Basic concepts#

Definition

Consider a function \(f: A \to \mathbb{R}\) where \(A \subset \mathbb{R}^n\). A point \(x^* \in A\) is called a

  • maximizer of \(f\) on \(A\) if \(f(x^*) \geq f(x)\) for all \(x \in A\)

  • minimizer of \(f\) on \(A\) if \(f(x^*) \leq f(x)\) for all \(x \in A\)

  • Function \(f\) achieves its maximum value (maximum) at maximizers

  • Function \(f\) achieves its minimal value (minimum) at minimizers

We’ll define these concepts more formally later. For now, note the complexity of the simple idea of finding the maximum or minimum of a function and where it is achieved. The issues include:

  • existence of maximizers/minimizers

  • uniqueness of maximizers/minimizers

  • characterization of maximizers/minimizers (how to find, how to check if a real one is found)

Hide code cell content
from myst_nb import glue
import matplotlib.pyplot as plt
import numpy as np

def subplots():
    "Custom subplots with axes through the origin"
    fig, ax = plt.subplots()
    # Set the axes through the origin
    for spine in ['left', 'bottom']:
        ax.spines[spine].set_position('zero')
    for spine in ['right', 'top']:
        ax.spines[spine].set_color('none')
    return fig, ax

xmin, xmax = 2, 8
xgrid = np.linspace(xmin, xmax, 200)
f = lambda x: -(x - 4)**2 + 10

xstar = 4.0
fig, ax = subplots()
ax.plot([xstar], [0],  'ro', alpha=0.6)
ax.set_ylim(-12, 15)
ax.set_xlim(-1, 10)
ax.set_xticks([2, xstar, 6, 8, 10])
ax.set_xticklabels([2, r'$x^*=4$', 6, 8, 10], fontsize=14)
ax.plot(xgrid, f(xgrid), 'b-', lw=2, alpha=0.8, label=r'$f(x) = -(x-4)^2+10$')
ax.plot((xstar, xstar), (0, f(xstar)), 'k--', lw=1, alpha=0.8)
#ax.legend(frameon=False, loc='upper right', fontsize=16)
glue("fig_maximizer", fig, display=False)

xstar = xmax
fig, ax = subplots()
ax.plot([xstar], [0],  'ro', alpha=0.6)
ax.text(xstar, 1, r'$x^{**}=8$', fontsize=16)
ax.set_ylim(-12, 15)
ax.set_xlim(-1, 10)
ax.set_xticks([2, 4, 6, 10])
ax.set_xticklabels([2, 4, 6, 10], fontsize=14)
ax.plot(xgrid, f(xgrid), 'b-', lw=2, alpha=0.8, label=r'$f(x) = -(x-4)^2+10$')
ax.plot((xstar, xstar), (0, f(xstar)), 'k--', lw=1, alpha=0.8)
#ax.legend(frameon=False, loc='upper right', fontsize=16)
glue("fig_minimizer", fig, display=False)

xmin, xmax = 0, 1
xgrid1 = np.linspace(xmin, xmax, 100)
xgrid2 = np.linspace(xmax, 2, 10)

fig, ax = subplots()
ax.set_ylim(0, 1.1)
ax.set_xlim(-0.0, 2)
func_string = r'$f(x) = x^2$ if $x < 1$ else $f(x) = 0.5$'
ax.plot(xgrid1, xgrid1**2, 'b-', lw=3, label=func_string)
ax.plot(xgrid2, -0.25 * xgrid2 + 0.5, 'b-', lw=3)
ax.plot(xgrid1[-1], xgrid1[-1]**2, marker='o', markerfacecolor='white', markeredgewidth=2, markersize=6, color='b')
ax.plot(xgrid2[0], -0.25 * xgrid2[0] + 0.5, marker='.', markerfacecolor='b', markeredgewidth=2, markersize=10, color='b')
glue("fig_none", fig, display=False)

fig, ax = subplots()
ax.set_ylim(-2.1, 20.1)
ax.set_xlim(-0.0, 2)
func_string = r'$f(x) = 1/1-x$ if $x < 1$ else $f(x) = -3x$'
ax.plot(xgrid1, 1/(1-xgrid1), 'b-', lw=3, label=func_string)
ax.plot(xgrid2, -3 * xgrid2 + 6, 'b-', lw=3)
ax.plot(xgrid2[0], -3 * xgrid2[0] + 6, marker='.', markerfacecolor='b', markeredgewidth=2, markersize=10, color='b')
glue("fig_none2", fig, display=False)
/tmp/ipykernel_1868/4109180995.py:62: RuntimeWarning: divide by zero encountered in divide
  ax.plot(xgrid1, 1/(1-xgrid1), 'b-', lw=3, label=func_string)
_images/7a0f2f51c8ed9349abd828d5b8ac95cf16c87f7c4cfb571e872fe73a7e035a8f.png _images/d98a8c84e008bddd3bde3670f2416aa4f56b07eb046ac7cf5dbd20c8e9121bc2.png _images/7387af011713a5e4ead3beadeab2970dcd1192a8752f758a5055efed932cd970.png _images/f5ae797ef294ee0a23c231f5cf6163c6bce8c674ce9e100e584efe23b78f8328.png

Example

Let

  • \(f(x) = -(x-4)^2 + 10\)

  • \(a = 2\) and \(b=8\)

Then

  • \(x^* = 4\) is a maximizer of \(f\) on \([2, 8]\)

  • \(x^{**} = 8\) is a minimizer of \(f\) on \([2, 8]\)

_images/7a0f2f51c8ed9349abd828d5b8ac95cf16c87f7c4cfb571e872fe73a7e035a8f.png

Fig. 19 Maximizer on \([a, b] = [2, 8]\) is \(x^* = 4\)#

_images/d98a8c84e008bddd3bde3670f2416aa4f56b07eb046ac7cf5dbd20c8e9121bc2.png

Fig. 20 Minimizer on \([a, b] = [2, 8]\) is \(x^{**} = 8\)#

The set of maximizers/minimizers can be

  • empty

  • a singleton (contains one element)

  • finite (contains a number of elements)

  • infinite (contains infinitely many elements)

Example: infinite maximizers

\(f \colon [0, 1] \to \mathbb{R}\) defined by \(f(x) =1\)
has infinitely many maximizers and minimizers on \([0, 1]\)

Example: no maximizers

The following function has no maximizers on \([0, 2]\)

\[\begin{split} f(x) = \begin{cases} 1/(x-1) & \text{ if } x < 1 \\ -3x +6& \text{ otherwise} \end{cases} \end{split}\]
_images/f5ae797ef294ee0a23c231f5cf6163c6bce8c674ce9e100e584efe23b78f8328.png

Fig. 21 No maximizer on \([0, 2]\)#

Example: no maximizers

The following function has no maximizers on \([0, 2]\)

\[\begin{split} f(x) = \begin{cases} x^2 & \text{ if } x < 1 \\ -x/4 +1/2& \text{ otherwise} \end{cases} \end{split}\]
_images/7387af011713a5e4ead3beadeab2970dcd1192a8752f758a5055efed932cd970.png

Fig. 22 No maximizer on \([0, 2]\)#

\(\mathbb{R}^1\) univariate case#

Let \(f \colon [a, b] \to \mathbb{R}\) be a differentiable (smooth) function

  • Domain now is \([a, b]\) is all \(x\) with \(a \leq x \leq b\) in \(\mathbb{R}\)

  • \(f\) takes \(x \in [a, b]\) and returns number \(f(x)\)

  • derivative \(f'(x)\) exists for all \(x\) with \(a < x < b\)

Definition

Point \(x \in \mathbb{R}\) is called interior to \([a, b]\) if \(a < x < b\)

The set of all interior points is written \((a, b)\)

We refer to \(x^* \in [a, b]\) as

  • interior maximizer if both a maximizer and interior

  • interior minimizer if both a minimizer and interior

Definition

A stationary point of \(f\) on \([a, b]\) is an interior point \(x\) with \(f'(x) = 0\)

Fact

If \(f\) is differentiable and \(x^*\) is either an interior minimizer or an interior maximizer of \(f\) on \([a, b]\), then \(x^*\) is stationary

\[ x^* \text{ is interior maximizer/minimizer } \implies x^* \text{ is stationary} \]

Converse is not true: not all stationary points are maximizers/minimizers!

_images/stationary.png

Fig. 23 Interior maximizers/minimizers are stationary points, but not all stationary points are maximizers!#

Fact

Previous fact \(\implies\)

\(\implies\) any interior maximizer is stationary
\(\implies\) set of interior maximizers \(\subset\) set of stationary points
\(\implies\) maximizers \(\subset\) stationary points \(\cup \{a,b\}\)

Algorithm for univariate problems

  1. Locate stationary points

  2. Evaluate \(y = f(x)\) for each stationary \(x\) and for \(a\), \(b\)

  3. Pick the point giving largest \(y\) value

Minimization: same idea

Example

Let’s solve

\[ \max_{-2 \leq x \leq 5} f(x) \quad \text{where} \quad f(x) = x^3 - 6x^2 + 4x + 8 \]

Steps

  • Differentiate to get \(f'(x) = 3x^2 - 12x + 4\)

  • Solve \(3x^2 - 12x + 4 = 0\) to get stationary \(x\)

  • Discard any stationary points outside \([-2, 5]\)

  • Eval \(f\) at remaining points plus end points \(-2\) and \(5\)

  • Pick point giving largest value

from sympy import *
x = Symbol('x')
points = [-2, 5]
f = x**3 - 6*x**2 + 4*x + 8
fp = diff(f, x)
spoints = solve(fp, x)
points.extend(spoints)
v = [f.subs(x, c).evalf() for c in points]
maximizer = points[v.index(max(v))]
print("Maximizer =", str(maximizer),'=',maximizer.evalf())
Maximizer = 2 - 2*sqrt(6)/3 = 0.367006838144548
Hide code cell source
import matplotlib.pyplot as plt
import numpy as np
xgrid = np.linspace(-2, 5, 200)
f = lambda x: x**3 - 6*x**2 + 4*x + 8
fig, ax = plt.subplots()
ax.plot(xgrid, f(xgrid), 'g-', lw=2, alpha=0.8)
ax.plot((maximizer, maximizer), (f(xgrid[0]), f(maximizer)), 'g--', lw=1, alpha=0.8)
ax.grid()

Function shape and sufficiency of stationary point condition#

Preview into what we will study in detail for multivariate functions

Q: When is condition \(f'(x^*) = 0\) sufficient for \(x^*\) to be a maximizer?

A: When \(f\) is concave!

Hide code cell source
xgrid = np.linspace(-2, 2, 200)
f = lambda x: - 8*x**2 + 8
fig, ax = subplots()
ax.set_ylim(-15, 10)
ax.set_yticks([-15, -10, -5, 5, 10])
ax.set_xticks([-1, 0, 1])
ax.plot(xgrid, f(xgrid), 'b-', lw=2, alpha=0.8, label='concave $f$')
ax.legend(frameon=False, loc='lower right')
plt.show()

We will come back to shape conditions and full definitions of concavity/convexity later in the course.

Sufficient conditions for concavity in one dimension

Let \(f \colon [a, b] \to \mathbb{R}\)

  • If \(f''(x) \leq 0\) for all \(x \in (a, b)\) then \(f\) is concave on \((a, b)\)

  • If \(f''(x) < 0\) for all \(x \in (a, b)\) then \(f\) is strictly concave on \((a, b)\)

Example

  • \(f(x) = a x + b\) is concave on \(\mathbb{R}\) but not strictly

  • \(f(x) = \log(x)\) is strictly concave on \((0, \infty)\)

Fact

For maximizers:

  • If \(f \colon [a,b] \to \mathbb{R}\) is concave and \(x^* \in (a, b)\) is stationary then \(x^*\) is a maximizer

  • If, in addition, \(f\) is strictly concave, then \(x^*\) is the unique maximizer

Hide code cell content
p = 2.0
w = 1.0
alpha = 0.6
xstar = (alpha * p / w)**(1/(1 - alpha))
xgrid = np.linspace(0, 4, 200)
f = lambda x: x**alpha
pi = lambda x: p * f(x) - w * x
fig, ax = subplots()
ax.set_xticks([1,xstar,2,3])
ax.set_xticklabels(['1',r'$\ell^*$','2','3'], fontsize=14)
ax.plot(xgrid, pi(xgrid), 'b-', lw=2, alpha=0.8, label=r'$\pi(\ell) = p\ell^{\alpha} - w\ell$')
ax.plot((xstar, xstar), (0, pi(xstar)), 'g--', lw=1, alpha=0.8)
#ax.legend(frameon=False, loc='upper right', fontsize=16)
glue("fig_price_taker", fig, display=False)
glue("ellstar", round(xstar,4))

Example

A price taking firm faces output price \(p > 0\), input price \(w >0\)

Maximize profits with respect to input \(\ell\)

\[ \max_{\ell \ge 0} \pi(\ell) = p f(\ell) - w \ell, \]

where the production technology is given by

\[ f(\ell) = \ell^{\alpha}, 0 < \alpha < 1. \]

Evidently

\[ \pi'(\ell) = \alpha p \ell^{\alpha - 1} - w, \]

so unique stationary point is

\[ \ell^* = (\alpha p/w)^{1/(1 - \alpha)} \]

Moreover,

\[ \pi''(\ell) = \alpha (\alpha - 1) p \ell^{\alpha - 2} < 0 \]

for all \(\ell \ge 0\) so \(\ell^*\) is unique maximizer.

_images/d01c19d8581b5053ad3691417af8bfb5b8c54d65c5e929587889c64e94841e8a.png

Fig. 24 Profit maximization with \(p=2\), \(w=1\), \(\alpha=0.6\), \(\ell^*=\)1.5774#

Local vs global minimizers and maximizers#

So far we have been talking about global maximizers and minimizers:

  • definition in the very beginning

  • algorithm above for univariate function on closed interval

It is important to keep in mind the desctinction between the global and local optima.

Definition

Consider a function \(f: A \to \mathbb{R}\) where \(A \subset \mathbb{R}^n\). A point \(x^* \in A\) is called a

  • local maximizer of \(f\) if \(\exists \delta>0\) such that \(f(x^*) \geq f(x)\) for all \(x \in B_\delta(x^*) \cap A\)

  • local minimizer of \(f\) if \(\exists \delta>0\) such taht \(f(x^*) \leq f(x)\) for all \(x \in B_\delta(x^*) \cap A\)

In other words, a local optimizer is a point that is a maximizer/minimizer in some neighborhood of it and not necesserily in the whole function domain.

Example

_images/stationary1.png
_images/stationary2.png

So far we have seen:

  • (global) maximizers and minimizers may or may not exist

  • stationary points are important but are neither necessary nor sufficient for finding (global) maximizers/minimizers

  • convexity and concavity makes stationary points sufficient for maximizers/minimizers

  • we have conditions for local maximizers/minimizers only, but in the univariate case can check the end points of th interval explicitly

Need rigorous theory of existance of optima!

Bounded sets#

Note

From here on transition to the multidimensional case, everything is on \(\mathbb{R}^n\) (Cartesian product of real lines \(\mathbb{R}\))

Definition

A set \(A \subset \mathbb{R}^N\) called bounded if

\[\exists \, M \in \mathbb{R} \; \mathrm{such\;that} \; \|x\| \leq M, \quad \forall \; x \in A\]
_images/bounded-set.png

Fig. 25 Bounded set in \(\mathbb{R}^2\)#

Example

Every finite subset \(A\) of \(\mathbb{R}\) is bounded

Indeed, set \(M := \max \{ |a| : a \in A \}\). Then \(A\) is bounded by definition

Example

The set \(\{(x,y)\in\mathbb{R}^2\colon xy \leqslant 1 \}\) is unbounded

Proof:

For any \(M \in \mathbb{R}\) consider the point with coordinates \(x=1/M\) and \(y=M\). This point belongs to the set because it satisfies \(xy=1\), yet

\[\| (x,y) \| = \sqrt{x^2 + y^2} = \sqrt{\frac{1}{M^2} + M^2} > \sqrt{ M^2} = M\]

Therefore, for any candidate bound \(M\) we can find points in the set that are further away from the origin than \(M\).

Example

Interval \((a, b)\) is bounded for any \(a, b \in \mathbb{R}\)

Proof:

Let \(M := \max\{ |a|, |b| \}\). We have to show that each \(x \in (a, b)\) satisfies \(|x| \leq M\)

\[x \in (a, b) \iff a < x < b\]

Cases:

  1. \(0 \le a \le b \implies x > 0, x = |x| < |b| = b = \max\{|a|,|b|\}\)

  2. \(a \le b \le 0 \implies a < x < 0, |x|= -x < -a = |a| = \max\{|a|,|b|\}\)

  3. \(a \le 0 \le b \implies\)

\[\begin{split}\begin{cases} |x|<|b|, x \ge 0\\ |x|<|a|, x < 0 \end{cases} \implies |x|< \max\{|a|,|b|\} \text{ from 1. and 2.}\end{split}\]

Fact

If \(A\) and \(B\) are bounded sets then so is \(A \cup B\)

Open and closed sets#

Definition

Let \(G \subset \mathbb{R}^N\). We call \(x \in G\) interior to \(G\) if \(\exists \; \epsilon > 0\) with \(B_\epsilon(x) \subset G\)

_images/interior.png

Fig. 26 Loosely speaking, interior means β€œnot on the boundary”#

Example

Set of real numbers \(\mathbb{R}\) is open because every point is obviously interior.

Example

_images/peeled_tomato.png

Example

If \(G = (a, b)\) for some \(a < b\), then any \(x \in (a, b)\) is interior

_images/x_interior.png

Proof:

Fix any \(a < b\) and any \(x \in (a, b)\)

Let \(\epsilon := \min\{x - a, b - x\}\)

If \(y \in B_\epsilon(x)\) then \(y < b\) because

\[y = y + x - x \leq |y - x| + x < \epsilon + x \leq b - x + x = b\]

Exercise: Show \(y \in B_\epsilon(x) \implies y > a\)

Example

If \(G = [-1, 1]\), then \(1\) is not interior

_images/not_interior.png

Proof:

Intuitively, any \(\epsilon\)-ball centered on \(1\) will contain points \(> 1\)

More formally, pick any \(\epsilon > 0\) and consider \(B_\epsilon(1)\)

There exists a \(y \in B_\epsilon(1)\) such that \(y \notin [-1, 1]\)

For example, consider the point \(y := 1 + \epsilon/2\)

Exercise: Check this point: lies in \(B_\epsilon(1)\) but not in \([-1, 1]\)

Definition

A set \(G\subset \mathbb{R}^N\) is called open if all of its points are interior

Example

Open sets:

  • any open interval \((a,b) \subset \mathbb{R}\), since we showed all points are interior

  • any open ball \(B_\epsilon(a) = x \in \mathbb{R}^N : \|x - a \| < \epsilon\)

  • \(\mathbb{R}^N\) itself satisfies the defintion of open set

Sets that are not open

  • \((a,b]\) because \(b\) is not interior

  • \([a,b)\) because \(a\) is not interior

Closed Sets#

Definition

A set \(F \subset \mathbb{R}^N\) is called closed if every convergent sequence in \(F\) converges to a point in \(F\)

Rephrased: If \(\{x_n\} \subset F\) and \(x_n \to x\) for some \(x \in \mathbb{R}^N\), then \(x \in F\)

Example

All of \(\mathbb{R}^N\) is closed \(\Leftarrow\) every sequence converging to a point in \(\mathbb{R}^N\) converges to a point in \(\mathbb{R}^N\)!

Note that \(\mathbb{R}\) is both open and closed 😲 The empty set \(\varnothing\) is the only other example

Example

If \((-1, 1) \subset \mathbb{R}\) is not closed

Proof:

True because

  1. \(x_n := 1-1/n\) is a sequence in \((-1, 1)\) converging to \(1\),

  2. and yet \(1 \notin (-1, 1)\)

Example

If \(F = [a, b] \subset \mathbb{R}\) then \(F\) is closed in \(\mathbb{R}\)

Proof:

Take any sequence \(\{x_n\}\) such that

  • \(x_n \in F\) for all \(n\)

  • \(x_n \to x\) for some \(x \in \mathbb{R}\)

We claim that \(x \in F\)

Recall that (weak) inequalities are preserved under limits:

  • \(x_n \leq b\) for all \(n\) and \(x_n \to x\), so \(x \leq b\)

  • \(x_n \geq a\) for all \(n\) and \(x_n \to x\), so \(x \geq a\)

therefore \(x \in [a, b] =: F\)

Properties of Open and Closed Sets#

Fact

\(G \subset \mathbb{R}^N\) is open \(\iff \; G^c\) is closed

Fact

Any singleton \(\{ x \} \subset \mathbb{R}^N\) is closed

Fact

  1. Any finite union of open sets is open

  2. Any finite intersection of closed sets is closed

But be careful:

  • An infinite intersection of open sets is not necessarily open

  • An infinite union of closed sets is not necessarily closed

For example, if \(G_n := (-1/n, 1/n)\), then \(\cap_{n \in \mathbb{N}} G_n = \{0\} \)

Now a very important definition:

Definition

Set \(X\) is called compact if it is both closed and bounded.

Example

  • Any closed interval \([a, b]\) is a compact

  • Any hypercube \([a_1, b_1] \times \ldots \times [a_N, b_N]\) is compact

Example

Closed but not compact set is for example

\[ \{ (x, y) \in \mathbb{R}^2 \colon x \geqslant 0, y \geqslant 0 \; \text{and} \; xy \leqslant 1 \} \]

Suprema and Infima (\(\mathrm{sup}\) + \(\mathrm{inf}\))#

  • Always well defined

  • Agree with max and min when the latter exist

Definition

Let \(A \subset \mathbb{R}\) (we restrict attention to subsets of real line for now)

A number \(u \in \mathbb{R}\) is called an upper bound of \(A\) if

\[a \leqslant u \quad \text{for all} \quad a \in A\]

Example

If \(A = (0, 1)\) then 10 is an upper bound of \(A\) β€” indeed, every element of \((0, 1)\) is \(\leqslant 10\)

If \(A = (0, 1)\) then 1 is an upper bound of \(A\) β€” indeed, every element of \((0, 1)\) is \(\leqslant 1\)

If \(A = (0, 1)\) then \(0.75\) is not an upper bound of \(A\)

Definition

Let \(U(A)\) denote set of all upper bounds of \(A\).

A set \(A \subset \mathbb{R}\) is called bounded above if \(U(A)\) is not empty

Examples

  • If \(A = [0, 1]\), then \(U(A) = [1, \infty)\)

  • If \(A = (0, 1)\), then \(U(A) = [1, \infty)\)

  • If \(A = (0, 1) \cup (2, 3)\), then \(U(A) = [3, \infty)\)

  • If \(A = \mathbb{N}\), then \(U(A) = \varnothing\)

Definition

The least upper bound \(s\) of \(A\) is called supremum of \(A\), denoted \(s=\sup A\), i.e.

\[s \in U(A) \quad \text{and} \quad s \leqslant u, \; \forall \, u \in U(A)\]

Example

If \(A = (0, 1]\), then \(U(A) = [1, \infty)\), so \(\sup A = 1\)

If \(A = (0, 1)\), then \(U(A) = [1, \infty)\), so \(\sup A = 1\)

_images/sup_inf_set.png

Fig. 27 Upper and lower bounds, supremum and infimum of an interval on \(\mathbb{R}\)#

Definition

A lower bound of \(A \subset \mathbb{R}\) is any \(\ell \in \mathbb{R}\) such that \(\ell \leqslant a\) for all \(a \in A\).

Let \(L(A)\) denote set of all lower bounds of \(A\).

A set \(A \subset \mathbb{R}\) is called bounded below if \(L(A)\) is not empty.

The highest of the lower bounds \(p\) is called the infimum of \(A\), denoted \(p=\inf A\), i.e.

\[p \in L(A) \quad \text{and} \quad p \geqslant \ell, \; \forall \, \ell \in L(A)\]

Example

  • If \(A = [0, 1]\), then \(\inf A = 0\)

  • If \(A = (0, 1)\), then \(\inf A = 0\)

Some useful facts#

Fact

Boundedness of a subset of the set of real numbers is equivalent to it being bounded above and below.

\[A \subset \mathbb{R}, \; \text{bounded} \quad \iff \quad A \; \text{bounded above and below}\]

Fact

Every nonempty subset of \(\mathbb{R}\) bounded above has a supremum in \(\mathbb{R}\)

Every nonempty subset of \(\mathbb{R}\) bounded below has an infimum in \(\mathbb{R}\)

Some textbooks allow all sets to have a supremum and infimum, even if they are not bounded. This is achieved by extending the set of real numbers with \(\{-\infty,\infty\}\)

Then, if \(A\) is unbounded above then \(\sup A = \infty\), and if \(A\) is unbounded below then \(\inf A = -\infty\).

Note aside

Conventions for dealing with symbols β€œ\(\infty\)’’ and β€œ\(-\infty\)” If \(a \in \mathbb{R}\), then

  • \(a + \infty = \infty\)

  • \(a - \infty = -\infty\)

  • \(a \times \infty = \infty\) if \(a \ne 0\), \(0 \times \infty = 0\)

  • \(-\infty < a < \infty\)

  • \(\infty + \infty = \infty\)

  • \(-\infty - \infty = -\infty\) But \(\infty - \infty\) is not defined

Fact

Let \(A\) be any set bounded above and let \(s = \sup A\). There exists a sequence \(\{x_n\}\) in \(A\) with \(x_n \to s\).

Analogously, if \(A\) is bounded below and \(p = \inf A\), then there exists a sequence \(\{x_n\}\) in \(A\) with \(x_n \to p\).

Maxima and Minima (\(\max\) + \(\min\))#

Definition

We call \(a^*\) the maximum of \(A \subset \mathbb{R}\), denoted \(a^* = \max A\), if

\[a^* \in A \quad \text{and} \quad a \leqslant a^* \text{ for all } a \in A \]

We call \(a^*\) the minimum of \(A \subset \mathbb{R}\) and write \(a^* = \min A\) if

\[a^* \in A \quad \text{and} \quad a^* \leqslant a \text{ for all } a \in A \]

Example

For \(A = [0, 1]\) \(\max A = 1\) and \(\min A = 0\)

Relationship between max/min and sup/inf#

Fact

Let \(A\) be any subset of \(\mathbb{R}\)

  1. If \(\sup A \in A\), then \(\max A\) exists and \(\max A = \sup A\)

  2. If \(\inf A \in A\), then \(\min A\) exists and \(\min A = \inf A\)

In other words, when max and min exist they agree with sup and inf

Fact

If \(F \subset \mathbb{R}\) is a closed and bounded, then \(\max F\) and \(\min F\) both exist

Segway to functions#

Of course, in optimization we are mainly interested in maximizing and minimizing functions

Will apply the notions of supremum and infimum, minimum and maximum to the range of a function

\[\text{From} \quad A \subset \mathbb{R} \quad \text{go to} \quad \mathrm{rng}(f) \subset \mathbb{R}, \; f \colon X \subset \mathbb{R}^N \to \mathbb{R}\]

Equivalently, we may consider the image of a set \(X \subset \mathbb{R}^N\) under a function \(f\)

\[f(X) = \{ f(x) \colon x \in X\}\]

Definition

Let \(f \colon A \to \mathbb{R}\), where \(A\) is any set

The supremum of \(f\) on \(A\) is defined as

\[\sup_{x \in A} f(x) = \sup f(A)\]

The infimum of \(f\) on \(A\) is defined as

\[\inf_{x \in A} f(x) = \inf f(A)\]
_images/func_sup.png

Fig. 28 The supremum of \(f\) on \(A\)#

_images/func_inf.png

Fig. 29 The infimum of \(f\) on \(A\)#

Definition

Let \(f \colon A \to \mathbb{R}\) where \(A\) is any set

The maximum of \(f\) on \(A\) is defined as

\[\max_{x \in A} f(x) = \max f(A)\]

The minimum of \(f\) on \(A\) is defined as

\[\min_{x \in A} f(x) = \min f(A)\]

Definition

A maximizer of \(f\) on \(A\) is a point \(a^* \in A\) such that

\[a^* \in A \text{ and } f(a^*) \geqslant f(x) \text{ for all } x \in A, \quad \text{alternatively}\]
\[f(a^*) = \max_{x \in A} f(x)\]

The set of all maximizers is typically denoted by

\[\mathrm{argmax}_{x \in A}f(x)\]

Definition

A minimizer of \(f\) on \(A\) is a point \(a^* \in A\) such that

\[a^* \in A \text{ and } f(a^*) \leqslant f(x) \text{ for all } x \in A, \quad \text{alternatively}\]
\[f(a^*) = \min_{x \in A} f(x)\]

The set of all minimizers is typically denoted by

\[\mathrm{argmin}_{x \in A}f(x)\]

Weierstrass boundedness theorem#

Putting together all the above material to formulate a fundamental result which is essential for establishing the existence of maxima and minima of functions in the next section.

Definition

A function \(f\) is called bounded if its range is a bounded set.

Example

\([0,1] \ni x \mapsto f(x) = x^5 \in \mathbb{R}\) is bounded

\(\sin(x)\) is bounded on \(\mathbb{R}\), i.e. \(f: \mathbb{R} \to \mathbb{R}\) where \(f(x) = \sin(x)\) is bounded

\(f(x)=1/x\) is not bounded (on \(\mathbb{R}\))

\(f(x) = \log(x)\) is not bounded (on \(\mathbb{R}\))

Fact

Consider a continuous function \(f: X \subset \mathbb{R}^N \to \mathbb{R}\).

If \(X\) is compact, then \(f\) is bounded on \(X\).

Weierstrass extreme value theorem#

  • Fundamental result on existence of maxima and minima of functions!

  • Classic example of an existance theorem: it states that something exists without providing a method to find it

Fact (Weierstrass extreme value theorem)

If \(f: A \subset \mathbb{R}^N \to \mathbb{R}\) is continuous and \(A\) is closed and bounded (compact),
then \(f\) has both a maximizer and a minimizer on \(A\).

_images/extreme_value_theorem.png

Example

Consider the problem

\[ % \max_{c_1, c_2} \; U(c_1, c_2) = \sqrt{c_1} + \beta \sqrt{c_2} % \]
\[ % \text{ such that } \; c_2 \leqslant (1 + r)(w - c_1), \quad c_i \geqslant 0 \text{ for } i = 1,2 % \]

where

  • \(r \in (0,1)\) is interest rate

  • \(w\) is wealth

  • \(\beta \in (0,1)\) is a discount factor

Let \(B\) be all \((c_1, c_2)\) satisfying the constraint

Exercise: Show that the budget set \(B\) is a closed, bounded subset of \(\mathbb{R}^2\)

Exercise: Show that \(U\) is continuous on \(B\)

We conclude that a maximizer exists

Example

Examples of failure of Weierstrass theorem conditoins

_images/extreme_value_theorem_fail2.png

Why? Function is not continuous on \([a,b]\)

_images/extreme_value_theorem_fail3.png

Why? Domain of the function is not closed

_images/extreme_value_theorem_fail1.png

Why? The function is not continuous or not defined at the right endpoint of the interval: either continuity or closedness is violated