Wednesday, February 18, 2015

Convexity and You: Unpacking the Definition

Real Analysis is notorious for taking easy-to-understand concepts and repackaging them in a thick theoretical barrier. Take the epsilon-delta definition of continuity---it's impossible to prove anything with the information "the function, uh, doesn't have any holes," but it's impossible to develop a mental picture given only the theoretical perspective. For this reason, one of the biggest barriers to learning any type of analysis is properly connecting the intuitive idea and the theoretical representation.

We'll focus here on one of the less transparent definitions: convex functions. Convex functions can be understood intuitively as "the area above the function is a shape that doesn't go inwards on itself"... and theoretically as
Given convex set $X$, a function $f:X\to\mathbb{R}$ is convex if for all $x_1,\ x_2\in X$ and $t \in [0,1]$, $f(tx_1+(1-t)x_2)\leq tf(x_1)+(1-t)f(x_2)$.

This is the part where, during an analysis course, you are expected to nod your head at the alphabet vomit (at least this time it's the Roman alphabet, not the Greek, that tossed its cookies). Let's make some sense out of what information is being conveyed.

First of all, to understand the definition of convex functions, you must know what convex sets are. A set is convex if any two points (call them $x_1$ and $x_2$) can be connected by a straight line that is contained in the set. If the set is not convex (i.e. "goes inwards" visually), then there will be at least two points whose connecting line goes outside the set.

Now the domain of $f$ is a convex set $X$, which should explain what the points $x_1$ and $x_2$ are doing in the definition: they correspond to the two arbitrary points that we want to try and connect with a line. This brings us to the purpose of defining $t \in [0,1]$. Consider the function $y(t)=tx_1+(1-t)x_2$. Since $y(0)=x_2$, $y(1)=x_1$ and $y$ itself is a linear functional, this function represents a straight line segment starting at $x_2$ and ending at $x_1$. Thus the purpose of $t$ is to create the parametrized line segment joining points $x_1$ and $x_2$.

We are given that $X$ is a convex set, so it is certainly true that the line $tx_1+(1-t)x_2$ is completely contained in $X$, the domain of $f$. This makes it completely legit to consider $f(tx_1+(1-t)x_2)$ as the image of this line. The image of a straight line in the domain won't necessarily be a straight line itself, but will instead be a path along the function starting at $f(x_2)$ and ending at $f(x_1)$. Hence the expression $f(tx_1+(1-t)x_2)$ is asking us to consider the section of $f(x)$ that connects* $f(x_1)$ and $f(x_2)$.

This brings us to the last part of the inequality
$$f(tx_1+(1-t)x_2)\leq tf(x_1)+(1-t)f(x_2).$$
Just as before, the second expression $tf(x_1)+(1-t)f(x_2)$ is representing a parametrized line segment, joining the points $f(x_2)$ and $f(x_1)$. We are now comparing two paths between $f(x_1)$ and $f(x_2)$: one is a straight line, and the other a path on the function. The inequality places a lower bound on where the straight line can be. If the straight line is above the path on $f$ everywhere---that is, if it satisfies the above inequality---it is contained in the area above $f(x)$ (the epigraph of $f$).

That's exactly the definition of a convex set, but applied to the space above $f$... cool.

Here's a picture for $X = \mathbb{R}$:

That's what the definition is communicating. I hope that was insightful for someone!

*(does not refer to connectedness in the mathematical sense)

No comments:

Post a Comment