Hyperplane Separation Theorem

A proof made under a “divide and conquer” approach.
Author

Mauricio “Pachá” Vargas S.

Published

February 21, 2021

I’m reading Real Analysis books again as a part of my studies. I used to visit Kim C. Border site from time to time to read his excellent materials, and now I read that he passed away. I never audited one of his courses nor studied at Caltech, but we exchanged several emails from 2012 to 2019, mostly about Linear Algebra, and lately also about Arrow’s Impossibility Theorem and social debates in a moment when Chile entered a political crisis that we still haven’t solved.

The proof of this theorem, heavily inspired from his style, is a way to tribute him as a very positive influence during my economics studies. This theorem, interesting by itself, provides a powerful result used in the different versions and proofs of the finite dimensional supporting hyperplane theorem, the Farkas’ lemma, and the Karush-Kuhn-Tucker theorem. These result are also the basis for the support vector machines algorithm, duality in linear optimization, and many portfolio-related results in finance.

Theorem (Hyperplane Separation Theorem). Let A and B two convex, disjoint, non-empty subsets of Rn. If A is closed and B is compact, exists pRn{0} such that pa<pb(a,b)A×B.

A consequence of this result, is that any closed and convex set C can be separated for any vector vC.

Proof. Shall be made under a “divide and conquer” approach.

If A is closed, define the function f:BRbminaAba. This function returns the distance from bB to aA and is a continuous function.

If B is compact, b0B exists and is such that f(b0)f(b)bB because of the Weierstrass extreme value theorem.

Let yA such that f(b0)=b0y. The vector p=b0yb0y is well defined and is such that p=1.

From pp>0, it follows that pb0yb0y>0, from which we conclude that (*)py<pb0.

From (*) we need to prove that pb0pb and pa<py.

Fix aA and define the function g:[0,1]Rλb0+yλ(ay)2. Provided that A is convex, g reaches a minimum at λ=0, then g(λ)=(b0yλ(ay),b0yλ(ay))2=(b0y,b0y2λb0y,ay+λ2ay,ay)2gλ=2b0y,ay+2λay,ay. Hence, gλ|λ=0=(b0y)(ya)0. Dividing by b0y we conclude that (**)papy.

Fix bB and define the function h:[0,1]Rλb0yλ(by)2. By the same argument that leads to (**), we conclude that (***)pb0pb.

Finally, (*), (**) and (***) lead to conclude that pa<pb.

Corollary. Let A and B be two convex, disjoint and non-empty sets. Then C=AB is convex and non-empty such that 0C. There are two cases for this result:

  1. 0adh(C), in which case exists pRn{0} such that papb(a,b)A×B.
  2. 0adh(C), in which case exists pRn{0} such that pa<pb(a,b)A×B.

Proof. Again, dividing by parts.

0adh(C): Applying the hyperplane separation theorem exists 0int(C). Therefore, exists a sequence {xn}nN such that {xn}adh(C) and {xn}0 as n. Then for every nN exists a vector pn such that pn=1 and pna<pnb(a,b)A×B. Provided that {pn}nN is defined in a compact set, exists at least one convergent subsequence, this means that exists pRn{0} such that p=1 and papb(a,b)A×B.

0adh(C): Applying the hyperplane separation theorem, we conclude that exists pRn{0} such that pc<0cadh(C), and in particular if we define c=ab with aA and bB we obtain pa<pb(a,b)A×B.

Note. The latter proof assumes that the interior of the adherence of a set is contained within the set. The property is intuitively clear, but only applies for convex sets. As a counter example: Let C=(0,1), then adh(C)=[0,1] and int(adh(C))=(0,1). But if C=(0,12)(12,1) then adh(C)=[0,1] and int(adh(C))=(0,1), therefore int(adh(C))C.