Books.ConvexAnalysis_Rockafellar_1970.Chapters.Chap03.section16

source

theorem section16_supportFunctionEReal_preimage_eq_sInf_of_exists_mem_ri {n m : ℕ} (A : EuclideanSpace ℝ (Fin n) →ₗ[ℝ ] EuclideanSpace ℝ (Fin m)) (D : Set (Fin m → ℝ)) (hD : Convex ℝ D) (hri : ∃ (x : EuclideanSpace ℝ (Fin n)), A x ∈ euclideanRelativeInterior m ((fun (z : EuclideanSpace ℝ (Fin m)) => z.ofLp) ⁻¹' D)) :

have Aadj := LinearMap.adjoint A; (supportFunctionEReal ((fun (x : Fin n → ℝ) => (A (WithLp.toLp 2 x)).ofLp) ⁻¹' D) = fun (xStar : Fin n → ℝ) => sInf ((fun (yStar : EuclideanSpace ℝ (Fin m)) => supportFunctionEReal D yStar.ofLp) '' {yStar : EuclideanSpace ℝ (Fin m) | Aadj yStar = WithLp.toLp 2 xStar})) ∧ ∀ (xStar : Fin n → ℝ), sInf ((fun (yStar : EuclideanSpace ℝ (Fin m)) => supportFunctionEReal D yStar.ofLp) '' {yStar : EuclideanSpace ℝ (Fin m) | Aadj yStar = WithLp.toLp 2 xStar}) = ⊤ ∨ ∃ (yStar : EuclideanSpace ℝ (Fin m)), Aadj yStar = WithLp.toLp 2 xStar ∧ supportFunctionEReal D yStar.ofLp = sInf ((fun (yStar : EuclideanSpace ℝ (Fin m)) => supportFunctionEReal D yStar.ofLp) '' {yStar : EuclideanSpace ℝ (Fin m) | Aadj yStar = WithLp.toLp 2 xStar})

Theorem 16.4.3: Let f₁, …, fₘ be proper convex functions on ℝⁿ. For any convex set D ⊆ ℝᵐ, if there exists x such that A x ∈ ri D, then the closure operation can be omitted in Theorem 16.4.2, and

δ^*(xStar | A⁻¹ D) = inf { δ^*(yStar | D) | A^* yStar = xStar },

where the infimum is attained (or is +∞ vacuously).

In this development, δ^* is represented by supportFunctionEReal, ri is euclideanRelativeInterior, and the right-hand side is modeled by an sInf over the affine fiber {yStar | A^* yStar = xStar}.

source

theorem section16_infDist_as_infimalConvolution_norm_indicator {n : ℕ} {C : Set (Fin n → ℝ)} (hCne : C.Nonempty) :

(fun (x : Fin n → ℝ) => ↑(Metric.infDist x C)) = infimalConvolution (fun (y : Fin n → ℝ) => ↑‖y‖) (indicatorFunction C)

Rewriting the distance function as an infimal convolution of the norm and indicator.

source

theorem section16_fenchelConjugate_norm_eq_indicator_unitBall {n : ℕ} :

(fenchelConjugate n fun (x : Fin n → ℝ) => ↑‖x‖) = fun (xStar : Fin n → ℝ) => if l1Norm xStar ≤ 1 then 0 else ⊤

The Fenchel conjugate of the norm is the indicator of the ℓ¹ unit ball.

source

theorem section16_fenchelConjugate_infDist_eq_indicatorBall_add_supportFunctionEReal {n : ℕ} (C : Set (Fin n → ℝ)) (hC : Convex ℝ C) (hCne : C.Nonempty) :

(fenchelConjugate n fun (x : Fin n → ℝ) => ↑(Metric.infDist x C)) = fun (xStar : Fin n → ℝ) => (if l1Norm xStar ≤ 1 then 0 else ⊤) + supportFunctionEReal C xStar

The Fenchel conjugate of the distance function splits into the norm conjugate and the support function.

source

theorem section16_fenchelConjugate_infDist {n : ℕ} (C : Set (Fin n → ℝ)) (hC : Convex ℝ C) (hCne : C.Nonempty) :

(fenchelConjugate n fun (x : Fin n → ℝ) => ↑(Metric.infDist x C)) = fun (xStar : Fin n → ℝ) => if l1Norm xStar ≤ 1 then supportFunctionEReal C xStar else ⊤

Text 16.4.1: Let f(x) = d(x, C) = inf {‖x - y‖ | y ∈ C} be the distance function, where C is a given nonempty convex set. Then

f^*(x^*) = δ^*(x^* | C) if ‖x^*‖₁ ≤ 1, and f^*(x^*) = +∞ otherwise.

In this development, f^* is represented by fenchelConjugate n f, d(x, C) is Metric.infDist, and δ^*(· | C) is the support function supportFunctionEReal C.

source

theorem section16_span_range_eq_linearMap_range_sum_smul {n m : ℕ} (a : Fin m → Fin n → ℝ) :

have A := { toFun := fun (ξ : Fin m → ℝ) => ∑ t : Fin m, ξ t • a t, map_add' := ⋯, map_smul' := ⋯ }; Submodule.span ℝ (Set.range a) = LinearMap.range A

The span of a finite family agrees with the range of its linear-combination map.

source

theorem section16_sInf_range_norm_sub_eq_infDist_range {n m : ℕ} (A : (Fin m → ℝ) →ₗ[ℝ ] Fin n → ℝ) (x : Fin n → ℝ) :

sInf (Set.range fun (ξ : Fin m → ℝ) => ‖x - A ξ‖) = Metric.infDist x (Set.range ⇑A)

The infimum over all coefficient vectors is the distance to the range.

source

theorem section16_closedProperConvex_coe_infDist_submodule {n : ℕ} (L : Submodule ℝ (Fin n → ℝ)) :

(ClosedConvexFunction fun (x : Fin n → ℝ) => ↑(Metric.infDist x ↑L)) ∧ ProperConvexFunctionOn Set.univ fun (x : Fin n → ℝ) => ↑(Metric.infDist x ↑L)

The distance to a submodule is closed, proper, and convex.

source

theorem section16_supportFunctionEReal_submodule {n : ℕ} (L : Submodule ℝ (Fin n → ℝ)) :

supportFunctionEReal ↑L = fun (b : Fin n → ℝ) => if b ∈ orthogonalComplement n L then 0 else ⊤

The support function of a submodule is the indicator of its orthogonal complement.

source

theorem section16_fenchelConjugate_coe_infDist_submodule_eq_indicator_D_inter_orth {n : ℕ} (L : Submodule ℝ (Fin n → ℝ)) :

(fenchelConjugate n fun (x : Fin n → ℝ) => ↑(Metric.infDist x ↑L)) = indicatorFunction ({b : Fin n → ℝ | l1Norm b ≤ 1} ∩ ↑(orthogonalComplement n L))

The conjugate of the distance to a submodule is an indicator on D ∩ Lᗮ.

source

theorem section16_bddAbove_dotProduct_D_inter_orth {n : ℕ} (L : Submodule ℝ (Fin n → ℝ)) (x : Fin n → ℝ) :

BddAbove ((fun (y : Fin n → ℝ) => y ⬝ᵥ x) '' ({y : Fin n → ℝ | l1Norm y ≤ 1} ∩ ↑(orthogonalComplement n L)))

Dot products over D ∩ Lᗮ are bounded above.

source

theorem section16_infNorm_sub_lincomb_eq_deltaStar_inter_orthogonalComplement {n m : ℕ} (a : Fin m → Fin n → ℝ) :

have L := Submodule.span ℝ (Set.range a); have D := {xStar : Fin n → ℝ | ∑ j : Fin n, |xStar j| ≤ 1}; have f := fun (x : Fin n → ℝ) => sInf (Set.range fun (ξ : Fin m → ℝ) => ‖x - ∑ t : Fin m, ξ t • a t‖); f = deltaStar (D ∩ ↑(orthogonalComplement n L))

Text 16.4.2: Consider the function

f(x) = inf { ‖x - ξ₁ a₁ - ⋯ - ξₘ aₘ‖∞ | ξₜ ∈ ℝ },

where a₁, …, aₘ ∈ ℝⁿ are given and ‖x‖∞ = max_j |x j| for x ∈ ℝⁿ. Then f is the support function of the (polyhedral) convex set D ∩ L^⊥, where L is the subspace spanned by a₁, …, aₘ and

D = {xStar | |xStar 1| + ⋯ + |xStar n| ≤ 1}.

source

theorem section16_piecewise_nonneg_eq_add_indicator_nonnegOrthant {n : ℕ} (h : (Fin n → ℝ) → EReal) (hnotbot : ∀ (x : Fin n → ℝ), h x ≠ ⊥) :

(fun (x : Fin n → ℝ) => if 0 ≤ x then h x else ⊤) = fun (x : Fin n → ℝ) => h x + indicatorFunction {x : Fin n → ℝ | 0 ≤ x} x

Rewriting the piecewise nonnegative extension as a sum with the indicator.

source

theorem section16_polar_nonnegOrthant_eq_nonpos {n : ℕ} :

{xStar : Fin n → ℝ | ∀ x ∈ {x : Fin n → ℝ | 0 ≤ x}, x ⬝ᵥ xStar ≤ 0} = {xStar : Fin n → ℝ | xStar ≤ 0}

The dot-product polar of the nonnegative orthant is the nonpositive orthant.

source

theorem section16_fenchelConjugate_indicator_nonnegOrthant_eq_indicator_nonposOrthant {n : ℕ} :

fenchelConjugate n (indicatorFunction {x : Fin n → ℝ | 0 ≤ x}) = indicatorFunction {xStar : Fin n → ℝ | xStar ≤ 0}

The Fenchel conjugate of the nonnegative-orthant indicator is the nonpositive-orthant indicator.

source

theorem section16_infimalConvolution_with_indicator_nonpos_eq_sInf_image_ge {n : ℕ} (p : (Fin n → ℝ) → EReal) (hp : ∀ (z : Fin n → ℝ), p z ≠ ⊥) :

infimalConvolution p (indicatorFunction {u : Fin n → ℝ | u ≤ 0}) = fun (xStar : Fin n → ℝ) => sInf (p '' {zStar : Fin n → ℝ | xStar ≤ zStar})

Infimal convolution with the nonpositive orthant indicator is an infimum over upper bounds.

source

theorem section16_fenchelConjugate_piecewise_nonneg_eq_convexFunctionClosure_sInf_ge {n : ℕ} (h : (Fin n → ℝ) → EReal) (hclosed : ClosedConvexFunction h) (hproper : ProperConvexFunctionOn Set.univ h) :

have f := fun (x : Fin n → ℝ) => if 0 ≤ x then h x else ⊤; have g := fun (xStar : Fin n → ℝ) => sInf ((fun (zStar : Fin n → ℝ) => fenchelConjugate n h zStar) '' {zStar : Fin n → ℝ | xStar ≤ zStar}); fenchelConjugate n f = convexFunctionClosure g

Text 16.4.3: Let h be a closed proper convex function on ℝⁿ. Define f by

f x = h x if x ≥ 0, and f x = +∞ otherwise,

where x ≥ 0 is the coordinatewise order (the non-negative orthant). Then the Fenchel conjugate f^* is the closure of the convex function

g xStar = inf { h^* zStar | zStar ≥ xStar },

again with respect to the coordinatewise order. Here h^* is fenchelConjugate n h and the closure is convexFunctionClosure.

source

theorem section16_softmax_mem_simplex {n : ℕ} (hn : 0 < n) (xStar : Fin n → ℝ) :

have Z := ∑ j : Fin n, Real.exp (xStar j); have ξ0 := fun (j : Fin n) => Real.exp (xStar j) / Z; 0 ≤ ξ0 ∧ ∑ j : Fin n, ξ0 j = 1 ∧ 0 < Z

Softmax lies in the simplex and has a positive normalizer.

source

theorem section16_mul_exp_sub_log_le_exp {ξ a : ℝ} (hξ : 0 ≤ ξ) :

ξ * Real.exp (a - Real.log ξ) ≤ Real.exp a

A weighted exponential term is bounded by the unweighted exponential.

source

theorem section16_entropy_rangeTerm_le_log_sum_exp {n : ℕ} (hn : 0 < n) (ξ : Fin n → ℝ) (hξ : 0 ≤ ξ) (hsum : ∑ j : Fin n, ξ j = 1) (xStar : Fin n → ℝ) :

ξ ⬝ᵥ xStar - ∑ j : Fin n, ξ j * Real.log (ξ j) ≤ Real.log (∑ j : Fin n, Real.exp (xStar j))

Gibbs inequality for entropy on the simplex.

source

theorem section16_entropy_rangeTerm_eq_log_sum_exp_at_softmax {n : ℕ} (hn : 0 < n) (xStar : Fin n → ℝ) :

have Z := ∑ j : Fin n, Real.exp (xStar j); have ξ0 := fun (j : Fin n) => Real.exp (xStar j) / Z; ξ0 ⬝ᵥ xStar - ∑ j : Fin n, ξ0 j * Real.log (ξ0 j) = Real.log Z

The entropy range term attains log (∑ exp) at the softmax point.

Documentation

Books.ConvexAnalysis_Rockafellar_1970.Chapters.Chap03.section16_part11