Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section04_part4

theorem exists_nat_cast_add_one_ge (t : ℝ) :

∃ (N : ℕ), t ≤ ↑N + 1

For any real lower bound t, there exists a natural number N with t ≤ (N : ℝ) + 1.

theorem optimal_parameters_M_zero {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (d2 : E2 → ℝ) (σ1 σ2 D1 ε : ℝ) (hε : 0 < ε) :

have A' := { toFun := fun (x : E1) => ↑(A x), map_add' := ⋯, map_smul' := ⋯ }; have D2 := sSup (d2 '' Q2); ∃ (μ : ℝ) (Lμ : ℝ) (N : ℕ), μ = ε / (2 * D2) ∧ Lμ = D2 / (2 * σ2) * (OperatorNormDef A' ^ 2 / ε) ∧ 4 * OperatorNormDef A' / ε * √(D1 * D2 / (σ1 * σ2)) ≤ ↑N + 1

Proposition 1.4.2. In the M = 0 case of Theorem 1.4.1, an optimal (order-wise) choice of the smoothing parameter μ, the corresponding Lipschitz constant L_μ, and the iteration count N as a function of accuracy ε > 0 is given by μ = ε / (2 D2), L_μ = (D2 / (2 σ2)) * (‖A‖_{1,2}^2 / ε), and an integer N with (4 ‖A‖_{1,2} / ε) * sqrt(D1 D2 / (σ1 σ2)) ≤ N + 1 (equation (4.8), up to rounding).

def standardSimplex (n : ℕ) :

Set (Fin n → ℝ)

Definition 1.4.1.1. The standard simplex Δ_n in ℝ^n is the set of vectors with nonnegative coordinates summing to 1 (eq:Delta_n).

Equations

standardSimplex n = {x : Fin n → ℝ | (∀ (i : Fin n), 0 ≤ x i) ∧ ∑ i : Fin n, x i = 1}

Instances For

theorem standardSimplex_eq_stdSimplex (n : ℕ) :

standardSimplex n = stdSimplex ℝ (Fin n)

Definition 1.4.1.1. The book's standard simplex agrees with mathlib's stdSimplex on Fin n → ℝ.

noncomputable def matrixGameSaddlePointValue (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b : Fin m → ℝ) :

Definition 1.4.1.1. The matrix-game saddle-point value min_{x ∈ Δ_n} max_{u ∈ Δ_m} {⟪A x, u⟫_2 + ⟪c, x⟫_1 + ⟪b, u⟫_2} (4.9).

Equations

One or more equations did not get rendered due to their size.

Instances For

noncomputable def matrixGamePrimalObjective (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b : Fin m → ℝ) :

(Fin n → ℝ) → ℝ

Definition 1.4.1.2. The primal objective in (4.10): f(x) = ⟪c,x⟫_1 + max_{1≤j≤m} (⟪a_j,x⟫_1 + b^{(j)}).

Equations

matrixGamePrimalObjective n m A c b x = ∑ i : Fin n, c i * x i + sSup ((fun (j : Fin m) => A x j + b j) '' Set.univ)

Instances For

noncomputable def matrixGameDualObjective (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b : Fin m → ℝ) :

(Fin m → ℝ) → ℝ

Definition 1.4.1.2. The dual objective in (4.10_dual): φ(u) = ⟪b,u⟫_2 + min_{1≤i≤n} (⟪\hat a_i,u⟫_2 + c^{(i)}).

Equations

matrixGameDualObjective n m A c b u = ∑ j : Fin m, b j * u j + sInf ((fun (i : Fin n) => ∑ j : Fin m, A (fun (k : Fin n) => if k = i then 1 else 0) j * u j + c i) '' Set.univ)

Instances For

theorem standardSimplex_basis_mem (n : ℕ) (i : Fin n) :

(fun (k : Fin n) => if k = i then 1 else 0) ∈ standardSimplex n

The standard basis vector belongs to the standard simplex.

theorem standardSimplex_nonempty_of_pos {n : ℕ} (hn : 0 < n) :

(standardSimplex n).Nonempty

The standard simplex is nonempty when n > 0.

theorem linear_standardSimplex_le_sSup_coeff {m : ℕ} (r u : Fin m → ℝ) :

u ∈ standardSimplex m → ∑ i : Fin m, r i * u i ≤ sSup ((fun (i : Fin m) => r i) '' Set.univ)

Any linear functional value on the standard simplex is bounded above by the supremum of its coefficients.

theorem sSup_linear_standardSimplex_eq {m : ℕ} (r : Fin m → ℝ) :

sSup ((fun (u : Fin m → ℝ) => ∑ i : Fin m, r i * u i) '' standardSimplex m) = sSup ((fun (i : Fin m) => r i) '' Set.univ)

The supremum of a linear functional over the standard simplex equals the supremum of its coefficients.

theorem sSup_image_add_const {s : Set ℝ} (a : ℝ) (hs : BddAbove s) (hne : s.Nonempty) :

sSup ((fun (x : ℝ) => a + x) '' s) = a + sSup s

Adding a constant commutes with sSup on a bounded nonempty set of reals.

theorem sInf_image_add_const {s : Set ℝ} (a : ℝ) (hs : BddBelow s) (hne : s.Nonempty) :

sInf ((fun (x : ℝ) => a + x) '' s) = a + sInf s

Adding a constant commutes with sInf on a bounded-below nonempty set of reals.

theorem sInf_coeff_le_linear_standardSimplex {m : ℕ} (r u : Fin m → ℝ) :

u ∈ standardSimplex m → sInf ((fun (i : Fin m) => r i) '' Set.univ) ≤ ∑ i : Fin m, r i * u i

Any linear functional value on the standard simplex is bounded below by the infimum of its coefficients.

theorem sInf_linear_standardSimplex_eq {m : ℕ} (r : Fin m → ℝ) :

sInf ((fun (u : Fin m → ℝ) => ∑ i : Fin m, r i * u i) '' standardSimplex m) = sInf ((fun (i : Fin m) => r i) '' Set.univ)

The infimum of a linear functional over the standard simplex equals the infimum of its coefficients.

theorem sum_Ax_mul_eq_sum_basis {n m : ℕ} (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (u : Fin m → ℝ) (x : Fin n → ℝ) :

∑ j : Fin m, A x j * u j = ∑ i : Fin n, (∑ j : Fin m, A (fun (k : Fin n) => if k = i then 1 else 0) j * u j) * x i

Expanding the bilinear form x ↦ ∑ j, (A x) j * u j on the standard basis.

theorem matrixGameDualObjective_eq_sInf_over_standardSimplex (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b u : Fin m → ℝ) (hne : (standardSimplex n).Nonempty) :

matrixGameDualObjective n m A c b u = sInf ((fun (x : Fin n → ℝ) => ∑ j : Fin m, A x j * u j + ∑ i : Fin n, c i * x i + ∑ j : Fin m, b j * u j) '' standardSimplex n)

Rewriting the dual objective as an infimum over the standard simplex.

theorem matrixGameDualObjective_le_saddlePointValue (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b u : Fin m → ℝ) (hu : u ∈ standardSimplex m) (hne : (standardSimplex n).Nonempty) :

matrixGameDualObjective n m A c b u ≤ matrixGameSaddlePointValue n m A c b

Pointwise weak duality for a fixed dual vector in the simplex.

theorem matrixGameDual_le_saddlePointValue (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b : Fin m → ℝ) (hm : 0 < m) (hne : (standardSimplex n).Nonempty) :

sSup (matrixGameDualObjective n m A c b '' standardSimplex m) ≤ matrixGameSaddlePointValue n m A c b

Weak duality: the dual value is bounded above by the saddle-point value.

theorem matrixGameSaddlePointValue_eq_primal_dual (n m : ℕ) (A : (Fin n → ℝ) →ₗ[ℝ ] Fin m → ℝ) (c : Fin n → ℝ) (b : Fin m → ℝ) (hn : 0 < n) (hm : 0 < m) :

matrixGameSaddlePointValue n m A c b = sInf (matrixGamePrimalObjective n m A c b '' standardSimplex n) ∧ sSup (matrixGameDualObjective n m A c b '' standardSimplex m) ≤ matrixGameSaddlePointValue n m A c b

Definition 1.4.1.2. The saddle-point value (4.9) equals the primal minimization in (4.10) and upper-bounds the dual maximization in (4.10_dual) over the standard simplices Δ_n and Δ_m.

theorem standardSimplex_coord_mem_Icc {n : ℕ} {x : Fin n → ℝ} (hx : x ∈ standardSimplex n) (i : Fin n) :

x i ∈ Set.Icc 0 1

A coordinate of a point in the standard simplex lies in [0, 1].

theorem norm_sq_le_sum_sq_pi (n : ℕ) (z : Fin n → ℝ) :

‖z‖ ^ 2 ≤ ∑ i : Fin n, z i ^ 2

The squared Pi/sup norm on Fin n → ℝ is bounded by the sum of squared coordinates.

theorem deriv_mul_log_sub_half_sq {x : ℝ} (hx : x ≠ 0) :

deriv (fun (t : ℝ) => t * Real.log t - 1 / 2 * t ^ 2) x = Real.log x + 1 - x

Derivative formula for x ↦ x * log x - (1/2) * x^2 away from 0.

theorem deriv2_mul_log_sub_half_sq {x : ℝ} (hx : x ≠ 0) :

deriv ^[2] (fun (t : ℝ) => t * Real.log t - 1 / 2 * t ^ 2) x = x⁻¹ - 1

Second derivative formula for x ↦ x * log x - (1/2) * x^2 away from 0.

theorem mul_log_strongConvexOn_Icc :

StrongConvexOn (Set.Icc 0 1) 1 fun (t : ℝ) => t * Real.log t

On [0,1], x ↦ x * log x is 1-strongly convex.

theorem entropy_sum_strongConvexOn_standardSimplex (n : ℕ) :

StrongConvexOn (standardSimplex n) 1 fun (x : Fin n → ℝ) => ∑ i : Fin n, x i * Real.log (x i)

The entropy sum x ↦ ∑ i, x i * log (x i) is 1-strongly convex on the standard simplex.

theorem StrongConvexOn.const_add {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] {s : Set E} {m : ℝ} {f : E → ℝ} (c : ℝ) (hf : StrongConvexOn s m f) :

StrongConvexOn s m fun (x : E) => c + f x

Adding a constant to a strongly convex function preserves strong convexity.

theorem entropy_proxDiameterBound_standardSimplex (n : ℕ) :

IsProxDiameterBound (standardSimplex n) (fun (x : Fin n → ℝ) => Real.log ↑n + ∑ i : Fin n, x i * Real.log (x i)) (Real.log ↑n)

The entropy prox-function has prox-diameter bounded by log n on the standard simplex.

theorem matrixGame_entropy_prox_parameters (n m : ℕ) :

have d1 := fun (x : Fin n → ℝ) => Real.log ↑n + ∑ i : Fin n, x i * Real.log (x i); have d2 := fun (u : Fin m → ℝ) => Real.log ↑m + ∑ j : Fin m, u j * Real.log (u j); StrongConvexOn (standardSimplex n) 1 d1 ∧ StrongConvexOn (standardSimplex m) 1 d2 ∧ IsProxDiameterBound (standardSimplex n) d1 (Real.log ↑n) ∧ IsProxDiameterBound (standardSimplex m) d2 (Real.log ↑m)

Lemma 1.4.1.1. In the setting of Definition 1.4.1.2, with ℓ1 norms ‖x‖₁ = ∑ i |x i| and ‖u‖₁ = ∑ j |u j|, and entropy prox-functions on Δ_n and Δ_m, d1(x) = ln n + ∑ i x^{(i)} ln x^{(i)} and d2(u) = ln m + ∑ j u^{(j)} ln u^{(j)}, the strong convexity parameters satisfy σ1 = σ2 = 1 and the prox-diameters are D1 = ln n and D2 = ln m.

theorem entropySimplex_softmax_den_pos (m : ℕ) (μ : ℝ) (s : Fin m → ℝ) (hm : 0 < m) :

0 < ∑ j : Fin m, Real.exp (s j / μ)

The softmax normalizer ∑ exp(s j / μ) is positive when m > 0.