Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section04_part10

theorem Section04Part10.image_neg_eq_smul (s : Set ) :
(fun (x : ) => -x) '' s = -1 s

Image under negation is scalar multiplication by -1 on .

theorem Section04Part10.sSup_image_neg_eq_neg_sInf (s : Set ) :
sSup ((fun (x : ) => -x) '' s) = -sInf s

Push a negation through sSup for subsets of .

theorem Section04Part10.sInf_image_neg_eq_neg_sSup (s : Set ) :
sInf ((fun (x : ) => -x) '' s) = -sSup s

Push a negation through sInf for subsets of .

theorem Section04Part10.variationalInequality.minimax_eq_of_saddlePoint {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] (Q : Set E) (B : EModule.Dual E) {w0 : E} (hw0 : w0 Q) (hgap_le : vQ, (B v) (w0 - v) 0) (hgap_ge : wQ, 0 (B w0) (w - w0)) (hbdAbove : wQ, BddAbove ((fun (v : E) => (B v) (w - v)) '' Q)) (hbdBelow : vQ, BddBelow ((fun (w : E) => (B v) (w - v)) '' Q)) :
sInf ((fun (w : E) => sSup ((fun (v : E) => (B v) (w - v)) '' Q)) '' Q) = sSup ((fun (v : E) => sInf ((fun (w : E) => (B v) (w - v)) '' Q)) '' Q)

A saddle point with bounded images forces the minimax equality by showing both sides are zero.

theorem Section04Part10.variationalInequality.minimax_eq_of_exists_VI_solution {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] (Q : Set E) ( : E →ₗ[] Module.Dual E) (b : Module.Dual E) (hmono : ∀ (h : E), 0 ( h) h) (hex : ∃ (w0 : E), IsVariationalInequalitySolution Q b w0) (hbdAbove : wQ, BddAbove ((fun (v : E) => ( v + b) (w - v)) '' Q)) (hbdBelow : vQ, BddBelow ((fun (w : E) => ( v + b) (w - v)) '' Q)) :
sInf ((fun (w : E) => sSup ((fun (v : E) => ( v + b) (w - v)) '' Q)) '' Q) = sSup ((fun (v : E) => sInf ((fun (w : E) => ( v + b) (w - v)) '' Q)) '' Q)

A VI solution yields a saddle point for the gap payoff, hence the minimax equality.

theorem Section04Part10.variationalInequality_sign_flip {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] (Q : Set E) (B : EModule.Dual E) :
sSup ((fun (v : E) => sInf ((fun (w : E) => (B v) (w - v)) '' Q)) '' Q) = -sInf ((fun (v : E) => sSup ((fun (w : E) => (B v) (v - w)) '' Q)) '' Q)

Reformulate the inner max by flipping the sign.

theorem Section04Part10.variationalInequality_dual_maxStructure {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] (Q : Set E) ( : E →L[] E →L[] ) (b : Module.Dual E) (d : E) (μ σ1 D1 ε : ) :
have ℬ' := { toFun := fun (x : E) => ( x), map_add' := , map_smul' := }; have B := fun (v : E) => ℬ' v + b; have phihat := fun (u : E) => b u; have := SmoothedMaxFunction Q phihat d μ; ((∀ (h : E), 0 (ℬ' h) h)(∃ (w0 : E), IsVariationalInequalitySolution Q ℬ' b w0)(∀ wQ, BddAbove ((fun (v : E) => (B v) (w - v)) '' Q))(∀ vQ, BddBelow ((fun (w : E) => (B v) (w - v)) '' Q))sInf ((fun (w : E) => sSup ((fun (v : E) => (B v) (w - v)) '' Q)) '' Q) = sSup ((fun (v : E) => sInf ((fun (w : E) => (B v) (w - v)) '' Q)) '' Q)) sSup ((fun (v : E) => sInf ((fun (w : E) => (B v) (w - v)) '' Q)) '' Q) = -sInf ((fun (v : E) => sSup ((fun (w : E) => (B v) (v - w)) '' Q)) '' Q) (∀ (x : E), x = sSup ((fun (u : E) => ( x) u - μ * d u - b u) '' Q)) (0 < ε0 < σ1StrongConvexOn Q σ1 dIsProxDiameterBound Q d D10 D1∃ (N : ), N 4 * D1 * OperatorNormDef ℬ' / (σ1 * ε) + (D1 * OperatorNormDef ℬ' / (σ1 * ε)))

Proposition 1.4.3.3. A dual max-structure representation of (4.19) comes from min_{w ∈ Q} max_{v ∈ Q} ⟪B(v), w - v⟫_1 = max_{v ∈ Q} min_{w ∈ Q} ⟪B(v), w - v⟫_1 = - min_{v ∈ Q} max_{w ∈ Q} ⟪B(v), v - w⟫_1. Equivalently, take E1 = E2 = E, Q1 = Q2 = Q, d1 = d2 = d, A = ℬ, fhat(x) = ⟪b, x⟫_1 + ⟪ℬ x, x⟫_1, and phihat(u) = ⟪b, u⟫_1. Then f_μ(x) = max_{u ∈ Q} {⟪ℬ x, u⟫_1 - μ d(u) - ⟪b, u⟫_1}. In this dual representation M = ‖ℬ‖_{1,2}, and the complexity bound corresponding to (4.21) is N ≤ (4 D1 ‖ℬ‖_{1,2})/(σ1 ε) + sqrt(D1 ‖ℬ‖_{1,2}/(σ1 ε)).

noncomputable def Section04Part10.piecewiseLinearMinimizationOptimalValue {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] (m : ) (Q1 : Set E1) (a : Fin mE1 →L[] ) (b : Fin m) :

Definition 1.4.4.1. Let a_j ∈ E1^* and b^{(j)} ∈ ℝ for j = 1, ..., m. Consider the piecewise linear minimization problem min_{x ∈ Q1} f(x), where f(x) := max_{1 ≤ j ≤ m} |⟪a_j, x⟫_1 - b^{(j)}| (4.22).

Equations
Instances For
    theorem Section04Part10.piecewiseLinear_entropy_rbar_nonneg (n : ) (Q1 : Set (Fin n)) :
    0 sSup ((fun (x : Fin n) => x) '' Q1)

    The radius defined as a supremum of norms is nonnegative.

    theorem Section04Part10.piecewiseLinear_entropy_complexity_const_nonneg (n m : ) (Q1 : Set (Fin n)) (hatA' : (Fin n) →ₗ[] Module.Dual (Fin (2 * m))) {ε : } ( : 0 < ε) :
    have rbar := sSup ((fun (x : Fin n) => x) '' Q1); have maxAj := OperatorNormDef hatA'; 0 2 * 2 * rbar * maxAj * (Real.log (2 * m)) * (1 / ε)

    The entropy-complexity constant is nonnegative.

    theorem Section04Part10.piecewiseLinear_entropy_complexity_exists_N {C : } (hC : 0 C) :
    ∃ (N : ), N C

    Round a nonnegative constant to a natural upper bound.

    theorem Section04Part10.piecewiseLinear_entropy_complexity (n m : ) (Q1 : Set (Fin n)) (a : Fin mModule.Dual (Fin n)) (b : Fin m) (hatA : (Fin n) →L[] (Fin (2 * m)) →L[] ) (hatb : (Fin (2 * m))) (μ ε : ) ( : 0 < ε) :
    have rbar := sSup ((fun (x : Fin n) => x) '' Q1); have σ1 := 1; have σ2 := 1; have D1 := 1 / 2 * rbar ^ 2; have D2 := Real.log (2 * m); have f := fun (x : Fin n) => sSup ((fun (u : Fin (2 * m)) => (hatA x) u - hatb u) '' standardSimplex (2 * m)); have hatA' := { toFun := fun (x : Fin n) => (hatA x), map_add' := , map_smul' := }; have maxAj := OperatorNormDef hatA'; have ξ := fun (τ : ) => 1 / 2 * (Real.exp τ + Real.exp (-τ)); have fbar := fun (x : Fin n) => μ * Real.log (1 / m * j : Fin m, ξ (((a j) x + b j) / μ)); (∀ (x : Fin n), f x = sSup ((fun (u : Fin (2 * m)) => (hatA x) u - hatb u) '' standardSimplex (2 * m))) σ1 = 1 σ2 = 1 D1 = 1 / 2 * rbar ^ 2 D2 = Real.log (2 * m) OperatorNormDef hatA' = maxAj (∃ (N : ), N 2 * 2 * rbar * maxAj * (Real.log (2 * m)) * (1 / ε)) ∀ (x : Fin n), fbar x = μ * Real.log (1 / m * j : Fin m, ξ (((a j) x + b j) / μ))

    Proposition 1.4.4.1. In the setting of Definition 1.4.4.1, take E1 = ℝ^n with Euclidean norm, the prox-function d1(x) = (1/2) ‖x‖_1^2, and rbar = max_{x ∈ Q1} ‖x‖_1. Let E2 = ℝ^{2m} with simplex Δ_{2m} and define hatA = (A; -A) and hatb = (b; -b) from rows a_j and scalars b^{(j)}. Then f(x) = max_{u ∈ Δ_{2m}} {⟪hatA x, u⟫_2 - ⟪hatb, u⟫_2} (4.19). With the entropy prox-function d2(u) = ln(2m) + ∑ u^{(j)} ln u^{(j)}, we have σ1 = σ2 = 1, D1 = (1/2) rbar^2, D2 = ln(2m), and ‖hatA‖_{1,2} = max_{1≤j≤m} ‖a_j‖_{1,*}. Therefore the complexity estimate (4.4) gives N ≤ 2*sqrt 2 rbar (max_{1≤j≤m} ‖a_j‖_{1,*}) sqrt(ln(2m)) / ε iterations for (4.22), and the smooth approximation has the explicit form bar f_μ(x) = μ ln((1/m) ∑_{j=1}^m ξ((1/μ) [⟪a_j, x⟫_1 + b^{(j)}])) with ξ(τ) = (1/2) (e^τ + e^{-τ}) (4.22_smooth).

    noncomputable def Section04Part10.piecewiseLinearSumAbsOptimalValue {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] (m : ) (Q1 : Set E1) (a : Fin mE1 →L[] ) (b : Fin m) :

    Definition 1.4.4.2. Let a_j ∈ E1^* and b^{(j)} ∈ ℝ for j = 1, ..., m. Consider minimizing the sum of absolute values (4.23): min_{x ∈ Q1} f(x) with f(x) := ∑_{j=1}^m |⟪a_j, x⟫_1 - b^{(j)}|.

    Equations
    Instances For
      theorem Section04Part10.sumAbs_smoothed_unfold {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] (m : ) (a : Fin mE1 →L[] ) (b : Fin m) (A : E1 →L[] (Fin m) →L[] ) (phihat : (Fin m)) (μ : ) (hA : ∀ (x : E1) (u : Fin m), (A x) u = j : Fin m, u j * (a j) x) (hphihat : ∀ (u : Fin m), phihat u = j : Fin m, b j * u j) (x : E1) :
      have Q2 := {u : Fin m | ∀ (j : Fin m), |u j| 1}; have d2 := fun (u : Fin m) => 1 / 2 * j : Fin m, a j * u j ^ 2; have := SmoothedMaxFunction Q2 A phihat d2 μ; x = sSup ((fun (u : Fin m) => j : Fin m, (u j * ((a j) x - b j) - 1 / 2 * μ * a j * u j ^ 2)) '' Q2)

      Unfold the smoothed max for the sum-of-abs objective into a separable quadratic sum.

      theorem Section04Part10.sumAbs_sSup_separable {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] (m : ) (a : Fin mE1 →L[] ) (b : Fin m) (μ : ) (x : E1) :
      have Q2 := {u : Fin m | ∀ (j : Fin m), |u j| 1}; sSup ((fun (u : Fin m) => j : Fin m, (u j * ((a j) x - b j) - 1 / 2 * μ * a j * u j ^ 2)) '' Q2) = j : Fin m, sSup ((fun (t : ) => t * ((a j) x - b j) - 1 / 2 * μ * a j * t ^ 2) '' Set.Icc (-1) 1)

      Split the separable quadratic supremum over Q2 into coordinatewise suprema.

      theorem Section04Part10.sSup_quadratic_Icc_eq_norm_mul_psi (μ τ c : ) (hc : 0 < c) :
      sSup ((fun (t : ) => t * τ - 1 / 2 * μ * c * t ^ 2) '' Set.Icc (-1) 1) = c * sSup ((fun (γ : ) => γ * (|τ| / c) - 1 / 2 * μ * γ ^ 2) '' Set.Icc 0 1)

      Express the quadratic supremum on [-1,1] via the scaled ψμ on [0,1].

      theorem Section04Part10.sumAbs_operatorNorm_bound {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] (m : ) (a : Fin mE1 →L[] ) (A : E1 →L[] (Fin m) →L[] ) (hA : ∀ (x : E1) (u : Fin m), (A x) u = j : Fin m, u j * (a j) x) :
      have D := j : Fin m, a j; have A' := { toFun := fun (x : E1) => (A x), map_add' := , map_smul' := }; OperatorNormDef A' D

      A crude operator-norm bound using the supremum norm on Fin m → ℝ.

      theorem Section04Part10.sumAbs_complexity_exists_N (D D1 σ1 ε : ) ( : 0 < ε) (hD : 0 D) :
      ∃ (N : ), N 2 / ε * (2 * D1 / σ1) * D

      Produce a natural bound from the explicit complexity constant.

      theorem Section04Part10.piecewiseLinear_sumAbs_smoothedApprox {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] (m : ) (a : Fin mE1 →L[] ) (b : Fin m) (A : E1 →L[] (Fin m) →L[] ) (phihat : (Fin m)) (μ σ1 D1 ε : ) ( : 0 < ε) (hpos : ∀ (j : Fin m), 0 < a j) (hA : ∀ (x : E1) (u : Fin m), (A x) u = j : Fin m, u j * (a j) x) (hphihat : ∀ (u : Fin m), phihat u = j : Fin m, b j * u j) :
      have Q2 := {u : Fin m | ∀ (j : Fin m), |u j| 1}; have d2 := fun (u : Fin m) => 1 / 2 * j : Fin m, a j * u j ^ 2; have := SmoothedMaxFunction Q2 A phihat d2 μ; have ψμ := fun (τ : ) => sSup ((fun (γ : ) => γ * τ - 1 / 2 * μ * γ ^ 2) '' Set.Icc 0 1); have D := j : Fin m, a j; have D2 := 1 / 2 * D; have A' := { toFun := fun (x : E1) => (A x), map_add' := , map_smul' := }; (∀ (x : E1), x = j : Fin m, a j * ψμ (|(a j) x - b j| / a j)) D2 = 1 / 2 * D OperatorNormDef A' D ∃ (N : ), N 2 / ε * (2 * D1 / σ1) * D

      Proposition 1.4.4.2. In the setting of Definition 1.4.4.2, let A be the matrix with rows a_j and take E2 = ℝ^m, Q2 = {u : |u^{(j)}| ≤ 1}, and d2(u) = (1/2) ‖u‖_2^2 = (1/2) ∑ ‖a_j‖_{1,*} (u^{(j)})^2. Then the smoothed approximation has the form f_μ(x) = max_{u ∈ Q2} {⟪A x - b, u⟫_2 - μ d2(u)} = ∑_{j=1}^m ‖a_j‖_{1,*} ψ_μ(|⟪a_j,x⟫_1 - b^{(j)}| / ‖a_j‖_{1,*}), where ψ_μ is the function in (4.17). Moreover, with D = ∑ ‖a_j‖_{1,*} we have D2 = (1/2) D, σ2 = 1, and ‖A‖_{1,2} ≤ D^{1/2}. Therefore, in the case M = 0, N ≤ (2/ε) * sqrt(2 D1/σ1) * ∑ ‖a_j‖_{1,*} (equation (4.24)).