Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section04_part2

theorem weighted_sum_bound_hatu {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] [NormedAddCommGroup E2] [NormedSpace E2] [FiniteDimensional E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (fhat : E1) (phihat d2 : E2) (μ : ) ( : E1E2) (A' : E1 →ₗ[] Module.Dual E2) (xSeq : Q1) (N : ) (hatu : Q2) (hhatudef : hatu = iFinset.range (N + 1), (2 * (i + 1) / ((N + 1) * (N + 2))) (xSeq i)) (hQ1_open : IsOpen Q1) (hQ1_convex : Convex Q1) (hfhat_convex : ConvexOn Q1 fhat) (hfhat_diff : DifferentiableOn fhat Q1) (hfbar_diff : DifferentiableOn (SmoothedObjective Q2 A phihat d2 μ fhat) Q1) (hphihat_convex : ConvexOn Q2 phihat) (hμ_nonneg : 0 μ) (hd2_nonneg : uQ2, 0 d2 u) (hmax : ∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x ( x)) (hderivμ : ∀ (x : E1), fderiv (SmoothedMaxFunction Q2 A phihat d2 μ) x = LinearMap.toContinuousLinearMap ((AdjointOperator A') ( x))) (hpair : ∀ (x : E1) (u : E2), DualPairing ((AdjointOperator A') u) x = (A x) u) (x : E1) :
x Q1iFinset.range (N + 1), (i + 1) * (SmoothedObjective Q2 A phihat d2 μ fhat (xSeq i) + DualPairing (↑(fderiv (SmoothedObjective Q2 A phihat d2 μ fhat) (xSeq i))) (x - (xSeq i))) (N + 1) * (N + 2) / 2 * (fhat x + (A x) hatu - phihat hatu)

Weighted linearization sum bound using the averaged dual point.

theorem mu_sq_identity (D1 D2 σ1 σ2 μ Aop : ) (N : ) (hμ_def : μ = 2 * Aop / (N + 1) * (D1 / (σ1 * σ2 * D2))) (hσ1 : 0 < σ1) (hσ2 : 0 < σ2) (hD1 : 0 D1) (hD2 : 0 D2) (hD2ne : D2 0) :
μ ^ 2 * (σ1 * σ2 * D2) * (N + 1) ^ 2 = 4 * Aop ^ 2 * D1

If μ is chosen as in the smoothing rule, its square satisfies the scaling identity.

theorem mu_mul_D2_sqrt (D1 D2 σ1 σ2 μ Aop : ) (N : ) (hμ_def : μ = 2 * Aop / (N + 1) * (D1 / (σ1 * σ2 * D2))) (hD2 : 0 D2) (hσ1 : 0 < σ1) (hσ2 : 0 < σ2) :
μ * D2 = 2 * Aop / (N + 1) * (D1 * D2 / (σ1 * σ2))

If μ is chosen as in the smoothing rule, then μ * D2 simplifies to the sqrt expression.

theorem mu_inv_term_eq (D1 D2 σ1 σ2 μ Aop : ) (N : ) (hμ_def : μ = 2 * Aop / (N + 1) * (D1 / (σ1 * σ2 * D2))) (hσ1 : 0 < σ1) (hσ2 : 0 < σ2) (hD1 : 0 D1) (hD2 : 0 D2) ( : μ 0) :
4 * (1 / (μ * σ2)) * Aop ^ 2 * D1 / (σ1 * (N + 1) * (N + 2)) = μ * D2 * (N + 1) / (N + 2)

The inverse-μ term equals μ * D2 scaled by (N+1)/(N+2).

theorem mu_inv_term_le (D1 D2 σ1 σ2 μ Aop : ) (N : ) (hμ_def : μ = 2 * Aop / (N + 1) * (D1 / (σ1 * σ2 * D2))) (hσ1 : 0 < σ1) (hσ2 : 0 < σ2) (hD1 : 0 D1) (hD2 : 0 D2) (hAop : 0 Aop) :
4 * (1 / (μ * σ2)) * Aop ^ 2 * D1 / (σ1 * (N + 1) * (N + 2)) 2 * Aop / (N + 1) * (D1 * D2 / (σ1 * σ2))

The inverse-μ term is bounded by the sqrt expression.

theorem mu_simplify_bound (D1 D2 σ1 σ2 μ : ) (N : ) (Aop : ) (hμ_def : μ = 2 * Aop / (N + 1) * (D1 / (σ1 * σ2 * D2))) (hD1 : 0 D1) (hD2 : 0 D2) (hσ1 : 0 < σ1) (hσ2 : 0 < σ2) (hAop : 0 Aop) :
μ * D2 + 4 * (1 / (μ * σ2)) * Aop ^ 2 * D1 / (σ1 * (N + 1) * (N + 2)) 4 * Aop / (N + 1) * (D1 * D2 / (σ1 * σ2))

Simplify the μ-dependent terms in the gap bound.

theorem epsilon_bound_from_gap {a b ε : } {N : } ( : 0 < ε) (ha : 0 a) (hb : 0 b) (hN : N + 1 a / ε + (b / ε)) :
a / (N + 1) + b / (N + 1) ^ 2 ε

Convert an iteration lower bound to an ε-accuracy bound.