Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section05_part5

theorem Rk_succ_update_modified {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] {Q : Set E} {f d : E} {L σ : } (α : ) (hQ_convex : Convex Q) (hf_convex : ConvexOn Q f) (hf_diff : xQ, DifferentiableAt f x) (hd_diff : xQ, DifferentiableAt d x) (hconv : StrongConvexOn Q σ d) ( : 0 < σ) (hL : 0 < L) (xSeq ySeq : Q) (k : ) (hα_pos : ∀ (k : ), 0 < α k) (hα_sq : ∀ (k : ), α (k + 1) ^ 2 A_k α (k + 1)) (hRk : R_k Q f d L σ α xSeq ySeq k) (hzk_bregman : xQ, L / σ * bregmanDistance d (↑(z_k Q f d L σ α xSeq k)) x + psi_k Q f d L σ α xSeq k L / σ * d x + iFinset.range (k + 1), α i * (f (xSeq i) + DualPairing (↑(fderiv f (xSeq i))) (x - (xSeq i)))) (hVQ_min : IsMinOn (fun (x : E) => L / σ * bregmanDistance d (↑(z_k Q f d L σ α xSeq k)) x + α (k + 1) * DualPairing (↑(fderiv f (xSeq (k + 1)))) (x - (z_k Q f d L σ α xSeq k))) Q (V_Q Q d (z_k Q f d L σ α xSeq k) ((σ / L * α (k + 1)) (fderiv f (xSeq (k + 1)))))) (hfy_bound : have xk1 := (xSeq (k + 1)); have g := (fderiv f (xSeq (k + 1))); have γk := σ / L * α (k + 1); have xhat := (V_Q Q d (z_k Q f d L σ α xSeq k) (γk (fderiv f (xSeq (k + 1))))); A_k α (k + 1) * f (ySeq (k + 1)) A_k α (k + 1) * f xk1 + α (k + 1) * DualPairing g (xhat - (z_k Q f d L σ α xSeq k))) (hx_update : (xSeq (k + 1)) = (α (k + 1) / A_k α (k + 1)) (z_k Q f d L σ α xSeq k) + (1 - α (k + 1) / A_k α (k + 1)) (ySeq k)) :
R_k Q f d L σ α xSeq ySeq (k + 1)

Lemma 1.5.3.1. Let the sequence {α_k} satisfy condition (3.6) and suppose that relation (R_k) holds for some k ≥ 0. Choose γ_k = (σ / L) α_{k+1} and define x_{k+1} = τ_k z_k + (1-τ_k) y_k, \hat x_{k+1} = V_Q(z_k, γ_k ∇ f(x_{k+1})), y_{k+1} = τ_k \hat x_{k+1} + (1-τ_k) y_k (equation (5.5)). Then (R_{k+1}) holds.

def ModifiedOptimalSchemeAlgorithm {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] (Q : Set E) (f d : E) (L σ : ) (xSeq ySeq zSeq xhatSeq : Q) :

Algorithm 1.5.3.1. Choose an initial point y_0 = z_0 = argmin_{x ∈ Q} { (L/σ) d(x) + (1/2) [ f(x_0) + ⟪∇ f(x_0), x - x_0⟫ ] } (equation (5.6_init)). For k ≥ 0, iterate: (a) z_k = argmin_{x ∈ Q} { (L/σ) d(x) + ∑_{i=0}^k (i+1)/2 [ f(x_i) + ⟪∇ f(x_i), x - x_i⟫ ] }; (b) τ_k = 2/(k+3), x_{k+1} = τ_k z_k + (1-τ_k) y_k; (c) \hat x_{k+1} = V_Q(z_k, (σ/L) τ_k ∇ f(x_{k+1})); (d) y_{k+1} = τ_k \hat x_{k+1} + (1-τ_k) y_k (equation (5.6)).

Equations
  • One or more equations did not get rendered due to their size.
Instances For