Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section04_part1

noncomputable def SmoothedObjective {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [NormedAddCommGroup E2] [NormedSpace E2] (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (phihat d2 : E2) (μ : ) (fhat : E1) :
E1

Definition 1.4.1. Assume fhat : E1 → ℝ is convex and continuously differentiable on Q1, and its gradient is Lipschitz on Q1 with constant M ≥ 0 in the dual norm: ‖∇ fhat x - ∇ fhat y‖_{1,*} ≤ M ‖x - y‖_1 for all x, y ∈ Q1. Let f_μ be the smoothed function from (2.5). Define the smoothed objective \bar f_μ(x) = fhat x + f_μ x and consider min_{x ∈ Q1} \bar f_μ(x) (equation (4.1)).

Equations
Instances For
    theorem lipschitzOnWith_add {α : Type u_1} {β : Type u_2} [PseudoMetricSpace α] [NormedAddCommGroup β] {K₁ K₂ : NNReal} {f g : αβ} {s : Set α} (hf : LipschitzOnWith K₁ f s) (hg : LipschitzOnWith K₂ g s) :
    LipschitzOnWith (K₁ + K₂) (fun (x : α) => f x + g x) s

    LipschitzOnWith is preserved under pointwise addition.

    theorem smoothedMaxFunction_fderiv_lipschitzOn {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] [NormedAddCommGroup E2] [NormedSpace E2] [FiniteDimensional E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (phihat d2 : E2) (μ σ2 : ) ( : 0 < μ) (hσ2 : 0 < σ2) (hconv : StrongConvexOn Q2 σ2 d2) ( : E1E2) (hmax : ∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x ( x)) :
    have := SmoothedMaxFunction Q2 A phihat d2 μ; have A' := { toFun := fun (x : E1) => (A x), map_add' := , map_smul' := }; (∀ (x : E1), HasFDerivAt (fun (y : E1) => (A y) ( y) - phihat ( y) - μ * d2 ( y)) (LinearMap.toContinuousLinearMap ((AdjointOperator A') ( x))) x)(∀ (x1 x2 : E1), μ * σ2 * x1 - x2 ^ 2 DualPairing ((AdjointOperator A') ( x1 - x2)) (x1 - x2))LipschitzOnWith (1 / (μ * σ2) * OperatorNormDef A' ^ 2).toNNReal (fun (x : E1) => fderiv x) Q1

    Lipschitz bound for the derivative of the smoothed max-function on a set.

    theorem fderiv_add_on {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] {f g : E} {s : Set E} (hf : xs, DifferentiableAt f x) (hg : xs, DifferentiableAt g x) (x : E) :
    x sfderiv (fun (y : E) => f y + g y) x = fderiv f x + fderiv g x

    On a set, the derivative of a sum is the sum of derivatives.

    theorem toNNReal_add_nonneg {a b : } (ha : 0 a) (hb : 0 b) :

    Real.toNNReal preserves addition on nonnegative inputs.

    theorem smoothedObjective_lipschitz_gradient {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] [NormedAddCommGroup E2] [NormedSpace E2] [FiniteDimensional E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (phihat d2 : E2) (μ σ2 M : ) (fhat : E1) ( : E1E2) ( : 0 < μ) (hσ2 : 0 < σ2) (hM : 0 M) (hconv : StrongConvexOn Q2 σ2 d2) (hFhatDiff : xQ1, DifferentiableAt fhat x) (hLipschitz : LipschitzOnWith M.toNNReal (fun (x : E1) => fderiv fhat x) Q1) :
    have fbar := SmoothedObjective Q2 A phihat d2 μ fhat; have A' := { toFun := fun (x : E1) => (A x), map_add' := , map_smul' := }; (∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x ( x))(∀ (x : E1), HasFDerivAt (fun (y : E1) => (A y) ( y) - phihat ( y) - μ * d2 ( y)) (LinearMap.toContinuousLinearMap ((AdjointOperator A') ( x))) x)(∀ (x1 x2 : E1), μ * σ2 * x1 - x2 ^ 2 DualPairing ((AdjointOperator A') ( x1 - x2)) (x1 - x2))∃ ( : ), = M + 1 / (μ * σ2) * OperatorNormDef A' ^ 2 LipschitzOnWith .toNNReal (fun (x : E1) => fderiv fbar x) Q1

    Proposition 1.4.1. Under the assumptions of Definition 1.4.1, the function \bar f_μ has Lipschitz continuous gradient on Q1 with Lipschitz constant L_μ := M + (1/(μ σ2)) ‖A‖_{1,2}^2 (equation (4.2)), where σ2 is the strong convexity parameter of the prox-function d2 on Q2.

    def IsProxDiameterBound {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace E1] (Q1 : Set E1) (d1 : E1) (D1 : ) :

    Definition 1.4.2. Let Q1 ⊆ E1 be bounded, closed, and convex, and let d1 be a prox-function on Q1, meaning it is continuous and σ1-strongly convex on Q1 for some σ1 > 0 with respect to ‖·‖_1. Assume the (finite) prox-diameter bound D1 satisfies max_{x ∈ Q1} d1 x ≤ D1 < +∞ (equation (4.3)).

    Equations
    Instances For
      theorem smoothedMaxFunction_linearization {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] [NormedAddCommGroup E2] [NormedSpace E2] [FiniteDimensional E2] (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (phihat d2 : E2) (μ : ) ( : E1E2) (A' : E1 →ₗ[] Module.Dual E2) (hmax : ∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x ( x)) (hderiv : ∀ (x : E1), fderiv (SmoothedMaxFunction Q2 A phihat d2 μ) x = LinearMap.toContinuousLinearMap ((AdjointOperator A') ( x))) (hpair : ∀ (x : E1) (u : E2), DualPairing ((AdjointOperator A') u) x = (A x) u) (x : E1) :
      SmoothedMaxFunction Q2 A phihat d2 μ x - DualPairing (↑(fderiv (SmoothedMaxFunction Q2 A phihat d2 μ) x)) x = -phihat ( x) - μ * d2 ( x)

      Linearization identity for the smoothed max-function under an adjoint derivative formula.

      theorem convex_of_strongConvexOn {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] {s : Set E} {σ : } {f : E} (h : StrongConvexOn s σ f) :

      Strong convexity on a set implies the set is convex.

      theorem adjointForm_duality_gap_nonneg {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [FiniteDimensional E1] [NormedAddCommGroup E2] [NormedSpace E2] [FiniteDimensional E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (fhat : E1) (phihat : E2) (hbd0 : ∀ (x : E1), BddAbove ((fun (u : E2) => (A x) u - phihat u) '' Q2)) (hbdBelow : ∀ (u : E2), BddBelow ((fun (x : E1) => (A x) u + fhat x) '' Q1)) (x : E1) :
      x Q1∀ (u : Q2), 0 fhat x + sSup ((fun (u : E2) => (A x) u - phihat u) '' Q2) - AdjointFormPotential Q1 Q2 A fhat phihat u

      Weak duality: f(x) ≥ φ(u) for any x ∈ Q1 and u ∈ Q2.

      theorem smoothedObjective_bound {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace E1] [NormedAddCommGroup E2] [NormedSpace E2] (Q2 : Set E2) (A : E1 →L[] E2 →L[] ) (phihat d2 : E2) (μ : ) ( : 0 μ) (hd2_nonneg : uQ2, 0 d2 u) (hbdd0 : ∀ (x : E1), BddAbove ((fun (u : E2) => (A x) u - phihat u) '' Q2)) (hbdd_d2 : BddAbove (d2 '' Q2)) (fhat : E1) (x : E1) :
      fhat x + sSup ((fun (u : E2) => (A x) u - phihat u) '' Q2) SmoothedObjective Q2 A phihat d2 μ fhat x + μ * sSup (d2 '' Q2)

      The smoothed objective upper-bounds the original objective up to μ D2.

      theorem sSup_image_nonneg {α : Type u_1} (s : Set α) (f : α) (hbd : BddAbove (f '' s)) (hne : s.Nonempty) (hnonneg : xs, 0 f x) :
      0 sSup (f '' s)

      The supremum of a nonnegative function on a nonempty set is nonnegative.

      axiom z_k_isMinOn {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] (Q : Set E) (f d : E) (L σ : ) (α : ) (xSeq : Q) (k : ) :
      IsMinOn (fun (z : E) => L / σ * d z + iFinset.range (k + 1), let xi := (xSeq i); α i * (f xi + DualPairing (↑(fderiv f xi)) (z - xi))) Q (z_k Q f d L σ α xSeq k)

      z_k is chosen as a minimizer of the auxiliary function defining ψ_k (Definition 1.3.5).

      theorem fbar_rate_bound {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] {Q : Set E} {f d : E} {L σ D : } (xSeq ySeq : Q) (N : ) (hL : 0 < L) ( : 0 < σ) (hAlg : OptimalSchemeAlgorithm Q f d L σ xSeq ySeq) (hD : IsProxDiameterBound Q d D) {x : E} (hx : x Q) :
      f (ySeq N) 4 * L * D / (σ * ((N + 1) * (N + 2))) + 2 / ((N + 1) * (N + 2)) * iFinset.range (N + 1), (i + 1) * (f (xSeq i) + DualPairing (↑(fderiv f (xSeq i))) (x - (xSeq i)))

      Rate bound for the optimal scheme after bounding the prox term by D.

      theorem sum_range_add_one_cast (N : ) :
      iFinset.range (N + 1), (i + 1) = (N + 1) * (N + 2) / 2

      Closed form: ∑_{i=0}^N ((i:ℝ)+1) = ((N:ℝ)+1)((N:ℝ)+2)/2.

      theorem convex_support_grad {E : Type u_1} [NormedAddCommGroup E] [NormedSpace E] [FiniteDimensional E] {Q : Set E} {f : E} (hQ_open : IsOpen Q) (hQ_convex : Convex Q) (hf_convex : ConvexOn Q f) (hf_diff : DifferentiableOn f Q) {u v : E} (hu : u Q) (hv : v Q) :
      f u f v + DualPairing (↑(fderiv f v)) (u - v)

      Supporting hyperplane inequality for a differentiable convex function on an open convex set.