Papers.SmoothMinimization_Nesterov_2004.Sections.section04

source

theorem operatorNormDef_nonneg {E1 : Type u_1} {E2 : Type u_2} [SeminormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [SeminormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (A : E1 →ₗ[ℝ ] Module.Dual ℝ E2) :

0 ≤ OperatorNormDef A

OperatorNormDef is nonnegative.

source

noncomputable def SmoothedObjective {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (phihat d2 : E2 → ℝ) (μ : ℝ) (fhat : E1 → ℝ) :

E1 → ℝ

Definition 1.4.1. Assume fhat : E1 → ℝ is convex and continuously differentiable on Q1, and its gradient is Lipschitz on Q1 with constant M ≥ 0 in the dual norm: ‖∇ fhat x - ∇ fhat y‖_{1,*} ≤ M ‖x - y‖_1 for all x, y ∈ Q1. Let f_μ be the smoothed function from (2.5). Define the smoothed objective \bar f_μ(x) = fhat x + f_μ x and consider min_{x ∈ Q1} \bar f_μ(x) (equation (4.1)).

Equations

SmoothedObjective Q2 A phihat d2 μ fhat x = fhat x + SmoothedMaxFunction Q2 A phihat d2 μ x

Instances For

source

theorem lipschitzOnWith_add {α : Type u_1} {β : Type u_2} [PseudoMetricSpace α] [NormedAddCommGroup β] {K₁ K₂ : NNReal} {f g : α → β} {s : Set α} (hf : LipschitzOnWith K₁ f s) (hg : LipschitzOnWith K₂ g s) :

LipschitzOnWith (K₁ + K₂) (fun (x : α) => f x + g x) s

LipschitzOnWith is preserved under pointwise addition.

source

theorem smoothedMaxFunction_fderiv_lipschitzOn {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (phihat d2 : E2 → ℝ) (μ σ2 : ℝ) (hμ : 0 < μ) (hσ2 : 0 < σ2) (hconv : StrongConvexOn Q2 σ2 d2) (uμ : E1 → E2) (hmax : ∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x (uμ x)) :

have fμ := SmoothedMaxFunction Q2 A phihat d2 μ; have A' := { toFun := fun (x : E1) => ↑(A x), map_add' := ⋯, map_smul' := ⋯ }; (∀ (x : E1), HasFDerivAt (fun (y : E1) => (A y) (uμ y) - phihat (uμ y) - μ * d2 (uμ y)) (LinearMap.toContinuousLinearMap ((AdjointOperator A') (uμ x))) x) → (∀ (x1 x2 : E1), μ * σ2 * ‖uμ x1 - uμ x2‖ ^ 2 ≤ DualPairing ((AdjointOperator A') (uμ x1 - uμ x2)) (x1 - x2)) → LipschitzOnWith (1 / (μ * σ2) * OperatorNormDef A' ^ 2).toNNReal (fun (x : E1) => fderiv ℝ fμ x) Q1

Lipschitz bound for the derivative of the smoothed max-function on a set.

source

theorem fderiv_add_on {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] {f g : E → ℝ} {s : Set E} (hf : ∀ x ∈ s, DifferentiableAt ℝ f x) (hg : ∀ x ∈ s, DifferentiableAt ℝ g x) (x : E) :

x ∈ s → fderiv ℝ (fun (y : E) => f y + g y) x = fderiv ℝ f x + fderiv ℝ g x

On a set, the derivative of a sum is the sum of derivatives.

source

theorem toNNReal_add_nonneg {a b : ℝ} (ha : 0 ≤ a) (hb : 0 ≤ b) :

(a + b).toNNReal = a.toNNReal + b.toNNReal

Real.toNNReal preserves addition on nonnegative inputs.

source

theorem smoothedObjective_lipschitz_gradient {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (phihat d2 : E2 → ℝ) (μ σ2 M : ℝ) (fhat : E1 → ℝ) (uμ : E1 → E2) (hμ : 0 < μ) (hσ2 : 0 < σ2) (hM : 0 ≤ M) (hconv : StrongConvexOn Q2 σ2 d2) (hFhatDiff : ∀ x ∈ Q1, DifferentiableAt ℝ fhat x) (hLipschitz : LipschitzOnWith M.toNNReal (fun (x : E1) => fderiv ℝ fhat x) Q1) :

have fbar := SmoothedObjective Q2 A phihat d2 μ fhat; have A' := { toFun := fun (x : E1) => ↑(A x), map_add' := ⋯, map_smul' := ⋯ }; (∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x (uμ x)) → (∀ (x : E1), HasFDerivAt (fun (y : E1) => (A y) (uμ y) - phihat (uμ y) - μ * d2 (uμ y)) (LinearMap.toContinuousLinearMap ((AdjointOperator A') (uμ x))) x) → (∀ (x1 x2 : E1), μ * σ2 * ‖uμ x1 - uμ x2‖ ^ 2 ≤ DualPairing ((AdjointOperator A') (uμ x1 - uμ x2)) (x1 - x2)) → ∃ (Lμ : ℝ), Lμ = M + 1 / (μ * σ2) * OperatorNormDef A' ^ 2 ∧ LipschitzOnWith Lμ.toNNReal (fun (x : E1) => fderiv ℝ fbar x) Q1

Proposition 1.4.1. Under the assumptions of Definition 1.4.1, the function \bar f_μ has Lipschitz continuous gradient on Q1 with Lipschitz constant L_μ := M + (1/(μ σ2)) ‖A‖_{1,2}^2 (equation (4.2)), where σ2 is the strong convexity parameter of the prox-function d2 on Q2.

source

def IsProxDiameterBound {E1 : Type u_1} [NormedAddCommGroup E1] [NormedSpace ℝ E1] (Q1 : Set E1) (d1 : E1 → ℝ) (D1 : ℝ) :

Prop

Definition 1.4.2. Let Q1 ⊆ E1 be bounded, closed, and convex, and let d1 be a prox-function on Q1, meaning it is continuous and σ1-strongly convex on Q1 for some σ1 > 0 with respect to ‖·‖_1. Assume the (finite) prox-diameter bound D1 satisfies max_{x ∈ Q1} d1 x ≤ D1 < +∞ (equation (4.3)).

Equations

IsProxDiameterBound Q1 d1 D1 = ∀ x ∈ Q1, d1 x ≤ D1

Instances For

source

theorem smoothedMaxFunction_linearization {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (phihat d2 : E2 → ℝ) (μ : ℝ) (uμ : E1 → E2) (A' : E1 →ₗ[ℝ ] Module.Dual ℝ E2) (hmax : ∀ (x : E1), IsSmoothedMaximizer Q2 A phihat d2 μ x (uμ x)) (hderiv : ∀ (x : E1), fderiv ℝ (SmoothedMaxFunction Q2 A phihat d2 μ) x = LinearMap.toContinuousLinearMap ((AdjointOperator A') (uμ x))) (hpair : ∀ (x : E1) (u : E2), DualPairing ((AdjointOperator A') u) x = (A x) u) (x : E1) :

SmoothedMaxFunction Q2 A phihat d2 μ x - DualPairing (↑(fderiv ℝ (SmoothedMaxFunction Q2 A phihat d2 μ) x)) x = -phihat (uμ x) - μ * d2 (uμ x)

Linearization identity for the smoothed max-function under an adjoint derivative formula.

source

theorem convex_of_strongConvexOn {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] {s : Set E} {σ : ℝ} {f : E → ℝ} (h : StrongConvexOn s σ f) :

Convex ℝ s

Strong convexity on a set implies the set is convex.

source

theorem adjointForm_duality_gap_nonneg {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [FiniteDimensional ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] [FiniteDimensional ℝ E2] (Q1 : Set E1) (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (fhat : E1 → ℝ) (phihat : E2 → ℝ) (hbd0 : ∀ (x : E1), BddAbove ((fun (u : E2) => (A x) u - phihat u) '' Q2)) (hbdBelow : ∀ (u : E2), BddBelow ((fun (x : E1) => (A x) u + fhat x) '' Q1)) (x : E1) :

x ∈ Q1 → ∀ (u : ↑Q2), 0 ≤ fhat x + sSup ((fun (u : E2) => (A x) u - phihat u) '' Q2) - AdjointFormPotential Q1 Q2 A fhat phihat u

Weak duality: f(x) ≥ φ(u) for any x ∈ Q1 and u ∈ Q2.

source

theorem smoothedObjective_bound {E1 : Type u_1} {E2 : Type u_2} [NormedAddCommGroup E1] [NormedSpace ℝ E1] [NormedAddCommGroup E2] [NormedSpace ℝ E2] (Q2 : Set E2) (A : E1 →L[ℝ ] E2 →L[ℝ ] ℝ) (phihat d2 : E2 → ℝ) (μ : ℝ) (hμ : 0 ≤ μ) (hd2_nonneg : ∀ u ∈ Q2, 0 ≤ d2 u) (hbdd0 : ∀ (x : E1), BddAbove ((fun (u : E2) => (A x) u - phihat u) '' Q2)) (hbdd_d2 : BddAbove (d2 '' Q2)) (fhat : E1 → ℝ) (x : E1) :

fhat x + sSup ((fun (u : E2) => (A x) u - phihat u) '' Q2) ≤ SmoothedObjective Q2 A phihat d2 μ fhat x + μ * sSup (d2 '' Q2)

The smoothed objective upper-bounds the original objective up to μ D2.

source

theorem sSup_image_nonneg {α : Type u_1} (s : Set α) (f : α → ℝ) (hbd : BddAbove (f '' s)) (hne : s.Nonempty) (hnonneg : ∀ x ∈ s, 0 ≤ f x) :

0 ≤ sSup (f '' s)

The supremum of a nonnegative function on a nonempty set is nonnegative.

source

axiom z_k_isMinOn {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] [FiniteDimensional ℝ E] (Q : Set E) (f d : E → ℝ) (L σ : ℝ) (α : ℕ → ℝ) (xSeq : ℕ → ↑Q) (k : ℕ) :

IsMinOn (fun (z : E) => L / σ * d z + ∑ i ∈ Finset.range (k + 1), let xi := ↑(xSeq i); α i * (f xi + DualPairing (↑(fderiv ℝ f xi)) (z - xi))) Q ↑(z_k Q f d L σ α xSeq k)

z_k is chosen as a minimizer of the auxiliary function defining ψ_k (Definition 1.3.5).

source

theorem fbar_rate_bound {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] [FiniteDimensional ℝ E] {Q : Set E} {f d : E → ℝ} {L σ D : ℝ} (xSeq ySeq : ℕ → ↑Q) (N : ℕ) (hL : 0 < L) (hσ : 0 < σ) (hAlg : OptimalSchemeAlgorithm Q f d L σ xSeq ySeq) (hD : IsProxDiameterBound Q d D) {x : E} (hx : x ∈ Q) :

f ↑(ySeq N) ≤ 4 * L * D / (σ * ((↑N + 1) * (↑N + 2))) + 2 / ((↑N + 1) * (↑N + 2)) * ∑ i ∈ Finset.range (N + 1), (↑i + 1) * (f ↑(xSeq i) + DualPairing (↑(fderiv ℝ f ↑(xSeq i))) (x - ↑(xSeq i)))

Rate bound for the optimal scheme after bounding the prox term by D.

source

theorem sum_range_add_one_cast (N : ℕ) :

∑ i ∈ Finset.range (N + 1), (↑i + 1) = (↑N + 1) * (↑N + 2) / 2

Closed form: ∑_{i=0}^N ((i:ℝ)+1) = ((N:ℝ)+1)((N:ℝ)+2)/2.

source

theorem convex_support_grad {E : Type u_1} [NormedAddCommGroup E] [NormedSpace ℝ E] [FiniteDimensional ℝ E] {Q : Set E} {f : E → ℝ} (hQ_open : IsOpen Q) (hQ_convex : Convex ℝ Q) (hf_convex : ConvexOn ℝ Q f) (hf_diff : DifferentiableOn ℝ f Q) {u v : E} (hu : u ∈ Q) (hv : v ∈ Q) :

f u ≥ f v + DualPairing (↑(fderiv ℝ f v)) (u - v)

Supporting hyperplane inequality for a differentiable convex function on an open convex set.

Documentation

Papers.SmoothMinimization_Nesterov_2004.Sections.section04_part1