Overview
This work investigates lower complexity bounds for nonconvex-strongly-convex bilevel optimization, specifically focusing on the number of oracle calls required by first-order algorithms. The study develops hard instances to derive these bounds under both deterministic and stochastic first-order oracle models.
Research Context
Upper bound guarantees for bilevel optimization have been extensively studied. However, progress in establishing corresponding lower bounds has been limited, a challenge attributed to the inherent complexity of the bilevel structure. This research addresses this gap within the smooth nonconvex-strongly-convex setting.
Approach
The researchers developed new hard instances tailored to the smooth nonconvex-strongly-convex bilevel optimization problem. These instances were then used to analyze the minimum number of oracle calls required by first-order algorithms. The analysis considered two distinct first-order oracle models: deterministic and stochastic.
Findings
Deterministic First-Order Oracle Model
- For the deterministic case, any first-order zero-respecting algorithm requires a minimum of $\Omega(\kappa^{3/2}\epsilon^{-2})$ oracle calls to reach an $\epsilon$-accurate stationary point.
- This result improves upon optimal lower bounds previously known for single-level nonconvex optimization problems.
- It also strengthens the optimal lower bounds that were known for nonconvex-strongly-convex min-max problems.
Stochastic First-Order Oracle Model
- In the stochastic case, the study indicates that at least $\Omega(\kappa^{5/2}\epsilon^{-4})$ stochastic oracle calls are necessary.
- This finding reinforces the best known bounds in related optimization settings.
Why This Matters
The derived results reveal substantial gaps between the existing upper and lower bounds for bilevel optimization. These discrepancies suggest that even simplified regimes, such as those involving quadratic lower-level objectives, warrant further investigation to fully comprehend the optimal complexity of bilevel optimization when using standard first-order oracles.