Overview
A new diagnostic software suite is designed to audit learned Partial Differential Equation (PDE) simulators, which are increasingly employed as cost-effective alternatives to numerical solvers. The suite focuses on evaluating whether a learned model functions as a coherent numerical time propagator, moving beyond the limitations of standard relative $L^2$ error metrics.
Research Context
Learned PDE simulators serve as low-cost replacements for more expensive numerical solvers. However, relying solely on standard relative $L^2$ error measurements may not sufficiently determine if a learned model accurately represents a coherent numerical time propagator. The research addresses this gap by proposing a more comprehensive diagnostic approach.
Approach
The diagnostic software suite provides architecture-independent, post-hoc diagnostics for learned PDE simulators. Its design is centered around a minimal contract that includes reference trajectories, either a learned propagator or saved predictions, metadata related to the equation being simulated, and a diagnostic configuration specifying the meaningful structures for a given problem. The suite assesses multiple aspects of a simulator's behavior, including:
- Relative state error
- Semigroup consistency
- Finite-difference generator discrepancy
- Energy behavior
- Integral balance
- Admissibility constraints
- Perturbation response
- Scaling-law consistency
The validation of the suite involved five benchmark PDE tasks:
- Two-dimensional incompressible Navier-Stokes
- Shallow-water dynamics
- Active matter
- Three-dimensional compressible Navier-Stokes
- Three-dimensional magnetohydrodynamics
For these tasks, the researchers utilized various surrogate models, specifically FNO, DeepONet, U-Net, and ResNet-style architectures. They also included controlled underfit and oversmoothed variants of these models in their validation study.
Findings
The validation study indicated that relative $L^2$ error can maintain moderate values, and in some cases even show improvement, while concurrently, structural diagnostics exhibit substantial deterioration. This suggests that a low relative $L^2$ error does not inherently guarantee the structural coherence of a learned PDE simulator as a time propagator. The software package is designed to provide an interpretable diagnostic panel that avoids collapsing model behavior into a single state-error score, thereby supporting software-level auditing of learned PDE simulators.