Physics-Informed Neural Networks: Fundamental Limitations and Conditional Usefulness

Physics-Informed Neural Networks (PINNs) aim to approximate the solution \(u\) of a differential equation defined over spatial coordinates \(x \in \mathbb{R}^d\), and possibly time \(t\), by representing $u$ with a neural network \(u_\theta\), where \(\theta\) denotes the trainable parameters. Training proceeds by minimizing a composite loss
\[ \mathcal{L}(\theta)=\mathcal{L}_{\mathrm{PDE}} +\mathcal{L}_{\mathrm{BC}} + \mathcal{L}_{\mathrm{IC}}+\mathcal{L}_{\mathrm{data}},\] which enforces the governing PDE, boundary and initial conditions, and any observational data available.

PINNs reduce the PDE residual only indirectly: the residual decreases through adjustments of neural-network parameters rather than through explicit manipulation of discretized solution components or derivative operators. All required derivatives arise from automatic differentiation of the network; their behavior is therefore dictated by the functional form of \(u_\theta\), without the stabilizing structure that classical schemes obtain from meshes, trial spaces, or discrete stencils. Maintaining balance among derivative terms becomes difficult, especially in stiff or high-order PDEs, where multiscale interactions often lead to gradient imbalance, instability, or slow convergence.

Classical numerical methods derived from a variational formulation avoid these difficulties by encoding PDE structure into the discretization. For the Poisson problem \[-\Delta u = f \quad \text{in }\Omega, \qquad u=g \quad \text{on }\partial\Omega, \] the true solution is characterized as the minimizer of the energy functional  \[J(u)=\frac 12 \int_\Omega |\nabla u|^2 dx -\int_\Omega uf dx,  \] subject to the boundary condition \(u|_{\partial\Omega} = g\). Equivalently, the solution satisfies the weak formulation \[\int_\Omega \nabla u\cdot \nabla \varphi\,dx  = \int_\Omega f \varphi\,dx, \qquad \forall \varphi \in H_0^1(\Omega). \]
The variational structure provides coercivity, stability, and a globally coupled formulation; geometry appears exactly through integration over \(\Omega\), and boundary conditions are incorporated directly into the trial and test spaces. Mesh refinement allows accurate resolution near singularities, and convergence properties are well understood. 

These differences are especially visible on non-convex or irregular domains. For the classical L-shaped domain \[ \Omega = (-1,1)^2 \setminus ([0,1]\times[-1,0]), \] the solution exhibits reduced regularity near the reentrant corner. In polar coordinates \((r,\theta)\), \[ u(r,\theta)  \sim r^{2/3} \sin\!\left(\tfrac23\theta\right),\] revealing a singular derivative structure. A finite element method handles this effectively using local mesh refinement and the stability of the weak formulation.  (Here, the angular variable \(\theta\) is unrelated to the neural-network parameter \(\theta\); the distinction is clear from context.) 
A PINN, however, must discover this singularity solely through loss minimization. Since geometry enters only through collocation points, there is no intrinsic mechanism to concentrate resolution near the corner, no stiffness matrix to enforce global structure, and no variational principle to guarantee stability. As a result, the network often converges to an overly smooth approximation that fails to capture the true singular behavior, even when the pointwise PDE residual appears small. Some recent PINN approaches attempt to mitigate this limitation through adaptive or residual-based sampling, which increases the density of collocation points in regions where the residual is large. Although such strategies can partially improve local resolution, they remain heuristic and do not replace the systematic mesh refinement and stability guarantees provided by variational and finite element methods.

These structural weaknesses become more severe for PDE systems with intrinsic constraints, such as the incompressible Navier–Stokes equations, where the divergence-free condition $\nabla\cdot u=0$ is crucial and is enforced in classical solvers through compatible finite element spaces or staggered grids. A PINN, by contrast, imposes incompressibility only through a penalty term. Since pressure enters only via its gradient, the network must learn a coupled velocity–pressure pair while simultaneously minimizing divergence, leading to competing gradients and the absence of the stabilizing structure provided by mesh-based methods. This produces an ill-conditioned optimization problem: even small deviations in $\nabla\cdot u_\theta$ can cause large pressure errors, and solutions may appear locally plausible while violating incompressibility globally.

This contrast reflects a broader principle: variational formulations embed geometric information and operator structure directly into the numerical method, while PINNs attempt to infer such structure indirectly from sampled residuals. The loss function minimized by a PINN is therefore only a surrogate for the true energy or weak formulation of the PDE. As a consequence, a PINN may successfully reduce local pointwise residuals yet still produce inaccurate global solutions, especially on complex or irregular domains where geometric fidelity and operator coupling are essential.

Even so, PINNs remain useful when the governing PDE is only an approximate model of the underlying phenomenon. Observational data may contain unmodeled dynamics or historical influences absent from the idealized PDE. A standard example is option pricing in finance: although the Black–Scholes equation assumes constant volatility, real markets exhibit volatility patterns shaped by past behavior. Because PINNs incorporate data directly into the training objective, they can learn corrections to the nominal PDE, yielding more realistic models.

PINNs can appear to work reasonably well when the underlying problem is extremely simple—meaning the solution is smooth, the geometry is trivial (such as a one-dimensional domain), the PDE is low-order, the scales are uniform, and the optimization landscape is benign. In these settings, the neural network avoids stiff gradients, strong variable coupling, geometric singularities, and multiscale behavior, making the training objective much easier to minimize. A small network can then approximate the solution acceptably, and the PINN formulation can incorporate internal measurements, uncertain forcing, or unknown parameters within a single objective. However, this apparent success is highly conditional and provides no substantive advantage: classical numerical and inverse-problem methods solve such problems far more efficiently, robustly, and with stronger theoretical guarantees. PINNs seem effective only because the problems are so simple that the usual difficulties—such as unstable gradients, multiscale interactions, or sensitivity to geometry—do not arise. In more realistic settings, where the solution or domain has nontrivial structure, these weaknesses re-emerge and the method often performs poorly.

Comments

Popular posts from this blog

Optimizing Data Simplification: Principal Component Analysis for Linear Dimensionality Reduction

Exploring the Fundamentals of Diffusion Models

AI-Supervised Home Palliative Care: A Comfort-First and Cost-Effective Alternative to Hospital-Based End-of-Life Care