How we build a PINN for inviscid Burgers Equation with shock formulation
PINN on Shock Waves
Physics-informed neural networks (PINNs) are a special type of neural networks. They estimate solutions to partial differential equations by incorporating the governing physical laws of a given dataset into the learning process.
An example of such an equation is the inviscid Burgers’ equation, a prototype for conservation laws that can develop shock waves.
The current literature struggles to effectively tackle this issue. As shock waves are not continuous solutions, they only satisfy the equations in a weak sense. Continuous Time Models that depend solely on training samples, like the algorithmic differentiation method, cannot capture shock waves. These methods are only applicable to cases of functional regularity.
One could attempt to use Discrete Time Models where neural networks and time discretization work together to help the model formulate shocks. However, this method somewhat diminishes the advantages of Physics-informed Neural Networks (PINNs) and reverts to traditional numerical methods. This can be challenging for someone who understands equations but is not familiar with numerical solutions.
In this article, I will address the limitations of existing Continuous Time Models of PINN methods for the Burgers equation. I will introduce calculations for discontinuity and weak solutions based on algorithmic differentiation, enabling the equation to capture shocks. This article might inspire those who are interested in the intersection of neural networks and physics-based modeling, especially in domains related to conservation laws.
However, it should be noted that this method has only shown promising results for one of the simplest one-dimensional hyperbolic equations. Whether it can be extended to higher dimensions or more complex equations is an aspect that the author has not explored, and I invite readers to contribute their own ideas and resources on this topic.
PINN: Continuous Time Models for Burgers
According to the original paper: “Physics Informed Neural Networks (PINNs) are trained to solve supervised learning tasks whilst respecting any given laws of physics, described by general nonlinear partial differential equations (PDEs). ”
These PDEs take the following form in general [1]:
ut + N [u] = 0, x ∈ Ω, t ∈ [0, T],
where u(t, x) represents the solution, N [·] is a nonlinear differential operator, and Ω is a subset of the d-dimensional space.
Let’s denote by
L(u) = ut + N [u].
It can be immediately seen that f=0 if u is the solution of the equation. We will construct the solution u as a neural network
u = neural_net(t,x;weights)
where the inputs are the time and space variables. We determine the weights by minimizing the mean square error of f (as is said before, L(u) should be close to 0 if u is the solution of the equation) and certain initial and boundary conditions. For more details, please refer to the original paper.
Now, let’s consider the 1-dimensional inviscid Burgers Equation:
The solution to the equation, adhering to the initial condition, can be constructed implicitly using the method of characteristics, that is, u=f(x-ut) with the characteristic curve x(t)= x0+f(x0)t. We see from the formula that the characteristics x(t) are straight lines without the same slope, so if there exists two points x1, x2 such that x1+f(x1)t= x2+f(x2)t at a finite time t, we will see the intersectfion of two characteristics and the wave breaks [2].
The following code is inspired by the git repository pinn-burgers. Here, a viscous Burgers’ equation is considered for 𝜈>0. The equation is proven to have a globally defined smooth solution, given that the initial condition is a smooth function growing like o(|x|) at infinity [3].
We will express u(t,x) as neural_net(t,x;weights) with the aim to minimize the mean square error of L(u) (in this case, ut+uux) and the initial and boundary condition. If the solution to the equation is smooth, TensorFlow can be naturally used to write the following code to compute the desired unknowns:
with tf.GradientTape() as g:
g.watch(x)
u = neural_net(x)
du_dtx = g.batch_jacobian(u, x)
du_dt = du_dtx[..., 0]
du_dx = du_dtx[..., 1]
The L(u) (in the code, we call it u_eqn) will be define simply as:
u_eqn = du_dt+ u*du_dt # (1)
The issue is that the equation ut + uux only holds true in the weak sense. This means that it may not be useful to consider the values of ut and ux when shock waves form as they will explode. The equation only applies in an integrated form. Common Python packages like TensorFlow or PyTorch provide APIs for neural networks and differentiation algorithms but don’t offer weak sense solutions. Therefore, we need to reconfigure the formula of L(u) to compel the neural network to form the shock wave.
Introduction of Shock
We’re introducing the Rankine–Hugoniot conditions, also known as Rankine–Hugoniot jump conditions or relations. These describe the relationship between states on either side of a shock wave. For the Burgers equation, the Rankine–Hugoniot condition appears as: 1/2[[𝑢²]]=𝑠[[𝑢]]. The brackets [[ ]] represent the difference between the right-hand side and left-hand side values, while ‘s’ is the shock propagation velocity.
Considering a specific space variable ‘x’, we aim to closely examine the left or right limits, i.e., u(x±) in cases of discontinuity. Here’s the relevant code:
delta = 1e-3
xl = x - delta
xr= x + delta
with tf.GradientTape() as g:
g.watch(x)
u = neural_net(x)
ul = neural_net(xl)
ur = neural_net(xr)du_dtx = g.batch_jacobian(u, x)
du_dt = du_dtx[..., 0]
du_dx = du_dtx[..., 1]
We define a small delta and calculate the value of the solution on both the left and right sides of the space variable x, up to a delta.
Following this, we redefine the function L(u) as :
term1 = du_dt + u * du_dx
term2 = (ul + ur) / 2
condition = tf.less(du_dt, 1e3)
u_eqn = tf.where(condition, term1, term2) # (2)
We use the regular form of the equation when the value of du_dt is finite (specifically, smaller than a sufficiently large value), and we use the Rankine–Hugoniot condition when the value of du_dt is infinite.
Experiment
Let’s consider the Burgers equation with an initial condition of sin(πx) on the interval [-1, 1]. The solution can be expressed as u=sin(π(x-ut)), and a shock forms when t=1. Using formula (1), we derive the following solution:
The model has struggled to find the correct answer without being informed about what a shock is. However, if we switch to formula (2), we obtain the following solution:
You can see that the model successfully captures the shock wave at t=1.
Conclusion
Physics-informed neural networks (PINNs) can estimate solutions to partial differential equations by incorporating physical laws into their learning process. However, they often have difficulties with discontinuous solutions such as shock waves. I propose calculations for weak solutions that allow the Burgers equation to capture shocks. It’s important to note that while the 1-D Burgers Equation is a simple use case, this method may not be applicable to more complex equations without deeper consideration.
Reference
[1] M. Raissi, P. Perdikaris, G.E. Karniadakis,
Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational Physics,Volume 378,2019, Pages 686–707.
[2] A. Salih, Inviscid Burgers’ Equation. Lecture notes. Department of Aerospace Engineering Indian Institute of Space Science and Technology.
[3] J Unterberger, Global existence for strong solutions of viscous Burgers equation. March 2015. Control and Cybernetics 46(2).