Ostrogradsky Instability and Higher-Order Derivatives
Traditionally, we can use various methods to determine that the EoM is at most a second-order differential equation. For example, in Newtonian mechanics, we only need to know one of the following boundary conditions:
Initial position and time \((q, t_i)\) and final position and time \((q_f, t_f)\)
Initial position and time \((q_i, t_i)\) and initial velocity \( \dot{q}_i \)
to uniquely determine the motion of an object. In other words, the physical path only depends on one of the aforementioned conditions. This corresponds to the EoM being a second-order differential equation, such as \(F = m \ddot{x}\), where the initial conditions only require the initial position and velocity. In the Lagrangian framework, the EoM is given by the Euler-Lagrange equation:
\[
\frac{d}{dt} \frac{\partial L}{\partial \dot{q}} - \frac{\partial L}{\partial q} = 0
\]
Note that \(\frac{d}{dt}\) increases the order of differentiation. For the EoM to remain a second-order differential equation, the Lagrangian \(L\) must only be a function of \(L(\dot{q}, q, t)\).
Beyond empirical and experimental observations suggesting that the EoM is second-order and \(L\) is limited to \(L(\dot{q}, q, t)\), there is a more severe issue: if the Lagrangian \(L\) includes higher-order derivatives, the system's Hamiltonian \(H\) (i.e., energy \(E\)) has no lower bound. This is referred to as the Ostrogradsky instability. A lack of a lower bound for \(E\) is a severe issue in physics. In quantum mechanics, it implies the absence of a ground state, allowing particles to occupy infinitely negative energy states. In statistical mechanics, the partition function
\[
Z = \sum_{E_i} e^{-\frac{E_i}{k_B T}}
\]
diverges (\(Z \to +\infty \because \exists E_i \to -\infty \)), making it impossible to determine the probability of different configurations
\[
\rho_i = \frac{1}{Z} e^{-\frac{E_i}{k_B T}}
\]
and rendering all physical quantities unpredictable.
The Hamiltonian of a Traditional Lagrangian
Returning to the traditional Lagrangian \(L(\dot{q}, q, t)\), the Hamiltonian is defined as: \[ H = \frac{\partial L}{\partial \dot{q}} \dot{q} - L \] According to Landau's arguments, due to the isotropy (homogeneity in space and time) and frame independence (i.e., the principle of relativity, ensuring physics is unaffected by the choice of coordinate systems or signs), the Lagrangian \(L\) only includes terms like \(\dot{q}^{2n}\) and \(q^{2n}\), which are even powers. Additionally, \(L\) must be a scalar. More specifically, since \(\overrightarrow{\dot{q}}\) and \(\overrightarrow{q}\) are vectors, to satisfy the above requirements, they transform as \(\overrightarrow{\dot{q}} \to \overrightarrow{\dot{q}} \cdot \overrightarrow{\dot{q}} = |\dot{q}|^2\) and \(\overrightarrow{q} \to \overrightarrow{q} \cdot \overrightarrow{q} = |q|^2\), depending only on magnitudes and not directions, ensuring even powers. The simplest harmonic oscillator is: \[ L = \frac{m}{2} \dot{q}^2 - \frac{k}{2} q^2 \implies H = \frac{\partial L}{\partial \dot{q}} \dot{q} - L = \frac{m}{2} \dot{q}^2 + \frac{k}{2} q^2 \] Notably, \(\frac{\partial L}{\partial \dot{q}} \dot{q} = m \dot{q}^2 \) remains even powers, ensuring the energy \(E\) has a lower bound.
When \(L\) Includes Higher-Order Derivatives
If we allow the Lagrangian \(L\) to include second derivatives, i.e., \(L = L(\ddot{q}, \dot{q}, q, t)\), we can derive the EoM using the principle of variation: \[ \delta L = \frac{\partial L}{\partial \ddot{q}} \delta \ddot{q} + \frac{\partial L}{\partial \dot{q}} \delta \dot{q} + \frac{\partial L}{\partial q} \delta q \] \[ = \frac{\partial L}{\partial \ddot{q}} \frac{d}{dt} \delta \dot{q} + \frac{\partial L}{\partial \dot{q}} \frac{d}{dt} \delta q + \frac{\partial L}{\partial q} \delta q \] \[ = \frac{d}{dt} \left(\frac{\partial L}{\partial \ddot{q}} \delta \dot{q}\right) - \frac{d}{dt} \left(\frac{\partial L}{\partial \ddot{q}}\right) \delta \dot{q} + \frac{d}{dt} \left(\frac{\partial L}{\partial \dot{q}} \delta q\right) - \frac{d}{dt} \left(\frac{\partial L}{\partial \dot{q}}\right) \delta q + \frac{\partial L}{\partial q} \delta q \] \[ = \frac{d}{dt} \left(\frac{\partial L}{\partial \ddot{q}} \delta \dot{q} + \frac{\partial L}{\partial \dot{q}} \delta q\right) - \frac{d}{dt} \left[\frac{d}{dt} \left(\frac{\partial L}{\partial \ddot{q}}\right) \delta q\right] + \frac{d^2}{dt^2} \left(\frac{\partial L}{\partial \ddot{q}}\right) \delta q + \left(\frac{\partial L}{\partial q} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}}\right) \delta q \] \[ = \frac{d}{dt} \left[\frac{\partial L}{\partial \ddot{q}} \delta \dot{q} + \left(\frac{\partial L}{\partial \dot{q}} - \frac{d}{dt} \frac{\partial L}{\partial \ddot{q}}\right) \delta q\right] + \left(\frac{\partial L}{\partial q} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}} + \frac{d^2}{dt^2} \frac{\partial L}{\partial \ddot{q}}\right) \delta q \] The first term represents the boundary term, while the second term gives the EoM: \[ \frac{d^2}{dt^2} \frac{\partial L}{\partial \ddot{q}} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}} + \frac{\partial L}{\partial q} = 0 \] The EoM involves \(\frac{d^2}{dt^2} \frac{\partial L}{\partial \ddot{q}}\), leading to a fourth-order differential equation. Such an equation requires not only the initial position \(q(0)\), initial velocity \(\dot{q}(0)\), and initial acceleration \(\ddot{q}(0)\), but also the initial jerk \(\dddot{q}(0)\). Even though \(L\) includes only second derivatives, the EoM demands consideration of third derivatives, which seems unreasonable.
The Effect on the Hamiltonian
Starting from the Lagrangian \(L\), consider the following formula: \[ \frac{dL}{dt} = \frac{\partial L}{\partial \ddot{q}} \frac{d\ddot{q}}{dt} + \frac{\partial L}{\partial \dot{q}} \frac{d\dot{q}}{dt} + \frac{\partial L}{\partial q} \frac{dq}{dt} + \frac{\partial L}{\partial t} \] Rearranging and simplifying, we get: \[ \frac{\partial L}{\partial t} = \frac{dL}{dt} - \frac{\partial L}{\partial \ddot{q}} \frac{d\ddot{q}}{dt} - \frac{\partial L}{\partial \dot{q}} \frac{d\dot{q}}{dt} - \frac{\partial L}{\partial q} \dot{q} \] \[ = \frac{dL}{dt} - \frac{d}{dt} \left( \frac{\partial L}{\partial \ddot{q}} \ddot{q} \right) + \frac{d}{dt} \left( \frac{\partial L}{\partial \ddot{q}} \right) \ddot{q} - \frac{d}{dt} \left( \frac{\partial L}{\partial \dot{q}} \dot{q} \right) + \frac{d}{dt} \left( \frac{\partial L}{\partial \dot{q}} \right) \dot{q} - \frac{\partial L}{\partial q} \dot{q} \] \[ = \frac{d}{dt} \left[ L - \frac{\partial L}{\partial \ddot{q}} \ddot{q} + \left( \frac{d}{dt} \frac{\partial L}{\partial \ddot{q}} - \frac{\partial L}{\partial \dot{q}} \right) \dot{q} \right] - \left( \frac{d^2}{dt^2} \frac{\partial L}{\partial \ddot{q}} - \frac{d}{dt} \frac{\partial L}{\partial \dot{q}} + \frac{\partial L}{\partial q} \right) \dot{q} \] The second term corresponds to EoM = 0, while the first term gives the Hamiltonian: \[ H = \frac{\partial L}{\partial \ddot{q}} \ddot{q} + \left( \frac{\partial L}{\partial \dot{q}} - \frac{d}{dt} \frac{\partial L}{\partial \ddot{q}} \right) \dot{q} - L \] In the previous discussion, based on isotropy and the principle of relativity, we can infer that \(\ddot{q}\), \(\dot{q}\), and \(q\) in the Lagrangian \(L\) are all even powers. Therefore: \[ \frac{\partial L}{\partial \ddot{q}} \ddot{q}, \quad \frac{\partial L}{\partial \dot{q}} \dot{q} \] remain even and always positive. However: \[ \left( \frac{d}{dt} \frac{\partial L}{\partial \ddot{q}} \right) \dot{q} \] introduces odd terms involving \(\ddot{q}\) and \(\dot{q}\), allowing the Hamiltonian \(H\) to lack a lower bound. This is the essence of Ostrogradsky instability.
More General CaseFor readers interested in a more general case, consider \(L = L\left(\frac{d^n q}{dt^n}, \dots, \frac{dq}{dt}, q, t\right)\). The corresponding EoM and Hamiltonian are: \[ \sum_{i=0}^n (-1)^i \frac{d^i}{dt^i} \left[ \frac{\partial L}{\partial \left(\frac{d^i q}{dt^i}\right)} \right] = 0 \] \[ H = \sum_{i=1}^n \sum_{j=1}^i (-1)^{i-j} \left[ \frac{d^{i-j}}{dt^{i-j}} \frac{\partial L}{\partial \left(\frac{d^i q}{dt^i}\right)} \right] \frac{d^j q}{dt^j} - L \] Key observations:
These arguments suggest that restricting \(L\) to first-order derivatives is more reasonable. |
Originally written in Chinese by the author, these articles are translated into English to invite cross-language resonance.