Aug 24, 2012

Why is the Lorentz transformation linear?

Today, in our General Relativity class, a student asked Dr. Anderson why Lorentz transformation must be linear. Dr. Anderson seemed unprepared for this question, and simply argued it does not have to, but proposed for simplicity. Especially, said him, linearity hold for infinitesimal transformation (first order). His answer isn't satisfactory to me.  

Lorentz transformation is defined for (between) inertial frames. (The coordinates transformation for general frames is the subject of general relativity.) The linearity of the Lorentz transformation comes from the property of the inertial frame and the equivalence principle (See for example: Weinberg, Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity). Newton's first law defines inertial reference frame,
"In an inertial frame, every body persists in its state of being at rest or of moving uniformly straight forward, except insofar as it is compelled to change its state by force impressed." 
Therefore, if an object is not accelerating in one inertial frame, it musn't be accelerating in another. Now let's apply it to the Lorentz transformations.

Let $K$ and $K'$ be two inertial frames, related by a constant velocity $\mathbf{V}$. We denote the coordinates in the two frames as $x^\mu, y^\mu$. Let $x^\mu=X^\mu(t)$ and $y^\mu=Y^\mu(t')$ be the trajectories of a particle  in $K$ and $K'$ respectively. Note that $t = x^0 = X^0(t), t' = y^0 = Y^0(t')$, thus $X^0$ and $Y^0$ are unit identity functions. The velocities $\frac{d}{dt}X^\mu \equiv \dot X^\mu, \frac{d}{dt'}Y^\mu \equiv \dot{Y}^\mu$. We note that all the derivatives are ultimately taken respect to $t$ (or $t'$). The relation between $t$ and $t'$: $\frac{d t'}{d t} = \dot{X}^\mu \frac{\partial y^0}{\partial x^\mu} $. So,
[ equation 1 ] \[
\frac{\mathrm{d}^2 Y^\kappa }{\mathrm{d}t'^2} =
\frac{\partial y^0}{\partial x^\rho}\dot{X}^\rho \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} \dot{X}^\mu\dot{X}^\nu
+ \frac{\partial y^0}{\partial x^\rho}\dot{X}^\rho \frac{\partial y^\kappa}{\partial x^\mu}  \ddot{X}^\mu
-\frac{\partial y^\kappa}{\partial x^\rho} \dot{X}^\rho \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \dot{X}^\mu \dot{X}^\nu
- \frac{\partial y^\kappa}{\partial x^\rho} \dot{X}^\rho \frac{\partial y^0}{\partial x^\mu} \ddot{X}^\mu
 \left( \frac{\partial y^0}{\partial x^\lambda} \dot{X}^\lambda \right)^3  }
Suppose the particle has uniform velocity in frame $K$, $\ddot{\mathbf{X}} = \mathbf{a} = 0, \ddot{X}^0(t) = \frac{\mathrm{d}^2 t}{\mathrm{d}t^2} = 0$ (i.e. $\ddot{X}^\mu = 0$). According to the property of the inertial frame, it has 0 acceleration in $K'$, too, $\ddot{\mathbf{Y}} = \mathbf{a}' = 0, \ddot{Y}^0(t) = \frac{\mathrm{d}^2 t'}{\mathrm{d}t'^2} = 0$ (i.e. $\ddot{Y}^\mu = 0$). Hence for all constant velocities $\dot{X}^\mu$ ( with subject to the constraint $\dot{X}^0 \equiv 1$  - yet does not affect the conclusion ),
[ equation 2 ] \[\ddot{Y}^\kappa = 0
\left( \frac{\partial y^0}{\partial x^\rho}\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} -\frac{\partial y^\kappa}{\partial x^\rho}  \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \right) \dot{X}^\mu \dot{X}^\nu \dot{X}^\rho = 0
\frac{\partial y^0}{\partial x^\rho} \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = \frac{\partial y^\kappa}{\partial x^\rho} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu}.
\] Obviously, the expression holds trivially when $\kappa = 0$.

Now, according to Special Relativity (SR), the coordinates in the new and old frame satisfy [ equation 3, the equivalence principle ] \[
g_{\mu\nu} \mathrm d x^\mu \mathrm d x^\nu = g_{\mu\nu} \mathrm d y^\mu \mathrm d y^\nu
\implies g_{\mu\nu} \frac{\partial y^\mu}{\partial x^\rho} \frac{ \partial y^\nu}{\partial x^\sigma} = g_{\rho\sigma}, \quad
g^{\rho\sigma} \frac{\partial y^\mu}{\partial x^\rho} \frac{ \partial y^\nu}{\partial x^\sigma} = g^{\mu\nu}
\] where $g_{\mu\nu} = g^{\mu\nu} = \mathrm{diag}\{-1, +1, +1, +1\}$ is the metric tensor. By using this relation and equation 2, we get,
[ equation 4 ] \[
g^{\rho\sigma} \frac{\partial y^0}{\partial x^\sigma}\frac{\partial y^0}{\partial x^\rho} \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = g^{\rho\sigma} \frac{\partial y^0}{\partial x^\sigma}\frac{\partial y^\kappa}{\partial x^\rho} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \\
\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = -g^{0\kappa} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu}
\] Now, $-g^{0\kappa} = \delta^{0\kappa}$. That means, if $\kappa = 1,2,3$, then $\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = 0 $. Substitute this result back to equation 2, and take we can see  $\frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} = 0$. Therefore, we conclude
[ equation 5 ] \[
\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = 0,
\] namely the Lorentz transformation is linear with respect to coordinates $x^\mu$.

Due to the linearity, We can write the Lorentz transformation as \[
y^\mu = \Lambda^\mu_{\;\nu} x^\nu + a^\mu,
\] where $\Lambda^\mu_{\;\nu}$ and $a^\mu$ do no depend on the coordinates  $x^\mu$. Apparently, $\Lambda^\mu_{\;\nu}$ satisfy $g_{\mu\nu} \Lambda^\mu_{\;\rho} \Lambda^\nu_{\;\sigma} = g_{\rho\sigma}$. This constraint reduces the number of free parameters of $\Lambda^\mu_{\;\nu}$ from $4\times4 = 16$ to 6 (3 angles and 3 rapidities). These are also the solutions of equation 2 & equation 3. These solutions, as we know form a group, the Poincare group.

edit the wording and notations.

supplement the derivation with $\frac{\mathrm d^2 t}{\mathrm d t'^2}$ part (red texts). It was omitted in the previous derivation as $\frac{\mathrm d^2 t}{\mathrm d t'^2}= 0 $ (or I forgot).

rederive the linearity from the equivalence principle


  1. Thanks!
    But what happened with the (d^2 t / dt'^2) - terms?
    I couldn't find out how to type formula here, so here's a screenshot

    1. This comment has been removed by the author.

  2. you forgot the dt/dt' term in your first red term.

  3. If y = square(x1) + square (x2), it is non-linear

    but its second-order derivatives, y'' = zero.

    so, y''=0 does not imply linearity, right or wrong?

    1. In your example, $\frac{\partial^2 y}{\partial x_1 \partial x_1} = -\frac{1}{4}\frac{1}{\sqrt{x_1^3}} \ne 0$.