Today, in our General Relativity class, a student asked Dr. Anderson why Lorentz transformation must be linear. Dr. Anderson seemed unprepared for this question, and simply argued it does not have to, but proposed for simplicity. Especially, said him, linearity hold for infinitesimal transformation (first order). His answer isn't satisfactory to me.
Lorentz transformation is defined for (between) inertial frames. (The coordinates transformation for general frames is the subject of general relativity.) The linearity of the Lorentz transformation comes from the property of the inertial frame and the equivalence principle (See for example: Weinberg,
Gravitation and Cosmology: Principles and Applications of the General Theory of Relativity). Newton's first law defines inertial reference frame,
"In an inertial frame, every body persists in its state of being at rest or of moving uniformly straight forward, except insofar as it is compelled to change its state by force impressed."
Therefore, if an object is not accelerating in one inertial frame, it musn't be accelerating in another. Now let's apply it to the Lorentz transformations.
Let $K$ and $K'$ be two inertial frames, related by a constant velocity $\mathbf{V}$. We denote the coordinates in the two frames as $x^\mu, y^\mu$. Let $x^\mu=X^\mu(t)$ and $y^\mu=Y^\mu(t')$ be the trajectories of a particle in $K$ and $K'$ respectively. Note that $t = x^0 = X^0(t), t' = y^0 = Y^0(t')$, thus $X^0$ and $Y^0$ are unit identity functions. The velocities $\frac{d}{dt}X^\mu \equiv \dot X^\mu, \frac{d}{dt'}Y^\mu \equiv \dot{Y}^\mu$. We note that all the derivatives are ultimately taken respect to $t$ (or $t'$). The relation between $t$ and $t'$: $\frac{d t'}{d t} = \dot{X}^\mu \frac{\partial y^0}{\partial x^\mu} $. So,
[
equation 1 ] \[
\frac{\mathrm{d}^2 Y^\kappa }{\mathrm{d}t'^2} =
\frac{
\frac{\partial y^0}{\partial x^\rho}\dot{X}^\rho \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} \dot{X}^\mu\dot{X}^\nu
+ \frac{\partial y^0}{\partial x^\rho}\dot{X}^\rho \frac{\partial y^\kappa}{\partial x^\mu} \ddot{X}^\mu
-\frac{\partial y^\kappa}{\partial x^\rho} \dot{X}^\rho \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \dot{X}^\mu \dot{X}^\nu
- \frac{\partial y^\kappa}{\partial x^\rho} \dot{X}^\rho \frac{\partial y^0}{\partial x^\mu} \ddot{X}^\mu
}{
\left( \frac{\partial y^0}{\partial x^\lambda} \dot{X}^\lambda \right)^3 }
\]
Suppose the particle has uniform velocity in frame $K$, $\ddot{\mathbf{X}} = \mathbf{a} = 0, \ddot{X}^0(t) = \frac{\mathrm{d}^2 t}{\mathrm{d}t^2} = 0$ (i.e. $\ddot{X}^\mu = 0$). According to the property of the inertial frame, it has 0 acceleration in $K'$, too, $\ddot{\mathbf{Y}} = \mathbf{a}' = 0, \ddot{Y}^0(t) = \frac{\mathrm{d}^2 t'}{\mathrm{d}t'^2} = 0$ (i.e. $\ddot{Y}^\mu = 0$). Hence for all
constant velocities $\dot{X}^\mu$ ( with subject to the constraint $\dot{X}^0 \equiv 1$ - yet does not affect the conclusion ),
[
equation 2 ] \[\ddot{Y}^\kappa = 0
\implies
\left( \frac{\partial y^0}{\partial x^\rho}\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} -\frac{\partial y^\kappa}{\partial x^\rho} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \right) \dot{X}^\mu \dot{X}^\nu \dot{X}^\rho = 0
\implies
\frac{\partial y^0}{\partial x^\rho} \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = \frac{\partial y^\kappa}{\partial x^\rho} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu}.
\] Obviously, the expression holds trivially when $\kappa = 0$.
Now, according to Special Relativity (SR), the coordinates in the new and old frame satisfy [
equation 3, the equivalence principle ] \[
g_{\mu\nu} \mathrm d x^\mu \mathrm d x^\nu = g_{\mu\nu} \mathrm d y^\mu \mathrm d y^\nu
\implies g_{\mu\nu} \frac{\partial y^\mu}{\partial x^\rho} \frac{ \partial y^\nu}{\partial x^\sigma} = g_{\rho\sigma}, \quad
g^{\rho\sigma} \frac{\partial y^\mu}{\partial x^\rho} \frac{ \partial y^\nu}{\partial x^\sigma} = g^{\mu\nu}
\] where $g_{\mu\nu} = g^{\mu\nu} = \mathrm{diag}\{-1, +1, +1, +1\}$ is the metric tensor. By using this relation and
equation 2, we get,
[
equation 4 ] \[
g^{\rho\sigma} \frac{\partial y^0}{\partial x^\sigma}\frac{\partial y^0}{\partial x^\rho} \frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = g^{\rho\sigma} \frac{\partial y^0}{\partial x^\sigma}\frac{\partial y^\kappa}{\partial x^\rho} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} \\
\implies
\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = -g^{0\kappa} \frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu}
\] Now, $-g^{0\kappa} = \delta^{0\kappa}$. That means, if $\kappa = 1,2,3$, then $\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = 0 $. Substitute this result back to
equation 2, and take we can see $\frac{\partial^2 y^0}{\partial x^\mu \partial x^\nu} = 0$. Therefore, we conclude
[
equation 5 ] \[
\frac{\partial^2 y^\kappa}{\partial x^\mu \partial x^\nu} = 0,
\] namely the Lorentz transformation is linear with respect to coordinates $x^\mu$.
Due to the linearity, We can write the Lorentz transformation as \[
y^\mu = \Lambda^\mu_{\;\nu} x^\nu + a^\mu,
\] where $\Lambda^\mu_{\;\nu}$ and $a^\mu$ do no depend on the coordinates $x^\mu$. Apparently, $\Lambda^\mu_{\;\nu}$ satisfy $g_{\mu\nu} \Lambda^\mu_{\;\rho} \Lambda^\nu_{\;\sigma} = g_{\rho\sigma}$. This constraint reduces the number of free parameters of $\Lambda^\mu_{\;\nu}$ from $4\times4 = 16$ to 6 (3 angles and 3 rapidities). These are also the solutions of
equation 2 &
equation 3. These solutions, as we know form a group, the Poincare group.
update:
edit the wording and notations.
supplement the derivation with $\frac{\mathrm d^2 t}{\mathrm d t'^2}$ part (
red texts). It was omitted in the previous derivation as $\frac{\mathrm d^2 t}{\mathrm d t'^2}= 0 $ (or I forgot).
rederive the linearity from the
equivalence principle.