If we measure the speed of light in a vacuum, we always get the same value, independent of the velocity of the light source. The speed of light is a universal constant. This fact has several interesting qualitative consequences. Firstly, choose units so that the speed of light is \(1\). Then the time taken for light to travel from \(A\) to \(B\) is equal to the distance between \(A\) and \(B\). Secondly, since we learn about the objects around us from the light which bounces of them, we expect our perception of these objects to be distorted if they are moving close to the speed of light. We are going to develop a quantitative understanding of this distortion.

Before we can derive any formulas, we need to understand how spatial measurements get distorted perpendicular to velocity. Suppose that you have a square sheet of metal and you cut out a complicated shape from the middle:

The cut out piece fits perfectly through the hole in the sheet. When objects are moving really fast, their size doesn’t change, just our perception of their size does. If the cut out piece is moving fast, parallel to the normal vector of the sheet (which is at rest in our frame of reference), it still fits through the hole. Therefore there cannot be any distortion of distances perpendicular to the velocity.

Since the time taken for light to travel from \(A\) to \(B\) is equal to the distance between \(A\) and \(B\) (in our units), we can use light beams to construct a clock. Consider the following apparatus at rest:

It takes time \(d\) for the light to travel from the bottom plate to the top plate. Now suppose that the clock is moving horizontally in your reference frame:

It takes time \(l\) for the light to travel from the bottom plate to the top plate in your reference frame. In the clocks reference frame, it takes time \(d\) for the light to travel from the bottom plate to the top plate. Since distances don’t get distorted perpendicular to the velocity, Pythagoras’s theorem tells us \[l^2 - x^2 = d^2.\] If we are watching an object traveling at a constant velocity, the time measured by a clock strapped to the object is \[\sqrt{t^2 - x^2}\] where \(t\) is the time measured by our clock and \(x\) is the distance which the object travels in our reference frame. This quantity is called a **spacetime interval** and the value is independent from the frame in which it was measured.

Postulating that the speed of light is a universal constant forces us to unify space and time. To deal with this, we treat the time coordinate \(t\) and the spatial coordinates \(x,y,z\) as coordinates on spacetime or Minkowski space \(M \cong \mathbb{R}^4\). Minkowski space is not just a bare \(4\)-manifold. We also want to capture the spacetime interval which is independent of reference frame. If we see an object traveling with tangent vector \[s \frac{\partial}{\partial t} + a \frac{\partial}{\partial x} + b \frac{\partial}{\partial y} + c \frac{\partial}{\partial z}\] then the square of the spacetime interval \[s^2 - a^2 - b^2 - c^2\] is independent of reference frame. Therefore, we should equip \(M\) with the symmetric \(2\)-cotensor \[dt^2 - dx^2 - dy^2 - dz^2\] called the Minkowski metric. It is important to understand the subgroup of \({\rm GL}(M)\) preserving the Minkowski metric. We call this subgroup the **Lorentz group**. The Lie algebra of the Lorentz group has basis \[
\begin{pmatrix}
0 & 1 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{pmatrix}, \;
\begin{pmatrix}
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 0 \\
1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0
\end{pmatrix}, \;
\begin{pmatrix}
0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
1 & 0 & 0 & 0
\end{pmatrix}, \;
\begin{pmatrix}
0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & -1 & 0 & 0 \\
0 & 0 & 0 & 0
\end{pmatrix}, \;
\begin{pmatrix}
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 \\
0 & 0 & 0 & 0 \\
0 & -1 & 0 & 0
\end{pmatrix}, \;
\begin{pmatrix}
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 \\
0 & 0 & -1 & 0
\end{pmatrix}.
\] The last three basis vectors span the Lie algebra for rotations of the spatial coordinates \(x,y,z\). The interesting vectors are the first three. The \(1\)-parameter subgroup generated by the first vector is \[
\begin{pmatrix}
{\rm cosh}(t) & {\rm sinh}(t) & 0 & 0 \\
{\rm sinh}(t) & {\rm cosh}(t) & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\end{pmatrix}.
\] If we make the substitution \[v = {\rm tanh}(t) = \frac{{\rm sinh}(t)}{{\rm cosh}(t)}\] then the \(1\)-parameter subgroup becomes \[
\begin{pmatrix}
\frac{1}{\sqrt{1 - v^2}} & \frac{v}{\sqrt{1-v^2}} & 0 & 0 \\
\frac{v}{\sqrt{1-v^2}} & \frac{1}{\sqrt{1 - v^2}} & 0 & 0 \\
0 & 0 & 1 & 0 \\
0 & 0 & 0 & 1
\end{pmatrix}.
\] This is the standard form for the Lorentz transformation.