Thinking Relativistically

In my last article on special relativity, I explained how and why our intuitive understanding of space and time needed to be modified to account for developments in electromagnetic theory that took place in the 19th century, in particular the results of the Michelson-Morley experiment and the non-covariance of the electromagnetic wave equation under the Galilean transform. I explained how length contraction, time dilation, the invariance of the spacetime interval, and ultimately the Lorentz transformation follow from the two postulates of special relativity:

  • First postulate: The laws of physics take the same form in all inertial reference frames.
  • Second postulate: The speed of light has the same numerical value in all inertial reference frames.

Having seen that we live in a relativistic universe, it’s now time to look at the ways in which our thinking must be adjusted in order to properly understand physics in a relativistic universe.

Reference frames

Special relativity is ultimately a theory about the ways in which the same physical phenomena will appear to observers in different reference frames. All of the important and famous ideas that are associated with special relativity, like mass-energy equivalence, the impossibility of time travel, the speed of light as the universal speed limit, and the phenomenon of red shift all follow from those assumptions.

To mathematically express the law of motion for a physical system and to measure the parameters of that system, we need a coordinate system. However, coordinate systems are mathematical abstractions and Nature does not a priorirequire that coordinate systems must exist or that any particular coordinate system must be used for describing a physical system. To use a coordinate system, we will need to declare one. To do so, we will first declare a reference, which consists of a point at some location in physical space and a system of perpendicular lines intersecting at that point. We can then use the reference to declare a Cartesian coordinate system by saying that the reference lines are the axes and the origin is their point of intersection. The combination of the coordinate system and the reference is called the reference frame.

The choice of the location of the origin and the orientation of the axes will always be in reference to a physical object or collection of objects, such as the position of a person standing on a train with the x-axis pointing towards the front of the train or the center of a cube with the axes perpendicular to the faces of the cube. Nothing prevents us from defining multiple reference frames for the same system. For example, we might also define a reference frame attached to the position of a person standing by the tracks watching the train go by, or we might define a reference frame fixed at the position of somebody who’s watching the cube rotate. In this case, the reference frames are in motion with respect to each other.

This is is why it’s important to not confuse the coordinate system with the entire reference frame. A coordinate system is ultimately a function that maps points in space into ordered tuples of numbers and it is nonsensical to say that a function, which is an abstract mathematical object, is rotating or moving in physical space. For our purposes, this function will be defined in terms of the reference. For example, the two-dimensional Cartesian coordinate system will send a point P to the ordered pair F(P)=(x(P),y(P)) where x(P) is the perpendicular distance from point P to the y-axis and y(P) is the perpendicular distance from point P to the x-axis:

In this case, F(P)=(2,3), so (2,3) is the coordinate pair of point P.

We will only consider inertial reference frames, meaning frames that are moving at constant velocity with respect to each other and not accelerating or rotating. Special relativity can account for non-inertial frames as long as gravity isn’t what’s driving the acceleration (for that you need general relativity) but we will hold off on this until much later in this series. The reason for this is that if it was possible to detect absolute velocity then it would be possible to define an absolute reference frame and this would contradict the fact that Nature does not a priori come equipped with a coordinate system. However, absolute accelerations are detectable because accelerations imply forces and a force either acts or does not. If this was not the case then it would be possible for a physical process to be occurring in one frame but not in another, which violates the first postulate.

We will usually be interested in situations where an experimenter is observing the motion of some physical object, and neither the observer nor the object are subject to any acceleration. The unique frame where the object’s velocity is zero is called the rest frame (or proper frame) S′ and the unique frame where the observer’s velocity is zero is called the observer frame (or lab frame) S. It will usually be obvious which is which. Finally, we will always assume that the observer and rest frame are in standard configuration, meaning that S′ has constant velocity in the x-direction with respect to S, the origins of their coordinate systems coincide at t=t′=0, and the coordinate axes in both frames are parallel:

Source: Wikimedia Commons. Public domain.

The coordinates in the rest frame are labelled with primes, (t′,x′,y′,z′) and the observer frame coordinates are labelled simply (t,x,y,z) and these coordinates are related by the Lorentz transform:

We will almost always ignore the y and z coordinates.

The symbol γ denotes the Lorentz factor:

One last point before we move on. If an object is determined to have length L in frame S and length L′ in frame S′, if the time interval between two events is Δin frame S and Δt′ in frame S′, or if the speed of a particle is U in frame S and U′ in frame S′, then I do not say that L, Δt, and U “appear” to have different values in different frames. The word “appear” implies that the difference in these values between frames is somehow an error or an illusion that causes them to deviate from a single “true” value or that the frame dependence of these quantities represents a limitation of our knowledge. This is not the case. There is no such thing as the true length of an object, the true time interval between events, or the true speed of a particle because measuring these things requires a coordinate system and therefore requires a reference frame. There is no single “correct” reference frame, so there is no single “correct” value of these quantities.

Now let’s get to the heart of the matter and use what special relativity tells us about reference frames to analyze some physical problems.

Invariance of causality and the relativistic speed limit

Observing a physical system from a different reference frame doesn’t add anything to the physics that underlie the behavior of that system. This means that anything that is true in one frame must be true in every other frame, although the explanation for why that thing is true may change. Put another way, changing your frame of reference does not change the facts of Nature, and one of the most important facts of Nature is the causal relationships between physical events. If event A causes event B in one frame then there is no frame in which event B causes event A. This is called invariance of causality. We can use invariance of causality to understand what it means when we say that c is the “universal speed limit”.

Suppose that there is a frame S in which event A causes event B via propagation of a faster-than-light signal, for example, by firing a bullet with speed U>c. In this frame, let Δx be the distance between the two events and let Δt be the time separation between them so that Uxt. Since event A precedes event B in frame S, Δt must be positive. Let S′ be a frame moving with speed v