Einstein's Question

Sometime in 1895, when he was sixteen years old, Albert Einstein (1879 Mar 14 - 1955 Apr 18) asked himself a question whose pursuit would lead him, ten years later, to present to the world his theory of Special Relativity. He asked himself what he would see if he could pace a ray of light, running alongside it as it propagated itself through space. That strange question raises a question of its own: How did a contemplation of light lead Einstein to the discovery that space and time differ between different observers if those observers have some motion between them?

In that same year Hendrik Antoon Lorentz (1853 Jul 18 - 1928 Feb 04) sought an explanation of the results of the Michelson-Morley experiment and devised what we now call the Lorentz-Fitzgerald contraction. Over the next four years he parlayed that work into the four equations that we call the Lorentz Transformation, the mathematical centerpiece of the theory of Special Relativity. At first, though, it had nothing to do with Relativity. Indeed, Lorentz obtained his transformation equations by pursuing the antithesis of Einstein's theory, trying to preserve a physics that takes place in an absolute Newtonian space and time.

It all began with Thomas Young (1773 Jun 13 - 1829 May 10). In the course of giving a lecture in 1803 he revealed that some time in 1801 he had passed light through a pair of parallel thin slits and had observed that the light formed an interference pattern (an array of light and dark stripes) on a sheet of paper held behind the slits. From that observation he inferred that light has the nature of a wave, an oscillatory phenomenon that extends across space and can interfere with itself and other waves. Accepting Young's results, physicists soon postulated the existence of a luminiferous æther as the medium in which the waves were manifested and propagated. After all, they reasoned, sound requires air and ocean waves require water in order to exist, so if light is also a wave, then something must exist to do the waving.

James Clerk Maxwell (1831 Jun 13 - 1879 Nov 05) started out in his studies of electricity and magnetism by representing electric and magnetic fields as stresses and strains in the æther. Later he shifted to using a purely mathematical representation of the fields and ignored the æther, preferring, presumably like Newton, not to frame hypotheses. Then he went and deduced the existence and properties of electromagnetic waves, again without making any reference to the æther. He found that he could calculate from the electric permittivity and the magnetic permeability of vacuum the speed at which those waves propagate through space: using values measured by Wilhelm Eduard Weber (1804 Oct 24 - 1891 Jun 23) and Rudolf Kohlrausch (1809 - 1858), he calculated in 1862 a speed of 314,858 kilometers per second (195,647 miles per second), which came close enough to the 315,000 kilometers per second (195,682 miles per second) measured by Armand Fizeau for the speed of light that he asserted that light and electromagnetic wave were merely different names for the same phenomenon.

In 1887 Albert Abraham Michelson (1852 Dec 19 - 1931 May 09) and Edward Morley (1838 Jan 29 - 1923 Feb 24) conducted an experiment meant to demonstrate the existence of the æther by way of its effect on the speed of light. Because Earth changes its velocity through space by about sixty kilometers per second every six months, Michelson and Morley figured that somewhere on its orbit Earth must plow through the æther at a speed that would have a measurable effect on the speed of light. On Earth that motion would appear as an æther wind blowing through Michelson and Morley's laboratory and making light fly at slightly different speeds in different directions. Michelson and Morley determined that they could detect the differences by passing a ray of light through an interferometer.

To carry out their experiment Michelson and Morley built a simple L-shaped interferometer on a circular slab of granite, set that slab into a circular pit carved into another block of granite, and then filled the pit with mercury to make the circular slab float. They used in their interferometer a 50/50 beam splitter, a glass plate that divided a ray of light into two equal parts, one passing through the beam splitter and the other reflecting off it. Traveling down paths that spread apart from each other at a right angle, the half rays bounced off paired mirrors, each half ray traversing its leg of the interferometer multiple times to increase the instrument's sensitivity. After a certain number of bounces each half ray would miss the mirror closer to the beam splitter and fly past the beam splitter. One half ray then bounced off one more mirror and onto a path that crossed the path of the other half ray. Where the half rays crossed each other Michelson and Morley put a white screen, on which the rays would produce an interference pattern.

According to the æther theory the half ray moving across the æther wind (which moves through the laboratory at a presumed speed of v) travels at a speed and the other half ray, moving parallel to the æther wind, travels at the speeds c+v and c-v, depending on whether it moves downwind or upwind. Each leg of the interferometer has a length L, so the time each half ray needed to traverse its leg both ways once was

(Eq'n 1)

and

(Eq'n 2)

Thus, the half ray traveling parallel to the æther wind would take more time on its round trip than would the half ray traveling across the æther wind. Turning the interferometer ninety degrees would reverse that relationship, so as the apparatus turned the fringes in the interference pattern would thus drift across the ruler that Michelson and Morley had attached to the screen on which the pattern displayed itself, thereby demonstrating the existence of the æther wind and, therefore, of the æther itself.

Michelson and Morley's apparatus turned smoothly enough on its mercury bearing that the two men could observe the interference pattern without any jitter spoiling their observations. At any time of day, at any time of year, Michelson and Morley got the same result: the interference fringes did not shift at all as the interferometer turned. That result implied that light flew through the laboratory at the same speed in all directions at all times, implying in turn the absence of an æther wind.

Many physicists accepted the inference and hypothesized that as Earth travels through space it drags a large volume of æther with it. But H. A. Lorentz and the Irish mathematician George Francis FitzGerald (1851 Aug 03 - 1901 Feb 21) saw a different explanation of the result. Both men noticed that if the length L in Equation 2 were shortened by the factor , then the time intervals in Equations 1 and 2 would come out equal and thus the interference fringes would not shift as the interferometer turned. The cause of the shrinkage would be an effect of the æther wind blowing through the matter making up the interferometer. Thus we have the Lorentz-FitzGerald contraction. Though FitzGerald didn't take the idea any further, Lorentz went on to work out a description of what we now call the Lorentz Transformation.

Had Lorentz confronted Einstein's question, how might he have answered? If we conceive electric and magnetic fields as stresses and strains in the æther, then we must conceive electromagnetic waves as consisting of the same kinds of stresses and strains propagating in the æther just as mechanical stresses and strains propagate through a struck metal bar as sound. If an observer could move with the ray of light, the wave would appear to that observer as motionless, like a wave rolling up a flowing stream: the wave making up the ray propagates through the æther at the speed of light, but in the observer's view the æther wind blows it backward as fast as it advances. That æther wind complicates the analysis by contracting the observer to perfect flatness in the direction of motion and slowing their clocks to perfect stasis. Observation requires an elapse of time to occur, so the moving observer might not actually observe anything. Nonetheless, the Lorentzian æther theory does give an answer to Einstein= s question. It gives us the wrong answer, but it gives us an answer nonetheless.

In 1905 Einstein didn't know about that answer because he didn't know about the Michelson-Morley experiment or of Lorentz's work. So he had to answer the question by himself (possibly with some help from Mileva Marić (1875 Dec 19 - 1948 Aug 04), his girlfriend and then wife, and Michele Angelo Besso (1873 May 25 - 1955 Mar 15), his best friend). But how could he have done it and come up with an answer different from what Lorentz would have obtained? Consider what he knew prior to 1905.

He already knew about the principle of relativity. That principle had originated with Galileo Galilei, who laid out a description of it in his "Dialogue on the Two Chief World Systems". Galileo described a sailor, closed up in a cabin on a ship, performing experiments (such as releasing flies or playing a kind of hopscotch on a pattern on the deck) and he claimed that the sailor would be unable to determine through those experiments whether the ship was under sail on a smooth sea or tied up at a dock. In presenting that simple version of relativity Galileo responded to criticism of the Copernican world system, which criticism people based on their observations that they could not sense any effects of Earth's putative motion around the sun.

Galileo's principle of relativity tells us that the sailor cannot, by means of any experiment, discover an absolute state of motion. Isaac Newton used that proposition as the foundation of the physics that he revealed in his magnum opus, Philosophiae Naturalis Principia Mathematica (The Mathematical Principles of Natural Philosophy). His first law of motion stands as a purely relativistic repudiation of the Aristotelian notion that every body has a natural state of rest that it actively strives to occupy. But Newton took Galileo's principle a step further by bringing Galileo's sailor out of his cabin and up on deck.

Imagine that the sailor stands on a ship proceeding down the River Thames to the sea and that Newton stands on a dock watching it go by. As the ship passes Newton's position a musket ball falls from the crow's nest and hits the deck at the base of the mast. As the sailor sees it, the ball traces a straight line parallel to the mast, but as Newton sees it, the ball traces a parabolic arc, a combination of the ship's uniform forward motion and the ball's accelerating vertical motion. At about the same time a small cask slips out of a net suspended from a crane and falls to the dock next to the one that Newton is on. As Newton sees it, the cask traces a straight line as it falls, but as the sailor sees it, the cask traces a parabolic arc, a combination of the dock's uniform aftward motion relative to the ship and the cask's accelerating vertical motion. So to Galileo's proposition that no observer can use his own experiments to determine whether he is in absolute motion or at absolute rest, Newton added the proposition that no observer can use another observer's experiments to make the same determination. The sailor sees Newton's experiment deformed in the same way that Newton sees an identical experiment performed by the sailor deformed. The only motion that the two observers can detect is the relative motion between them.

What happens when two observers look at the same experiment? Consider an imaginary experiment that Hans Christian Ørsted could have used to parlay his discovery that an electric current exerts a magnetic force into the law of electromagnetic induction that Michael Faraday discovered more or less by accident. Imagine that we have set up a large powerful magnet and that we have suspended an electrically charged thread next to one of the magnet's pole faces so that the thread passes over the center of the pole face. We then accelerate the thread to a very high speed in the direction parallel to the direction in which we have it stretched. In the laboratory of our imaginations we also have two observers: Markus Torvaldsen stands next to the magnet and Torvald Markussen moves with the thread.

Markus interprets the moving thread as an electric current and he knows, from Ørsted's experiment, that it exerts a force upon the magnet. In accordance with Newton's third law of motion, the magnet must also exert a force upon the thread. Instead of passing directly over the center of the magnet's pole face, then, the thread will pass over an array of points comprising a curve that passes to one side of the center.

Torvald must also see the thread curve past the center of the magnet's pole face, so he also infers the existence of a force acting on the thread. But he doesn't see the thread as an electric current, so he must infer that the moving magnet generates an electric field that will exert the appropriate force upon the static charge on the thread. From that inference he could deduce the law of electromagnetic induction, but we don't want to follow that path here.

Both observers see the effect of a force pushing on the thread, but they offer different explanations for what generates the force. If our observers conduct this experiment many times, with many variations in its parameters, they will accumulate enough data to enable them to infer the rules describing the relationship between the force exerted between the magnet and the thread and the parameters of the system. In this case our observers obtain a rule for calculating the force on the thread, which rule includes, in addition to the multiplications and divisions by various constants and geometric factors, the product of the magnet's field strength, the electric charge on a unit length of the thread, and the velocity between the magnet and the thread. In Markus' case, the velocity multiplies the electric charge to yield a description of the electric current interacting with the magnet's field and in Torvald's case, the velocity multiplies the magnet's field strength to yield a description of the electric field acting on the charged thread. The difference between Markus' and Torvald's descriptions of the cause of the force on the thread thus boils down to the difference between A(BC) and (AB)C. Thus, when our observers break their descriptions of the force into its most fundamental components, they discover that the law governing the force is the same for both of them.

Thus we see Einstein's version of the principle of relativity: the laws of physics have the same mathematical form for all observers, regardless of the observers' motions relative to each other. I will just note here in this regard that Einstein began his paper "On the Electrodynamics of Moving Bodies" with a description of an imaginary experiment similar to the one I described above. Einstein's analysis requires a greater subtlety of interpretation, but it leads to the same postulate.

Einstein also knew Maxwell's Equations and the electromagnetic theory that physicists had organized around them. Our modern theory of light as an electromagnetic wave comes from those equations, so the Maxwellian theory of electricity and magnetism necessarily played a role in Einstein's approach to his question.

Imagine that Einstein's friend Besso operates a machine that generates and projects the ray of light that Einstein wants to pace. He arranges to project the ray along his and Einstein's common x-axis in such a way that the ray's electric field points only in the y-direction and the ray's magnetic field points only in the z-direction. As the ray emerges from the projector Einstein takes off after it and paces it.

Besso measures the strength of the ray's electric and magnetic fields at several points on the x-axis at different times. Those measurements tell him that at any given point the strengths of both fields change in a sinusoidal way with the elapse of time and that at any given time the strengths of both fields change in a sinusoidal way with a change in location. From those data Besso could derive Faraday's law of electromagnetic induction and the form of Ampere's law that we might call Maxwell's law of magnetoelectric induction.

Einstein makes the same set of measurements. Those measurements tell him that at any given instant the strengths of both fields change with a change in location, but at any given point that is stationary relative to him the strengths of both fields remain unchanged with the elapse of time (assuming, of course, that he experiences time as a Newtonian absolute). Like Besso, Einstein can extract from those measurements an abstraction describing each field as a sinusoidal pattern in space. But unlike Besso, he cannot derive Faraday's law or Maxwell's version of Ampere's law from his data because those data do not include temporal changes in the fields. Apparently Maxwell's Equations do not exist for Einstein as they do for Besso.

But, Lorentz and other physicists of the time would object, the æther wind blows through Einstein and his apparatus at the speed of light. In the æther theory, when the æther wind blows over one of the fields (either electric or magnetic) it generates the other field to some extent. If the æther blows across, say, the electric field at the speed of light, then it generates a magnetic field with enough strength that the æther blowing across that magnetic field perfectly regenerates the original electric field.

On first impression, though, that interpretation seems to violate Einstein's version of the principle of relativity. When Besso derives Faraday's law from his data, for example, he gets

(Eq'n 3)

Using the æther theory, Einstein would derive Faraday's law from his data as

(Eqn 4)

But æther theory is a kind of fluid dynamics, so Lorentz and his team would recognize the right sides of those equations as expressing separate parts of the convective derivative,

(Eq'n 5)

If we actually had Faraday's law as

(Eqn 6)

in which **v** represents the speed of the æther wind, then
the æther theory would seem to give a good account of Reality.

In Reality Einstein didn't follow that path. He approached the problem instead by proceeding as if the time derivatives in Maxwell's Equations are not convective derivatives, but are perfect, unaltered partial derivatives. That assumption necessitates that everyone observing the ray from Besso's projector must see the fields at any given point changing with the elapse of time.

At this point I must note that the motion of light is an illusion. Light does not move: it propagates. That is, when, at any given point, the strength of one of the fields rises or falls it induces the strength of the field at neighboring points to rise or fall in such a way that the curl of the field remains consistent with the associated rate at which the strength of the other field changes with the elapse of time. In order for light to exist for an observer, then, the fields must change inherently with the elapse of time; they cannot appear static.

That proposition being true to Reality necessitates that Besso observe the situation in such a way that he can only infer that the ray propagates past Einstein. That necessitates that Einstein observe the ray always propagating past him in the direction away from Besso. But Einstein remains free to accelerate in that direction, so how can Reality prevent him from matching speeds with the ray? We can only have a perfect guarantee that Reality will meet that criterion if we assert that no acceleration by Einstein will alter his kinematic relationship with the ray; that is, however long he accelerates, he will always see the ray pass him at the same speed. That inference leads directly to Einstein's second postulate of Special Relativity.

With his two postulates -- his version of the principle of relativity and the constancy of the speed of light -- and a series of imaginary experiments, Einstein deduced the Lorentz Transformation. In so doing he gave the transformation a geometric interpretation; that is, he presented the equations as representing differences in distance and duration measured between the same two events by different observers. Lorentz, on the other hand, gave his transformation equations a dynamic interpretation: he attributed the various effects -- the contraction of bodies, the slowing of clocks, etc. -- to the alteration of the forces within bodies by the æther wind blowing through those bodies. We could even say, with some justification, that, in spite of Lorentz's explicit use of the æther, Einstein's version of the Lorentz Transformation is the more ætherial of the two.

"Imagination is more important than knowledge," Einstein once commented. Without imagination, he said, knowledge as such does not exist: we have merely collections of facts. The use of imagination, the creation of theories, transforms science from mere stamp collecting into proper storytelling. But the same facts can be put into different stories and therein lies a risk that the story containing them will not give us a true representation of Reality.

Lorentz set his story in the Absolute Space and Absolute Time that Isaac Newton described in his Principia. For Lorentz space consists of only one inertial frame of reference, in which the æther remains more or less motionless and the speed of light is referred to that æther. The relationship between observers and what they observe is dynamic, mediated by forces arising from the æther wind.

Einstein set his story in a space consisting of an infinity of inertial frames of reference, a space in which the speed of light is the same for all observers. The relationship between observers and what they observe is geometric, mediated by the metric relationship between the inertial frames the observers occupy and mark. Nicely enough, Einstein never got an answer to the question that led him to that story: he found out that the question has no answer because the situation he tried to imagine, pacing a ray of light, can never be realized, even in the realm of the imagination.

hghghg

A Philosophical Commentary

The Classical Greek philosopher Plato argued that we experience Reality as a crude expression of underlying Forms (or Ideals) that serve as absolutely perfect templates of the things thus expressed. As absolutes the Platonic Forms shape Reality without being affected by it. Today we don't accept Plato's idea that Forms underlie common objects, but rather that they underlie the components of such objects and the laws that govern them: we no longer believe in a Platonic Form for Horse, but we do believe in one for Electron. We haven't changed our belief so much as we have applied the Democritean cutting procedure to the objects of our perception and inferred Forms underlying those things that we cannot cut.

We also see Forms underlying behavior: we call those Forms laws of nature. Indeed, we could not have a consistent Reality without such Forms. That fact necessitates that all observers receive the same percepts emanating from any given event. Thus, if on a certain date, at a certain time, I see a northbound red Honda slam into an eastbound green Ford in the intersection of Venice and Sepulveda Boulevards in West Los Angeles, then all other observers around that intersection at that time must see the same collision. No one will see a motorcycle hit a truck nor will they see the Ford hit the Honda. The differences that the witnesses see in the percepts emanating from the event will come from differences in the witnesses' locations relative to the event and those merely alter the perspective on the event.

All observers must see the same phenomenon in any given event. From that we infer the existence of an underlying sameness in the Reality of the event. That sameness corresponds to a set of Platonic Forms that we identify as the laws of nature. At their most irreducible we express those laws in mathematical form. Cold, austere, unyielding mathematics gives us the perfect exemplar of the Platonic Forms, so we use it to express the most fundamental of the Forms of modern science.

Thus we seek the absolute. From the abstraction of general laws from particular instances we gain our theories. We remove what makes our examples different from one another and keep what makes them in some sense the same. If we analyze the trajectories of thrown objects we find a bewildering array of parabolic arcs. But if we remove the horizontal component of motion from them, we find an underlying sameness: the objects all possess the same vertical acceleration. In that fact we can conceive what we take as the Form of Earth's gravity (although Earth's gravity merely gives us an example of the true Form, the law of gravitation).

To find the absolute (the necessary) we must eliminate the contingent (the accidental). Anything we control is contingent and therefore cannot be part of the absolute. The location of an experiment, its orientation, its timing, the velocity at which it moves are all under our control, so cannot affect the laws of physics. Whatever mathematical expression that we give those laws cannot acknowledge those things, except as contingencies. Thus we have Einstein's first postulate of Relativity.

Since the time of Galileo physicists have been sidling up to the Pythagorean notion that "all is number". Modern physics does more than sidle: the fundamental equations are the purest expression of the Platonic Forms. Maxwell's Equations are the Forms of Electricity and Magnetism, purely and simply. Physicists inferred them from laboratory experiments and one imaginary experiment and found them irreducible: nothing less than those four equations suffices to describe accurately the electromagnetic field and nothing more is needed. But if Maxwell's Equations are Forms, then the constants that determine the relative strengths of the fields represent Forms as well: they must be the same in all inertial frames. Their product is invariant then, so the speed of electromagnetic waves is an absolute. And that is just Einstein's second postulate of Relativity.

When he was informed of moral relativity, the misapplication of his theory to social situations through the Pathetic Fallacy, Einstein offered the wish that he had never allowed the word relativity to be associated with the theory. He said that he had actually preferred to think of his theory as the Theory of Invariants. Given that invariant gives us a rather mild description of a Platonic Form, we can say that he would have been well justified in insisting on the alternate name.

habg