Einstein=s Question

Back to Contents

    Sometime in 1895, when he was sixteen years old, Albert Einstein (1879 Mar 14 - 1955 Apr 18) asked himself a question whose pursuit would lead him, ten years later, to present to the world his theory of Special Relativity. He asked himself what he would see if he could pace a ray of light, running alongside it as it propagated itself through space. That strange question raises a question of its own: How did a contemplation of light lead Einstein to the discovery that measurements of the space and time intervals between two events differ between different observers if those observers have some motion between them?

    In that same year Hendrik Antoon Lorentz (1853 Jul 18 - 1928 Feb 04) sought an explanation of the results of the Michelson-Morley experiment and devised what we now call the Lorentz-Fitzgerald contraction. Over the next four years he parlayed that work into the four algebraic equations that we call the Lorentz Transformation, the mathematical centerpiece of the theory of Special Relativity. At first, though, it had nothing to do with Relativity. Indeed, Lorentz obtained his transformation equations by pursuing the antithesis of Einstein=s theory, trying to preserve a physics that takes place in an absolute Newtonian space and time, which has a single inertial frame of absolute rest, to which all observers must refer the motions that they measure.

    It all began with Thomas Young (1773 Jun 13 - 1829 May 10). In the course of giving a lecture in 1803 he revealed that some time in 1801 he had passed light through a pair of parallel thin slits and had observed that the light formed an interference pattern (an array of light and dark stripes) on a sheet of paper held behind the slits. From that observation he inferred that light has the nature of a wave, an oscillatory phenomenon that extends across space and can interfere with itself and other waves. Accepting Young=s results, physicists soon postulated the existence of a luminiferous Šther as the medium through which the waves of light propagate themselves. After all, they reasoned, sound requires air and ocean waves require water in order to exist, so if light also has the nature of a wave, then something must exist to do the waving.

    James Clerk Maxwell (1831 Jun 13 - 1879 Nov 05) started out in his studies of electricity and magnetism by representing electric and magnetic fields as stresses and strains in the Šther. Later he shifted to using a purely mathematical representation of the fields and ignored the Šther, preferring, presumably like Newton, not to frame hypotheses. Then he went and deduced the existence and properties of electromagnetic waves, again without making any reference to the Šther. He found that he could calculate from the electric permittivity and the magnetic permeability of vacuum the speed at which those waves propagate through space: using values measured by Wilhelm Eduard Weber (1804 Oct 24 - 1891 Jun 23) and Rudolf Kohlrausch (1809 - 1858), he calculated in 1862 a speed of 314,858 kilometers per second (195,647 miles per second), which came close enough to the 315,000 kilometers per second (195,682 miles per second) measured by Armand Fizeau for the speed of light that he asserted that light and electromagnetic wave merely gave us different names for the same phenomenon.

    In 1887 Albert Abraham Michelson (1852 Dec 19 - 1931 May 09) and Edward Morley (1838 Jan 29 - 1923 Feb 24) conducted an experiment meant to demonstrate the existence of the Šther by way of its effect on the speed of light. Because Earth changes its velocity through space by about sixty kilometers per second every six months as it goes around the sun, Michelson and Morley figured that somewhere on its orbit Earth must plow through the Šther at a speed that would have a measurable effect on the speed of light. On Earth that motion would appear as an Šther wind blowing through Michelson and Morley=s laboratory and making light fly at slightly different speeds in different directions. Michelson and Morley determined that they could detect the differences by passing a ray of light through an interferometer.

    To carry out their experiment Michelson and Morley built a simple L-shaped interferometer on a circular slab of granite, set that slab into a circular pit carved into another block of granite, and then filled the pit with the liquid metal mercury to make the circular slab float. They used a ray of light of a single color in their interferometer and a 50/50 beam splitter, a glass plate that divided the ray of light into two equal parts, one passing through the beam splitter and the other reflecting off it. Traveling down paths that spread apart from each other at a right angle, the half rays bounced off paired mirrors, each half ray traversing its leg of the interferometer multiple times to increase the instrument= s sensitivity. After a certain number of bounces each half ray would miss the mirror closer to the beam splitter and fly past the beam splitter. One half ray then bounced off one more mirror and onto a path that crossed the path of the other half ray. Where the half rays crossed each other Michelson and Morley put a white screen, on which the rays produced an interference pattern.

    According to the Šther theory the half ray moving across the Šther wind (which moves through the laboratory at some presumed speed) travels at a speed that we would calculate from the Pythagorean theorem with the hypotenuse of a right triangle representing the measured speed of light and one side representing the speed of the Šther wind: the other side of the triangle would then represent the actual speed of the ray in the laboratory. The other half ray, moving parallel to the Šther wind, travels faster than the measured speed of light when it moves downwind and slower than the measured speed of light when it travels upwind. Each leg of the interferometer had the same length as did the other, so the time each half ray needed to traverse its leg both ways depended only on the ray=s speed. Thus, according to the theory, the half ray traveling parallel to the Šther wind would take more time on its round trip than would the half ray traveling across the Šther wind. Turning the interferometer ninety degrees would reverse that relationship, so, as the apparatus turned, the fringes in the interference pattern would thus drift across the ruler that Michelson and Morley had attached to the screen on which the pattern displayed itself, thereby demonstrating the existence of the Šther wind and, therefore, of the Šther itself.

    Michelson and Morley=s apparatus turned smoothly enough on its mercury bearing that the two men could observe the interference pattern without any jitter spoiling their observations. At any time of day, at any time of year, Michelson and Morley got the same result: the interference fringes did not shift at all as the interferometer turned. That result implied that light flew through the laboratory at the same speed in all directions at all times, implying in turn the absence of an Šther wind.

    Many physicists accepted the inference, but, unwilling to abandon the hypothesis of the Šther, they hypothesized that, as it travels through space, Earth drags a large volume of Šther with it, thereby enveloping itself in a kind of pool of motionless Šther. But H. A. Lorentz and the Irish mathematician George Francis FitzGerald (1851 Aug 03 - 1901 Feb 21) saw a different explanation of the result. Both men noticed that if the Šther wind blowing through the interferometer had the effect of shortening the length of the leg parallel to the motion of the Šther wind by the factor equal to the inverse of what we now call the Lorentz factor, then the time intervals required by the rays to traverse their respective legs of the interferometer would come out equal and thus the interference fringes would not shift as the interferometer turned. Thus they described what we call the Lorentz-FitzGerald contraction. Though FitzGerald didn=t take the idea any further, Lorentz went on to work out a full mathematical description of what we now call the Lorentz Transformation.

    Had Lorentz confronted Einstein=s question, how might he have answered? If we conceive electric and magnetic fields as stresses and strains in the Šther, then we must conceive electromagnetic waves as consisting of the same kinds of stresses and strains propagating in the Šther just as mechanical stresses and strains propagate through a struck metal bar as sound. If an observer could move with the ray of light, the wave would appear motionless to that observer, just as a wave rolling up a flowing stream appears to stand still for an observer on the riverbank: the wave making up the ray propagates through the Šther at the speed of light, but in the observer=s view the Šther wind blows it backward as fast as it advances. That Šther wind complicates the analysis by contracting the observer to perfect flatness in the direction of motion (the Lorentz-FitzGerald contraction) and slowing their clocks to perfect stasis (the Lorentzian effect of time dilation). Observation requires an elapse of time to occur, so the moving observer might not actually observe anything. Nonetheless, the Lorentzian Šther theory does give an answer to Einstein=s question. It gives us the wrong answer, but it gives us an answer nonetheless.

    In 1905 Einstein didn=t know about that answer because he didn=t know about the Michelson-Morley experiment or of Lorentz=s work. So he had to answer the question by himself (possibly with some help from Mileva Mariâ (1875 Dec 19 - 1948 Aug 04), his girlfriend and then wife, and Michele Angelo Besso (1873 May 25 - 1955 Mar 15), his best friend). But how could he have done it and come up with an answer different from what Lorentz would have obtained? Consider what he knew prior to 1905.

    He already knew about the principle of relativity. That principle had originated with Galileo Galilei, who laid out a description of it in his ADialogue on the Two Chief World Systems@. Galileo described a sailor, closed up in a cabin on a ship, performing experiments (such as releasing flies or playing a kind of hopscotch on a pattern on the deck) and he claimed that the sailor could not determine through those experiments whether the ship moved under sail on a smooth sea or sat tied up at a dock. In presenting that simple version of relativity Galileo responded to criticism of the Copernican world system, which criticism people based on their observations that they could not sense any effects of Earth=s putative motion around the sun.

    Galileo=s principle of relativity tells us that the sailor cannot, by means of any experiment, discover an absolute state of motion. Isaac Newton used that proposition as the foundation of the physics that he revealed in his magnum opus, "Philosophiae Naturalis Principia Mathematica" (The Mathematical Principles of Natural Philosophy). His first law of motion stands as a purely relativistic repudiation of the Aristotelian notion that every body has a natural state of rest that it actively strives to occupy. But Newton took Galileo=s principle a step further by bringing Galileo=s sailor out of his cabin and up on deck.

    Imagine that the sailor stands on a ship proceeding down the River Thames to the sea and that Newton stands on a dock watching it go by. As the ship passes Newton=s position a musket ball falls from the crow=s nest and hits the deck at the base of the mast. As the sailor sees it, the ball traces a straight line parallel to the mast, but as Newton sees it, the ball traces a parabolic arc, a combination of the ship=s uniform forward motion and the ball=s accelerating vertical motion. At about the same time a small cask slips out of a net suspended from a crane and falls to the dock next to the one that Newton stands on. As Newton sees it, the cask traces a straight line as it falls, but as the sailor sees it, the cask traces a parabolic arc, a combination of the dock=s uniform aftward motion relative to the ship and the cask=s accelerating vertical motion. So to Galileo=s proposition that no observer can use his own experiments to determine whether he glides in absolute motion or stands at absolute rest, Newton added the proposition that no observer can use another observer=s experiments to make the same determination. The sailor sees Newton=s experiment deformed in the same way that Newton sees an identical experiment performed by the sailor deformed. The only motions that any observer can detect must refer to the inertial frame that the observer occupies as defining the state of rest. All motion thus exists only as relative motion between an observer and the objects that they observe.

    What happens when two observers look at the same experiment? Consider an imaginary experiment that Hans Christian ěrsted could have used to parlay his discovery that an electric current exerts a magnetic force into the law of electromagnetic induction that Michael Faraday discovered more or less by accident. Imagine that we have set up a large powerful magnet and that we have so suspended an electrically charged thread next to one of the magnet=s pole faces that the thread passes over the center of the pole face. We then accelerate the thread to a very high speed in the direction parallel to the direction in which we have it stretched. In the laboratory of our imaginations we also have two observers: Markus Torvaldsen stands next to the magnet and Torvald Markussen moves with the thread.

    Markus interprets the moving thread as an electric current and he knows, from ěrsted=s experiment, that it exerts a force upon the magnet. In accordance with Newton=s third law of motion, the magnet must also exert a force upon the thread. Instead of passing directly over the center of the magnet= s pole face, then, the thread will pass over an array of points comprising a curve that passes to one side of the center.

    Torvald must also see the thread curve past the center of the magnet=s pole face, so he also infers the existence of a force acting on the thread. But he doesn=t see the thread as an electric current, so he must infer that the moving magnet generates an electric field that will exert the appropriate force upon the static charge on the thread. From that inference he could deduce the law of electromagnetic induction, but we don=t want to follow that path here.

    Both observers see the effect of a force pushing on the thread, but they offer different explanations for what generates the force. If our observers conduct this experiment many times, with many variations in its parameters, they will accumulate enough data to enable them to infer the rules describing the relationship between the force exerted between the magnet and the thread and the parameters of the system. In this case our observers obtain a rule for calculating the force on the thread, which rule includes, in addition to the multiplications and divisions by various constants and geometric factors, the product of the magnet=s field strength, the electric charge on a unit length of the thread, and the velocity between the magnet and the thread. In Markus= case, the velocity multiplies the electric charge to yield a description of the electric current interacting with the magnet=s field and in Torvald=s case, the velocity multiplies the magnet=s field strength to yield a description of the electric field acting on the charged thread. The difference between Markus= and Torvald=s descriptions of the cause of the force on the thread thus boils down to the difference between the calculation of multiplying B by C and then multiplying the result by A and the calculation of multiplying A by B and then multiplying the result by C. Thus, when our observers break their descriptions of the force into its most fundamental components, they discover that the law governing the force has the same form for both of them.

    Thus we see Einstein=s version of the principle of relativity: the laws of physics have the same mathematical form for all observers, regardless of the observers= motions relative to each other. I will just note here in this regard that Einstein began his paper AOn the Electrodynamics of Moving Bodies@ with a description of an imaginary experiment similar to the one I described above. Einstein= s analysis requires a greater subtlety of interpretation, but it leads to the same postulate.

    Einstein also knew Maxwell=s Equations and the electromagnetic theory that physicists had organized around them. Our modern theory of light as an electromagnetic wave comes from those equations, so the Maxwellian theory of electricity and magnetism necessarily played a role in Einstein=s approach to his question.

    Imagine that Einstein=s friend Besso operates a machine that generates and projects the ray of light that Einstein wants to pace. He arranges to project the ray along his and Einstein=s common x-axis (which we can imagine as lying parallel to the east-west direction with the positive x-direction pointing due east) in such a way that the ray=s electric field points only in the y-direction (the north-south direction) and the ray=s magnetic field points only in the z-direction (the up-down dimension). As the ray emerges from the projector Einstein takes off after it on his atomic-powered motorcycle and paces it.

    Besso measures the strength of the ray=s electric and magnetic fields at several points on the x-axis at different times. Those measurements tell him that at any given point the strengths of both fields change in a sinusoidal way with the elapse of time and that at any given time the strengths of both fields change in a sinusoidal way with a change in location. From those data Besso could derive Faraday=s law of electromagnetic induction and the form of Ampere=s law that we might call Maxwell=s law of magnetoelectric induction.

    Einstein makes the same set of measurements. Those measurements tell him that at any given instant the strengths of both fields change with a change in location, but at any given point that stands stationary relative to him the strengths of both fields remain unchanged with the elapse of time (assuming, of course, that he experiences time as a Newtonian absolute). Like Besso, Einstein can extract from those measurements an abstraction describing each field as a sinusoidal pattern in space. But unlike Besso, he cannot derive Faraday=s law or Maxwell=s version of Ampere=s law from his data because those data do not include temporal changes in the fields. Apparently Maxwell=s Equations do not exist for Einstein as they do for Besso.

    But, Lorentz and other physicists of the time would object, the Šther wind blows through Einstein and his apparatus at the speed of light. In the Šther theory, when the Šther wind blows over one of the fields (either electric or magnetic) it generates the other field to some extent. If the Šther blows across, say, the electric field at the speed of light, then it generates a magnetic field with enough strength that the Šther blowing across that magnetic field perfectly regenerates the original electric field.

    On first impression, though, that interpretation seems to violate Einstein=s version of the principle of relativity. When Besso derives Faraday=s law from his data, for example, he gets a mathematical statement to the effect that the curl of the electric field at a given point equals the negative of the rate at which the magnetic field at that point changes with the elapse of time. Oversimplifying somewhat, we can describe the curl of a forcefield as the rate at which the field changes as we move our measuring apparatus in a direction perpendicular to the direction in which the field points. Using the Šther theory, Einstein would derive Faraday=s law from his data as a mathematical statement that the curl of the electric field that he measures equals the negative of the product of multiplying the speed of light by the curl of the magnetic field flowing over his apparatus.

    But Šther theory stands as a kind of fluid dynamics, so Lorentz and his team would recognize the two descriptions of the changing magnetic field as expressing separate parts of what mathematicians call the convective derivative, a special mathematical operation that corrects calculations made pertaining to phenomena occurring in moving fluids (such as the Šther) to take proper account of the fluid motions. If we actually had Faraday=s law as a statement relating the curl of the electric field to the convective derivative of the magnetic field instead of the simple rate at which the magnetic field changes with the elapse of time, then the Šther theory would seem to give a good account of Reality.

    In Reality Einstein didn=t follow that path. He approached the problem instead by proceeding as if the time derivatives in Maxwell=s Equations do not correspond to convective derivatives, but give us perfect, unaltered partial derivatives. In other words, whenever one of the fields appears in the equations as changing with the elapse of time, it must do so solely by coming into and out of existence at a given point and not at all as a spatially varying field moving past the point. That assumption necessitates that everyone observing the ray from Besso=s projector must see the fields at any given point changing inherently with the elapse of time.

    At this point I must note that the motion of light presents us with an illusion. Light does not move: it propagates. In other words we can say that, when, at any given point, the strength of one of the fields rises or falls with the elapse of time it induces the strength of the field at neighboring points to rise or fall in such a way that the curl of the field remains consistent with the associated rate at which the strength of the other field changes with the elapse of time. In order for light to exist for an observer, then, the fields must change inherently with the elapse of time; they cannot appear static.

    The fact of that proposition standing true to Reality necessitates that Besso observe the situation in such a way that he can only infer that the ray propagates past Einstein. That fact necessitates that Einstein observe the ray always propagating past him in the direction away from Besso. But Einstein remains free to accelerate in that direction, so how can Reality prevent him from matching speeds with the ray? We can only have a perfect guarantee that Reality will meet that criterion if we assert that no acceleration by Einstein will alter his kinematic relationship with the ray; in other words, however long he accelerates, he will always see the ray pass him at the same speed. That inference leads directly to Einstein=s second postulate of Special Relativity.

    With his two postulates B his version of the principle of relativity and the constancy of the speed of light B and a series of imaginary experiments, Einstein deduced the Lorentz Transformation. In so doing he gave the transformation a geometric interpretation; he presented the equations as representing differences in distance and duration measured between the same two events by different observers. Lorentz, on the other hand, gave his transformation equations a dynamic interpretation: he attributed the various effects B the contraction of bodies, the slowing of clocks, etc. B to the alteration of the forces within bodies by the Šther wind blowing through those bodies. We could even say, with some justification, that, in spite of Lorentz=s explicit use of the Šther, Einstein=s version of the Lorentz Transformation stands before us as the more Štherial of the two.

    AImagination is more important than knowledge,@ Einstein once commented. Without imagination, he said, knowledge as such does not exist: we have merely collections of facts. The use of imagination, the creation of theories, transforms science from mere stamp collecting into proper storytelling. But we remain free to put the same facts into different stories and therein lies a risk that the story containing them will not give us a true representation of Reality.

    Lorentz set his story in the Absolute Space and Absolute Time that Isaac Newton described in his Principia. For Lorentz space consists of only one inertial frame of reference, in which the Šther remains more or less motionless and we must refer the speed of light to that Šther. Observers have a dynamic relationship with what they observe, one mediated by forces arising from the Šther wind.

    Einstein set his story in a space consisting of an infinite set of inertial frames of reference, a space in which the speed of light has the same magnitude for all observers. Observers have a geometric relationship with what they observe, one mediated by the metric relationship between the inertial frames the observers and the objects of their observations occupy and mark. Nicely enough, Einstein never got an answer to the question that led him to that story: he found out that the question has no answer because the situation he tried to imagine, pacing a ray of light, lies beyond all possibility of realization, even in the realm of the imagination.


A Philosophical Commentary

    The Classical Greek philosopher Plato argued that we experience Reality as a crude expression of underlying Forms (or Ideals) that serve as absolutely perfect templates of the things thus expressed. As absolutes the Platonic Forms shape Reality without reacting to it in any way. Today we don=t accept Plato=s idea that Forms underlie common objects, but rather we tacitly accept the idea that they underlie the components of such objects and the laws that govern them: we no longer believe in a Platonic Form for Horse, but we do believe in one for Electron; we no longer believe in a Platonic Form for Rainstorm, but we do believe in one for Gravity. We haven=t changed our belief so much as we have applied the Democritean cutting procedure to the objects of our perception and inferred Forms underlying those things that we cannot cut.

    We also see something like Forms underlying behavior: we call those Form-like things laws of nature. Indeed, we could not have a consistent Reality without such pseudo-Forms. That fact necessitates that all observers receive the same percepts emanating from any given event. Thus, if on a certain date, at a certain time, I see a northbound red Honda slam into an eastbound green Ford in the intersection of Venice and Sepulveda Boulevards in West Los Angeles, then all other observers around that intersection at that time must see the same collision. No one will see a motorcycle hit a truck nor will they see the Ford hit the Honda. The differences that the witnesses see in the percepts emanating from the event will come from differences in the witnesses= locations relative to the event and those differences merely alter the perspective on the event.

    All observers must see the same phenomenon in any given event. From that we infer the existence of an underlying sameness in the Reality of the event. That sameness corresponds to a set of Platonic Forms that we identify as the laws of nature. At their most irreducible we express those laws in mathematical form. Cold, austere, unyielding mathematics gives us the perfect exemplar of the Platonic Forms, so we use it to express the most fundamental of the Forms of modern science.

    Thus we seek the absolute. From the abstraction of general laws from particular instances we gain our theories. We remove what makes our examples different from one another and keep what makes them in some sense the same. If we analyze the trajectories of thrown objects we find a bewildering array of parabolic arcs. But if we remove the horizontal component of motion from them, we find an underlying sameness: the objects all possess the same vertical acceleration. In that fact we can conceive what we take as the Form of Earth=s gravity (although Earth=s gravity merely gives us an example of the true Form, the law of gravitation).

    To find the absolute (the necessary) we must eliminate the contingent (the accidental). Anything we control we must identify with the contingent and therefore we must exclude it from the absolute. The location of an experiment, its orientation, its timing, the velocity at which it moves; all come under our control and, so, cannot affect the laws of physics. Whatever mathematical expression that we give those laws cannot acknowledge those things, except as contingencies. Thus we have Einstein=s first postulate of Relativity.

    Since the time of Galileo physicists have slyly gone sidling up to the Pythagorean notion that Aall is number@. Modern physics does more than sidle: the fundamental equations stand as the purest expression of the Platonic Forms. Maxwell=s Equations show us the Forms of Electricity and Magnetism, purely and simply. Physicists inferred them from laboratory experiments and one imaginary experiment and found them irreducible: nothing less than those four equations suffices to describe accurately the electromagnetic field and nothing more adds any useful information. But if Maxwell=s Equations show us Forms, then the constants that determine the relative strengths of the fields must represent Forms as well: they cannot vary in different inertial frames. They have an invariant product then, so the speed of electromagnetic waves represents an absolute. And that just gives us Einstein=s second postulate of Relativity.

    When he found out about moral relativity, the misapplication of his theory to social situations through the Pathetic Fallacy, Einstein offered the wish that he had never allowed anyone to associate the word relativity with the theory. He said that he had actually preferred to think of his theory as the Theory of Invariants. Given that invariant gives us a rather mild description of a Platonic Form, we can say that he would have done us good to insist on the alternate name.


Back to Contents