Back to Contents
Of the accidental properties of matter that must obey conservation laws, one has the character of an absolute. If we have only two bodies in the Universe, we cannot tell which one moves and which one stands still, but we can tell which one rotates.
For simplicity assume that we have two bodies between which the distance does not change. If I suspect that the first of those bodies rotates, I need only change my motion until that body appears motionless to me. If the second body then appears to move around me in a circle, I have two possible explanations: either the second body moves around me or I rotate. If the second body moves around me, something must exert a force upon it to provide the centripetal acceleration necessary to make it move in a circle. If nothing exerts a force upon the second body, then I must infer that the second body’s apparent motion comes from my rotation with the first body. For all observers, then, a body either has a spin angular momentum or it doesn’t.
Because it obeys a conservation law, angular momentum must also conform to the finite-value theorem. Angular momentum also participates in the principle of least action, so it conforms to Heisenberg’s indeterminacy principle in the form
in whichδ represents the indeterminacy in the variable to which I have prefixed it and L represents the angular momentum of a system turning through an angle θ. We can use those facts to work out a description of the fundamental unit of angular momentum.
That angular momentum has a fixed non-zero realizable value comes from the fact that we measure angular displacement on a bounded domain. Although we sometimes calculate with bigger numbers, this fact remains true to Reality; that only angles between zero and 2π (360 degrees) have physical meaning: as far as the laws of physics are concerned ϕ+2π differs not at all from ϕ. We thus get the minimum realizable and measurable angular momentum when Δθ=2π and we use the equality in Heisenberg’s formula; that is, we get
which gives us
Physicists call that number aitch-bar or, rarely, Dirac’s constant or the reduced Planck’s constant.
Thus we infer that any realizable and measurable process that changes a system’s angular momentum must do so by that amount or some integer multiple of it. If it didn’t make the change by an integer multiple of aitch-bar, then we could contrive a way to use the fractional amount to manifest a realizable and measurable change less than that minimum. That would violate Heisenberg’s principle, so it can’t happen. We must thus have in all cases
in which n represents an integer.
Imagine that we have as our system an electron revolving about an atomic nucleus. The electron’s orbit constitutes a realizable and measurable state, so the orbit must represent an integer multiple of the basic unit of angular momentum, the Dirac unit (aitch-bar). If we pursue that fact’s consequences, we get Niels Bohr’s model of the atom’s electronic structure. But consider instead what must happen when an electron changes orbits.
The electron’s orbital angular momentum must change by at least one Dirac unit of angular momentum. Further, we must have at least three particles (usually the electron, the nucleus, and a photon) involved in the event to ensure that the event properly conserves both linear momentum and energy. And satisfying conservation of angular momentum necessitates that some part of the system manifest a change of angular momentum to negate the change that the electron brings into the system. We cannot rely on the nucleus to play that role, so we must infer that the photon does the job, which necessitates that each and every photon carry one Dirac unit of angular momentum as an inherent property that we call spin.
Next we want to transcend the old quantum theory to develop a proper quantum theory of angular momentum and thereby show that spin emerges from the theory as a relativistic effect. To reach that goal we need to devise a relativistic version of the master equation of quantum mechanics. That fact means that we can’t use Schrödinger’s Equation, because that equation lacks the property of Lorentz invariance.
The basic master equation of quantum mechanics simply equates the total energy possessed by a particular system to the Hamiltonian function describing that particle or system. In order to develop a relativistic version of that master equation we need to devise a proper relativistic version of the Hamiltonian. To that end we start with the fundamental equation of relativistic dynamics,
That equation represents a Lorentz invariant, which means that whatever function we square to produce it conforms to the requirements of the Lorentz Transformation. Given that p2=p•p, we rewrite Equation 5 as an explicit square,
Bacauseα must have three components, one for each component of the linear momentum p, and because the products of those components in the vector dot product obey the associative rule, we must have
Also the cross terms that come from multiplyingα by itself and by β must form pairs that zero out, so we have
Those latter two equations tell us that multiplication of non-identical components follows an anti-commutative rule; that means that reversing the order of the factors in each product reverses the algebraic sign of that product. That fact tells us, in turn, that the components cannot consist of scalars or vectors (the vector dot product obeys the commutative rule, as does the multiplication of scalars): the components must be matrices or higher-order arrays of numbers.
Because the state function, when multiplied by its complex conjugate, must yield a single number, the state function must be either a single number or a vector-like linear array of numbers, which means that the alphas and beta must be matrices and nothing of higher order. That fact stands true to mathematics because only multiplying a vector-like array of numbers by a matrix will yield another vector-like array of numbers, which we require for the quantum theory.
To see what we can learn from this let’s consider the case of an electrically-charged particle moving in a spherically symmetric electrostatic field, described by the potential function V(r). In this case we have the Dirac Hamiltonian as
In the light shed by that function we want to examine the angular momentum associated with the motion of the particle about some axis that passes through the electric field’s center of symmetry. Specifically we want to assume that the angular momentum vector aligns precisely with one of the axes (the x-axis for our example) of our coordinate system and then use Heisenberg’s equation of motion to calculate the rate at which the angular momentum changes with the elapse of time. We have, then,
Because it has the mathematical nature of a constant the rest mass term in the Hamiltonian commutes with the angular-momentum function, so its part of the commutator in the first line of the above equation zeroes out. Likewise, the potential term commutes with the angular-momentum function and thus drops out of the equation of motion, though it’s harder to see why.
Put simply, the potential function commutes with the angular momentum because all spherically-symmetric functions commute with the angular momentum. To understand why that fact stands true to mathematics we need only replace the angular momentum by the corresponding quantum differential operator. If, in our coordinate frame, we measure an angleθ around the x-axis, then we have
Because of its spherical symmetry the potential function has no dependence onθ, so it commutes with the angular momentum operator and, thus, drops out of the Heisenberg equation of motion.
Thus Equation 11 becomes
Because the right-hand side does not zero out, that description does not conform to conservation of angular momentum, but Reality most certainly does, so we need to find some entity S that we can add to the angular momentum such that
In order to make that equation suitable for addition to Equation 13 we need to find an alternate way in which to expressα. Dirac, of course, already did that.
He started by defining a new vector,
in whichσ represents the vector whose three components consist of the 2x2 Pauli spin matrices (see Appendix). Thus σ’ consists of a trio of 4x4 matrices. Next Dirac calculated the commutator of that vector with α, getting, for example,
And, of course, we also have
With that result Dirac could use Heisenberg’s equation of motion to describe the rate at whichσ’ changes with the elapse of time. If we carry out that calculation, using σx’ for an example, we get
If we compare that result with Equation 14 for all of the components ofσ’, we find that
which means that the quantity L+½σ’ represents the conserved angular momentum. That fact necessitates that the operator
extract a description of the electron’s spin from the appropriate state function.
The electron can have one half unit of spin because, unlike the photon, it can flip over, thereby enacting a change equal to one full Dirac unit of angular momentum as it goes from +½S to -½S or vice versa. When an electron in a hydrogen atom executes a spontaneous spin flip it emits a photon with an associated wavelength of 21 centimeters, the radiation that astronomers use to map the galaxy. And, like the electron, all of the fundamental particles of matter carry one half of a Dirac unit of angular momentum as spin.
Appendix: Pauli’s Theory of Electron Spin
Toward the end of the Nineteenth Century physicists began to suspect that spectroscopy would give them the magic key to unlock the secrets of the atom and its structure. Niels Bohr began confirming that suspicion in 1913 when he presented his model of the atom, of light electrons whirling about a minuscule, heavy nucleus, and used it to deduce a correct description of the Balmer series in the spectrum of hydrogen. Over the following decade physicists refined Bohr’s model and found that a proper description of an atom’s electronic structure requires three quantum numbers.
The first of those numbers designates the shell in which the electron revolves about the nucleus and, thus, specifies the energy that the electron possesses relative to some ground state. The second quantum number designates the angular momentum associated with the electron due to its revolving about the nucleus. And the third quantum number designates the projection of the orbital angular momentum vector on the z-axis of an arbitrarily oriented coordinate frame: that number takes values between the positive and negative values of the electron’s orbital angular momentum, each value differing from the next by one Dirac unit of angular momentum.
But then physicists had to confront the results of an experiment that Dutch physicist Pieter Zeeman (1865 May 25 – 1943 Oct 09) conducted in 1896. In spectroscopy the basic experiment consists of heating a sample of some substance to incandescence and then projecting the light emanating from the sample through a glass prism to spread it into a spectrum marked by bright lines. Zeeman discovered that if he immersed the sample in a magnetic field, the bright lines in the sample’s spectrum split into multiple lines separated by distances that corresponded to the intensity of the magnetic field. In the early 1920's Wolfgang Pauli proposed that the Zeeman effect required a proper description of an atom to include a fourth quantum number, although he could not say what phenomenon that new quantum number would represent.
In 1925 the Dutch physicists George Eugene Uhlenbeck (1900 Dec 06 – 1988 Oct 31) and Samuel Abraham Goudsmit (1902 July 11 – 1978 Dec 04) analyzed the Zeeman effect in light of their proposal that an electron carries a small amount of inherent angular momentum, which Pauli called spin, and that the fourth quantum number represents the projection of that spin’s vector onto the z-axis described above. The analysis convinced Pauli and led him to devise his own version of Schrödinger’s Equation,
In that equation the state functionψ consists of a two-component spinor, each component encoding a description of the system in one of the electron’s eigenstates (either spin up or spin down relative to the magnetic field). Further, in the three-component vector σ each component consists of a 2x2 matrix that reflects the fact that the electron has only two eigenstates. That latter fact comes from Uhlenbeck and Goudsmit’s analysis, in which they inferred that the electron carries a spin equal to one half of one quantum unit of angular momentum.
So the electron (and other fundamental particles) carries an inherent angular momentum
its spin. Because the vectorσ represents the orientation of an angular momentum, its components must conform to the quantum commutation rules for angular momentum,
Assuming that the spin vector is oriented parallel to our z-axis, we know that Sz has eigenvalues of +S/2 and !S/2, so we know that σz must have eigenvalues of +1 and -1. That fact tells us that σz2=1. Applying that reasoning to the other components of σ gives us
With those relations we can form a new commutator,
If we add to that equation the statement thatσyσzσy-σyσzσy=0, then we get
Referring to the second equation in Equations A-3 to replace the commutators in Equation A-6 (and dividing out the constants), we see that
which means that the components ofσ obey an anti-commutation rule. In the light of that rule and of Equations A-3 we readily infer that
Because the components ofσ do not commute under multiplication and because, like σz, they have only two eigenvalues, we know that we must represent them with 2x2 matrices. We already know the eigenvalues for σz, so we know that
in which the positive one represents the spin-up state and the negative one represents the spin-down state of a spin-½ particle. In order to determine the form of the other components we assume that
Next we multiply Equations A-8 by -i and get
Those equations give us twelve equations in eight unknowns,
Combining the first and fifth of those equations and then the fourth and eighth gives us a1=-a1=0 and a4=-a4=0. Those facts transform the ninth and twelfth equations into a2b3=+i and a3b2=-i.
Equation A-4 then necessitates that the pairs (a2,a3) and (b2,b3) take the values (1,1) or (-1,-1) and (i,-i) or (-i,i). We must now find a way to assign one pair of eigenvalues to one of the matrices and the other matrix’s eigenvalues will follow immediately.
Because the magnetic field that defines the direction of spin up or spin down lies parallel to our z-axis, our x-axis and y-axis have no features to distinguish one from the other. It will make no difference in the physics how we assign the eigenvalues to the matrix elements, so following Pauli we set (a2,a3)=(+1,+1). And thus we obtain the Pauli spin matrices
If we now want to describe a spin-½ particle with its spin vector oriented in a random direction defined by the direction cosines l, m, n, then we have
If we apply that matrix to the spinor-based state function describing the particle at a given instant, then we can calculate, by way of Born’s theorem, the probabilities of the particle’s spin vector pointing in any given direction at any given instant. Experiments confirm that result.
Back to Contents