Newton’s Zeroth Law

Back to Contents

    It seems almost as natural as my using English to compose these essays. If I want to describe some phenomenon – the way a planet moves across the sky, how hard a coil of wire adheres to an iron plate when someone connects the ends to the opposite poles of a Voltaic pile, or how much work an engine can extract from a mass of steam – I will describe the system with mathematics and I will describe its behavior with mathematical logic that changes my initial description into a new description in accordance with the change in some parameter of the system, usually time. Mathematics seems to give us something like the Platonic Form of Reality, a kind of ćtherial music to which everything dances.

    We find the essence of physics in the act of devising an algebraic description of some feature of Reality and then using mathematical operations to transform that description into a new description that also matches some feature of Reality. Thus, we describe the world in its most fundamental aspect with mathematics; in particular, the mathematics of smooth, continuous functions. We go even further and say that the laws of physics conform to mathematical descriptions based on smooth, continuous functions of the relevant coordinates, usually those of space and time.

    Should we find that proposition astonishing? Do we find ourselves repeating Albert Einstein’s question, "How can it be that mathematics, being after all a product of human thought which is independent of experience, is so admirably appropriate to the objects of reality?" How, indeed? After all, we regard mathematics as a kind of language and we don’t question our ability to use language to describe the world or to use logic to tell us new things about the world. Language mediates between perception and conception and enables us to share concepts by describing things to other people. If Isaac Newton says, "I saw an apple fall from a tree", that statement evokes in our minds an image that, while not accurately representing the image in Newton’s mind fully, does contain the elements that he mentioned.

    But if we can use language to help us envision what we have never actually seen, however inaccurately, we can also use language to describe what does not exist. As an example consider that in her 1954 novel "The Stars Are Ours" Alice Mary ("Andre") Norton (1912 Feb 17 – 2005 Mar 17) described a rather different kind of apple tree. Found on an Earth-like planet revolving about a nearby star, it had a conical shape and blue-green needles, making it resemble a pine, but, in lieu of cones, it bore a fruit that resembled an apple with a golden sheen. No one has ever seen such a tree or its fruit, of course, but I have in my mind an image of it every bit as clear as the image of any real apple tree that I have actually seen.

    Given that example as a caution, we state that mathematics exists as a language on a particular realm, one of pure logic. It exists as a language whose grammar seems to mimic the rules that govern the behavior of matter and radiation in space and time. Indeed, at the interface between mathematics and Reality, the science of physics, we find that statements that stand true to one of those realms stand true to the other. One of the conditions that our mathematical language must satisfy to achieve such close mimicry requires that its "words" present only pure denotation and no connotation, the humanizing fluff that attaches to most words. Satisfying that condition makes easier the task of conforming our statements to Cobbett’s rule: we put our propositions into such a form that others will understand them and also will not misunderstand them. In that rule we find an analogue of the mathematician’s criteria of necessary and sufficient for a derivation in mathematical logic.

    The property of uniqueness gives mathematics the ability to obey that rule. We exploit the fact that the equations representing the laws of physics have unique solutions; as Richard Feynman observed, equations of the same form have solutions of the same form. Among those equations the ones that encode the fundamental laws of physics constitute a set of first-order and second-order differential equations. The functions that go into those equations must then possess the properties of continuity and smoothness, which means that, at the very least, those functions must process elements of the set of the real numbers. The functions that we use to describe features of Reality require continuity because their first derivatives must exist everywhere in the realm of application. Those functions also require smoothness because their second derivatives must exist everywhere in the realm of application.

    At some point we want to ask why we must encode the fundamental laws of physics into the change in some function F or in the change in the change in F. Part of the answer will explain the fact that the solutions of the fundamental equations give us only a general description of the situation we want to describe and that we must, therefore, augment any application of the laws of physics with initial conditions and/or boundary conditions pertaining to a specific situation. In an effort to gain additional insight into this topic let’s look at the discoveries that necessarily led up to the appearance of Newton’s zeroth law.

    In Ancient Greece philosophers knew that they could apply mathematics to astronomy and they believed that they could apply it to physics. Referring to the doctrines of Plato, they knew that they had to remove the texture of the real world, the world of our senses, to leave the ćtherial Forms underlying Reality exposed. They sought the causes of things, the Forms of objects and events, but they lacked the mathematics necessary to abstract general laws from their observations; indeed, they lacked the mathematics necessary to describe their observations properly.

    Beginning near the end of the Tenth Century AD, Europeans began to get acquainted with the Ghobar (sand-table) numerals that the Arabs had obtained from Hindu mathematicians by way of Persia. By the middle of the Sixteenth Century the system of "Arabic" numerals, with its zero, place-value notation, and decimal fractions, had achieved common use throughout Europe, thereby giving European mathematicians the ability to use the set of the real numbers, the concept of which the process of differentiation requires.

    Algebra, the foundation on which differential calculus as we understand it rests, has had a fuzzier history. Some mathematicians trace it back to Diophantus and the Babylonians. Others point out that the word problems found in Ancient Egyptian texts correspond to the solving of equations. And yet others point to the Persian mathematician Muhammad ibn Musa al-Khwarizmi (from whose name we get the word algorithm), who published the first algebra book (algebra comes from al-Jabr, which means reunion) in AD 820. (Al-Khwarizmi also wrote the book that introduced the Hindu numeral system to the Arabs.) The key concept in algebra tells us that we can represent numbers as labeled blanks, manipulate those blanks through the processes of arithmetic, and then insert actual numbers into the blanks in order to carry out the actual calculations. Recipes for doing arithmetic with labeled blanks correspond to the concept of a function, a specific sequence of arithmetic operations that takes numbers from an input set and associates them with specific numbers from an output set.

    In the realm of measurement, the invention of the mechanical clock in the Fourteenth Century began for time the process that Greek geometry had applied to space. By separating the measurement of time from the rhythms of Nature with a device that could count time continuously, the monks who built and used the first clocks initiated the abstraction of time to reflect the Platonic Form of duration.

    At the beginning of the Seventeenth Century Galileo Galilei discovered the right way to mathematize physics. While studying the motion of freely falling bodies, he conceived the idea of rolling balls down a shallow groove cut into an inclined plank to slow the balls’ descent enough that he could make measurements of distance versus elapsed time. He found that the distance a ball rolled stood in direct proportion to the square of the time elapsed since the ball began rolling. Thus began algebraic physics. In a downright Pythagorean move ("All is number"), Galileo initiated the idea of algebraic equations corresponding to the Platonic Forms of motion.

    In that move Galileo took connotational reasoning out of physics, leaving only the denotational reasoning of pure mathematical logic. To understand the significance of that move, to understand the difference between connotational reasoning and denotational reasoning, consider the difference between astrology and astronomy: in the former the motions of the planets through the constellations possess meanings, based on the myths told of the gods and heroes associated with the planets and the constellations, that have application here on Earth and in the latter we simply calculate those motions from algebraic formulae. Up to the Seventeenth Century natural philosophy had more the character of astrology than of astronomy. The philosophers, right up through the Medieval Schoolmen, failed to remove the connotations from the denotations of the words used in their reasoning. Galileo corrected that error and laid the foundation of modern physics, just as, at the end of the Eighteenth Century, Antoine Lavoisier created the nomenclature that we use to name substances, thereby transforming alchemy into true chemistry.

    In 1637 Rene Descartes had "La Geometrie" published. In that book he laid the foundation of what we now call analytic geometry. He did not actually introduce the gridded plane that we now use to plot the curves that correspond to certain algebraic functions, but he showed the way toward that useful device. More importantly, he showed mathematicians how to convert algebraic equations into problems in geometry and vice versa, which enabled mathematicians to solve otherwise intractable problems.

    Half a century later Isaac Newton brought it all together in "The Mathematical Principles of Natural Philosophy" (the famous Principia). He combined Galileo’s algebraic physics with Descartes’ analytic geometry and added in the calculus (which he invented) and the element of time, thereby transforming the static geometry of Euclid and Apollonius into the dynamic geometry that underlies all of modern physics. Although he refused to frame any hypotheses (as he put it), refused to speculate on fundamental causes, his mathematical physics gives us something very close; indeed, we may well regard the laws of Newtonian physics as the Platonic Forms of events.

    Consider one example of what that statement means. In the years around 1610 Johannes Kepler, in a magnificent example of the empirical-inductive method, took the volumes of data on planetary motion that Tycho Brahe had accumulated, applied those data to the Copernican hypothesis, and extracted three laws governing the motions of the planets. He got the correct laws, certainly, but he gained no explanation of how or why they exist as they do. Newton, on the other hand, deduced Kepler’s laws from his fundamental laws of motion and his law of gravity. In that deduction he came as close as we can to comprehending the causes of planetary motion.

    Or did he? Does mathematics truly give us a correct picture of causes? Does mathematics accurately represent something like a Platonic Form? To gain enough insight to answer those questions let’s ask whether humans invented mathematics or discovered it. If we invented it, then mathematics consists solely of the manipulation of symbols which exist in the world of our conceptions and to which culture, and only culture, assign meaning. But if we discovered it, then mathematics shows us the shadow of a transcendent reality, the Platonic Forms underlying the world of our perceptions. In the latter case Newton certainly caught a glimpse of ultimate Reality. But if we invented mathematics, we can’t say for certain what we see through it. Thus, we cannot answer the existential questions right now, but we will return to them as we develop the Map of Physics.

    Finally, the smooth, continuous functions that we use in mathematical physics conform to a simple equation,

In a kind of nice symmetry we have the integral of some function over the closed boundary of a geometric figure S equal to the integral of the derivative of the function over the figure S itself. We have three basic possibilities for putting that statement into a more explicit form:

    1. The calculation of the function at the endpoints of a line equals the integral of df on the line. This gives us our standard Riemannian integral.

    2. The integral of f over the closed curve bounding a surface equals the integral of df on the surface. Stokes’ theorem and Ampere’s law provide examples of this two-dimensional case.

    3. The integral of f over the closed surface enclosing a volume equals the integral of df throughout the volume. Gauss’s law gives us the prime example of this version.

Thus we have the fundamental concepts underlying the mathematical foundation upon which we build modern physics.

Appendix I: A Physicist’s Peano Arithmetic

    In 1889 Giuseppe Peano (1858 Aug 27 – 1932 Apr 20) had his book "The Principles of Arithmetic Presented by a New Method" published. In that book, building on work begun in the 1860's, Peano sought to replace the ad hoc arithmetic that had accumulated over several millenia with a rigorously deduced axiom-based arithmetic. He started with nine axioms, which he later reduced to five, from which he could deduce all of the properties of the natural numbers. In those five axioms we have:

    1. Zero, the content of the empty set, is a natural number.

    2. Every number N has precisely one successor SN.

    3. Zero is not the successor of any natural number.

    4. Distinct numbers have distinct successors.

    5. If a set of natural numbers contains the number zero and contains, together with any number, also it successor, then it is the set of all natural numbers.

    Axiom #1 gives us the foundation upon which we base our construction of the set of the natural numbers. It gives us the first element in the set as that which counts the absence of anything to count.

    For Axiom #2 we can define the relation of successor by saying that if we have a set of discrete objects to which we have assigned the number N (such as by counting) and if we put another object into the set, then the number of the modified set corresponds to the successor of N.

    Axiom #3 tells us that given the set of the natural numbers we have a well-ordered set that begins with zero.

    Axiom #4 tells us that for all natural numbers M and N, if we have SM=SN, then we must also have M=N.

    This brings into play the first four of Peano’s original axioms, which we may take as constituting the set of common notions pertaining to the relation of equality. If we have natural numbers X, Y, and Z, then we know that X=X (equality is reflexive (works both ways)); if X=Y, then Y=X (equality is symmetric); if X=Y and Y=Z, then X=Z (equality is transitive); and for all entities labeled A and B, if A represents a natural number and A=B, then B also represents a natural number (the set of natural numbers is closed under the relation of equality). Note that A and B can represent different entities, such as algebraic functions.

    We now have what we need to generate the set of the natural numbers. Beginning with the natural number of the empty set {0}, we generate the set of the natural numbers by repeated application of the successor function to that seed. For convenience (and to make the connection with the common, ad hoc arithmetic that we all know and love) we apply a name to each successor, fixing it in place to create the ordered sequence representing the natural numbers. The names come from the sequence {one, two, three, ... , infinity}, in which the word infinity denotes the fact that no last name exists in the sequence. Thus we have 0=zero, S0=one, SS0=two, SSS0=three, and so on.

    With that set we can carry out the process of counting the elements of another set. We begin with an empty set and call its number zero. Then we put elements into the set, accompanying the placement of each element with a calling out of the next name in the fixed sequence of number names. The last name that we call out when we put the last element into the set we call the number of the elements in the set.

    Axiom #5, the axiom of induction, has two forms:

    a. if {K} is a set such that zero is an element of {K} and for every natural number N, if N is an element of {K}, then SN is an element of {K}, then {K} contains every natural number; or

    b. if F is a unary predicate such that F(0) is true and for every natural number N, if F(N) is true, the F(SN) is true, then F(N) is true for every natural number N.

    We accept the validity of those statements because the natural numbers provide nothing more than indices on mathematical objects and the action of those indices does not change for any of them; therefore, the process behind any function of the real numbers does not change for any natural number. Note that a mathematical function describes an arithmetic procedure that simply replaces one number with another number.

    In order to define the primary operations for combining numbers and to extend the set of things that we call numbers, we need to define a new function. We define the predecessor function as the inverse of the successor function: for any natural number N we have PSN=SPN=N.

    We start with addition. If we have natural numbers A, B, and C, then we define the function of addition by saying that we can calculate the sum of A and B (C=A+B) by applying the successor function to A every time we apply the predecessor function to B and by applying the predecessor function to B until it comes to zero. That procedure gives us the same result that we get from arraying the natural numbers on a line in their proper order, starting at A, and then counting farther up the line by B elements. We get the same result if we start at B and then count further up the line by A elements, which means that we can also calculate the sum by applying the successor function to B every time we apply the predecessor function to A and then applying the predecessor function to A until it comes to zero. Thus we know that addition obeys the commutative rule, which means that A+B=B+A. We can use addition to determine the number of elements in a set that we create by combining the elements of multiple mutually-exclusive sets.

    Addition gives us a synthetic process, so its inverse gives us an analytic process that we call subtraction. We define the process of subtraction by saying that we can calculate the difference between A and B (C=A-B) by applying the predecessor function to A every time we apply the predecessor function to B and then applying the predecessor function to B until it comes to zero. That procedure gives the same result as what we would get if we started at the A-th element on the number line and counted down the line (toward zero) by B consecutive elements.

    With the sole exception of A=B, the interchange of A and B in a subtraction does not yield the same result: subtraction does not give us a commutative process. But that interchange of A and B will give us something else. We have tacitly assumed that A represents a number greater than B (which means that there exists a non-zero natural number E such that A=B+E), so what do we get if we try to subtract A from B? To answer that question we must determine the meaning of a predecessor of zero.

    The third Peano axiom ensures that no predecessor of zero can exist in the set of the natural numbers. So the process of subtraction ensures the existence of some kind of unnatural number. We know that the predecessor of zero exists, because we can apply the successor function to it and get back to zero. We know that adding the M-th predecessor of zero to the N-th successor of zero gives us a smaller successor of zero (if M<N) or a smaller predecessor of zero (if M>N), so we interpret the predecessors of zero as representing a kind of debt that we can pay off by adding successors of zero, natural numbers, to it. In essence, the predecessors of zero, in the operation of addition, negate the natural numbers, so we call them negative numbers and, in contrast, call the natural numbers positive. So now we put the infinite set of the negative numbers together with the infinite set of the positive numbers to create the set of the integers (whole numbers).

    If we have integers A, B, and C, we define the operation of multiplication by saying that we can calculate the product of A and B (C=AB) by starting with zero, applying the process of addition to A every time we apply the predecessor function to B, and then applying the predecessor function to B until it comes to zero. If A represents a negative integer, we apply the process of subtraction to it every time we apply the predecessor function to B and we get a negative number as the product. We get the same result if we interchange A and B in those descriptions, so now we know that multiplication gives us a commutative function, which means that AB=BA.

    We can apply the processes of addition and multiplication sequentially to obtain a number G=(A+B)C. Because we can represent any number as a sum of two numbers and because multiplication consists of repeated additions, we can separate any number into two numbers, multiply those numbers by some other number and add the resulting products to get the same number that we get from adding the two numbers first and then carrying out the multiplication on their sum: AC+BC=(A+B)C, in which the parenthesis function tells us to carry out the operations enclosed within it before we carry out any other functions, thereby removing potential ambiguity in a calculation. Thus we know that addition and multiplication, taken together, conform to a distributive law. That fact gives us an important result.

    We know that the product of two positive numbers gives us a positive number and that the product of a positive number and a negative number gives us a negative number. Now we want to determine what the product of two negative numbers gives us. To make that determination we represent zero as the sum of a positive number and a negative number: 0=posA+negA. If we multiply that equality by some other negative number, we still get zero: 0=(posA+negA)negB=posAnegB+negAnegB. We know that the partial product posAnegB gives us a negative number, so negAnegB, the product of two negative numbers, must give us a positive number.

    Before I define the analytic inverse of multiplication, I want to consider the next synthetic process – exponentiation. Just as we define the B-fold multiple of A as the addition of B terms all equal to A, so we define the B-th power of A as the multiplication of B factors all equal to A (which we represent by C=AB). More technically we define exponentiation by saying that we start with the number one, apply the process of multiplication to A every time we apply the predecessor function to B, and then apply the predecessor function to B until it comes to zero.

    The processes of addition, multiplication, and exponentiation give us the means to represent numbers in a convenient way that gives us the ability to carry out calculations easily. We select a number (that we represent with K) and call it the base of the system. We devise symbols Di to represent the numbers from zero up to but not including the base (the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} if we make ten our base) and call them the digits of the number system. We can then represent any number N as a sum:

N=D0+D1K+D2K2+D3K3....

Because we know the pattern we can leave the powers of K implicit and write down only the digits in what we call place-value notation. And instead of the left-to-right pattern shown above, which reflects the way that Indo-European languages are written, we write the digits from right to left, reflecting the pattern’s origin in the way that Semitic languages get written down, so we have

N=...D3D2D1D0.

    Now, as we have multiplication as a synthetic arithmetic process, so we have its inverse, division, as the corresponding analytic process. We define the process of dividing a number A by a number B as follows: let C=0; every time we subtract B from A we apply the successor function to C; we subtract B from A until we reduce A to zero; and we call the final value of C the quotient.

    But suppose now that in carrying out the procedure our multiple subtractions of B from A reduce A to some number R less than B but greater than zero (or that we started with A less than B). We do not allow our multiple subtractions to take the reduced A into the realm of negative numbers, because that would give the division an infinite quotient, which we dismiss as absurd. Instead, as we did with subtraction, we must invent a new kind of number. We call those particular novelties broken numbers (i.e. fractions) because in order to describe them properly we must break the remainder R into B parts. We can represent that ordered pair as R/B, the classical fraction, but we also want to represent fractions in something resembling the Ghobar form of numbers.

    We define the process of generating a decimal fractions as follows: given integers R and B, with R<B, we apply a modified version of the process of division by repeated subtraction. We establish C’=0 and make a mark (a decimal point in our ten-based system) on the right side of the zero. We generate R’ by multiplying R by the base of our number system, apply repeated subtraction of B to R’, keeping count of the number of times we can subtract B from R’, and write the digit from the count into the space on the right side of the decimal point. If we have a remainder, we multiply it by the base to generate R’‘, subtract B from R’‘ while keeping count of the number of times we do so, and put the number of that count in the empty space on the right side of the first place to the right of the decimal point. We continue that process until the remainder goes to zero. In that way we create the decimal fraction

C’=0.D-1D-2D-3D-4...,

a number that lies between zero and one.

    The elements of the set of the decimal fractions all obey the rules of addition and subtraction and, by extension, multiplication, division, exponentiation, and the extraction of roots. When we combine that set with the set of the integers we create the set of the real numbers, the set that completely fills the number line, enabling us to calculate a precise value for anything. And with the decimal fractions we also extend the process of exponentiation into the realm of negative-number exponents.

    Finally we have the inverse of exponentiation, the analytic process of the extraction of roots. Given any real number A we define the N-th root of A by stating that raising that n-th root to the n-th power equals A. Unfortunately we have no easy, straightforward way in which we can calculate roots. If we recognize that the root of a number corresponds to the number raised to a fractional power, we can apply the binomial series that Isaac Newton discovered in 1676 to work out an approximation of the root’s value.

    As with subtraction and division, the process of extracting roots necessitates the invention of a new kind of number. We know that we can extract any root of a positive real number and we know that we can determine any odd-numbered root (e.g. the third, the fifth, etc.) of a negative number, but none of the numbers in the set of the real numbers yields a negative number when raised to an even-numbered power, so we do not have the numbers necessary to express the even-numbered roots of negative numbers. Thus we must assert the existence of imaginary numbers, whose even-numbered powers do yield negative numbers. We can prove and verify easily enough that the imaginary numbers so defined participate in the six processes of arithmetic just as the real numbers do, so the imaginary numbers do, indeed, give us a kind of number.

    Imaginary numbers coordinate with the real numbers in the same way in which the negative integers coordinate with the positive integers, so we can combine the two sets to form the infinite set of the complex numbers. The fundamental theorem of algebra tells us that the six basic processes of arithmetic and the set of the complex numbers comprise a complete, closed system of calculation. Thus this system of arithmetic gives us the means to describe any phenomenon or to carry out any calculation that falls within the realm of physics. It gives us what Newton’s zeroth law requires.

Appendix II: A Classic Essay

    In 1960 Professor Eugene Wigner wrote an essay in which he expressed physicists’ continuing astonishment at the fact that mathematics and mathematical logic given an especially accurate accounting for the regularities that scientists find in the real world. In essence it represents a paean to Newton’s zeroth law.

The Unreasonable Effectiveness of Mathematics in the Natural Sciences

by

Eugene Wigner

    There is a story about two friends, who were classmates in high school, talking about their jobs. One of them became a statistician and was working on population trends. He showed a reprint to his former classmate. The reprint started, as usual, with the Gaussian distribution and the statistician explained to his former classmate the meaning of the symbols for the actual population, for the average population, and so on. His classmate was a bit incredulous and was not quite sure whether the statistician was pulling his leg. "How can you know that?" was his query. "And what is this symbol here?" "Oh," said the statistician, "this is pi." "What is that?" "The ratio of the circumference of the circle to its diameter." "Well, now you are pushing your joke too far," said the classmate, "surely the population has nothing to do with the circumference of the circle."

    Naturally, we are inclined to smile about the simplicity of the classmate’s approach. Nevertheless, when I heard this story, I had to admit to an eerie feeling because, surely, the reaction of the classmate betrayed only plain common sense. I was even more confused when, not many days later, someone came to me and expressed his bewilderment with the fact that we make a rather narrow selection when choosing the data on which we test our theories. "How do we know that, if we made a theory which focuses its attention on phenomena we disregard and disregards some of the phenomena now commanding our attention, that we could not build another theory which has little in common with the present one but which, nevertheless, explains just as many phenomena as the present theory?" It has to be admitted that we have no definite evidence that there is no such theory.

    The preceding two stories illustrate the two main points which are the subjects of the present discourse. The first point is that mathematical concepts turn up in entirely unexpected connections. Moreover, they often permit an unexpectedly close and accurate description of the phenomena in these connections. Secondly, just because of this circumstance, and because we do not understand the reasons of their usefulness, we cannot know whether a theory formulated in terms of mathematical concepts is uniquely appropriate. We are in a position similar to that of a man who was provided with a bunch of keys and who, having to open several doors in succession, always hit on the right key on the first or second trial. He became skeptical concerning the uniqueness of the coordination between the keys and doors. Most of what will be said on these questions will not be new; it has probably occurred to most scientists in one form or another. My principal aim is to illuminate it from several sides. The first point is that the enormous usefulness of mathematics in the natural sciences is something bordering on the mysterious and that there is no rational explanation for it. Second, it is just this uncanny usefulness of mathematical concepts that raises the question of the uniqueness of our physical theories. In order to establish the first point, that mathematics plays an unreasonably important role in physics, it will be useful to say a few words on the question, "What is mathematics?", then, "What is physics?", then, how mathematics enters physical theories, and last, why the success of mathematics in its role in physics appears so baffling. Much less will be said on the second point: the uniqueness of the theories of physics. A proper answer to this question would require elaborate experimental and theoretical work which has not been undertaken to date.

WHAT IS MATHEMATICS?

    Somebody once said that philosophy is the misuse of a terminology which was invented just for this purpose. In the same vein, I would say that mathematics is the science of skillful operations with concepts and rules invented just for this purpose. The principle emphasis is on the invention of concepts. Mathematics would soon run out of interesting theorems if these had to be formulated in terms of the concepts which already appear in the axioms. Furthermore, whereas it is unquestionably true that the concepts of elementary mathematics and particularly elementary geometry were formulated to describe entities which are directly suggested by the actual world, the same does not seem to be true of the more advanced concepts, in particular the concepts which play such an important role in physics. Thus, the rules for operations with pairs of numbers are obviously designed to give the same results as the operations with fractions which we first learned without reference to "pairs of numbers." The rules for the operations with sequences, that is, with irrational numbers, still belong to the category of rules which were determined so as to reproduce rules for the operations with quantities which were already known to us. Most more advanced mathematical concepts, such as complex numbers, algebras, linear operators, Borel sets – and this list could be continued almost indefinitely – were so devised that they are apt subjects on which the mathematician can demonstrate his ingenuity and sense of formal beauty. In fact, the definition of these concepts, with a realization that interesting and ingenious considerations could be applied to them, is the first demonstration of the ingeniousness of the mathematician who defines them. The depth of thought which goes into the formulation of the mathematical concepts is later justified by the skill with which these concepts are used. The great mathematician fully, almost ruthlessly, exploits the domain of permissible reasoning and skirts the impermissible. That his recklessness does not lead him into a morass of contradictions is a miracle in itself: certainly it is hard to believe that our reasoning power was brought, by Darwin’s process of natural selection, to the perfection which it seems to possess. However, this is not our present subject. The principal point which will have to be recalled later is that the mathematician could formulate only a handful of interesting theorems without defining concepts beyond those contained in the axioms and that the concepts outside those contained in the axioms are defined with a view of permitting ingenious logical operations which appeal to our aesthetic sense both as operations and also in their results of great generality and simplicity.

    The complex numbers provide a particularly striking example for the foregoing. Certainly, nothing in our experience suggests the introduction of these quantities. Indeed, if a mathematician is asked to justify his interest in complex numbers, he will point, with some indignation, to the many beautiful theorems in the theory of equations, of power series, and of analytic functions in general, which owe their origin to the introduction of complex numbers. The mathematician is not willing to give up his interest in these most beautiful accomplishments of his genius.

WHAT IS PHYSICS?

    The physicist is interested in discovering the laws of inanimate nature. In order to understand this statement, it is necessary to analyze the concept, "law of nature."

    The world around us is of baffling complexity and the most obvious fact about it is that we cannot predict the future. Although the joke attributes only to the optimist the view that the future is uncertain, the optimist is right in this case: the future is unpredictable. It is, as Schrödinger has remarked, a miracle that in spite of the baffling complexity of the world, certain regularities in the events could be discovered. One such regularity, discovered by Galileo, is that two rocks, dropped at the same time from the same height, reach the ground at the same time. The laws of nature are concerned with such regularities. Galileo’s regularity is a prototype of a large class of regularities. It is a surprising regularity for three reasons.

    The first reason that it is surprising is that it is true not only in Pisa, and in Galileo’s time, it is true everywhere on the Earth, was always true, and will always be true. This property of the regularity is a recognized invariance property and, as I had occasion to point out some time ago, without invariance principles similar to those implied in the preceding generalization of Galileo’s observation, physics would not be possible. The second surprising feature is that the regularity which we are discussing is independent of so many conditions which could have an effect on it. It is valid no matter whether it rains or not, whether the experiment is carried out in a room or from the Leaning Tower, no matter whether the person who drops the rocks is a man or a woman. It is valid even if the two rocks are dropped, simultaneously and from the same height, by two different people. There are, obviously, innumerable other conditions which are all immaterial from the point of view of the validity of Galileo’s regularity. The irrelevancy of so many circumstances which could play a role in the phenomenon observed has also been called an invariance. However, this invariance is of a different character from the preceding one since it cannot be formulated as a general principle. The exploration of the conditions which do, and which do not, influence a phenomenon is part of the early experimental exploration of a field. It is the skill and ingenuity of the experimenter which show him phenomena which depend on a relatively narrow set of relatively easily recognizable and reproducible conditions. In the present case, Galileo’s restriction of his observations to relatively heavy bodies was the most important step in this regard. Again, it is true that if there were no phenomena which are independent of all but a manageably small set of conditions, physics would be impossible.

    The preceding two points, though highly significant from the point of view of the philosopher, are not the ones which surprised Galileo most, nor do they contain a specific law of nature. The law of nature is contained in the statement that the length of time which it takes for a heavy object to fall from a given height is independent of the size, material, and shape of the body which drops. In the framework of Newton’s second "law," this amounts to the statement that the gravitational force which acts on the falling body is proportional to its mass but independent of the size, material, and shape of the body which falls.

    The preceding discussion is intended to remind us, first, that it is not at all natural that "laws of nature" exist, much less that man is able to discover them. The present writer had occasion, some time ago, to call attention to the succession of layers of "laws of nature," each layer containing more general and more encompassing laws than the previous one and its discovery constituting a deeper penetration into the structure of the universe than the layers recognized before. However, the point which is most significant in the present context is that all these laws of nature contain, in even their remotest consequences, only a small part of our knowledge of the inanimate world. All the laws of nature are conditional statements which permit a prediction of some future events on the basis of the knowledge of the present, except that some aspects of the present state of the world, in practice the overwhelming majority of the determinants of the present state of the world, are irrelevant from the point of view of the prediction. The irrelevancy is meant in the sense of the second point in the discussion of Galileo’s theorem.

    As regards the present state of the world, such as the existence of the earth on which we live and on which Galileo’s experiments were performed, the existence of the sun and of all our surroundings, the laws of nature are entirely silent. It is in consonance with this, first, that the laws of nature can be used to predict future events only under exceptional circumstances – when all the relevant determinants of the present state of the world are known. It is also in consonance with this that the construction of machines, the functioning of which he can foresee, constitutes the most spectacular accomplishment of the physicist. In these machines, the physicist creates a situation in which all the relevant coordinates are known so that the behavior of the machine can be predicted. Radars and nuclear reactors are examples of such machines.

    The principal purpose of the preceding discussion is to point out that the laws of nature are all conditional statements and they relate only to a very small part of our knowledge of the world. Thus, classical mechanics, which is the best known prototype of a physical theory, gives the second derivatives of the positional coordinates of all bodies, on the basis of the knowledge of the positions, etc., of these bodies. It gives no information on the existence, the present positions, or velocities of these bodies. It should be mentioned, for the sake of accuracy, that we discovered about thirty years ago that even the conditional statements cannot be entirely precise: that the conditional statements are probability laws which enable us only to place intelligent bets on future properties of the inanimate world, based on the knowledge of the present state. They do not allow us to make categorical statements, not even categorical statements conditional on the present state of the world. The probabilistic nature of the "laws of nature" manifests itself in the case of machines also, and can be verified, at least in the case of nuclear reactors, if one runs them at very low power. However, the additional limitation of the scope of the laws of nature which follows from their probabilistic nature will play no role in the rest of the discussion.

THE ROLE OF MATHEMATICS IN PHYSICAL THEORIES

    Having refreshed our minds as to the essence of mathematics and physics, we should be in a better position to review the role of mathematics in physical theories.

    Naturally, we do use mathematics in everyday physics to evaluate the results of the laws of nature, to apply the conditional statements to the particular conditions which happen to prevail or happen to interest us. In order that this be possible, the laws of nature must already be formulated in mathematical language. However, the role of evaluating the consequences of already established theories is not the most important role of mathematics in physics. Mathematics, or, rather, applied mathematics, is not so much the master of the situation in this function: it is merely serving as a tool. Mathematics does play, however, also a more sovereign role in physics. This was already implied in the statement, made when discussing the role of applied mathematics, that the laws of nature must have been formulated in the language of mathematics to be an object for the use of applied mathematics. The statement that the laws of nature are written in the language of mathematics was properly made three hundred years ago; it is now more true than ever before. In order to show the importance which mathematical concepts possess in the formulation of the laws of physics, let us recall, as an example, the axioms of quantum mechanics as formulated, explicitly, by the great physicist, Dirac. There are two basic concepts in quantum mechanics: states and observables. The states are vectors in Hilbert space, the observables self-adjoint operators on these vectors. The possible values of the observations are the characteristic values of the operators – but we had better stop here lest we engage in a listing of the mathematical concepts developed in the theory of linear operators.

    It is true, of course, that physics chooses certain mathematical concepts for the formulation of the laws of nature, and surely only a fraction of all mathematical concepts is used in physics. It is true also that the concepts which were chosen were not selected arbitrarily from a listing of mathematical terms but were developed, in many if not most cases, independently by the physicist and recognized then as having been conceived before by the mathematician. It is not true, however, as is so often stated, that this had to happen because mathematics uses the simplest possible concepts and these were bound to occur in any formalism. As we saw before, the concepts of mathematics are not chosen for their conceptual simplicity (even sequences of pairs of numbers are far from being the simplest concepts) but for their amenability to clever manipulations and to striking, brilliant arguments. Let us not forget that the Hilbert space of quantum mechanics is the complex Hilbert space, with a Hermitean scalar product. Surely to the unpreoccupied mind, complex numbers are far from natural or simple and they cannot be suggested by physical observations. Furthermore, the use of complex numbers is in this case not a calculational trick of applied mathematics but comes close to being a necessity in the formulation of the laws of quantum mechanics. Finally, it now begins to appear that not only complex numbers but so-called analytic functions are destined to play a decisive role in the formulation of quantum theory. I am referring to the rapidly developing theory of dispersion relations.

    It is difficult to avoid the impression that a miracle confronts us here, quite comparable in its striking nature to the miracle that the human mind can string a thousand arguments together without getting itself into contradictions, or to the two miracles of the existence of laws of nature and of the human mind’s capacity to divine them. The observation which comes closest to an explanation for the mathematical concepts’ cropping up in physics which I know is Einstein’s statement that the only physical theories which we are willing to accept are the beautiful ones. It stands to argue that the concepts of mathematics, which invite the exercise of so much wit, have the quality of beauty. However, Einstein’s observation can at best explain properties of theories which we are willing to believe and has no reference to the intrinsic accuracy of the theory. We shall, therefore, turn to this latter question.

IS THE SUCCESS OF PHYSICAL THEORIES TRULY SURPRISING?

    A possible explanation of the physicist’s use of mathematics to formulate his laws of nature is that he is a somewhat irresponsible person. As a result, when he finds a connection between two quantities which resembles a connection well-known from mathematics, he will jump at the conclusion that the connection is that discussed in mathematics simply because he does not know of any other similar connection. It is not the intention of the present discussion to refute the charge that the physicist is a somewhat irresponsible person. Perhaps he is. However, it is important to point out that the mathematical formulation of the physicist’s often crude experience leads in an uncanny number of cases to an amazingly accurate description of a large class of phenomena. This shows that the mathematical language has more to commend it than being the only language which we can speak; it shows that it is, in a very real sense, the correct language. Let us consider a few examples.

    The first example is the oft-quoted one of planetary motion. The laws of falling bodies became rather well established as a result of experiments carried out principally in Italy. These experiments could not be very accurate in the sense in which we understand accuracy today partly because of the effect of air resistance and partly because of the impossibility, at that time, to measure short time intervals. Nevertheless, it is not surprising that, as a result of their studies, the Italian natural scientists acquired a familiarity with the ways in which objects travel through the atmosphere. It was Newton who then brought the law of freely falling objects into relation with the motion of the moon, noted that the parabola of the thrown rock’s path on the earth and the circle of the moon’s path in the sky are particular cases of the same mathematical object of an ellipse, and postulated the universal law of gravitation on the basis of a single, and at that time very approximate, numerical coincidence. Philosophically, the law of gravitation as formulated by Newton was repugnant to his time and to himself. Empirically, it was based on very scanty observations. The mathematical language in which it was formulated contained the concept of a second derivative and those of us who have tried to draw an osculating circle to a curve know that the second derivative is not a very immediate concept. The law of gravity which Newton reluctantly established and which he could verify with an accuracy of about 4% has proved to be accurate to less than a ten thousandth of a per cent and became so closely associated with the idea of absolute accuracy that only recently did physicists become again bold enough to inquire into the limitations of its accuracy. Certainly, the example of Newton’s law, quoted over and over again, must be mentioned first as a monumental example of a law, formulated in terms which appear simple to the mathematician, which has proved accurate beyond all reasonable expectations. Let us just recapitulate our thesis on this example: first, the law, particularly since a second derivative appears in it, is simple only to the mathematician, not to common sense or to non-mathematically-minded freshmen; second, it is a conditional law of very limited scope. It explains nothing about the earth which attracts Galileo’s rocks, or about the circular form of the moon’s orbit, or about the planets of the sun. the explanation of these initial conditions is left to the geologist and the astronomer, and they have a hard time with them.

    The second example is that of ordinary, elementary quantum mechanics. This originated when Max Born noticed that some rules of computation, given by Heisenberg, were formally identical with the rules of computation with matrices, established a long time before by mathematicians. Born, Jordan, and Heisenberg then proposed to replace by matrices the position and momentum variables of the equations of classical mechanics. They applied the rules of matrix mechanics to a few highly idealized problems and the results were quite satisfactory. However, there was, at that time, no rational evidence that their matrix mechanics would prove correct under more realistic conditions. Indeed, they say "if the mechanics as here proposed should already be correct in its essential traits." As a matter of fact, the first application of their mechanics to a realistic problem, that of the hydrogen atom, was given several months later, by Pauli. This application gave results in agreement with experience. This was satisfactory but still understandable because Heisenberg’s rules of calculation were abstracted from problems which included the old theory of the hydrogen atom. The miracle occurred only when matrix mechanics, or a mathematically equivalent theory, was applied to problems for which Heisenberg’s calculating rules were meaningless. Heisenberg’s rules presupposed that the classical equations of motion had solutions with certain periodicity properties; and the equations of motion of the two electrons of the helium atom, or of the even greater number of electrons of heavier atoms, simply do not have these properties, so that Heisenberg’s rules cannot be applied to these cases. Nevertheless, the calculation of the lowest energy level of helium, as carried out a few months ago by Kinoshita at Cornell and by Bazley at the Bureau of Standards, agrees with the experimental data within the accuracy of the observations, which is one part in ten million. Surely in this case we "got something out" of the equations that we did not put in.

    The same is true of the qualitative characteristics of the "complex spectra," that is, the spectra of heavier atoms. I wish to recall a conversation with Jordan, who told me, when the qualitative features of the spectra were derived, that a disagreement of the rules derived from quantum mechanical theory and the rules established by empirical research would have provided the last opportunity to make a change in the framework of matrix mechanics. In other words, Jordan felt that we would have been, at least temporarily, helpless had an unexpected disagreement occurred in the theory of the helium atom. This was, at that time, developed by Kellner and by Hilleraas. The mathematical formalism was too dear and unchangeble so that, had the miracle of helium which was mentioned before not occurred, a true crisis would have arisen. Surely, physics would have overcome that crisis in one way or another. It is true, on the other hand, that physics as we know it today would not be possible without a constant recurrence of miracles similar to the one of the helium atom, which is perhaps the most striking miracle that has occurred in the course of the development of elementary quantum mechanics, but by far not the only one. In fact, the number of analogous miracles is limited, in our view, only by our willingness to go after more similar ones. Quantum mechanics had, nevertheless, many almost equally striking successes which gave us the firm conviction that it is, what we call, correct.

    The last example is that of quantum electrodynamics, or the theory of the Lamb shift. Whereas Newton’s theory of gravitation still had obvious connections with experience, experience entered the formulation of matrix mechanics only in the refined or sublimated form of Heisenberg’s prescriptions. The quantum theory of the Lamb shift, as conceived by Bethe and established by Schwinger, is a purely mathematical theory and the only direct contribution of experiment was to show the existence of a measurable effect. The agreement with calculation is better than one part in a thousand.

    The preceding three examples, which could be multiplied almost indefinitely, should illustrate the appropriateness and accuracy of the mathematical formulation of the laws of nature in terms of concepts chosen for their manipulability, the "laws of nature" being of almost fantastic accuracy but of strictly limited scope. I propose to refer to the observation which these examples illustrate as the empirical law of epistemology. Together with the laws of invariance of physical theories, it is an indispensable foundation of these theories. Without the laws of invariance the physical theories could have been given no foundation of fact; if the empirical law of epistemology were not correct, we would lack the encouragement and reassurance which are emotional necessities, without which the "laws of nature" could not have been successfully explored. Dr. R. G. Sachs, with whom I discussed the empirical law of epistemology, called it an article of faith of the theoretical physicist, and it is surely that. However, what he called our article of faith can be well supported by actual examples – many examples in addition to the three which have been mentioned.

THE UNIQUENESS OF THE THEORIES OF PHYSICS

    The empirical nature of the preceding observation seems to me to be self-evident. It surely is not a "necessity of thought" and it should not be necessary, in order to prove this, to point to the fact that it applies only to a very small part of our knowledge of the inanimate world. It is absurd to believe that the existence of mathematically simple expressions for the second derivative of the position is self-evident, when no similar expressions for the position itself or for the velocity exist. It is therefore surprising how readily the wonderful gift contained in the empirical law of epistemology was taken for granted. The ability of the human mind to form a string of 1000 conclusions and still remain "right," which was mentioned before, is a similar gift. Every empirical law has the disquieting quality that one does not know its limitations. We have seen that there are regularities in the events in the world around us which can be formulated in terms of mathematical concepts with an uncanny accuracy. There are, on the other hand, aspects of the world concerning which we do not believe in the existence of any accurate regularities. We call these initial conditions. The question which presents itself is whether the different regularities, that is, the various laws of nature which will be discovered, will fuse into a single consistent unit, or at least asymptotically approach such a fusion. Alternatively, it is possible that there always will be some laws of nature which have nothing in common with each other. At present, this is true, for instance, of the laws of heredity and of physics. It is even possible that some of the laws of nature will be in conflict with each other in their implications, but each convincing enough in its own domain so that we may not be willing to abandon any of them. We may resign ourselves to such a state of affairs or our interest in clearing up the conflict between the various theories may fade out. We may lose interest in the "ultimate truth," that is, in a picture which is a consistent fusion into a single unit of the little pictures, formed on the various aspects of nature.

    It may be useful to illustrate the alternatives by an example. We now have, in physics, two theories of great power and interest: the theory of quantum phenomena and the theory of relativity. These two theories have their roots in mutually exclusive groups of phenomena. Relativity theory applies to macroscopic bodies, such as stars. The event of coincidence, that is, in ultimate analysis of collision, is the primitive event in the theory of relativity and defines a point in space-time, or at least would define a point if the colliding particles were infinitely small. Quantum theory has its roots in the microscopic world and, from its point of view, the event of coincidence, or of collision, even if it takes place between particles of no spatial extent, is no primitive and not at all sharply isolated in space-time. The two theories operate with different mathematical concepts – the four dimensional Riemann space and the infinite dimensional Hilbert space, respectively. So far, the two theories could not be united, that is, no mathematical formulation exists to which both of these theories are approximations. All physicists believe that a union of the two theories is inherently possible and that we shall find it. Nevertheless, it is possible also to imagine that no union of the two theories can be found. This example illustrates the two possibilities, of union and of conflict, mentioned before, both of which are conceivable.

    In order to obtain an indication as to which alternative to expect ultimately, we can pretend to be a little more ignorant than we are and place ourselves at a lower level of knowledge than we actually possess. If we can find a fusion of our theories on this lower level of intelligence, we can confidently expect that we will find a fusion of our theories also at our real level of intelligence. On the other hand, if we would arrive at mutually contradictory theories at a somewhat lower level of knowledge, the possibility of the permanence of conflicting theories cannot be excluded for ourselves either. The level of knowledge and ingenuity is a continuous variable and it is unlikely that a relatively small variation of this continuous variable changes the attainable picture of the world from inconsistent to consistent. Considered from this point of view, the fact that some of the theories which we know to be false give such amazingly accurate results is an adverse factor. Had we somewhat less knowledge, the group of phenomena which these "false" theories explain would appear to us to be large enough to "prove" these theories. However, these theories are considered to be "false" by us just for the reason that they are, in ultimate analysis, incompatible with more encompassing pictures and, if sufficiently many such false theories are discovered, they are bound to prove also to be in conflict with each other. Similarly, it is possible that the theories, which we consider to be "proved" by a number of numerical agreements which appears to be large enough for us, are false because they are in conflict with a possible more encompassing theory which is beyond our means of discovery. If this were true, we would have to expect conflicts between our theories as soon as their number grows beyond a certain point and as soon as they cover a sufficiently large number of groups of phenomena. In contrast to the article of faith of the theoretical physicist mentioned before, this is the nightmare of the theorist.

    Let us consider a few examples of "false" theories which give, in view of their falseness, alarmingly accurate descriptions of groups of phenomena. With some goodwill, one can dismiss some of the evidence which these examples provide. The success of Bohr’s early and pioneering ideas on the atom was always a rather narrow one and the same applies to Ptolemy’s epicycles. Our present vantage point gives an accurate description of all phenomena which these more primitive theories can describe. The same is not true any longer of the so-called free-electron theory, which gives a marvelously accurate picture of many, if not most, properties of metals, semiconductors, and insulators. In particular, it explains the fact, never properly understood on the basis of the "real theory," that insulators show a specific resistance to electricity which may be 1026 times greater than that of metals. In fact, there is no experimental evidence to show that the resistance is not infinite under the conditions under which the free-electron theory would lead us to expect an infinite resistance. Nevertheless, we are convinced that the free-electron theory is a crude approximation which should be replaced, in the description of all phenomena concerning solids, by a more accurate picture.

    If viewed from our real vantage point, the situation presented by the free-electron theory is irritating but is not likely to forebode any inconsistencies which are insurmountable for us. The free-electron theory raises doubts as to how much we should trust numerical agreement between theory and experiment as evidence for the correctness of the theory. We are used to such doubts.

    A much more difficult and confusing situation would arise if we could, some day, establish a theory of consciousness, or of biology, which would be as coherent and convincing as our present theories of the inanimate world. Mendel’s laws of inheritance and the subsequent work on genes may well form the beginning of such a theory as far as biology is concerned. Furthermore, it is quite possible that an abstract argument can be found which shows that there is a conflict between such a theory and the accepted principles of physics. The argument could be of such abstract nature that it might not be possible to resolve the conflict, in favor of one or of the other theory, by an experiment. Such a situation would put a heavy strain on our faith in our theories and on our belief in the reality of the concepts which we form. It would give us a deep sense of frustration in our search for what I called "the ultimate truth." The reason that such a situation is conceivable is that, fundamentally, we do not know why our theories work so well. Hence, their accuracy may not prove their truth and consistency. Indeed, it is this writer’s belief that something rather akin to the situation which was described above exists if the present laws of heredity and of physics are confronted.

    Let me end on a more cheerful note. The miracle of the appropriateness of the language of mathematics for the formulation of the laws of physics is a wonderful gift which we neither understand nor deserve. We should be grateful for it and hope that it will remain valid in future research and that it will extend, for better or for worse, to our pleasure, even though perhaps also to our bafflement, to wide branches of learning.

"The Unreasonable Effectiveness of Mathematics in the Natural Sciences," in Communications in Pure and Applied Mathematics, vol. 13, No. 1 (February 1960). New York: John Wiley & Sons, Inc. Copyright 1960 by John Wiley & Sons, Inc.

habg

Back to Contents