Back to Contents
Consider the differential equation
in whichα represents a constant and F represents some function of the variable x. If we differentiate that equation and make the obvious substitution, we get
If we repeat that process multiple times, we get the family of differential equations
We now want to determine the explicit form of F that satisfies those equations. Because we set no limit on the value of n, we know that F must be infinitely differentiable.
Multiply Equation 1 by dx and integrate it. Because we always have an implicit zero added to any function, the integration yields a function G plus a constantβ; that is,
If we multiply that equation byα and make the appropriate substitution from Equation 1, we get
If we multiply that equation by dx and integrate it, we get
And if we multiply that equation byα, we get
Again, we multiply that equation by dx and integrate it to get
As we continue that process, we see that F must consist of an infinite series whose n-th term has the form
So we have, then
which I obtained by factoring outβ and, because it has no impact on the fuctionality of the series, setting it equal to one. That equation clearly satisfies Equations 1, 2, and 3.
Now I want to explore the arithmetic properties of that series. For simplicity I set α=1 (which is the same as redefining the scale of x) and multiply the series by itself. We get
But we can define a new variable j=n+m and sum the series over all values of j after summing over all values of n that contribute to each value of j; in other formulae, we have
In that expression the sum over n equals the binomial theorem's representation of the j-th power of 1+1; to wit, 2j. So we have
Next calculate the cube of F(x) as we calculated the square;
If we continue that procedure, we will find, as we calculate ever higher powers of F(x), that we have
true to mathematics. Equation 15 tells us that multiplying the argument of the function by some number N has the same effect as does raising the function to the N-th power. That means that we can represent the function as some number with the argument as its exponent;
in which e represents Euler's number, which we calculate by setting Nx=1, so we have
That proposition must remain true to mathematics, even if we make the argument a complex number. If z=x+iy, then we have
Because that exponential function solves a large family of differential equations, we expect some of those solutions to involve imaginary exponents. If we take Equation 1 in the form
we have a description of the radioactive decay of a mass m of some unstable substance and the solution of that equation,
describes the exponential diminution with the elapse of time of an initial quantity m0 of that material. If we take Equation 2 in the form
we have a description of the acceleration in the x-direction that a well-anchored spring of stiffness k imposes upon a body of mass m attached to its free end and the solution of that equation,
in which describes the location of the body relative to the origin of the x-axis at any given time t.
We know that the imaginary exponential of Equation 22 represents an oscillation, but let's feign ignorance of that fact and see whether we can work out the alternative form of the exponential, which makes the oscillation explicit. In series form we have
We can split that series into its real and imaginary parts in accordance with the definition
We also have from Equation 24
If we multiply Equation 24 by Equation 27, we get
And if we differentiate Equations 25 and 26, we get
We have tacitly assumed thatθ represents a real number. Thus, the functions g(θ) and h(θ) also represent real numbers and their squares are, therefore, positive real numbers. Equation 28 then tells us that g(θ) and h(θ) can only have values that lie at or between +1 and -1. Equations 29 and 30 put the numerical values of the derivatives of those functions into the same range.
To see what those facts mean imagine looking down the θ-axis in the positive θ-direction in a three-dimensional coordinate grid. Hold the grid in your imagination in such a way that the x-axis runs horizontally with the positive x-direction to your right and the y-axis runs vertically with the positive y-direction pointing up. On that grid make x=g(θ) and y=h(θ); for each value of θ, then, we plot eiθ as a point in the x-y plane, with the x-direction representing the real part and the y-direction the imaginary part of our exponential function. Equation 28 tells us that if we plot enough points, we will trace out a circle of unit radius centered on the origin of the x-y plane. Suppose we carry out such a plotting by increasing θ in small increments.
When θ=0, then x=+1, y=0, dx/dθ=dg/dθ=0, and dy/dθ=dh/dθ=+1. Now increase the value of θ in increments. We must plot the corresponding points ever higher in the y-direction and, slowly at first and then faster, further to the left in the x-direction until we have a point for which y=+1, x=0, dx/dθ=-1, and dy/dθ=0. As we continue to increase the value of θ we must plot the corresponding points ever farther to the left (in the negative x-direction) and, slowly at first and then faster, less far from the x-axis until we have reached a point for which y=0, x=-1, dx/dθ=0, and dy/dθ=-1. If we continue onward, the derivatives oblige us to plot subsequent points in the negative y-direction and, slowly at first and then faster, toward the y-axis until we reach a point for which y=-1, x=0, dx/dθ=+1, and dy/dθ=0. Thence, as the value of θ continues to increase, our set of plotted points comes back to the point with which we began and if we were to increase θ further, we would simply repeat that process, proceeding around the unit circle in the counterclockwise sense.
We want to calculate the distance between two of our plotted points that lie near each other on that unit circle. If we choose points that lie close enough to each other that we can use the differentials of their coordinates to describe the differences between the values of those coordinates manifest in the points, then we can represent the arc of the circle between the points as differing only negligibly from a straight line. Between the points, then, we have dg and dh as the sides of a right triangle whose hypotenuse nearly coincides with the arc between the points and we calculate the distance between the points in the Pythagorean way as
That fact remains true to mathematics regardless of where on the circle we locate our points.
Now we know that, through the functions g(θ) and h(θ), any division of the unit circle on the x-y plane into identically long segments also divides the θ-axis into identical intervals and vice versa. So now we know that θ represents an angle between the positive x-axis and a straight line drawn from the center of the unit circle to any point on its perimeter, the positive angle being measured in the counterclockwise sense on the circle. In this picture we can represent that straight line as the hypotenuse of a right triangle whose sides correspond to lines of length g(θ) and h(θ). Trigonometry, the branch of mathematics that involves right triangles drawn inside circles, has special names for those sides: we must call h(θ) the sine of θ and call g(θ) the cosine of θ. So now we have Equation 24 as
which is Euler's formula.
Thus we have at least the outline of a proper deduction of Euler's formula. This differs from the usual proof and verification of Euler's theorem (which the formula expresses) in that it does not oblige us to know or to suspect the theorem before we start: an exploration of the meaning of the imaginary exponential leads us unerringly to the theorem, even if we don't suspect the destination.
Abraham de Moivre (1667 May 26 - 1754 Nov 27) wrote in 1698 that Isaac Newton (1643 Jan 04 - 1727 Mar 31) knew the formula that we call de Moivre's theorem,
as early as 1676. We can derive that equation for ourselves from Equation 32 and the exponential law, (eiθ)n=einθ. Newton likely derived that result by applying his binomial theorem to the addition rules of trigonometry.
In 1714 Roger Cotes (1682 Jul 10 - 1716 Jun 05) asserted that
This equation doesn't look like anything that anyone could devise by pure guesswork, so we may well ask how Cotes devised it.
Through his work for Newton, he knew de Moivre's theorem and knew, thus, that Cisθ=Cosθ+iSinθ constituted an interesting function in its own right. It's emergence from Newton's binomial theorem may have given it a special cachet as an object of further mathematical investigation. And, given that Cotes worked for Newton (editing the second edition of the Philosophiae Naturalis Principia Mathematica), we can guess that an obvious process for him to apply to that function involved finding the fluxion (what we call the derivative): using the notation devised by Leibniz (Newton's was too clumsy to survive), we get
A minor algebraic manipulation transforms that equation into
Now comes the big step: Cotes had to recognize that
Fortunately, Newton had already done it for him, in 1675 (see appendix). That Cotes knew that mathematical fact less than fifty years after Newton invented the calculus we cannot doubt, as astounding as we may find it, because he clearly substituted that fact into Equation 36 and integrated the result to obtain Equation 34. He then left us to wonder why he didn't take the next obvious step and determine the anti-logarithm of Equation 34, thereby obtaining Euler's formula, Equation 32.
Leonhard Euler (1707 Apr 15 - 1783 Sep 18) published Equation 32 in 1748. He based his proof and verification of it on his recognition of the Taylor series representations of the sine and cosine in the Taylor series representation of the imaginary exponential. Such a proof required that Euler have the necessary familiarity with the relevant series and either suspect the theorem that he used them to prove or come upon a truly serendipitous juxtaposition of the series in his imagination.
In working out their results those men did not have recourse to a particular aid to the imagination that mathematicians devised at the end of the Eighteenth CenturyB the complex plane. The Norwegian-Danish mathematician Caspar Wessel (1745 Jun 08 - 1818 Mar 25) published an account of the graphical representation of the complex numbers in 1798, thereby providing mathematicians with a basis for describing Euler's theorem geometrically.
I used the complex plane in my deduction of Euler's theorem, thereby putting that theorem firmly into the realm of analytic geometry. Only in that realm does it become possible to obtain the theorem by deduction, that most perfect mathematical logic that well suits what Richard Feynman called "the most remarkable formula in mathematics". Of course, we don't disparage the partial deduction, the induction, or the educated guessing that led to the theorem in the first place: we must certainly use such techniques as we explore new realms of mathematics. Nonetheless, the Platonist in me insists that someday all of mathematics will lie described on an axiomatic-deductive flowchart, that we will have a perfectly Rationalist Map of Mathematics to accompany, and support, the perfectly Rationalist Map of Physics that I have begun to lay out in another part of this website.
Once Isaac Newton had the concept of fluxions fixed firmly in his mind and had he then recalled to mind John Napier's definition of a logarithm, it might have taken him all of five minutes to figure out the equivalent of that equation. It certainly comes easily enough.
John Napier (1550 xxx yy - 1617 Apr 04) invented logarithms as an aid to calculations, such as those required of astronomers, and published his description of them in 1614. He devised the word logarithm, from the Greek words logos (in its sense of proportion) and arithmos (number), to denote the fact that an arithmetic series of logarithms corresponds to a geometric series of the numbers associated with them. Napier already knew that if he were to match an arithmetic sequence of numbers with a geometric series of numbers, such as
then adding two elements of the A-sequence corresponds to multiplying together the two corresponding elements of the B-sequence (e.g. 3+5=8 in the A-sequence corresponds to 8x32=256 in the B-sequence). But that fact applies to a limited set of numbers and yields a clumsy system even so. Napier wanted something that he could use to multiply and divide with any numbers he chose and a single table that he could use to translate between numbers and their logarithms.
To that end he sought some phenomenon that in one aspect would give him a continuum analogous to an arithmetic sequence and in another aspect would give him a continuum analogous to a geometric sequence. He found what he wanted in the mathematical description of a point moving along a straight line. If, on a line extending from minus infinity to plus infinity, a point moves with constant speed, we have the kinematic analogue of an arithmetic sequence. If, on a line extending from zero to plus infinity, a point moves with a speed proportional to the point's distance from the zero point on the line, we have the kinematic analogue of a geometric sequence. If we correlate those motions with each other by way of given instants of time, then we constitute numbers on the first line as logarithms of numbers on the second line. Instead of the proto-logarithms that we get from discrete sequences of integers, this set-up gives us a continuum of logarithms for a continuum of real numbers.
If we identify the first line above with the y-axis of a Cartesian coordinate grid and identify the second line with the positive x-axis of that grid, then Napier's descriptions of points moving on the lines become parametric equations describing the tracing of a curve in the x-y plane:
If we add the proviso that when x=1 we have both y=0 and vy=vx (which tells us that b=vy), then the curve expresses the function
Now solve Equations A-1 and A-2 for dt and equate the solutions: we get
Making the appropriate substitutions from Equation A-3 and from b=vy gives us
ready for Roger Cotes to use. Of course, Newton used a less algebraic and more geometric approach, but he got the same result by the same logic.
Back to contents