Mathematical Induction

Back to Contents

    The process of mathematical induction gives us a means of discovering mathematical truth when we have no success in applying the process of deduction. Instead of giving us certainty, as deduction does, induction gives us the likelihood of a proposition standing true to mathematics. We prefer certainty to likelihood, of course, especially in mathematics, but in some cases a deductive plan eludes us and we must employ induction.

    We apply induction when we infer some rule pertaining to a mathematical entity on some domain of applicability, find that the rule stands true to mathematics for some other, randomly chosen location on the domain, and infer that the rule applies to all locations on the domain.

    As an example consider the following. Take the first few odd integers, square each one and subtract one from each square. That process maps the subset {1,3,5,7} onto the subset {0,8,24,48}. Each of the elements of the second subset is an integer multiple of eight. Does that statement remain true for the entire set of the odd integers? Pick an odd integer at random and apply the mapping to it; for example, applying the mapping to 31 yields 960, an integer multiple of eight. Applying the mapping to 427 yields 182,328, which equals 8 x 22,791. Whatever odd integer we pick, the mapping always gives us an integer multiple of eight.

    To deduce that result we represent our mapping algebraically as X2-1 for all odd integers X. But we can represent each and every odd integer as X=2N-1 for all natural numbers N. Thus our mapping becomes 4N2-4N, which we rewrite as 4N(N-1). That formula instructs us to multiply four by a natural number and its predecessor. Of the natural number and its predecessor, one of them must be an even number, an integer multiplied by two. So the formula instructs us to multiply four by two by an integer and then by another integer, which always gives us an integer multiple of eight for any natural number N and, thus, for any odd integer X.

    Thus, induction gives us what the law calls probable cause; deduction gives us the conviction. When we use induction to infer some rule pertaining to numbers on a certain domain, we have no guarantee that there does not exist some small area on that domain where the rule does not stand true to mathematics: we have obtained and tested the rule by sampling the domain and such sampling may miss those areas where the rule does not apply. On the other hand, deduction proceeds by invoking rules that we know a priori apply to the entire domain and manipulating those rules in a way that produces another rule that necessarily applies to the entire domain. Consider another example, the sum of a series;

(Eq地 1)

    If we did not know the formula on the right side of that equation, we could infer it from a process of trial summation. For N=1,2,3, we get, respectively,

(Eq地s 2)

We seem to have a pattern that we can encode in a rule, the one that we see in Equation 1. Now we want to test that rule. Note how this resembles the empirical-inductive Scientific Method devised by Francis Bacon in his unfinished novel, "The New Atlantis": we have made some observations and gathered data, we set that rule as our hypothesis, and now we want to conduct an experiment in order to contrive further observations that will test our hypothesis and either verify or falsify it. To that end let N=5 and assume that the formula of Equation 1 works for N=4; those assumptions give us

(Eq地 3)

Let痴 take our experiment further and let N=24. In that case we get

(Eq地 4)

Note that in that numerator, when we give our fractions a common denominator, we get (N-1)(N+1)+1=(N-1)2+2(N-1)+1=N2. For all the values of N that we try we will get a result that verifies Equation 1, but we have no guarantee that we will not find a number somewhere on the number line that will not give us the desired result. To obtain such a guarantee we need to use something a bit more potent.

    To determine the sum of the series deductively we need only notice that the basic term looks like what we get when we give two fractions a common denominator. If we work that arithmetic process backwards, we get

(Eq地 5)

If we substitute from that equation into the left side of Equation 1, we see that all of the terms in the sum cancel, except the first and the last; thus, we get the sum as

(Eq地 6)

In this case we have used the most general rules: the substitution of Equation 5 applies to all possible values of x in the set of the natural numbers (the domain over which we want to calculate our sum), so Equation 6 must apply equally to all of the natural numbers.

    In the process of induction as we apply it to number theory we use the natural numbers as indices to organize mathematical statements into a series. Let us have a statement Sn uniquely associated with each and every natural number n. Induction lets us infer that Sn stands true to mathematics for all possible values of n if the following conditions also stand true to mathematics:

    1) S1 stands true to mathematics.

    2) If Sm stands true to mathematics, then so does Sm+1.

To understand how that pattern works for us, we want to look at an example.

    We have a proposition stating that the N-th square number, the product of multiplying a natural number N by itself, equals the sum of the first N odd natural numbers. We claim that the proposition stands true to mathematics for all possible values of N and we now want to prove and verify that claim.

    Most simply we define a square number as the number of markers that form a uniform square with the same number of markers on each side: the square number thus equals the number of markers on any one side multiplied by itself. We can regularize that process of forming markers into squares by using a bleached, extended version of a checker board, one on which we have drawn numbers on two lines that cross each other in order to use them as coordinate axes.

    For an inductive proof we begin by placing one marker on the square with coordinates [1,1]. One gives us the first odd number but we call it a square number largely as a courtesy to the pattern we described above. So we go to the second square number: 2x2=4=1+3, which is the sum of the first two odd integers in the set of the natural numbers. That fact gives us step one, though we actually had to assert both S1 and S2 in this case.

    Next we consider the N-th square number, the number of markers in the N-marker by N-marker square array on the grid. Then we create the (N+1)-th square array by placing on the grid N markers along the right side of the square, N markers along the top of the square, and one marker at the corner where those two new lines meet. We thus turn the N-th square into the (N+1)-th square by adding 2N+1=2(N+1)-1 markers to it; more specifically and relevantly, we have converted the N-th square into the (N+1)-th square by adding a quantity of markers equal to the (N+1)-th odd integer to it. That statement, we assert, remains true to mathematics for all values of N, because we chose the value of N arbitrarily from all possible values.

    Now we see that the statement stands true to mathematics for all of the squares from the first to the N-th. That fact means that we can create the N-th square number by starting with the first odd integer, adding to the second odd integer, then the third odd integer, and so on until we finally add the N-th odd integer. So the N-th square number comes to us when we add up the first N odd integers in the set of the natural numbers.

    In this inductive proof we began by establishing a pattern with the first and second elements of the set of square numbers. In this case we had to use the second element because the proposition concerns the relationship between consecutive square numbers. Then we found that the same relationship applies to a randomly chosen element of the set and its successor. That latter fact implies that the relationship exists between each and every element of the set and its successor, that the pattern we established with the first two elements of the set persists throughout the entire set. The level of certainty that we assign to the result of this proof depends upon the level of certainty that we assign to the assumption that the pattern does not change for some value of N. There we see the weak point of induction, the feature that gives us probability rather than perfect certainty in our results.

    For a deductive proof we resort to algebra. We know that the N-th odd integer has the general form 2N-1. Our proposition claims that the sum of the first N numbers of the form 2n-1 equals N2, so we have algebraically

(Eq地 7)

Karl Friedrich Gauss showed us how to calculate the sum of the first N integers, both odd and even. He feigned writing the numbers from one to N in their proper sequence on a line and then write the numbers from N to one in reverse order just below that first line. Each adjacent pair of numbers in that array adds up to N+1 and the array contains N pairs, so the array adds up to N(N+1), half of which equals the sum of the numbers on one of the lines. Thus we have

(Eq地 8)

which verifies the proposition.

    In this case we expressed the N-th odd number in abstract form, as an algebraic formula that enables us to calculate the value of that N-th odd integer directly from the value of N. We then summed that formula for all consecutive values of the natural numbers up to and including N, using the standard rules of algebraic manipulation of symbols, which rules reflect the rules of arithmetic. We found that the sum does, indeed, come out equal to the N-th square number, which proves and verifies the proposition for all possible values of N. We assign a certainty to that verification that reflects the certainty that we assign to the rules of arithmetic and of algebraic logic; to the extent that those rules give us mathematical truth, to that same extent does our deductive proof also give us mathematical truth.

    If we can use deduction and obtain perfect certainty in our result, why would we use induction and obtain mere likelihood in our result? In many cases we can稚 yet see a pattern susceptible to deductive analysis, but we have enough information to use inductive analysis. We may think of induction as the scaffolding supporting the growing structure of mathematical knowledge until we can construct the appropriate deduction to solidify the structure properly. And sometimes that scaffolding can help us in correcting that structure when the builders begin to err in its construction. As an example consider an inductive analysis of Cantor痴 diagonalization proof of the existence of infinities larger than the infinity of the natural numbers.

    Georg Cantor used a proof based on a process of diagonalization to verify his proposition that the set of all decimal fractions contains more elements than does the infinite set of the natural numbers. But an inductive examination of the proof shows us that Cantor erred and that his proposition stands, in fact, false to mathematics.

    Cantor constructed his proof by stating that one could draw the natural numbers, in proper sequence, on the left side of a list and draw the decimal fractions, in arbitrary order, on the right side of the same list, associating one decimal fraction with one natural number. Guided by Cantor, we assume that we have a complete list, a perfect one-to-one match between the natural numbers and the decimal fractions. Then we falsify that assumption by demonstrating that the listing of decimal fractions is necessarily incomplete and we do that by creating new fractions that did not appear on our original list.

    To construct that new fraction we take the first digit from the first fraction on our list, change it, and make the first digit of our new fraction. We then take the second digit from the second fraction on our list, change it, and make it the second digit of our new fraction. Changing the third digit of the third fraction gives us the third digit of our new fraction. We continue in that manner, using the changed M-th digit of the M-th fraction on our list as the M-th digit of our new fraction. In that way we construct a fraction guaranteed to differ by at least one digit from each and every other fraction on our list; therefore, it couldn稚 have existed on the original list. Because we can repeat that process endlessly for decimal fractions with an infinite set of decimal places, we must have infinitely more fractions on the list than we originally assumed, so the set of the decimal fractions must have a magnitude greater than the infinity of the natural numbers, the magnitude that Cantor called Aleph-One (the infinity of the natural numbers bears the name Aleph-Null).

    That deductive proof of the existence of higher infinities fails when we subject it to a proper inductive analysis.

    As with the inductive proof pertaining to square numbers, we must properly begin with S2 with only a nod of acknowledgment to S1. In this case Sm represents the application of Cantor痴 diagonalization to the complete set of the m-place decimal fractions. If we list all of the two-digit decimal fractions, from 0.00 to 0.99, we have a set that contains one hundred elements, but we can diagonalize only two elements at a time. But that diagonalization merely generates a decimal fraction that we already have on the list, so the process proves but does not verify the proposition that our list is incomplete. Next we look at the full set of m-place decimal fractions, in which we have 10m elements and we can diagonalize m of them at one time, which means that the diagonalization will not produce a new decimal fraction that does not already exist in the set. For the entire set of the natural numbers the m-th power of ten exceeds the value of m, so that previous statement remains true to mathematics as the value of m increases and goes to infinity.

    Note that our use of infinity does not mean that infinity represents an actual number. We use it as a pseudo-number that denotes the fact that the natural numbers do not have a last member: the numbers simply keep on increasing without end. Thus we do not attribute any magical properties to infinity that would invalidate our induction.

    Thus we can infer that Cantor made an error in his proof. By not testing his diagonalization procedure against decimal fractions with a finite number of places, he left himself open to a misunderstanding of the nature of infinity. Thus we see the value of using induction when deduction eludes us.


Back to Contents