Boltzmann=s H-Theorem

Part I

Back to Contents

    Imagine, as a kind of mathematical black box, a system containing N particles that, except for an occasional collision, remain independent of each other. We also note that of those N particles ni of them each possess an energy Ei. Conservation of matter and conservation of energy give us two constraints on that system:

(Eq=n 1)


(Eq=n 2)

in which U represents the total energy contained within the system. Both N and U represent macroscopically measurable independent parameters that define a macrostate of the system.

    We now want to describe what we assume will be the most probable macrostate that the system will manifest. To that end we want to describe the set of occupation numbers ni that yields the maximum number of microstates for the particles to occupy.

    In its simplest description the system consists of N boxes into each of which we put only one particle. We have N different ways in which we can put the first particle into the system, N-1 different ways in which we can put the second particle into the system, and so on, for a grand total of N! ways in which we can populate the system. Note that we have tacitly assumed that the particles are all distinguishable, that we can tell one from another.

    That distinguishability allows us to discern and count the number of different microstates that correspond to our given macrostate. Given one specific arrangement of the particles in the boxes, which defines a microstate, we label with the index i all of the boxes containing particles that carry energy Ei and do that for all values of the index. Let=s call that particular labeling of the boxes a physically distinct substate.

    In Reality we don=t deal with distinguishable particles. Atoms don=t come with labels that mark them as individuals: every carbon atom has exactly the same properties as any other carbon atom. So instead of microstates, we really want to count the number of physically distinct substates that correspond to a given macrostate.

    In the specific physically distinct substate that I described above we have ni! microstates for the specific arrangement of boxes containing the particles carrying energy Ei. That means that we have a total number of microstates equal to the product of all of the factorials of the complete set of numbers ni in that one substate. That fact, in turn, leads us to infer that the number W of physically distinct substates that correspond to the given macrostate equals the total number of ways of putting the particles into the N boxes divided by that product; that is,

(Eq=n 3)

    Under the constraints encoded in Equations 1 and 2, we want to work out a description of the conditions that maximize the value of W. Using Stirling=s formula,

(Eq=n 4)

we rewrite Equation 3 as

(Eq=n 5)

In the transition to the second equality in that equation the exponentials canceled each other out because

(Eq=n 6)

in accordance with Equation 1. Bearing in mind the fact that

(Eq=n 7)

we differentiate Equation 5 and get

(Eq=n 8)

In the last step of that differentiation I eliminated the minus one by virtue of the fact, based on Equation 1, that

(Eq=n 9)

We know that dW=0 when W has reached its maximum value, so we can rewrite Equation 8 as

(Eq=n 10)

Interpreting Equation 10 in light of Equation 9 necessitates that

(Eq=n 11)

for all values of the index, with C representing a constant. That equation doesn=t tell us much, but since we can add zero to any equation we can incorporate the constraints encoded in Equations 1 and 2 into it by the method of Lagrange multipliers. We have the constraints in the form of Equation 9 and

(Eq=n 12)

so we have Equation 11 in the form

(Eq=n 13)


(Eq=n 14)

Taking the anti-logarithm of that equation gives us

(Eq=n 15)

in which the amplitude A=exp(C-α), which, like β, remains to be determined. That statement reflects the fact that the Lagrange multipliers are not predetermined, but must be found by applying the resulting equation to known circumstances and deriving their values from the results. We also note that for values of ni greater than about one thousand the factor exp(1/2ni) differs from unity by so small an amount that we may take it as equal to one except in the most minuscule of particle systems. So we end up with

(Eq=n 16)

which describes the Maxwell-Boltzmann distribution.

    In accordance with Equation 1 and a simple division, summing Equation 16 gives us

(Eq=n 17)

in which

(Eq=n 18)

represents the partition function of the system. In accordance with Equation 2 we have

(Eq=n 19)

Dividing that equation by N gives us the average energy per particle,

(Eq=n 20)

If we multiply that equation by dβ, add to both sides of the equality sign to complete the differential on the left, and then integrate, we get

(Eq=n 21)

A little algebraic rearrangement of that equation gives us

(Eq=n 22)

which we recognize as the equation that relates the system= s partition function to the system=s entropy. In light of that recognition we see that must represent heat entering or leaving the system and not mechanical work.

    Looked at a different way we have a Maxwellian system with a probability of a particle occupying the i-th state of

(Eq=n 23)

In that system we have the mean energy as

(Eq=n 24)

In a general quasi-static process that changes by the minuscule amount

(Eq=n 25)

In that equation pQ represents the heat absorbed by the system in accordance with

(Eq=n 26)

and pW represents mechanical work done on the system in accordance with

(Eq=n 27)

Note that absorbing heat does not change the energy of a state, but changes the probability of a state being occupied, while doing mechanical work on the system does change the energies in its states. Now consider the basic description of the entropy of the system:

(Eq=n 28)

And that simply expresses Boltzmann=s H-theorem. In that equation I have exploited the fact that, in accordance with Equation 23,

(Eq=n 29)

and, in accordance with the definition of probability,

(Eq=n 30)


But we have an alternate route that we can follow to that conclusion.

    When we put the particles into the system they don= t necessarily conform to the description in Equation 16. In most cases we will then have dW/W0. In light of that statement we integrate Equation 10 and get

(Eq=n 31)

Because the Maxwell-Boltzmann distribution represents the maximum value of W, we know that the system must evolve by W (and therefore lnW) increasing and never decreasing spontaneously, so we have as necessarily true to Reality

(Eq=n 32)

That simple little equation expresses Ludwig Boltzmann=s H-theorem, which is equivalent to the law of entropy.


Appendix A:

Counting States

    We want to use an abstract representation of a thermodynamic system as a means of analyzing a dynamics based on probabilities involving very large numbers of particles. To that end we imagine that our system consists of N boxes, each of which can hold only one particle, with n1 particles of type-1 and n2 particles of type-2 such that n1+n2=N. When we have put all of the particles into all of the boxes we have what we call the macrostate of the system. We also have a microstate, which consists of the specific arrangement of the particles in the boxes (e.g. a type-1 in Box #1, a type-1 in Box #2, a type-2 in Box #3, and so on).

    How many different microstates correspond to a given macrostate? Answering that question actually helps us to determine a description of the macrostate that a system freed from certain constraints will evolve into.

    If we have N particles that are all distinguishable from each other, we have N different ways to put the first particle into one of N boxes, N-1 ways to put the second particle into a box, and so on. We thus count N! different ways to put the particles into the boxes under the restriction that only one particle goes into each box. If we remove the distinguishing characteristics from the particles, making them indistinguishable from each other, then we have only one way to put them into the boxes, so if we assume that the particles are distinguishable when they are not, we overcount the number of different ways of putting them into the boxes by a factor of N!. That fact means that for a given macrostate the difference between distinguishable and indistinguishable particles of type-1 and type-2 is a factor of n1!n2!, the difference in the number of different ways we can make the macrostate with distinguishable and indistinguishable particles. If we multiply that factor by the total number of different microstates W that correspond to the given macrostate we should obtain a number equal to the factor by which we overcount the ways of creating that macrostate; that is, we have

(Eq=n A-1)

which gives us

(Eq=n A-2)

We extend that result readily to systems consisting of more than two types of particles to get

(Eq=n A-3)

    But Reality doesn=t make distinguishable particles. If you=ve seen one electron or oxygen atom, you have as good as seen them all. So referring to distinguishable particles seems like a not-terribly legitimate way to deduce Equation A-2. Can we deduce that equation without referring to distinguishable particles?

    How many different ways can we put n1 particles into N boxes? We have N ways to place the first particle, N-1 ways to place the second particle, and so on to N-(n1-1) ways to place the last particle. Thus we count N!/(N-n1)! different ways to place the particles. But again we have overcounted the microstates. For example, we put the first particle in Box #1 and the second particle in Box #2 and so on and call that one microstate and then we put the first particle into Box #2 and the second particle into Box #1 and so on and count it as a separate microstate, even though it=s identical to the first one when we finish placing the particles.

    Distinguishability is the same as the order in which the particles are put into their boxes. For the type-1 particles in a given microstate we have n1 boxes that can take the first particle. For each of those we have n1-1 boxes that can take the second particle. For each of those n1(n1-1) possibilities we have n1-2 boxes that can take the third particle. Thus we have n1! different ways to put the type-1 particles into their boxes for any given set of boxes into which only the type-1 particles go. But since the end result does not depend upon the order in which the particles went into the boxes, we have overcounted the ways to create the given microstate. Again, the total number of microstates that comprise a given macrostate equals the factorial of N divided by the factorials of all of the ni in the system.


Appendix B:

The Method of Lagrange Multipliers

    Sometimes when we work out the laws of physics or their consequences, especially when they involve probabilities, we need to describe a minimum or maximum value of some function f(qi) of a set of spatial or space-like coordinates qi. We know from elementary calculus that the function has an extremum at the point qi when

(Eq=n B-1)

If the coordinates are independent of each other, then we can solve that equation, term by term, in a straightforward way. But if the independence of the coordinates is negated by some constraint or side condition on the system, then we must take a more complicated route to the solution.

    We usually have the constraint manifested in a function of the coordinates (φ=φ(qi)) that makes a valid value of one coordinate dependent on the values of the other coordinates. Some constraints have the form of conservation laws; that is, statements that some properties of the system (e.g. number of particles, total energy, etc.) remain unchanged by any change in the system (φ=constant). Other constraints appear to augment the original equation. We can imagine f(qi) describing a surface in the abstract q-space and φ(qi) describing a second surface that intersects the first along some curve. We then seek to find the point where that curve reaches an extremum. In either case we have

(Eq=n B-2)

    Look again at Equation B-1. If the variables are all independent of each other, then we can let dqi=0 for all the differentials except one, dqj, so that we have

(Eq=n B-3)

But our choice of j was arbitrary, so that equation stands true to mathematics for all of the terms in Equation B-1. But if we have a constraint φ on the function f, then the variables are not independent; they are inter-related through φ. To render Equation B-1 soluble within the restrictive condition we introduce an undetermined parameter λ (the Lagrange multiplier), multiply it onto Equation B-2, and add the result to Equation B-1, so that we have

(Eq=n B-4)

Now we can treat that expression as if all of the differentials dqi are mutually independent and we have

(Eq=n B-5)

for all values of the index.

    As an example let=s determine the description of the Maxwell distribution of the particles in a gas over all available velocities. To avoid complications due to internal molecular motions we feign using a monatomic gas, such as helium or neon, in our imaginary experiment. We have N particles inside a container held at a certain temperature and we want to know what proportion of those particles move at the speed V with the range dV; that is, we want to determine the form of some function f(V) that describes the probability of a particle existing in that speed range in the gas. As Maxwell noted, the components of the particles= velocity vectors conform to their own distribution functions, which must have the same form as f(V), so we have f(u), f(v), and f(w) for the three mutually orthogonal dimensions of our coordinate grid. Further, the values of u, v, and w applied to a particle are independent of each other, so their joint probability equals the product of the individual probabilities, which gives us

(Eq=n B-6)

In working out the form of that function in more detail we have the constraint

(Eq=n B-7)

because we want to work out f(V) as a function of the velocity components for some given value of V. The isotropy of f(V) in velocity space also makes the function a constant with respect to the partial velocity derivatives, so that we have

(Eq=n B-8)

    Equation B-6 lets us rewrite that equation as

(Eq=n B-9)

Differentiating Equation B-7 and dividing the result by two gives us our side condition

(Eq=n B-10)

We multiply that equation by the Lagrange multiplier λ and by f(V) and add the result to Equation B-9, thereby obtaining a sum of three terms, each of which has the form of the first,

and all of which add together to equal zero. Because the values of the velocity-component differentials are arbitrary, each of those terms must equal zero separately, so we have

(Eq=n B-11)

with similar equations for the v and w parts of the function. Integrating that equation gives us the basic solution

(Eq=n B-12)

in which lnA represents the constant of integration. Because the solution must be isotropic, that constant of integration must be the same for all three component solutions. Thus we have

(Eq=n B-13)

    Applying that equation to a full description of the gas lets us determine the values of A and λ, so we get at last

Eq=n B-14)

in which m represents the mass of a particle in the gas and T represents the absolute temperature of the gas. With that equation we can work out the full Maxwellian description of a simple monatomic gas.


Back to Contents