Boltzmann=s H-Theorem

Part I

Imagine, as a kind of mathematical black box,
a system containing N particles that, except for an occasional collision, remain
independent of each other. We also note that of those N particles n_{i}
of them each possess an energy E_{i}.
Conservation of matter and conservation of energy give us two constraints on
that system:

(Eq=n 1)

and

(Eq=n 2)

in which U represents the total energy contained within the system. Both N and U represent macroscopically measurable independent parameters that define a macrostate of the system.

We now want to describe what we assume will
be the most probable macrostate that the system will manifest. To that end we
want to describe the set of occupation numbers n_{i} that yields the
maximum number of microstates for the particles to occupy.

In its simplest description the system consists of N boxes into each of which we put only one particle. We have N different ways in which we can put the first particle into the system, N-1 different ways in which we can put the second particle into the system, and so on, for a grand total of N! ways in which we can populate the system. Note that we have tacitly assumed that the particles are all distinguishable, that we can tell one from another.

That distinguishability allows us to discern
and count the number of different microstates that correspond to our given
macrostate. Given one specific arrangement of the particles in the boxes, which
defines a microstate, we label with the index i all of the boxes containing
particles that carry energy E_{i}
and do that for all values of the index. Let=s
call that particular labeling of the boxes a physically distinct substate.

In Reality we don=t deal with distinguishable particles. Atoms don=t come with labels that mark them as individuals: every carbon atom has exactly the same properties as any other carbon atom. So instead of microstates, we really want to count the number of physically distinct substates that correspond to a given macrostate.

In the specific physically distinct substate
that I described above we have n_{i}! microstates for the specific
arrangement of boxes containing the particles carrying energy
E_{i}.
That means that we have a total number of microstates equal to the product of
all of the factorials of the complete set of numbers n_{i} in that one
substate. That fact, in turn, leads us to infer that the number W of physically
distinct substates that correspond to the given macrostate equals the total
number of ways of putting the particles into the N boxes divided by that
product; that is,

(Eq=n 3)

Under the constraints encoded in Equations 1 and 2, we want to work out a description of the conditions that maximize the value of W. Using Stirling=s formula,

(Eq=n 4)

we rewrite Equation 3 as

(Eq=n 5)

In the transition to the second equality in that equation the exponentials canceled each other out because

(Eq=n 6)

in accordance with Equation 1. Bearing in mind the fact that

(Eq=n 7)

we differentiate Equation 5 and get

(Eq=n 8)

In the last step of that differentiation I eliminated the minus one by virtue of the fact, based on Equation 1, that

(Eq=n 9)

We know that dW=0 when W has reached its maximum value, so we can rewrite Equation 8 as

(Eq=n 10)

Interpreting Equation 10 in light of Equation 9 necessitates that

(Eq=n 11)

for all values of the index, with C representing a constant. That equation doesn=t tell us much, but since we can add zero to any equation we can incorporate the constraints encoded in Equations 1 and 2 into it by the method of Lagrange multipliers. We have the constraints in the form of Equation 9 and

(Eq=n 12)

so we have Equation 11 in the form

(Eq=n 13)

or

(Eq=n 14)

Taking the anti-logarithm of that equation gives us

(Eq=n 15)

in which the amplitude A=exp(C-α),
which, like β,
remains to be determined. That statement reflects the fact that the Lagrange
multipliers are not predetermined, but must be found by applying the resulting
equation to known circumstances and deriving their values from the results. We
also note that for values of n_{i} greater than about one thousand the
factor exp(1/2n_{i}) differs from unity by so small an amount that we
may take it as equal to one except in the most minuscule of particle systems. So
we end up with

(Eq=n 16)

which describes the Maxwell-Boltzmann distribution.

In accordance with Equation 1 and a simple division, summing Equation 16 gives us

(Eq=n 17)

in which

(Eq=n 18)

represents the partition function of the system. In accordance with Equation 2 we have

(Eq=n 19)

Dividing that equation by N gives us the average energy per particle,

(Eq=n 20)

If we multiply that equation by dβ, add to both sides of the equality sign to complete the differential on the left, and then integrate, we get

(Eq=n 21)

A little algebraic rearrangement of that equation gives us

(Eq=n 22)

which we recognize as the equation that relates the system= s partition function to the system=s entropy. In light of that recognition we see that must represent heat entering or leaving the system and not mechanical work.

Looked at a different way we have a Maxwellian system with a probability of a particle occupying the i-th state of

(Eq=n 23)

In that system we have the mean energy as

(Eq=n 24)

In a general quasi-static process that changes by the minuscule amount

(Eq=n 25)

In that equation pQ represents the heat absorbed by the system in accordance with

(Eq=n 26)

and pW represents mechanical work done on the system in accordance with

(Eq=n 27)

Note that absorbing heat does not change the energy of a state, but changes the probability of a state being occupied, while doing mechanical work on the system does change the energies in its states. Now consider the basic description of the entropy of the system:

(Eq=n 28)

And that simply expresses Boltzmann=s H-theorem. In that equation I have exploited the fact that, in accordance with Equation 23,

(Eq=n 29)

and, in accordance with the definition of probability,

(Eq=n 30)

But we have an alternate route that we can follow to that conclusion.

When we put the particles into the system they don= t necessarily conform to the description in Equation 16. In most cases we will then have dW/W…0. In light of that statement we integrate Equation 10 and get

(Eq=n 31)

Because the Maxwell-Boltzmann distribution represents the maximum value of W, we know that the system must evolve by W (and therefore lnW) increasing and never decreasing spontaneously, so we have as necessarily true to Reality

(Eq=n 32)

That simple little equation expresses Ludwig Boltzmann=s H-theorem, which is equivalent to the law of entropy.

habg

Appendix A:

Counting States

We want to use an abstract representation of
a thermodynamic system as a means of analyzing a dynamics based on probabilities
involving very large numbers of particles. To that end we imagine that our
system consists of N boxes, each of which can hold only one particle, with n_{1}
particles of type-1 and n_{2} particles of type-2 such that n_{1}+n_{2}=N.
When we have put all of the particles into all of the boxes we have what we call
the macrostate of the system. We also have a microstate, which consists of the
specific arrangement of the particles in the boxes (e.g. a type-1 in Box #1, a
type-1 in Box #2, a type-2 in Box #3, and so on).

How many different microstates correspond to a given macrostate? Answering that question actually helps us to determine a description of the macrostate that a system freed from certain constraints will evolve into.

If we have N particles that are all
distinguishable from each other, we have N different ways to put the first
particle into one of N boxes, N-1 ways to put the second particle into a box,
and so on. We thus count N! different ways to put the particles into the boxes
under the restriction that only one particle goes into each box. If we remove
the distinguishing characteristics from the particles, making them
indistinguishable from each other, then we have only one way to put them into
the boxes, so if we assume that the particles are distinguishable when they are
not, we overcount the number of different ways of putting them into the boxes by
a factor of N!. That fact means that for a given macrostate the difference
between distinguishable and indistinguishable particles of type-1 and type-2 is
a factor of n_{1}!n_{2}!, the difference in the number of
different ways we can make the macrostate with distinguishable and
indistinguishable particles. If we multiply that factor by the total number of
different microstates W that correspond to the given macrostate we should obtain
a number equal to the factor by which we overcount the ways of creating that
macrostate; that is, we have

(Eq=n A-1)

which gives us

(Eq=n A-2)

We extend that result readily to systems consisting of more than two types of particles to get

(Eq=n A-3)

But Reality doesn=t make distinguishable particles. If you=ve seen one electron or oxygen atom, you have as good as seen them all. So referring to distinguishable particles seems like a not-terribly legitimate way to deduce Equation A-2. Can we deduce that equation without referring to distinguishable particles?

How many different ways can we put n_{1}
particles into N boxes? We have N ways to place the first particle, N-1 ways to
place the second particle, and so on to N-(n_{1}-1) ways to place the
last particle. Thus we count N!/(N-n_{1})! different ways to place the
particles. But again we have overcounted the microstates. For example, we put
the first particle in Box #1 and the second particle in Box #2 and so on and
call that one microstate and then we put the first particle into Box #2 and the
second particle into Box #1 and so on and count it as a separate microstate,
even though it=s identical to the
first one when we finish placing the particles.

Distinguishability is the same as the order
in which the particles are put into their boxes. For the type-1 particles in a
given microstate we have n_{1} boxes that can take the first particle.
For each of those we have n_{1}-1 boxes that can take the second
particle. For each of those n_{1}(n_{1}-1) possibilities we have
n_{1}-2 boxes that can take the third particle. Thus we have n_{1}!
different ways to put the type-1 particles into their boxes for any given set of
boxes into which only the type-1 particles go. But since the end result does not
depend upon the order in which the particles went into the boxes, we have
overcounted the ways to create the given microstate. Again, the total number of
microstates that comprise a given macrostate equals the factorial of N divided
by the factorials of all of the n_{i} in the system.

habg

Appendix B:

The Method of Lagrange Multipliers

Sometimes when we work out the laws of
physics or their consequences, especially when they involve probabilities, we
need to describe a minimum or maximum value of some function f(q_{i}) of
a set of spatial or space-like coordinates q_{i}. We know from
elementary calculus that the function has an extremum at the point q_{i}
when

(Eq=n B-1)

If the coordinates are independent of each other, then we can solve that equation, term by term, in a straightforward way. But if the independence of the coordinates is negated by some constraint or side condition on the system, then we must take a more complicated route to the solution.

We usually have the constraint manifested in
a function of the coordinates (φ=φ(q_{i}))
that makes a valid value of one coordinate dependent on the values of the other
coordinates. Some constraints have the form of conservation laws; that is,
statements that some properties of the system (e.g. number of particles, total
energy, etc.) remain unchanged by any change in the system (φ=constant).
Other constraints appear to augment the original equation. We can imagine f(q_{i})
describing a surface in the abstract q-space and
φ(q_{i})
describing a second surface that intersects the first along some curve. We then
seek to find the point where that curve reaches an extremum. In either case we
have

(Eq=n B-2)

Look again at Equation B-1. If the variables
are all independent of each other, then we can let dq_{i}=0 for all the
differentials except one, dq_{j}, so that we have

(Eq=n B-3)

But our choice of j was arbitrary, so that equation stands true to mathematics for all of the terms in Equation B-1. But if we have a constraint φ on the function f, then the variables are not independent; they are inter-related through φ. To render Equation B-1 soluble within the restrictive condition we introduce an undetermined parameter λ (the Lagrange multiplier), multiply it onto Equation B-2, and add the result to Equation B-1, so that we have

(Eq=n B-4)

Now we can treat that expression as if all of the differentials
dq_{i} are mutually independent and we have

(Eq=n B-5)

for all values of the index.

As an example let=s determine the description of the Maxwell distribution of the particles in a gas over all available velocities. To avoid complications due to internal molecular motions we feign using a monatomic gas, such as helium or neon, in our imaginary experiment. We have N particles inside a container held at a certain temperature and we want to know what proportion of those particles move at the speed V with the range dV; that is, we want to determine the form of some function f(V) that describes the probability of a particle existing in that speed range in the gas. As Maxwell noted, the components of the particles= velocity vectors conform to their own distribution functions, which must have the same form as f(V), so we have f(u), f(v), and f(w) for the three mutually orthogonal dimensions of our coordinate grid. Further, the values of u, v, and w applied to a particle are independent of each other, so their joint probability equals the product of the individual probabilities, which gives us

(Eq=n B-6)

In working out the form of that function in more detail we have the constraint

(Eq=n B-7)

because we want to work out f(V) as a function of the velocity components for some given value of V. The isotropy of f(V) in velocity space also makes the function a constant with respect to the partial velocity derivatives, so that we have

(Eq=n B-8)

Equation B-6 lets us rewrite that equation as

(Eq=n B-9)

Differentiating Equation B-7 and dividing the result by two gives us our side condition

(Eq=n B-10)

We multiply that equation by the Lagrange multiplier λ and by f(V) and add the result to Equation B-9, thereby obtaining a sum of three terms, each of which has the form of the first,

and all of which add together to equal zero. Because the values of the velocity-component differentials are arbitrary, each of those terms must equal zero separately, so we have

(Eq=n B-11)

with similar equations for the v and w parts of the function. Integrating that equation gives us the basic solution

(Eq=n B-12)

in which lnA represents the constant of integration. Because the solution must be isotropic, that constant of integration must be the same for all three component solutions. Thus we have

(Eq=n B-13)

Applying that equation to a full description of the gas lets us determine the values of A and λ, so we get at last

Eq=n B-14)

in which m represents the mass of a particle in the gas and T represents the absolute temperature of the gas. With that equation we can work out the full Maxwellian description of a simple monatomic gas.

habg