Statistical Thermodynamics and Rate Theories/Boltzmann Distribution

Developing the Canonical Ensemble

The first step is to develop a model of how the energy is distributed; the model which will be used is a canonical ensemble which is a system at constant number of particles, volume and temperature. A canonical ensemble can be created by first taking a microcanonical ensemble of a large number of systems, ${\mathcal {A}}$ , all at the same number of particles, volume and energy. The entire microcanonical ensemble is then immersed into a heat bath. The systems are then allowed to exchange energy with the heat bath until all the systems come to a thermal equilibrium. The microcanonical ensemble is the removed from the heat bath such that the systems can only exchange energy with the other systems around it. The systems will now have a distribution of all possible total energies. These energy states can then be numbered in increasing energy (i.e., lowest energy is $E_{1}$ ). Since each of these energy states can have any number of systems with that energy, then the number of systems with total energy $E_{i}$ can be defined as the occupation of the state $i$ with variable $a_{i}$ . The entire ensemble can then be defined as the occupancies of the energy states, where the total number of systems, ${\mathcal {A}}$ , is:

\Sigma _{i}a_{i}={\mathcal {A}}

and the total energy of the entire ensemble, or energy level, $\varepsilon$ , is:

\Sigma _{i}a_{i}E_{i}=\varepsilon

The Weight of a Configuration

The occupancy of any of these energy levels depends on how many possible ways the total energy of the ensemble can be distributed amongst the systems or can be described by the occupancies of the energy states. Some of these configurations of the occupancies are mathematically more likely than others.

The weight of a configuration, or the number of ways to distribute the energy of the ensemble, is important. By combinatorial mathematics the weight of a system with occupancies $a_{1},a_{2},a_{3},...$ in an ensemble containing ${\mathcal {A}}$ systems is:

W(a_{1},a_{2},a_{3},...)={\frac {{\mathcal {A}}!}{a_{1}!a_{2}!a_{3}!...}}={\frac {{\mathcal {A}}!}{\Pi _{i}a_{i}!}}

System Weighting

Consider an ensemble of ${\mathcal {A}}$ systems divided into two states. The occupancy of the first state is N and therefore the occupancy of the second state is A-N. The weight of this configuration as a function of N is:

W(N)={\frac {{\mathcal {A}}!}{N!({\mathcal {A}}-N)!}}

It can be shown that as ${\mathcal {A}}$ increases the most probable distribution (N= ${\mathcal {A}}$ /2) becomes much more heavily weighted than any other configuration.

The Most Probable Distribution

To describe a canonical ensemble, only the most probable set of occupancies need to be determined, denoted as a*. For a very large ${\mathcal {A}}$ , the configuration with the largest weight will be much more heavily weighted than the other configurations. Therefore, this set of occupancies will be the most probable, and any configuration with a significantly large weight will be close to a*.

However, for any given ensemble, not all occupancies are possible. This is due to the imposition of constraints upon the occupancies by the specific ensemble. In a canonical ensemble there are two constraints which were previously used to define a canonical system:

The sum over all energy levels has to add up to the total energy of the ensemble, denoted as $\varepsilon$ : $\Sigma _{i}a_{i}E_{i}=\varepsilon$
The sum of all of the occupancies must be equal to ${\mathcal {A}}$ : $\Sigma _{i}a_{i}={\mathcal {A}}$ Therefore, given these two constraints, we must find a* that agrees with the above imposed constraints of the canonical ensemble.

Calculating the Most Probable Occupancy

To calculate the most probable occupancy, we must calculate the occupancies that would give the maximum weight. Lagrange multipliers are used to maximize this function.

Mathematical Manipulation of W

It was previously given that $W(a_{1},a_{2},a_{3},...)={\frac {{\mathcal {A}}!}{a_{1}!a_{2}!a_{3}!...}}={\frac {{\mathcal {A}}!}{\Pi _{i}a_{i}!}}$ However, this equation is difficult to manipulate. But, for a positive function, the maximum occurs at the same location as it does in the natural logarithm of the function. Since the maximum is what we’re really interested in finding, we can use $\ln {W}$ to find the maximum, which makes the mathematics easier to manipulate. Therefore the equation becomes:

\ln {W}(a_{1},a_{2},a_{3},...)=\ln {\left({\frac {{\mathcal {A}}!}{\Pi _{i}a_{i}!}}\right)}

=\ln {{\mathcal {A}}!}-\ln {\Pi _{i}a_{i}!}

=\ln {{\mathcal {A}}!}-\Sigma _{i}\ln {a_{i}!}

Applying the Math to a Canonical Ensemble

Using the Lagrange multipliers as previously described, we can determine the most probable occupancy of the canonical system. When Lagrange multipliers are applied to the $\ln {W}$ equation, following the aforementioned constraints, the new equation becomes:

{\frac {\partial }{\partial a_{j}}}\left[\ln W(a_{1},a_{2},a_{3}...)-{\alpha }[({\Sigma _{i}a_{i}})-{\mathcal {A}}]-{\beta }[({\Sigma _{i}a_{i}E_{i}})-{\varepsilon }]\right]=0

Next, we must simplify each of the three terms in the equation and solve for the unknown constants $\alpha$ and $\beta$.

Term 1

{\frac {\partial }{\partial a_{j}}}\ln W(a_{1},a_{2},a_{3}...)={\frac {\partial }{\partial a_{j}}}\ln {{\mathcal {A}}!}-\ln {\Pi _{i}a_{i}!}=-{\frac {\partial }{\partial a_{j}}}\ln {a_{j}!}

Using Stirling’s Approximation:

-{\frac {\partial }{\partial a_{j}}}\ln {(a_{j}!)}=-{\frac {\partial }{\partial a_{j}}}[{a_{j}}\ln {(a_{j})}-{a_{j}}]={-a_{j}}{\frac {1}{a_{j}}}-\ln {(a_{j})}+{1}=-\ln {(a_{j})}

Term 2

{\frac {\partial }{\partial a_{j}}}\alpha [(\Sigma _{i}a_{i})-{\mathcal {A}}]

For ${\Sigma _{i}a_{i}}$ , there will be one term of the sum where i = j, everything else is equal to zero. Therefore, this term becomes:

{\frac {\partial }{\partial a_{j}}}{\alpha }{a_{j}}={\alpha }

Term 3

{\frac {\partial }{\partial a_{j}}}{\beta }[({\Sigma _{i}a_{i}E_{i}})-{\varepsilon }]

Like term 2, for ${\Sigma _{i}E_{i}}$ , there will be one term of the sum where i = j, everything else is equal to zero. Therefore, this term becomes:

{\frac {\partial }{\partial a_{j}}}\beta {E_{j}a_{j}}=\beta {E_{j}}

Now that the three terms have been simplified, we can combine them into one equation to solve. This gives us:

-\ln {(a_{j})}-{\alpha }-{\beta }{E_{j}}=0

Solving for ${a_{j}}$ gives:

{a_{j}}={e^{-\alpha }}{e^{-{\beta }{E_{j}}}}

Determining $\alpha$ and $\beta$

The next step of the derivation is to determine the constants $\alpha$ and $\beta$ .

We can use one of the constraints to determine $\alpha$ .

a_{i}={e^{-\alpha }}{e^{-\beta {E_{i}}}}

\Sigma _{i}a_{i}={\mathcal {A}}

{\Sigma _{i}}{e^{-\alpha }}{e^{{-\beta }{E_{i}}}}={\mathcal {A}}

{e^{-\alpha }}{\Sigma _{i}}{e^{{-\beta }{E_{i}}}}={\mathcal {A}}

{e^{\alpha }}={\frac {1}{\mathcal {A}}}{\Sigma _{i}}{e^{{-\beta }{E_{i}}}}

Using this equation and the constraint, we can define the probability of a system in the ensemble as the occupation of state i divided by the total number of systems in the ensemble, which can be represented by the following equation.

{P_{i}}={\frac {a_{i}}{\mathcal {A}}}={\frac {e^{{-\beta }{E_{i}}}}{{\Sigma _{i}}{e^{{-\beta }{E_{j}}}}}}

Determining $\beta$ requires connecting it to classical thermodynamics. We can determine the average of a variable (<M>) over the states of a system by employing the equation

\langle M\rangle ={M_{i}}{P_{i}}

The average energy is

\langle E\rangle ={\frac {{\Sigma _{j}}{E_{j}}{e^{{-\beta }{E_{j}}}}}{{\Sigma _{i}}{e^{{-\beta }{E_{j}}}}}}

The average pressure is

\langle p\rangle ={\frac {{\Sigma _{j}}-\left({\frac {\partial E_{j}}{\partial V}}\right)_{N}{e^{{-\beta }{E_{i}}}}}{{\Sigma _{i}}{e^{{-\beta }{E_{j}}}}}}

Knowing these equations and combining two other equations we can come up with the following equation, which can be compared to a classical thermodynamic equation shown below.

\left({\frac {\partial \langle E\rangle }{\partial V}}\right)_{N,{\beta }}+{\beta }\left({\frac {\partial \langle p\rangle }{\partial {\beta }}}\right)_{N,V}=-\langle p\rangle

\left({\frac {\partial U}{\partial V}}\right)_{T,N}-{T}\left({\frac {\partial p}{\partial T}}\right)_{N,V}=-{p}

In order to better compare the equations, we can perform a calculation such that the signs in front of ${\beta }$ and T are the same, giving the following equations:

\left({\frac {\partial \langle E\rangle }{\partial V}}\right)_{N,{\beta }}+{\beta }\left({\frac {\partial \langle p\rangle }{\partial {\beta }}}\right)_{N,V}=-{\langle p\rangle }

\left({\frac {\partial U}{\partial V}}\right)_{T,N}+{\frac {1}{T}}\left({\frac {\partial p}{\partial {\frac {1}{T}}}}\right)_{N,V}=-p

Since multiplying through by a constant would not change the equations, we can determine that ${\beta }$ is only proportional to ${\frac {1}{T}}$ .

\beta \propto {\frac {1}{T}}

We can introduce ${k_{B}}$ as a proportionality constant to convert the proportionality to an equation as follows:

\beta ={\frac {1}{k_{B}T}}

The Boltzmann Distribution

Earlier we determined the probability of a state within a system. Now that we know ${\alpha }$ and ${\beta }$ , we can complete the equation to determine the Boltzmann Distribution.

{P_{i}}={\frac {a_{i}}{\mathcal {A}}}={\frac {e^{-\beta E_{i}}}{\sum _{i}e^{-\beta E_{j}}}}={\frac {e^{-E_{i}/{k_{B}T}}}{\sum _{i}e^{-E_{i}/{k_{B}T}}}}

Example

Consider a system with two singly-degenerate energy levels separated by $10^{-21}$ J. Derive an equation for the probability of the system being in the ground state. Plot the probability of the system being in the ground state between temperatures 1 K and 1000 K.

${P_{i}}={\frac {a_{i}}{\mathcal {A}}}={\frac {e^{-\beta E_{i}}}{\sum _{i}e^{-\beta E_{j}}}}={\frac {e^{-E_{i} \over {k_{B}T}}}{\sum _{i}e^{-E_{i} \over {k_{B}T}}}}$

$k_{B}=1.380\times 10^{-23}{J \over k}$ is the Boltzmann constant in joules per Kelvin. The ground state has an energy level of zero ( ${E_{o}=0}$ ). Since the ground energy level is known and the difference between the ground energy level and the first high energy level is given (∆E = $10^{-21}$ J), the high energy level can be determined using the following equation:

${\Delta E=E_{1}-E_{0}}$

After using the actual values, the equation should look like this

${10^{-21}=E_{1}-0}$

${E_{1}=10^{-21}}$

The probability equation can then be simplified to the following:

${P_{i}}={\frac {e^{-E_{i} \over {k_{B}T}}}{e^{-E_{i} \over {k_{B}T}}+{e^{-E_{i} \over {k_{B}T}}}}}$

${P_{0}}={\frac {e^{-E_{0} \over {k_{B}T}}}{e^{-E_{0} \over {k_{B}T}}+{e^{-E_{1} \over {k_{B}T}}}}}$

After substituting the values determined into the equation, the probability equation should be the following:

${P_{0}}={\frac {e^{0 \over {k_{B}T}}}{e^{0 \over {k_{B}T}}+{e^{{\Biggl (}{-(10^{-21}) \over {1.380\times 10^{-}23{\bigl (}{J \over K}{\bigr )}\times T}}{\Biggr )}}}}}$

Since the $e^{0}=1$ , the equation can be simplified to

$P_{0}={\frac {1}{1+{e^{{\Biggl (}{-(10^{-21}) \over {{1.380\times 10^{-23}{\bigl (}{J \over K}{\bigr )}}\times T}}{\Biggr )}}}}}$ Note: The probability does not have units.

This equation can be used to determine the probability of the system as the temperature changes in Kelvin.

Zoomed in graph of the probability of the Boltzmann distribution over the change of temperature in Kelvin

The probability of the system, which starts from one, gradually decreases as the temperature increases. the reason why can be observed from the equation. Since the temperature is in the denominator, the probability should be smaller as the denominator increases. After about 300 k, the line pattern starts to have a slight decrease toward 0.5 of probability. In other words, the probability of Boltzmann distribution at this system becomes less probable as the temperature exceeds 1000 K.