# Précis of epistemology/Why is entropy real?

## The reality of thermodynamic entropy

To know matter in all its states, entropy is one of the most fundamental and important concepts. With it we can explain almost everything, without it almost nothing. Entropy can always be assigned to the various fragments of matter as soon as very general conditions are met, to be in thermal equilibrium or close to thermal equilibrium, and it can generally be measured. From the point of view of empirical science and thermodynamic theory, entropy is a real magnitude, it describes real properties of matter. In thermodynamics courses it is said to be a state function, in order to say that it is determined by the actual state of the system. Entropy really exists, not just in the imagination of theorists.

From the point of view of statistical physics, the reality of entropy is nevertheless a problem.

## The three definitions of statistical entropy

Statistical physics asks us to distinguish two notions of state for a physical system, the microscopic state, or microstate, and the macroscopic state, or macrostate.

• The macrostate is the state as defined by thermodynamics. It depends on macroscopic parameters: volume, internal energy, number of moles, pressure, temperature, surface tension, chemical potential, magnetization, applied external field or any other measurable macroscopic parameter which serves to determine the equilibrium state of the studied system.
• The microstate is the instantantaneous state of the system. It depends on all the states of its microscopic constituents. In classical physics, it is determined by the positions and velocities of all constituent particles. In quantum physics, the microstate is the quantum state of the system that is determined by the Schrödinger equation.

The macrostate does not evolve or slowly and usually deterministically, the microstate usually changes all the time, very quickly and in a random way.

It seems that the real state of a system is always its microstate. The macrostate is only a rough description that ignores all the microscopic details.

Except in rare cases, one never knows exactly the microstate of a macroscopic system, because one would have to know the quantum states of all its microscopic constituents, which are far too numerous to be enumerated, and the way they are entangled.

Since the microstate is generally unknown, statistical physics reasons about the probability distribution of possible microstates. Entropy is always defined from this probability distribution. It is calculated with the Gibbs formula:

$S=-k_{B}\sum _{n}p_{n}\ln p_{n}$

where the $p_{n}$  are the probabilities of all possible microstates $n$ . $k_{B}$  is the Boltzmann constant.

For a quasi-isolated system, it can be shown that in equilibrium all the possible microstates are equally probable (see complement). In a somewhat mysterious way, this distribution of probabilities is called microcanonical. If $\Omega$  is the number of possible microstates. The $p_{n}$  are then equal to ${\frac {1}{\Omega }}$  since they are all equal. Gibbs' formula then leads to Boltzmann's formula:

$S=k_{B}\ln \Omega$

Boltzmann is the first (1872-1875) who defined statistical entropy. Gibbs came after and generalized the Boltzmann formula for any probability distribution (1902).

In quantum mechanics, we count the number of microstates of a basis. This number is the dimension of the space of possible microstates. When entropy is defined with a probability distribution, we can assign probabilities to the basis microstates, but it is best to reason with the density operator.

Entropy measures the lack of information on the microstate of a system, hence the ignorance of the observer. But then it seems that it is not a real magnitude since it depends on how the observer is informed. Must we conclude that thermodynamics is wrong to postulate that entropy is a function of state?

To answer, we must distinguish three ways of interpreting the mathematical definition of entropy from the set of possible microstates, because we can give three definitions of a possible microstate:

• A microstate accessible to the studied system. All accessible microstates can be visited by the system during its random walk.
• A microstate compatible with the macroscopic constraints that define the macrostate.
• A microstate compatible with the information the observer has.

This leads to three forms of entropy which will be called on this page entropy of accessibility, entropy of constraints and information entropy.

In general, but not necessarily, these three entropies are equal because all microstates compatible with macroscopic constraints are accessible, and because these constraints are precisely the information the observer has.

Entropy of accessibility may be smaller than the entropy of constraints, because the system may be prevented from accessing some of the microstates that are compatible with the macroscopic constraints. In particular, if $T=0$  the system is stuck in one of its lowest energy microstates, so the number of accessible microstates $\Omega _{A}=1$  but the microstates of lowest energy can be very numerous. The number of microstates compatible with the macroscopic constraints can therefore be much larger than 1, $\Omega _{C}\gg 1$ . With $S=k_{B}\ln \Omega _{A}$ , the entropy of a system at zero temperature is always zero, but with $S=k_{B}\ln \Omega _{C}$  it can be very different. Since zero temperature entropy, also called residual entropy, is a measurable quantity that is sometimes non-zero, we must not ignore the difference between entropy of accessibility and entropy of constraints.

Information entropy may be smaller than the entropy of constraints or the entropy of accessibility as soon as the observer is informed about microscopic details. It may be larger than the entropy of constraints when the observer does not know all the macroscopic constraints that the observed system must respect.

Clearly information entropy generally depends on the observer and is not determined by the real state of the studied system. On the other hand, statistical physics invites us to consider that the entropy of constraints is a real magnitude. But this is not at all obvious a priori, since it is like information entropy a measure of the lack of information on the microstate of the system. The entropy of accessibility seems much more real, because it depends on the space of microstates that can actually be visited.

Entropy of constraints is a kind of information entropy. It is the information entropy of an observer who is aware of the macroscopic constraints that determine the equilibrium of a thermodynamic system.

To justify the reality of statistical entropy, it is generally assumed that the entropy of constraints is equal to the entropy of accessibility (Diu, Guthmann, Lederer, Roulet 1989). It is further assumed that the measurable thermodynamic quantities can be defined from averages calculated over all accessible microstates. When the microstate changes all the time it is natural to consider a time average that takes into account these variations. But these justifications for the reality of statistical entropy face many difficulties.

The following sections show that the thermodynamic entropy is not the entropy of accessibility but the entropy of constraints when they are different, and that we must reason on information entropy to understand the impossibility of the perpetual motion of the second kind.

## The reality of accessibility entropy

### Statistical ensembles, ergodicity and time averages

Statistical physics is developed in a mathematically rigorous way by reasoning on statistical ensembles of physical systems (Gibbs 1902). The probability of a micro-state is interpreted as the probability that a randomly selected system in the statistical ensemble is in this microstate.

In principle, ergodic theory allows to link the quantities defined with statistical ensembles to quantities defined with a single physical system. Ensemble averages are identified as the system time averages. But we usually reason about very long-time averages, because it takes immense time in relation to the age of the Universe for a macroscopic system to explore a significant fraction of its space of accessible microstates. Yet thermodynamic measurements are generally quite fast. As long as the systems are not too far from thermal equilibrium, a fraction of a second may suffice. We can even measure them continuously. We never wait billions of years.

If one correctly calculates an equilibrium quantity, with a suitable statistical set, the result is confirmed by the observation. But the duration of it can be quite short, just the time the system reaches its equilibrium. Even when we wait hours, or rarely weeks, for thermodynamic equilibrium to be reached, it is not enough to explore the whole space of accessible micro-states. Why then is the result calculated with a probability distribution on this space identical to the observed result?

### The principle of polls and the Monte Carlo method

The principle of polls may explain the equality between thermodynamic quantities actually measured and quantities calculated with statistical ensembles that have no physical reality. An average calculated on a representative sample can be an excellent approximation of the average calculated over the whole set, provided that the sample is sufficiently numerous and truly representative. During a thermodynamic measurement, the system only explores a small part of its accessible micro-states space, but it may be sufficiently large and representative for the measured quantity to be identical to that calculated with a statistical set. .

A thermodynamic measurement is similar to the Monte Carlo method. To evaluate an average, we calculate from a sample chosen at random. Theorists use the pseudo-random generators of computers to choose their samples. Experimenters trust Nature. It is like a random generator that chooses at each observation a representative sample that confirms our theoretical predictions. The Monte Carlo method is closer to physical reality than the statistical ensembles it serves to study. When one makes a thermodynamic measurement Nature itself applies the Monte Carlo method before providing us with the result.

That a brief temporal average is representative of the space of all accessible micro-states is a priori not at all obvious, and even rather excluded, because we observe only a small part of the trajectory of the system and it can be very different from other parts. How is it that Nature is a reliable random generator that gives us truly representative samples of very long-time averages?

### Quantum decoherence

Thermodynamic laws must be justified from quantum physics, like all other physical laws, because quantum theory is the most fundamental physics. One can then wonder if the probabilities of the statistical ensembles can be interpreted as quantum probabilities. The theory of decoherence suggests it. If we observe a system that interacts with an environment that we do not observe, we must describe it with a density operator that defines a distribution of probabilities on the states of the system. Even if the initial state is precisely determined, the subsequent evolution is described by a probability distribution. This decoherence effect can be very fast. But the distributions of probabilities obtained by decoherence are not the distributions of the statistical ensembles of thermodynamics. Decoherence alone is not enough to solve the problem of short-time averages, but it can help to solve it, because it is a very fast, very powerful and very general effect that introduces a lot of randomness into the evolution of physical systems.

### Microscopic entropy

Probability distributions computed by statistical physics (Maxwell-Boltzmann, Fermi-Dirac, Bose-Einstein) determine the probabilities of the states of the microscopic constituents of a thermodynamic system. These probabilities, which determine the velocities of a molecule in a gas, or quantum state occupation numbers, make it possible to define a microscopic entropy, i.e. an entropy per molecule, or per particle quantum state. The entropy of the whole system is the sum of the microscopic entropies of its constituents, provided that they are statistically independent (see complement). To take into account the indiscernibility of particles one must reason on quantum state occupation numbers.

The reality of microscopic entropy is not incompatible with the brevity of observations because the space of accessible states of a microscopic constituent is small. This is enough to justify the reality of microscopic entropy and from there the macroscopic entropy too. But for that we need to justify the statistical independence of the microscopic constituents.

### The independence of microscopic constituents

The microscopic constituents of a system in thermodynamic equilibrium can not be perfectly independent. In order for them to be, they should not interact with each other at all. But if they do not interact, they can not exchange energy and thermal equilibrium is excluded.

For a thermodynamic equilibrium to be established, it is sufficient to assume that the constituents interact weakly with each other in a very diversified manner, that each constituent weakly interacts with a large number of other constituents. For example a molecule of a gas is weakly coupled to all the others, because they can collide, but it is a weak coupling because the probability of a particular collision is very small.

A microscopic constituent can have only a very small effect on its environment because it is very small compared to it. If furthermore this effect is diluted on many other parts, the possibility of a reaction of the environment to the effect of this constituent is negligible. Everything happens as if the environment remained statistically always almost the same, whatever the state of the microscopic constituent. The state of a constituent is therefore statistically almost independent of the state of its environment. Since this is true for all microscopic constituents, they are all almost independent of each other. It can be concluded that macroscopic entropy is the sum of microscopic entropy.

To justify the reality of statistical entropy, ergodic theory is not enough, it is necessary above all to prove the quasi-statistical independence of the microscopic constituents.

### Lack of information, laisser-faire and equilibrium

The lack of information on the actual state of a thermodynamic system does not come from the laziness or incompetence of the observer but from the nature of the observed phenomena. Thermodynamic experiments let the observed systems reach or approach an equilibrium. We control only a small number of macroscopic quantities and we let the equilibrium settle by ignoring the micro-states. If we tried to know them more precisely, we could prevent the system from approaching equilibrium and we could not observe precisely what we want to observe, the equilibrium or the proximity of equilibrium. Letting the system roam randomly in its space of micro-states is a necessary condition for a thermodynamic equilibrium. Paradoxically, ignorance of micro-states, which is a subjective property of the observer, is a necessary condition for a thermodynamic equilibrium to occur, a real, objective event. This is why entropy, which measures a lack of information, is an objective material property. It is the lack of information that makes possible the thermodynamic equilibrium actually observed.

## The difference between thermodynamic entropy and accessibility entropy

A glass is a frozen liquid. More precisely, it is a liquid whose viscosity is so high that we can not observe its flow, except over very long periods, days, centuries or more. It can therefore be considered a solid but its microscopic structure is as disordered as that of a liquid.

During the liquid-glass transition, the variation of thermodynamic entropy is continuous, so the thermodynamic entropy of a glass is equal to that of the liquid at the same temperature. But the accessibility entropy is much smaller, because the glass is stuck in a particular configuration while the liquid can explore all the configurations compatible with the macroscopic constraints.

The existence of glasses thus proves that thermodynamic entropy is the entropy of constraints and not the entropy of accessibility if they are different. Thermodynamic entropy is thus a kind of information entropy. It is the information entropy of an observer who is aware of the constraints.

Zero temperature entropy is the difference between the entropy of constraints and the entropy of accessibility at $T=0$ . More exactly, it is the limit of this difference when $T$  tends to zero. Materials that have zero temperature entropy are disordered solids like glasses. Zero temperature entropy is equal to $k_{B}\ln \Omega _{D}$  where $\Omega _{D}$  is the number of microstates compatible with the macroscopic constraints.

That it is necessary to count microstates which are not visited by a thermodynamic system is a priori surprising. The lack of information on the microscopic configuration of a disordered solid depends on the observer. If we observe the microscopic details of a configuration, we reduce this lack of information. We are therefore tempted to affirm that thermodynamic entropy should be the entropy of accessibility and not the entropy of constraints if it must be a real magnitude. But it would then be necessary to give up the law of non-decrease of the entropy since the entropy of accessibility is reduced spontaneously during the transition liquid-glass.

Why should micro-states that are not visited by a system be taken into account to correctly calculate its thermodynamic entropy?

Since entropy measures the lack of information on the microstates of a system, it is tempting to conclude that observing the microscopic details of a disordered solid should reduce its entropy. But then thermodynamic entropy could not be a real measurable magnitude, the same for all observers, it would be nothing more than an arbitrary information entropy. Why then is thermodynamic entropy truly a real magnitude?

The two problems above are closely related. To solve them we must understand that information can be used as a fuel and that thermodynamics asks us to reason about information entropy.

## Information as fuel

### Maxwell's demon

A Maxwell's demon shows that information can be transformed into work:

Consider a gas in a container. A partition is placed in the middle. It is equipped with a small door controlled by a device that detects the speed of the incident molecules. It opens the door only if a molecule that comes from the left goes faster than average or if a molecule that comes from the right goes slower than average. In this way the right compartment is warmed while the leeft one is cooled (Maxwell 1871). This difference in temperature can be used to operate a heat engine.

The door opener is a Maxwell's demon. It acquires information that can be transformed into work. Information is therefore a kind of fuel.

Maxwell invented his "demon" to show that the law entropy non-decrease is only a statistical truth that could be transgressed if one were able to modify the statistical equilibrium of the microscopic constituents. In its time, the existence of atoms and molecules was still very hypothetical. To consider the possibility of manipulating them was therefore out of the question. But as soon as the microscopic constituents of matter were better known, the possibility of a mechanical device that functions like a Maxwell's demon could be taken seriously.

To date, our ability to observe and manipulate microscopic constituents does not allow the device imagined by Maxwell to be realized, but scanning tunneling microscopy makes it possible to observe and manipulate atoms. One can then imagine a device that allows to recover work after reducing the entropy of the observed system, so a sort of theoretically feasible Maxwell's demon:

Consider a crystal that can accommodate atoms on its surface. It is assumed that initially $N_{A}$  atoms are randomly distributed on $N_{S}$  sites and that the temperature is low enough that they stay there. It is therefore a frozen disorder. We begin by observing the exact configuration of the surface atoms, which can be done with a scanning tunneling microscope, then we move them and collect them with the same microscope on a fraction ${\frac {N_{A}}{N_{S}}}$  of the surface. The activity of the microscope resembles an isothermal compression work on a gas, except that it is not a gas but a frozen disorder on the surface.

Initially the number of possible configurations is equal to the number $(_{N_{S}}^{N_{A}})$  of ways to place $N_{A}$  atoms on $N_{S}$  sites . The frozen disorder of the atoms on the surface thus brings a contribution $S_{A}^{i}$  to the thermodynamic entropy of the crystal:

${\frac {S_{A}^{i}}{k_{B}}}=\ln(_{N_{S}}^{N_{A}})=\ln {\frac {N_{S}!}{(N_{S}-N_{A})!N_{A}!}}=N_{S}\ln {\frac {N_{S}}{N_{S}-N_{A}}}+N_{A}\ln {\frac {N_{S}-N_{A}}{N_{A}}}$

where we used Stirling's approximation: $\ln N!\approx N\ln N-N$

After ordering all the atoms $S_{A}^{f}=0$ .

The law of entropy non-decrease seems thus transgressed, as Maxwell had predicted it, because we can manipulate atoms.

In principle, the displacement of atoms does not require any work because the work of tearing an atom can be recovered during redeposition. But since a scanning tunneling microscope consumes energy and dissipates heat, it does not diminish the total thermodynamic entropy. This objection is discussed below.

### Amount of information and work

To convert the reduction of the thermodynamic entropy of the crystal into work, we put its surface in contact with an empty container of volume $V$  whose other walls can not accommodate the atoms. This container is divided with a movable wall into two parts left and right whose volumes are respectively $V_{G}={\frac {N_{A}}{N_{S}}}V$  and $V_{D}={\frac {N_{S}-N_{A}}{N_{S}}}$ . The crystal is heated to vaporize the atoms in the volume $V_{G}$ . The resulting gas is then allowed to relax isothermally throughout the container, providing a work $W=N_{A}k_{B}T\ln {\frac {N_{S}}{N_{A}}}$ . The crystal is then cooled to allow the atoms to redeposit on the surface of the crystal. If one proceeds reversibly, with a succession of thermal baths, the heat supplied during the heating by each thermal bath used is exactly equal to the heat it recovers during the cooling, because the specific heat at constant volume of a gas does not depend on its volume. The crystal and the thermal baths that were used to warm it have returned to their original state.

It has been supposed that an absorbing wall can make a perfect vacuum in an arbitrarily large volume. Such a wall can not exist otherwise one could make a perpetual motion of the second kind: the wall charged with atoms is placed in contact with an empty container, it is heated to a temperature sufficiently hot that all atoms are vaporised. The gas is then allowed to relax isothermally to provide work. The gas is then cooled to a temperature sufficiently cold that all atoms redeposit on the absorbing wall. If one proceeds reversibly the heat supplied during heating by each thermal bath is exactly equal to the heat it recovers during cooling. We could therefore return to the initial state after providing work by extracting heat from a single heat bath.

In order to make an exact calculation that is compatible with the laws of thermodynamics, it is necessary to take into account the equilibrium density of a gas in contact with an absorbing wall. This density can not be zero, but it can be very small, a priori as small as one wants if the wall is sufficiently absorbent. This is enough to justify the calculation above where this density is neglected.

Suppose ${\frac {N_{S}}{N_{A}}}\gg 1$ . So

${\frac {S_{A}^{i}}{k_{B}}}\approx N_{A}(\ln {\frac {N_{S}}{N_{A}}}+1)$

If in addition $\ln {\frac {N_{S}}{N_{A}}}\gg 1$ , we obtain

$S_{A}^{i}\approx N_{A}k_{B}\ ln{\frac {N_{S}}{N_{A}}}$

Now $S_{A}^{i}$  is equal to the reduction of information entropy obtained when one observes the positions of all surface atoms. An example of the following theorem is thus obtained:

If information entropy is smaller than thermodynamic entropy, then the difference multiplied by the temperature $T$  measures the maximum of the amount of work that the system can provide when it can receive heat only from a thermal bath at the temperature $T$ .

This theorem was first established by Szilard in 1929. But his model is very unrealistic because it postulates that a single molecule can push a piston as if it were an ordinary gas.

## Why can not Maxwell's demon reduce total entropy?

For the existence of a Maxwell's demon to be compatible with the laws of thermodynamics, at least one of the following conditions must be satisfied:

1. The operation of the device does not reduce the entropy of the observed system because it increases it before decreasing it.
2. The operation of the device increases the entropy of the environment.
3. The operation of the device increases its own entropy.

Maxwell supposed that his demon had to see the molecules. But to see them one has to light the gas and thus warm it up. Such warming increases the entropy of the gas and one can expect this increase to be greater than the decrease caused by the establishment of a difference in temperature. In this case it is condition 1 that prevents the reduction of the total entropy.

Tunneling microscopy does not require light and really reduces the entropy of the observed system. But it consumes energy and gives heat to the environment. One is tempted to conclude that the acquisition of microscopic information prevents a Maxwell's demon from reducing total entropy because it has an energy cost. But the acquisition of information does not necessarily have an energy cost. If the complete system consisting of a detector and an observed system is perfectly isolated from its environment, this does not prevent the detector from acquiring information. During an ideal quantum measurement, for example, the complete system is isolated. If the observed system is in an eigenstate of measurement, and if there are n such possible initial states, there are also n possible final states of the detector, and the observed system is not disturbed. Initially, there are n possible initial states of the complete system, because the detector is in a single microstate. Finally there are also n possible final states of the complete system because the observed system and the detector are perfectly correlated. The information is therefore acquired without increasing the entropy of the complete system.

Physics does not prohibit the existence of a system capable of detecting atoms without energy costs.

Must we conclude that a Maxwell's demon can reduce total entropy?

A device that acquires information must store it in order to use it. If a frozen disorder is observed, the information recording device reproduces in its own memory the observed frozen disorder. For example, atoms on the surface of a crystal can be used to store information. An atom can store a bit of information if it has two possible positions. 0 is stored by the atom on the left for example and 1 by the atom on the right. If initially all the atoms of the memory are ordered, all on the left or all on the right, they are not any more after the observation of a frozen disorder. The entropy of memory has therefore increased.

To say that the entropy of a memory increases when it records information is very paradoxical. For an observer who records the information, the entropy of her memory does not increase, because she does not ignore in what state her memory is. It is only for an outside observer, who does not know what information has been recorded, that the information entropy of the non-observed memory increases.

To solve the problem we must understand that the reset of a memory generally requires energy. To prove it, we first need the law of conservation of the information entropy of an isolated system.

## The conservation of the information entropy of an isolated system

If a system is isolated, its spontaneous evolution conserves the information entropy of an outside observer, that is, the number of microstates compatible with the information of an outside observer does not change.

This law is a consequence of the hamiltonian determinism of the evolution of an isolated system: distinct microstates at an initial moment evolve towards distinct microstates at a later moment.

If one calculates the information entropy not with the number of microstates but with a distribution of probabilities on the microstates, the law of conservation of the information entropy of an isolated system remains because the distribution of probabilities on the final microstates is the same as that on the initial states.

When an observer acquires information by observing a system, her information entropy may decrease but the observed system is not isolated.

Since thermodynamic entropy is a kind of information entropy, the law of conservation of the information entropy of an isolated system is also valid for thermodynamic entropy. How then can the thermodynamic entropy of an isolated system increase?

## Why can entropy increase?

The thermodynamic entropy of a perfectly isolated system, and more generally its information entropy, can not increase because its dynamics is hamiltonian. Therefore a system whose entropy increases can not be perfectly isolated. When we reason in thermodynamics on an isolated system whose entropy increases, we only mean that it is quasi-isolated, that is to say that the energy exchanged with its environment can be neglected compared to its internal energy. The perturbations of the environment introduce chance in the evolution of the system. It suffices for its dynamics to be not hamiltonian but stochastic (Diu, Guthmann, Lederer, Roulet 1989).

## Do irreversible computations always reduce the thermodynamic entropy of a computer?

The following reasoning suggests that irreversible computations always reduce the thermodynamic entropy of a computer:

When a computation is reversible, the number of possible final states is equal to the number of possible initial states. If the computation is irreversible the number of possible final states is smaller than the number of possible initial states. An irreversible computation thus makes it possible to reduce the number of states compatible with the information available, and thus to reduce the information entropy of a computer from the point of view of an outside observer who is not informed about the state of the computer memory.

The information entropy of an isolated system can not decrease. If the observer does not acquire information about the system during its evolution, the information entropy can only decrease if the thermodynamic entropy also decreases. Thermodynamics requires that such a reduction is accompanied by an increase in thermodynamic entropy of the environment (Landauer 1961). If the entropy of the environment increases because it absorbs heat, it is necessary to supply energy to the computer. We are thus led to a paradoxical formula: we must work to forget. More precisely, we have to spend energy to lose information.

But there is a flaw in the previous reasoning. A bit can be represented by the state of a gas in a container having a removable inner wall. 0 is represented by the gas enclosed in the left part, 1 by the gas equally distributed on the left and on the right. Removing the wall and then putting it back in place makes it possible to make an irreversible computation:

$f(0)=f(1)=1$

Therefore an irreversible computation does not necessarily reduce the thermodynamic entropy of a computer.

It is enough to give an additional condition to prove that an irreversible computation always reduces the thermodynamic entropy of a computer: the spaces of microstates that make the various states of the computer memory physically exist must be disjoint.

## Maxwell's demon and the impossibility of a perpetual motion of the second kind

For an outside observer who is not aware of the observations made by the demon, the decrease in the thermodynamic entropy of the system on which the demon acts is offset by the increase in the information entropy of the demon. That's why the demon does not reduce total entropy.

For a Maxwell demon to make a perpetual motion, it must reset its memory at the beginning of each cycle to make room to record the information on the system it observes. But resetting a memory is an irreversible computation that reduces its thermodynamic entropy. This reduction must be offset by an increase in the thermodynamic entropy of the environment. If the demon reduces its own entropy by heating the environment, it must be provided with energy to compensate for this loss of heat. It is therefore the resetting of Maxwell's demon memory which is the ultimate source of the impossibility of perpetual motion of the second kind (Bennett 1982).

The demon's memory can be reset without heat dissipation by an observer manipulating it from the outside, but then the observer's memory must be taken into account, and it must also be reset to start a new cycle.

If the demon's memory can be reset without reducing its own thermodynamic entropy, because the spaces of microstates that realize the memory states are not disjoint, the recording of the observations requires a reduction of thermodynamic entropy, which must be offset by an increase elsewhere.

In any case a Maxwell's demon can not make a perpetual motion of the second kind.

## Thermodynamics is a physics of observation

To understand thermodynamics, we must reason on the three forms of statistical entropy, entropy of accessibility because it explains why a quasi-isolated system evolves towards equilibrium by increasing its entropy, entropy of constraints because it is the thermodynamic entropy measured by experimenters, and information entropy because it makes it possible to explain why a Maxwell's demon can not make a perpetual movement of the second kind.

Thermodynamic entropy is a kind of information entropy, because it is the observer who is aware of the macroscopic constraints that determine a thermodynamic equilibrium, but that does not prevent it from being a real magnitude, because the macroscopic constraints imposed by an observer actually exist.

Thermodynamics does not only invite us to reason about the properties of matter, it also invites us to reason about the role of observations. But it is still physics, because observers too are real.

## Complements

### The perpetual motion of the second kind

A machine that could lift a weight or move a car without being supplied with energy could make a perpetual motion of the first kind. The law of conservation of energy, the first law of thermodynamics, forbids the existence of such a machine. It is one of the most fundamental laws of physics. All physicists would be wrong if we could invent such a machine, but no one has ever invented it.

Perpetual movement of the second kind does not contradict the law of conservation of energy. Any body can yield energy if it is cooled, unless it is at zero temperature, equal to 0 Kelvin = -273.15 °Celsius = -459.67 °Fahrenheit. So we can imagine a car, a boat or a plane that could advance without consuming fuel. It would only have to absorb air or water at room temperature and reject it at a colder temperature. The difference in energy would be used to run the engine.

For such an engine to work it would be necessary to be able to separate a system whose temperature is uniform in two parts, one warmer, the other colder. But it is forbidden by the second law of thermodynamics because it would reduce total entropy. The impossibility of the perpetual motion of the second kind thus results from the law of non-decrease of the total entropy, the second law of thermodynamics.

### Accessibility entropy increase and the microcanonical distribution

The $n$  are all accessible states of an almost isolated system, ie disturbances by the environment do not change the energy of the system. $T_{np}$  is the transition probability per unit of time from the state $p$  to the state $n$ . It is assumed that all the microscopic evolutions are reversible :

$T_{np}=T_{pn}$

for all $n$  and all $p$ .

$P_{n}(t)$  is the probability of the state $n$  at the time $t$ . By definition of $T_{np}$ , we have:

${\frac {d}{dt}}P_{n}=\sum _{p\neq n}(P_{p}T_{np}-P_{n}T_{pn})=\sum _{p\neq n}T_{np}(P_{p}-P_{n})$

${\frac {d}{dt}}S=-k_{B}{\frac {d}{dt}}\sum _{n}P_{n}\ln P_{n}=-k_{B}\sum _{n}[(\ln P_{n}){\frac {d}{dt}}P_{n}+{\frac {d}{dt}}P_{n}]=-k_{B}\sum _{n}(\ln P_{n}){\frac {d}{dt}}P_{n}$

since

$\sum _{n}{\frac {d}{dt}}P_{n}={\frac {d}{dt}}\sum _{n}P_{n}={\frac {d}{dt}}1=0$

Then

${\frac {d}{dt}}S=k_{B}\sum _{n,p\neq n}(\ln P_{n})T_{np}(P_{n}-P_{p})=k_{B}[\sum _{n,pn}(\ln P_{n})T_{np}(P_{n}-P_{p})]$

But

$\sum _{n,p>n}(\ln P_{n})T_{np}(P_{n}-P_{p})=\sum _{p,n

So

${\frac {d}{dt}}S=k_{B}\sum _{n,p

$(\ln P_{n}-\ln P_{p})$  and $(P_{n}-P_{p})$  are of the same sign and the $T_{np}$  are always positive, therefore :

${\frac {d}{dt}}S\geq 0$

Paradoxically, the hypothesis of microscopic reversibility, $T_{np}=T_{pn}$ , leads to a law of irreversibility of macroscopic processes, since the entropy of a quasi-isolated system can never diminish.

The microcanonical distribution is an equilibrium distribution:

${\frac {d}{dt}}P_{n}=\sum _{p\neq n}T_{np}(P_{p}-P_{n})=0$

since $P_{n}=P_{p}$  for all $n$  and all $p$ .

This is the only equilibrium distribution, because if the $P_{n}$  are not all equal, there is at least one $P_{m}$  smaller than the others. Of all the states $m'$  such that $P_{m'}=P_{m}$ , there is at least one for which $T_{m'p}\neq 0$  for a $p$  such that $P_{m'} , otherwise they would not be accessible. Then

${\frac {d}{dt}}P_{m'}=\sum _{p\neq m'}T_{m'p}(P_{p}-P_{m'})>0$

and the distribution is not in equilibrium.

### The reality of entropy increase

The entropy non-decrease theorem is proved in a mathematically rigorous way for a statistical ensemble that has no real existence. $P_{n}(t)$  defines the occupation probability of the state $n$  at the moment $t$  for all the systems of a huge set imagined by theorists. It describes the evolution of this huge set, not the evolution of a real physical system. But under very general conditions, one can interpret the $P_{n}(t)$  as really measurable probabilities, and thus compare their evolution with the observed quantities.

It is assumed that the macroscopic evolution of the system is slow compared to microscopic fluctuations. The environment of each microscopic constituent is then almost constant over a sufficient period of time for it to explore its space of states and thus define probabilities of occupation of these states. For each microscopic constituent $i$  one can thus define probabilities $P_{im}(t)$  of occupation of its states $m$  and in principle measure them. Assuming that all microscopic constituents are statistically independent, these $P_{im}(t)$  are sufficient to define the probabilities $P_{n}(t)$  of all states $n$  of the macroscopic system.

### Frozen disorder and spontaneous decrease of accessibility entropy

During the liquid-glass transition, the total entropy of accessibility of the liquid and the thermal bath which cools it decreases. The decrease of accessibility entropy of the glass is not compensated by the increase of accessibility entropy of the thermal bath. It therefore seems that the theorem of increase of accessibility entropy is transgressed, while it is rigorously proven under very general conditions. But when the glass has frozen in a disordered configuration, the other configurations remain theoretically accessible. There is a negligible but non-zero probability that the thermal bath briefly gives up some of its heat which would allow the glass to be liquid again and then become glass again in another disordered configuration. All configurations are therefore in principle always accessible, but in practice the glass remains frozen in one. The theorem of increase of accessibility entropy considers that the space of accessible states does not change. It ignores the possibility that microstates become inaccessible in practice.

### Entropy is an extensive quantity when the parts are statistically independent

The entropy of a sum is the sum of the entropies of the parts when they are independent:

Let $a$  and $b$  be two independent parts of the system $ab$ . The $a_{i}$  and the $b_{j}$  are the accessible states of $a$  and $b$  respectively.

The entropies of $a$  and $b$  are:

$S_{a}=-k_{B}\sum _{i}P_{i}^{a}\ln P_{i}^{a}$

$S_{b}=-k_{B}\sum _{j}P_{j}^{b}\ln P_{j}^{b}$

Accessible states of $ab$  are all states $a_{i}b_{j}$  for all $i$  and all $j$ . If $a$  and $b$  are independent the probability of $a_{i}b_{j}$  is $P_{i}^{a}P_{j}^{b}$  and the entropy of $ab$  is:

$S_{ab}=-k_{B}\sum _{ij}P_{i}^{a}P_{j}^{b}\ln(P_{i}^{a}P_{j}^{b})=-k_{B}\sum _{ij}P_{i}^{a}P_{j}^{b}(\ln P_{i}^{a}+\ln P_{j}^{b})$

$=-k_{B}(\sum _{ij}P_{i}^{a}P_{j}^{b}\ln P_{i}^{a}+\sum _{ij}P_{i}^{a}P_{j}^{b}\ln P_{j}^{b})$

$=-k_{B}(\sum _{i}P_{i}^{a}\ln P_{i}^{a}+\sum _{j}P_{j}^{b}\ln P_{j}^{b})$

$=S_{a}+S_{b}$

### The Szilard's engine

To better understand the transformation of information into work, Szilard (1929) invited us to reason about an engine that works with a "one molecule gas":

A molecule is enclosed in a container that can be separated by a removable wall. When the wall is put in place in the middle of the container, the presence of the molecule in one or the other of the separate compartments is detected. So a bit of information is acquired. The wall is then used as a piston on which the molecule can work. It is necessary to know where the molecule is to know in which direction of piston displacement a work can be recovered. In this way it is calculated that under optimal conditions an bit of information is used to recover a work equal to $k_{B}T\ln 2$ . This is the work done by a molecule on a piston during an isothermal expansion at the temperature $T$  that doubles the accessible volume.

Szilard's engine seems to contradict thermodynamics because it suggests that we could make a perpetual motion of the second kind. If the piston can move in only one direction, it is not necessary to know the position of the molecule to recover work. Once in two the piston remains motionless because the molecule is on the wrong side of the piston, and we do not recover any work, but once in two it moves and we recover a work equal to $k_{B}T\ln 2$ . By repeating the experiment many times we could thus obtain an arbitrary amount of work without spending any to know the position of the molecule.

But such a process requires a device that removes the piston and puts it back in place. Now there are two possible positions of the piston at the end of a cycle, either in the middle if it has not moved, or at one end, if it has moved. The device must therefore acquire a bit of information at the end of each cycle. To return to its initial state, it must erase this information. The cost of erasing information is thus here too at the origin of the impossibility of a perpetual motion of the second kind (Leff & Rex 1990).

Szilard's engine confirms that the entropy of accessibility can be different from the thermodynamic entropy: when we introduce a wall in the middle of the one molecule gas, the reduction of the entropy of accessibility is $k_{B}\ln 2$ . On the other hand, the thermodynamic entropy is not reduced since the gas does not yield heat. For the Szilard's engine, the entropy of accessibility can be reduced in principle as much as we want. We just have to place several walls inside the container.

### Why does the entropy of the Universe increase?

The law of entropy increase suggests that the entropy of the Universe can only increase, that it started in a macrostate of low entropy, hot but very condensed, and that its expansion allowed it to cool down while increasing its entropy. But this statement faces two difficulties:

• As the Universe is perfectly isolated its entropy, if we could define it, should be conserved.
• It is not really meaningful to talk about the macrostate of the Universe, because there is no sense in talking about the space of its possible microstates. It is in a single microstate, sometimes called the wave function of the Universe.

But there is nevertheless a sense to say that the entropy of the Universe increases. We can assign entropy to its various parts (stars, planets, black holes, interstellar medium ...) and find that the sum of all these entropies increases. But when we do this, we suppose that the parts are statistically independent, we ignore their correlations. The ignorance of correlations leads to overestimate the total entropy. If we knew the microstate of the universe, we would also know all the correlations between its parts and we would note that the total entropy does not increase. It would always remain equal to zero. But knowing the microstate of the universe is obviously impossible.