Abstract

A thermodynamic expression for the analog of the canonical ensemble for nonequilibrium systems is described based on a purely information theoretical interpretation of entropy. It is shown that this nonequilibrium canonical distribution implies some important results from nonequilibrium thermodynamics, specifically, the fluctuation theorem and the Jarzynski equality. Those results are therefore expected to be more widely applicable, for example, to macroscopic systems.

1. Introduction

The derivations of the fluctuation theorem [1, 2] and the Jarzynski equality [3] appear to depend on the underlying microscopic Hamiltonian dynamics. From this it would follow that these theorems are only relevant to microscopic systems, with their associated definitions of entropy and temperature. In contrast, a statistical mechanical description of macroscopic systems often depends on more general forms of entropy, primarily information entropy [46]. Two notable examples from fluid dynamics are the statistical mechanics of point vortices [7] and the statistical mechanics of two-dimensional incompressible flows [8]. In such cases, temperature is defined in terms of the change of entropy with the energy of the system [9] or, equivalently, in terms of the Lagrange multiplier for the energy under the maximization of entropy at a given expectation value of the energy [10].

The question is whether for such macroscopic systems we can derive a fluctuation theorem or Jarzynski equality. This is of particular importance for climate science as there are strong indications that the global state of the climate system and, more generally, other components of the Earth system may be governed by thermodynamic constraints on entropy production [1115]. The theoretical underpinning of those thermodynamic constraints is still lacking. The presence of a fluctuation theorem for such systems would be of great importance.

Here we demonstrate that the information-theoretical definition of entropy implies the fluctuation theorem and the Jarzynski equality. It is shown that these results are due to the counting properties of entropy rather than the dynamics of the underlying system. As such, both these results are applicable to a much wider class of problems, specifically, macroscopic systems for which we can define an entropy and which are thermostated in some general sense.

The central tenet is that for two states 𝐴 and 𝐵 of a system, defined by two sets of macroscopic parameters, the ratio of the probabilities 𝑝𝐵/𝑝𝐴 for the system to be in either state is𝑝𝐵𝑝𝐴Δ=exp𝐴𝐵𝑆𝑘,(1) with Δ𝐴𝐵𝑆 being the difference in entropy between the states 𝐵 and 𝐴. This is essentially the Boltzmann definition of entropy: entropy is a counting property of the system. The theoretical background can be found in [10], where it is shown that this information theoretical interpretation reproduces the statistical mechanics based on Gibbs entropy but furthermore gives a justification of the Gibbs formulation as a statistical inference problem under limited knowledge of the system. Of note is that the entropy only has meaning in relation to the macroscopic constraints on the system (indicated by the subscripts 𝐴 and 𝐵), constraints which can be arbitrarily complex and prescriptive, as may be needed for systems far from equilibrium. In an information-theoretical setting the previous definition of entropy is equivalent to the principle of indifference: the absence of any distinguishing information between microscopic states within any of the macroscopic states 𝐴 or 𝐵 is equivalent to equal prior (prior to obtaining additional macroscopic constraints) probabilities for the microscopic states [16]. Note also that we do not need to specify precisely at this point how the states are counted, or how an invariant measure can be defined on the phase space confined by 𝐴 or 𝐵. The principle of indifference does not imply that all states are assumed equally probable; it is a statement that we cannot a priori assume a certain structure in phase space (such as a precisely defined invariant measure) in the absence of further information. The principle of indifference is not a statement about the structure of phase space; it is a principle of statistical inference and it is the only admissible starting point from an information theoretical point of view.

2. A General Form for the Canonical Ensemble

Following Boltzmann, we define the entropy 𝑆𝐴 as the logarithm of the number of states accessible to a system under given macroscopic constraints 𝐴. For an isolated system, the entropy is related to the size Φ𝐴 of the accessible phase space:𝑆𝐴=𝑘lnΦ𝐴.(2) For a classical gas system, 𝐴 is defined by the energy 𝑈, volume 𝑉 and molecule number 𝑁, the phase space size Φ𝐴 is the hyperarea of the energy shell, and it defines the usual microcanonical ensemble. For more complicated systems, where 𝐴 may include several macroscopic order parameters, the energy shell becomes more confined; in the following we will still refer to the accessible phase space under constraints 𝐴 as the energy shell. The hyperarea Φ𝐴 is nondimensionalised such that Φ𝐴(𝑈)d𝑈 is proportional to the number of states between energies 𝑈 and 𝑈+d𝑈. We will not consider other multiplicative factors which make the argument of the logarithm nondimensional; these contribute an additive entropy constant which will not be of interest to us here. Note also that the microcanonical ensemble does not include a notion of equilibrium: the system is assumed to be insulated, so it cannot equilibrate with an external system. It just moves around on the energy shell (defined by 𝐴) and the principle of indifference implies that all states, however improbable from a macroscopic point of view, are members of the ensemble. Of course, the number of unusual states (say, with nonuniform macroscopic properties not defined by 𝐴) is much lower than the number of regular states (say, with uniform macroscopic density) for macroscopic systems. Only for small systems, the distinction becomes important but it does not invalidate the previous formal definition of entropy. The previous definition of entropy also ensures that entropy is an extensive property such that for two independent systems considered together the total entropy is the sum of the individual entropies, 𝑆=𝑆1+𝑆2. The Boltzmann constant 𝑘 ensures dimensional compatibility with the classical thermodynamic entropy when the usual equilibrium assumptions are made [10, 17].

The hyperarea of the energy shell, and thus the entropy, can be a function of several variables which are set as external constraints, such as the total energy 𝑈, system volume, 𝑉, or particle number 𝑁 for a simple gas system. For the canonical ensemble we consider a system that can exchange energy with some reservoir. We consider here only a theoretical canonical ensemble in that we consider the coupling between the two systems to be weak such that the interaction energy vanishes compared to the relevant energy fluctuations in the system.

First, we need to define what a reservoir is. Following equilibrium thermodynamics, we formally define an inverse temperature 𝛽=(𝑘𝑇)1 as1𝛽=𝑘𝜕𝑆=1𝜕𝑈Φ𝜕Φ.𝜕𝑈(3) We make no claim about the equality of 𝛽 and the classical equilibrium inverse temperature; 𝛽 is the expansivity of phase space with energy and as such can be defined for any system, whether it is in thermodynamic equilibrium or not. When an isolated system is prepared far from equilibrium (e.g., when it has a local equilibrium temperature which varies over the system), then 𝛽 is still uniquely defined for the system as a nonlocal property of the energy shell that the system resides on. Because both energy and entropy in the weak coupling limit are extensive quantities, 𝛽 must be an intensive quantity.

Now consider a large isolated system 𝑅 with total (internal) energy 𝑈𝑅. Let this system receive energy 𝑈 from the environment. By expanding its entropy 𝑆𝑅 in powers of 𝑈, we can then write the entropy of this large system as𝑆𝑅𝑈𝑅+𝑈=𝑆𝑅𝑈𝑅1+𝑘𝑈𝛽+2𝑈𝜕𝛽𝑈𝜕𝑈+𝒪2.(4) We see that for finite 𝑈, (𝜕𝛽/𝜕𝑈)1 has to be an extensive quantity. But that means that for a very large system 𝜕𝛽/𝜕𝑈=𝒪(𝑁1), where 𝑁 is a measure of the size of the system (such as particle number). For a classical thermodynamic system 𝜕𝛽/𝜕𝑈=𝑘𝛽2/𝐶𝑉 with 𝐶𝑉 the heat capacity at constant volume. We conclude that for a very large system (𝑁), the entropy equals𝑆𝑅𝑈𝑅+𝑈=𝑆𝑅𝑈𝑅+𝑘𝛽𝑈(5) for all relevant, finite energy exchanges 𝑈. This expression for the entropy defines a reservoir. The size of the energy shell accessible to the reservoir is, for all relevant energy exchanges 𝑈, exactly proportional to exp(𝛽𝑈), with 𝛽 an intensive and constant property of the reservoir. We do not require the reservoir to be in thermodynamic equilibrium. A change of energy in the reservoir pushes the reservoir to a different energy shell 𝐴; the functional dependence of the size of the energy shell with energy defines the inverse temperature 𝛽, as in (3). However, it is not assured that a small and fast thermometer would measure an inverse temperature equal to 𝛽 at some point in the reservoir; only if the reservoir is allowed to equilibrate, its inverse temperature is everywhere equal to 𝛽. Of course, this is how the temperature of a classical reservoir is determined in practice.

Now suppose that a system of interest has energy 𝑈0. We then allow it to exchange heat 𝑈 with a reservoir. If the system has energy 𝑈0+𝑈, the reservoir must have given up energy 𝑈. We can write the hyperarea of the energy shell of the system Φ0 as a function of 𝑈. The total entropy of the system plus reservoir 𝑅 can then be written as a function of the exchange energy, 𝑈, as𝑆=𝑆0(𝑈)+𝑆𝑅𝑈𝑅𝑘𝛽𝑈,(6) with 𝑆0=𝑘lnΦ0. The number of states at each level of exchange energy therefore is proportional toΦ(𝑈)Φ0(𝑈)exp(𝛽𝑈),(7) where we omitted proportionality constants related to the additive entropy constants. Nowhere we assume that the system is in equilibrium with the reservoir. This means that Φ(𝑈) is the relevant measure to construct an ensemble average for the system, even for far-from-equilibrium systems. Even the reservoir can be locally out of equilibrium, as discussed previously. We have also made no reference to the size of the system of interest, as long as it is much smaller than the reservoir. However, in contrast to systems in thermodynamic equilibrium, there is no guarantee that the extensive macroscopic variables, such as 𝑈, 𝑉, or 𝑁, define the state of the system in any reproducible sense. To fully define an out-of-equilibrium system we need to introduce order parameters that can describe the nonequilibrium aspects of the system.

The previous density is an integrated version of the usual canonical distribution. The size of the energy shell of the system of interest, Φ0, can be written as an integral over states Γ such thatΦ0(𝑈)=𝐻0(Γ)=𝑈dΓ,(8) with 𝐻0 being the Hamiltonian of the system of interest. With this definition, the density in (7) reduces to the usual canonical distribution exp(𝛽𝐻0(Γ)) for states Γ. We will not make further use of this microscopic version of the density.

3. Fluctuation Theorems

The canonical density in (7) can be expanded by parametrizing each energy shell with some continuous coordinate 𝜐 so that every part of phase space has coordinates (𝑈,𝜐). The coordinate 𝑣 is again a macroscopic coordinate so that any combination (𝑈,𝜐) can correspond to many microscopic states. At each value of 𝜐 the differential 𝜙(𝑈,𝜐)d𝑈d𝜐 is proportional to the number of states between coordinate values 𝑈 and 𝑈+d𝑈, and 𝜐 and 𝜐+d𝜐, and it is normalised such that𝜙(𝑈,𝜐)d𝜐=Φ0(𝑈).(9) The parametrisation is arbitrary at this point and can be chosen such as to divide the phase space in as fine a structure as desired for a given application. We can define an entropy 𝑆0(𝑈,𝜐) again as the logarithm of the number of available states for the system of interest corresponding to the subset of phase space defined by (𝑈,𝜐):𝑆0(𝑈,𝜐)=𝑘ln𝜙(𝑈,𝜐).(10)

Now consider a process that occurs on the energy shell 𝑈 where some variable changes from 𝐴𝐵. On the parametrized energy shell this corresponds to a coordinate shift from 𝜐(𝐴)𝜐(𝐵). The number of corresponding states changes from 𝜙(𝑈,𝜐(𝐴))𝜙(𝑈,𝜐(𝐵)). We can use detailed balance to express the ratio of the probability of making this transition to the probability of making the reverse transition as the ratio of the number of states at (𝑈,𝜐(𝐴)) to the number of states at (𝑈,𝜐(𝐵)):𝑝𝐴𝐵𝑝𝐵𝐴=𝜙(𝑈,𝜐(𝐵))Δ𝜙(𝑈,𝜐(𝐴))=exp𝐴𝐵𝑆𝑘,(11) where Δ𝐴𝐵𝑆/𝑘=𝑆0(𝑈,𝜐(𝐵))𝑆0(𝑈,𝜐(𝐴)). If, in addition, during the process 𝐴𝐵 the energy of the system of interest changes from 𝑈𝐴𝑈𝐵 through exchange with the reservoir, then the previous ratio of probabilities can still be expressed as exp(Δ𝐴𝐵𝑆/𝑘) but now withΔ𝐴𝐵𝑆=𝑆0𝑈𝐵,𝜐(𝐵)𝑆0𝑈𝐴𝑈,𝜐(𝐴)𝑘𝛽𝐵𝑈𝐴.(12) We can always write the entropy change of the system of interest as the sum of the entropy change due to heat exchange with the reservoir and an irreversible entropy change associated with uncompensated heat [14, 18], namely, 𝑆0(𝑈𝐵,𝜐(𝐵))𝑆0(𝑈𝐴,𝜐(𝐴))=𝑘𝛽(𝑈𝐵𝑈𝐴)+Δ𝑖𝑆0. We thus conclude that Δ𝐴𝐵𝑆=Δ𝑖𝑆0; that is, the relevant entropy change in (11) equals the irreversible entropy change of the system of interest. So for processes that occur either on or across energy shells, we have𝑝𝐴𝐵𝑝𝐵𝐴Δ=exp𝑖𝑆0𝑘,(13) with Δ𝑖𝑆0 being the irreversible entropy change of the system in a process 𝐴𝐵. The right-hand side of this equation is only dependent on the irreversible entropy change Δ𝑖𝑆0 between the two states of the system of interest. So this equation must be true for any pair of states (𝐴,𝐵) that are related by the same irreversible entropy change. We thus arrive at the fluctuation theorem [1, 2]:𝑝Δ𝑖𝑆𝑝Δ𝑖𝑆Δ=exp𝑖𝑆𝑘,(14) with 𝑝(Δ𝑖𝑆) being the probability that the system of interest makes a transition with irreversible entropy change of Δ𝑖𝑆 and 𝑝(Δ𝑖𝑆) being the probability for the opposite change.

The fluctuation theorem applies to spontaneous processes that occur in thermostated but otherwise isolated systems. We next consider processes that occur when we modify the system of interest by changing some external macroscopic parameters. The entropy of the energy shell 𝑈 is then also a function of some parameter 𝜆, namely, 𝑆=𝑆𝜆(𝑈,𝜐). Without loss of generality we set 𝜆=0 at 𝐴 and 𝜆=1 at 𝐵. In this case the irreversible entropy change in (13) isΔ𝑖𝑆𝑘=𝑆1𝑈𝐵,𝜐(𝐵)𝑆0𝑈𝐴𝑈,𝜐(𝐴)𝑘𝛽𝐵𝑈𝐴.(15) Apart from this, there is no change in the considerations leading to the fluctuation theorem. By definition, thermostated systems that receive work 𝑊𝐴𝐵 from their environment have an irreversible entropy change equal toΔ𝑖𝑆𝑘𝑊=𝛽𝐴𝐵Δ𝐴𝐵𝐹,(16) with Δ𝐴𝐵𝐹 being the change in free energy going from 𝐴 to 𝐵. Recognising that the right-hand side is again only a function of the difference between the two states, we arrive at the Crooks fluctuation theorem [19]:𝑝01(𝑊)𝑝10𝛽(𝑊)=exp𝑊Δ01𝐹,(17) with 𝑝01(𝑊) being the probability that the system absorbs work 𝑊 when 𝜆 changes from 0 to 1, and 𝑝10(𝑊) being the probability that the system performs work 𝑊 when 𝜆 changes in reverse from 1 to 0. Because the transition probabilities can be normalised with respect to the exchanged work, it is straightforward to use this equation to show that the expectation value of exp(𝛽(𝑊Δ01𝐹)) equals unity, or equivalently,exp(𝛽𝑊)=exp𝛽Δ01𝐹.(18) This is the Jarzynski equality [3].

The consistency of the previous argument is strengthened by the following independent route to calculate free energy changes. The phase space measure Φ(𝑈) can be normalised with the partition function 𝑍𝜆:𝑍𝜆=Φ𝜆(𝑈)exp(𝛽𝑈)d𝑈,(19) where Φ𝜆(𝑈) is proportional to the number of accessible states of the isolated the system of interest when the external parameter is set to 𝜆. The equilibrium free energy for the thermostated system is𝐹𝜆=𝛽1ln𝑍𝜆.(20) Next we consider what happens to the equilibrium free energy of the system when we vary 𝜆 from 0 to 1. The partition function at 𝜆=1 satisfies𝑍1=Φ1(=Φ𝑈)exp(𝛽𝑈)d𝑈0(𝑈)expΔ𝑆𝑘exp(𝛽𝑈)d𝑈=𝑍0expΔ𝑆𝑘,(21) where . denotes an ensemble average over the initial ensemble, and Δ𝑆=𝑘ln(Φ1(𝑈)/Φ0(𝑈)). As before, the entropy change can be written as the sum of the entropy change due to heat exchange with the reservoir and the irreversible entropy change due to uncompensated heat. Because the system plus the reservoir is thermally insulated, any heat given to the reservoir must be compensated by work performed by the external parameter change. The entropy change can therefore be written as exp(Δ𝑆/𝑘)=exp(Δ𝑖𝑆/𝑘𝛽𝑊) so that we find𝑍1𝑍0=Δexp𝑖𝑆𝑘𝛽𝑊.(22) Because (16) is true for any microscopic realisation of the process, we find that the right-hand side of the previous equation is the same for every realisation and it is equal to exp(𝛽Δ𝐹). This is consistent with the equilibrium expression for the free energy, (20), from which follows that exp(𝛽Δ𝐹)=𝑍1/𝑍0. The previous equation is only apparently in contradiction to the Jarzynski equality, (18). To arrive at the Jarzynski equality we recognise that (16) implies that exp(𝛽(Δ𝐹𝑊))=exp(Δ𝑖𝑆/𝑘)=1, where the last equality follows from integrating the fluctuation theorem over all values of Δ𝑖𝑆.

4. Discussion

We have shown that the fluctuation theorem (14) and Jarzynski equality (18) follow from general counting properties of entropy and not from the underlying dynamics. As such we expect both results to be widely applicable to systems that are in some sense thermostated, that is, systems that are able to settle on a given expectation value for the total energy by interaction with a reservoir.

The climate system is potentially a nontrivial example of such a system: the incoming short-wave radiation from the Sun is depleted by long-wave (thermal infrared) radiation from the Earth to space. The corresponding equilibrium temperature is the bolometric temperature of the planet (about 255 K in case of the Earth [14]). (The bolometric radiation temperature of the Earth is substantially lower than the observed average surface temperature of about 288 K, because of the greenhouse effect of the atmosphere.) It is not obvious how to apply the fluctuation theorems to the climate system and how the entropy production in the climate system is related to the actual climate on Earth. For example, most of the entropy production in the climate system is due to degradation of radiation (e.g., [20]); namely, short wavelength visible sunlight is thermalized by molecular absorption into molecular thermal energy corresponding to long wavelength infrared radiation. This degradation of radiative energy is the main source of entropy production in the climate system, but as this entropy production only resides in the photon field, its relation to, for example, kinetic energy dissipation in the atmosphere is not clear. So from this example it appears that we need to select the relevant forms of entropy production before we can use it to make inferences about the climate system.

It remains to be seen whether the fluctuation theorems can be usefully applied to complex systems such as the climate, but we believe that the derivation presented here can pave the way for attempts in that direction.