Classical Mechanics/Lagrange Theory


This section contains several theoretical developments of the Lagrangian formalism that are not directly necessary for solving problems. However, these considerations help understand the theory more deeply and answer certain important questions.

Why does the extremum of a functional determine motion?Edit

In the Lagrangian formulation of mechanics, the trajectory   is determined from the condition that the action functional   should have an extremum. (It is not always the case that the trajectory is the minimum of the action; in some cases it might be merely an extremum, i.e. a point where the functional derivative   vanishes.) This condition is known as the action principle. By now, you should be familiar with the mathematical procedures used to derive the equations of motion from the action principle.

So, at this point, you should be well used to the fact that the correct equations of motion for each mechanical system indeed follow from the action principle, if the Lagrangian is chosen appropriately. However, it might still feel like a mystery to you that Newton's laws are equivalent to the condition for the extremum of some functional. You might be asking yourself: why is this possible at all?

Here is one explanation that may help. Let us consider a simple mechanical system: a point mass   moving in one dimension, with coordinate  , in a potential  . (The same considerations can be easily generalized to the case of more than one dimensions and more than one coordinate.) Suppose that   is the correct trajectory according to Newton's law,


How can we use a functional   to express the condition that the trajectory   is the correct one? One way is to demand that the deviation of   from   is everywhere zero. This can be expressed using the functional


It is clear that the functional   has the minimum value (obviously the minimum is 0) if and only if   for all  . This is an example of how to use a functional to express some condition on functions: the functional   measures the deviation of   from   all along the way. The smallest possible deviation is no deviation at all; thus, the minimum of the functional   is at the trajectory   that does not deviate at all from  .

Another similar way to specify the trajectory is to use the functional


This functional, together with the boundary conditions  , has the minimum value if and only if   for all  .

Admittedly, the functionals   do not help us to formulate the laws of mechanics, because they already contain the correct trajectory   explicitly. We shall now construct another functional,  , starting from   and trying to eliminate the explicit dependence on  .

Let us rewrite   as


The third term,  , is a fixed function and does not vary when we vary  . Therefore we may omit that term from  . Furthermore, we would like to have   rather than  , since we could then use Newton's law for the correct trajectory. So let us integrate the second term by parts:


The boundary term   does not vary with   since the boundary values of   are fixed. Therefore we may omit that term. Finally, we use Newton's law to replace   by  :


If we now assume that the trajectory   deviates very little from the correct trajectory  , then we may approximately write


The term quadratic in   can be omitted under the above assumption. The terms   and   can be omitted since they are independent of  . Thus we find that the functional   is equivalent, up to inessential terms that do not vary with  , to the following functional:


It is clear that   is equivalent to the usual Lagrangian up to the coefficient  .

In this way, we obtained a functional   which has a minimum when   is very close to  ; i.e. it is a local minimum. The new functional does not depend explicitly on  , just as we wanted. The price to pay is that this functional works only for small deviations from the correct trajectory. Indeed, the functional   may have other minima or maxima which the original functional   does not have. The only real justification for the correctness of   is that the equations of motion coincide with Newton's law.

Why can we use arbitrary coordinates to write the Lagrangian?Edit

In simple cases, the Lagrangian is equal to the difference of the kinetic and the potential energy terms. However, one needs to select some coordinates to describe these terms. Then it is completely unimportant which variables are chosen as coordinates; these variables could be lengths, angles, or any functions of lengths and angles (but not velocities!). In other words, one can use any coordinate systems or even just parts of some coordinate systems, as long as the possible positions of every mass point is adequately described by the coordinates and the appropriate constraints. For this reason, the coordinates entering the Lagrangian are called generalized coordinates. Usually, one chooses generalized coordinates for convenience, to minimize the required computational work, or to decrease the number of necessary constraints.

However, you may be asking yourself: why is it that one is allowed to use arbitrary coordinates in the Lagrangian formalism? Certainly, as we know, Newton's laws are not the same in different coordinates: for instance, the mass times the acceleration is equal to the force only if the acceleration is computed as  , where   is the vector of Cartesian coordinates  . This formula will be incorrect if the vector   were to consist of, say, the radius  , the azimuthal angle   in the   plane, and the coordinate  . However, the Lagrangian formalism will work just fine if we express the kinetic and the potential energy through the variables  . The equations of motion will be given by the Euler-Lagrange equation,


as before. One says that the Lagrangian formalism is covariant with respect to coordinate transformations.

The reason for this can be explained in two ways: either more formally, by showing that the Euler-Lagrange equations remain the same under an arbitrary change of coordinates; or more visually, by approaching the situation from the geometric point of view.

Formal derivationEdit

For simplicity, we shall only consider a one-dimensional problem with a Lagrangian  , where   is a generalized coordinate. The same consideration is very easily generalized to the case of multiple coordinates.

Suppose that a new coordinate   is chosen instead of  . The new coordinate can be a function of the old coordinate. Let us consider an even more general case where the change of coordinates depends on time (i.e. we may choose slightly different coordinates at different times). Then the new coordinate is related to the old one by a formula such as


where   is a known function.

Now we need to express the old Lagrangian   through the new variable   and its derivative  . We have


where we denote partial derivatives by subscripts with commas, e.g.  . This is a condensed notation frequently used in physics.

The Lagrangian expressed through the new variable   is therefore


The new variable   is a good variable if it is a nontrivial function of the old one, i.e. if  . Then the new Lagrangian will be a nontrivial function that depends on   as well as on  . So we shall assume that   at least within some interval of  .

Now let us compare the equations of motion (EOM) that we would derive in the old coordinates and in the new coordinates.

The old EOM can be written as


The new EOM is


Let us express this equation through   instead of  :




Therefore, the new EOM is


Simplifying this expression, we find


We find that the new EOM is indeed equivalent to the old one, under the assumption that  .

Geometric pictureEdit

The computation presented above is straightforward and explicit, but may leave you wondering why it works. Here is a more visual explanation.

The Euler-Lagrange equations express the condition that the functional   has an extremum at the trajectory  . Let us imagine a space of all trajectories, i.e. some huge space where each "point" represents one entire trajectory  . The functional   has an extremum at some "point"   which is the actual trajectory of the mechanical system. When we change coordinates,  , we merely change our description of this space of trajectories. We cannot change the fact that the functional   has an extremum somewhere, at some "point"  . We may only change our description of this "point". Therefore, after a change of variables the new functional   will again have an extremum at some "point"  , and this "point"   will have to correspond to the "point"   after the change of variables. The existence of the extremum is a geometric characteristic of the shape of the functional  ; that's why it is independent of the way we choose to describe it with coordinates.

Let us consider a simple example where we use functions instead of functionals. The function   has a minimum at  . We may change coordinates and use   instead of  , where e.g.  . This is a well-defined change of variables on the interval  , where  . In the new coordinates, the function   looks like  . This function has a minimum at   where  . But geometrically speaking, this is exactly the same function as before, except viewed in different coordinates. Therefore, it is no surprise that the minimum   is the old minimum   after the change of coordinates.

This equivalence can be seen more formally. The condition for the minimum of the function   is


This condition is equivalent to the condition for the minimum of the function  , namely  , as long as  . This is why the position of the minimum in the old coordinates,  , exactly corresponds to the position of the minimum in the new coordinates,  .

Similarly, when we consider functionals, we may write the condition for the minimum of   in new coordinates as


It is clear that the condition for the minimum remains the same under the change of variables, as long as the new variables are well-defined, i.e.  .

Is the Lagrangian unique?Edit

Another important question is whether there is only one Lagrangian that yields the correct equations of motion for a given system. The answer is that there are infinitely many different Lagrangians that can be used for any given system.

First of all, one may always multiply the Lagrangian by a constant   and also add an arbitrary fixed function of time,  , to the Lagrangian. The modified Lagrangian is then  . The term   is "fixed" in the sense that it does not depend on  . Then we can integrate this term explicitly and express the modified action as


The last term above is simply a number. Clearly, this modification of the action is irrelevant: if   is an extremum of  , then it is also an extremum of  . Adding a constant to a function does not change the position of the extrema.

More generally, we may add an arbitrary total time derivative to the Lagrangian:


The resulting modification of the action is


where   are the boundary values of  . Since these values are fixed and do not vary when we vary  , the extra term in the action is again a constant. Therefore, this modification of the action does not change the equations of motion. One says that two Lagrangians differing by a total derivative are equivalent.

One may even allow functions   that depend on derivatives of   as well as on  . However, in this case one would need to keep fixed also the values of the corresponding derivatives of   at the boundary points  .

So, as we see, the Lagrangian for a given physical system is not unique. The recipe "kinetic energy minus potential energy" is merely a simple rule that yields a good Lagrangian.

The variety of equivalent Lagrangians is not limited to those that differ by a total derivative or by a constant coefficient. For example, the Lagrangians


lead to the same equation of motion,


even though one obviously cannot find a function   and a constant   such that  . (Such a function would produce at most an extra   term in the Lagrangian, but not terms that are nonlinear in derivatives.)