Classical Mechanics/Lagrangian

< Classical Mechanics

Mechanics considered using forcesEdit

In Newtonian mechanics, a mechanical system is always made up of point masses or rigid bodies, and these are subject to known forces. One must therefore specify the composition of the system and the nature of forces that act on the various bodies. Then one writes the equations of motion for the system. Here are some examples of how one describes mechanical systems in Newtonian mechanics (these examples are surely known to you from school-level physics).

  • Example: a free mass point.

This is the most trivial of all mechanical systems: a mass point that does not interact with any other bodies and is subject to no forces. Introduce the coordinates   to describe the position of the mass point. Since the force is always equal to zero, the equations of motion are  . The general solution of these equations describes a linear motion with constant velocity:  , etc.

  • Example: two point masses with springs attached to a motionless wall.

Two masses can move along a line (the   axis) without friction. The mass   is attached to the wall by a spring, and the mass   is attached to the mass   by a spring. Both springs have spring constant   and the unstretched length  .

To write the equations of motion, we first introduce the two coordinates   and then consider the forces acting on the two masses. The force on the mass   is the sum of the leftward-pointing force   from the left spring and the rightward-pointing force   from the right spring. The force on   is a leftward-pointing  . By definition of a "spring" we have   and  . Therefore we write the equations for the accelerations   of the two masses:


At this point we are finished describing the system; we now need to solve these equations for particular initial conditions and determine the actual motion of this system.

Introducing the action principleEdit

The Lagrangian description of a mechanical system is rather different: First, we do not ask for the evolution of the system given some initial conditions, but instead assume that the position of the system at two different time moments   and   is known and fixed. For convenience, let us collect all coordinates (such as   or   above) into one array of "generalized coordinates" and denote them by  . So the "boundary conditions" that we impose on the system are   and  , where   are fixed numbers. We now ask: how does the system move between the time moments   and  . The Lagrangian description answers: during that time, the system must move in such a way as to give the minimum value to the integral  , where   is a known function called the Lagrange function or Lagrangian. For example, the Lagrangian for a free mass point is


The Lagrangian for the above example with two masses attached to the wall is


For instance, according to the Lagrangian description, the free point mass moves in such a way that the functions   give the minimum value to the integral  , where the values of   at times   are fixed.

In principle, to find the minimum value of the integral   one would have to evaluate that integral for each possible trajectory   and then choose the "optimal" trajectory   for which this integral has the smallest value. (Of course, we shall learn and use a much more efficient mathematical approach to determine this "optimal" trajectory instead of trying every possible set of functions  .) The value of the mentioned integral is called the action corresponding to a particular trajectory  . Therefore the requirement that the integral should have the smallest value is often called "the principle of least action" or just action principle.

At this point, we need to answer the pressing question:

  • How can it be that the correct trajectory   is found not by considering the forces but by requiring that some integral should have the minimum value? How does each point mass "know" that it needs to minimize some integral when it moves around?

The short answer is that the least action requirement is mathematically equivalent to the consideration of forces if the Lagrangian   is chosen correctly. The condition that some integral has the minimum value (when the integral is correctly chosen) is mathematically the same as the Newtonian equations for the acceleration. The point masses perhaps "know" nothing about this integral. It is simply mathematically convenient to formulate the mechanical laws in one sentence rather than in many sentences. (We shall see another, more intuitive explanation below.)

Suppose that we understand how the requirement that an integral has the minimum value can be translated into equations for the acceleration. Obviously the form of the integral needs to be different for each mechanical system since the equations of motion are different. Then the second question presents itself:

  • How can we find the Lagrange function   corresponding to each mechanical system?

This is a more complicated problem and one needs to study many examples to gain a command of this approach. (In brief: the Lagrange function is the kinetic energy minus the potential energy.)

Before considering Lagrange functions, we shall look at how the mathematical requirement of "least action" can be equivalent to equations of motion such as given in the examples above.

Variation of a functionalEdit

A function is a map from numbers into numbers; a functional is a map from functions into numbers. An application of a functional to a function is usually denoted by square brackets, e.g.  .

Random examples of functionals, just to illustrate the concept:


In principle, a functional can be anything that assigns a number to any function. In practice, only some functionals are interesting and have applications in physics.

Since the action integral maps trajectories into numbers, we can call it the action functional. The action principle is formulated as follows: the trajectory   must be such that the action functional evaluated on this trajectory has the minimum value among all trajectories.

This may appear to be similar to the familiar condition for the mechanical equilibrium: the coordinates   are such that the potential energy has the minimum value. However, there is a crucial difference: when we minimize the potential energy, we vary the three numbers   until we find the minimum value; but when we minimize a functional, we have to vary the whole function   until we find the minimum value of the functional.

The branch of mathematics known as calculus of variations studies the problem of minimizing (maximizing, extremizing) functionals. One needs to learn a little bit of variational calculus at this point. Let us begin by solving some easy minimization problems involving functions of many variables; this will prepare us for dealing with functionals which can be thought of as functions of infinitely many variables. You should try the examples yourself before looking at the solutions.

Example 1: Minimize the function   with respect to  .

Solution: Compute the partial derivatives of   with respect to  . These derivatives must both be equal to zero. This can only happen if  .

Example 2: Minimize the function   with respect to all  .

Solution: Compute the partial derivatives of   with respect to all  , where  . These derivatives must all be equal to zero. This can only happen if all  .

Example 3: Minimize the function   with respect to all   subject to the restrictions  .

Solution: Compute the partial derivatives of   with respect to  , where  . These derivatives must all be equal to zero. This can only happen if   for  . The values   are known, therefore we find  .

Intuitive calculationEdit

Let us now consider the problem of minimizing the functional   with respect to all functions   subject to the restrictions  . We shall first perform the minimization in a more intuitive but approximate way, and then we shall see how the same task is handled more elegantly by the variational calculus.

Let us imagine that we are trying to minimize the integral   with respect to all functions   using a digital computer. The first problem is that we cannot represent "all functions"   on a computer because we can only store finitely many values   in an array within the computer memory. So we split the time interval   into a large number   of discrete steps  , where the step size   is small; in other words,  . We can describe the function   by its values   at the points  , assuming that the function   is a straight line between these points. The time moments   will be kept fixed, and then the various values   will correspond to various possible functions  . (In this way we definitely will not describe all possible functions  , but the class of functions we do describe is broad enough so that we get the correct results in the limit  . Basically, any function   can be sufficiently well approximated by one of these "piecewise-linear" functions when the step size   is small enough.)

Since we have discretized the time and reduced our attention to piecewise-linear functions, we have


within each interval  . So we can express the integral   as the finite sum,


where we have defined for convenience  .

At this point we can perform the minimization of   quite easily. The functional   is now a function of   variables  , i.e.  , so the minimum is achieved at the values   where the derivatives of   with respect to each   are zero. This problem is now quite similar to the Example 3 above, so the solution is  . Now we recall that   is the value of the unknown function   at the point  . Therefore the minimum of the functional   is found at the values   such that would correspond to the function  . As we increase the number   of intervals, we still obtain the same function  , therefore the same function is obtained in the limit  . We conclude that the function   minimizes the functional   with the restrictions  .

Variational calculationEdit

The above calculation has the advantage of being more intuitive and visual: it makes clear that minimization of a functional   with respect to a function   is quite similar to the minimization of a function   with respect to a large number of variables   in the limit of infinitely many such variables. However, the formalism of variational calculus provides a much more efficient computational procedure. Here is how one calculates the function   that minimizes  .

Let us consider a very small change   in the function   and see how the functional   changes:


(In many textbooks, the change in   is denoted by  , and generally the change of any quantity   is denoted by  . We chose to write   instead of   for clarity.)

The functional   is called the variation of the functional   with respect to the change   in the function  . The variation is itself a functional depending on two functions,   and  . When   is very small, we expect that the variation will be linear in  , just like the variation in the value of a normal function is linear in the amount of change in the argument, e.g.   for small  . So we expect that the variation   of the functional   will be a linear functional of  . To understand what a linear functional looks like, consider a linear function   depending on several variables  ,  . This function can always be written as


where   are suitable constants. Since a functional is like a function of infinitely many variables, the index   becomes a continuous variable  , the variables   and the constants   become functions  , while the sum over   becomes an integral over  . Thus, a linear functional of   can be written as an integral,


where   is a suitable function. In the case of the usual function  , the "suitable constant  " is the derivative  . By analogy we call   above the variational derivative of the functional and denote it by  .

A function has a minimum (or maximum, or extremum) at a point where its derivative vanishes. So a functional   has a minimum (or maximum, or extremum) at the function   where the functional derivative vanishes. We shall justify this statement below, and for now let us now compute the functional derivative of the functional  .

Substituting   instead of   into the functional, we get


where we are going to neglect terms quadratic in   and so we didn't write them out. We now need to rewrite this integral so that no derivatives of   appear there; so we integrate by parts and find


Since in our case the values   are fixed, the function   must be such that  , so the boundary terms vanish. The variational derivative is therefore


The functional   has an extremum when its variation under an arbitrary change   is second-order in  . However, above we have obtained the variation as a first-order quantity, linear in  ; so this first-order quantity must vanish for   where the functional has an extremum. An integral such as   can vanish for arbitrary   only if the function   vanishes for all  . In our case, the "function  ," i.e. the variational derivative  , is equal to  . Therefore the function   on which the functional   has an extremum must satisfy   or more simply  . This differential equation has the general solution  , and with the additional restrictions   we immediately get the solution  .

General formulationEdit

To summarize: the requirement that the functional   must have an extremum at the function   leads to a differential equation on the unknown function  . This differential equation is found as


The procedure is quite similar to finding on extremum of a function  , where the point   of the extremum is found from the equation  .

Suppose that we are now asked to minimize the functional   subject to the restrictions  ; in mechanics we shall mostly be dealing with functionals of this kind. We might try to discretize the function  , as we did above, but this is difficult. Moreover, for a different functional   everything will have to be computed anew. Rather than go through the above procedure again and again, let us now derive the formula for the functional derivative for all functionals of this form, namely


where   is a given function of the coordinates   and velocities   (assuming that there are   coordinates, so  ). This function   is called the Lagrange function or simply the Lagrangian.

We introduce the infinitesimal changes   into the functions   and express the variation of the functional first through   and  ,


Then we integrate by parts, discard the boundary terms and obtain


Thus the variational derivatives can be written as


Euler-Lagrange equationsEdit

Consider again the condition for a functional to have an extremum at  : the first-order variation must vanish. We have derived the above formula for the variation  . Since all   are completely arbitrary (subject only to the boundary conditions  ), the first-order variation vanishes only if the functions in square brackets all vanish at all  . Therefore we obtain the Euler-Lagrange equations


These are the differential equations that express the mathematical requirement that the functional   has an extremum at the set of functions  . There are as many equations as unknown functions  , one equation for each  .

Note that the Euler-Lagrange equations involve partial derivatives of the Lagrangian with respect to coordinates and velocities. The derivatives with respect to velocities   are sometimes written as   which might at first sight appear confusing. However, all that is meant by this notation is the derivative of the function   with respect to its second argument.

The Euler-Lagrange equations also involve the derivative   with respect to the time. This is not a partial derivative with respect to   but a total derivative. In other words, to compute  , we need to substitute the functions   and   into the expression  , thus obtain a function of time only, and then take the derivative of this function with respect to time.

Remark: If the Lagrangian contains higher derivatives (e.g. the second derivative), the Euler-Lagrange formula is different. For example, if the Lagrangian is  , then the Euler-Lagrange equation is


Note that this equation may be up to fourth-order in time derivatives! Usually, one does not encounter such Lagrangians in studies of classical mechanics because ordinary systems are described by Lagrangians containing only first-order derivatives.

Summary: In mechanics, one specifies a system by writing a Lagrangian and pointing out the unknown functions in it. From that, one derives the equations of motion using the Euler-Lagrange formula. You need to know that formula really well and to understand how to apply it. This comes only with practice.

How to choose the LagrangianEdit

The basic rule is that the Lagrangian is equal to the kinetic energy minus the potential energy. (Both should be measured in an inertial system of reference! In a non-inertial system, this rule may fail.)

It can be shown that this rule works for an arbitrary mechanical system made up of point masses, springs, ropes, frictionless rails, etc., regardless of how one introduces the generalized coordinates. We shall not study the proof of this statement, but instead go directly to the examples of Lagrangians for various systems.

Examples of LagrangiansEdit

  • The Lagrangian for a free point mass moving along a straight line with coordinate  :
  • A point mass moving along a straight line with coordinate  , in a force field with potential energy  :
  • A point mass moving in three-dimensional space with coordinates  , in a force field with potential energy  :
  • A point mass constrained to move along the circle   in the gravitational field near the Earth (the   axis is vertical). It is convenient to introduce the angle   as the coordinate, with  . Then the potential energy is  , while the kinetic energy is  . So the Lagrangian is

Note that we have written the Lagrangian (and therefore we can derive the equations of motion) without knowing the force needed to keep the mass moving along the circle. This shows the great conceptual advantage of the Lagrangian approach; in the traditional Newtonian approach, the first step would be to determine this force, which is initially unknown, from a system of equations involving an unknown acceleration of the point mass.

  • Two (equal) point masses connected by a spring with length  :
  • A mathematical pendulum, i.e. a massless rigid stick of length   with a point mass attached at the end, that can move only in the   plane in the gravitational field near the Earth (vertical   axis). As the coordinate, we choose the angle   between the stick and the   axis. The Lagrangian is
  • A point mass   sliding without friction along an inclined plane that makes an angle   with the horizontal, in the gravitational field of the Earth. As the coordinate, we choose  , where   is parallel to the incline. The height   is then  , so the potential energy is  . The kinetic energy is computed as

Hence, the Lagrangian is


Further workEdit

Exercise: You should now determine the Euler-Lagrange equations that follow from each the above Lagrangians and verify that these equations are the same as would be obtained from school-level Newtonian considerations for the respective physical systems. This should occupy you for at most an hour or two. Only then you will begin to appreciate the power of the Lagrangian approach.

Some more Lagrangian exercises here.

For more examples of setting up Lagrangians for mechanical systems and for deriving the Euler-Lagrange equations, ask your physics teacher or look up in any theoretical mechanics problem book. Much of the time, the Euler-Lagrange equations for some complicated system (say, a pendulum attached to the endpoint of another pendulum) would be too difficult to solve, but the point is to gain experience deriving them. Their derivation would be much less straightforward in the old Newtonian approach using forces.

See here for a very brief primer on differential equations.

If this is your first time looking at Lagrangians, you might be still asking yourself: how could the motion of a system be described by saying that some integral has the minimal value? Is it a purely formal mathematical trick, and if not, how can one get a more visually intuitive understanding? A partial answer is here.