Classical Mechanics/Constrained

Physics - Classical Mechanics

Prev Up Next

What are constrained systems? edit

In many cases in mechanics, the motion of bodies is constrained in some way: for example, a massive bead may be constrained to move along a bent wire of certain shape; a massive cylinder may be rolling along a surface (but not sliding or flying around); or two masses may be connected by a rigid stick of fixed length.

In each of these cases, there are forces acting on the constrained bodies. In the above examples, the wire produces a force on the bead, the plane acts by the force of friction on the cylinder, and the stick pulls or pushes on the two masses. These forces may vary in time and we do not know the magnitude of these forces in advance. We know, however, that these forces are at every time exactly such as to guarantee that the constraints hold. The bead would fly away if there were no forces acting on it, but the wire provides a force that keeps the bead in place. The two masses connected by a rigid stick experience a force from the stick that is exactly necessary to keep them at a constant distance from each other. (This is what it means that the stick is "rigid".)

In the Newtonian approach to mechanics, these systems are treated by introducing variables   representing the unknown forces, and solving the system of equations for the unknown forces and accelerations. This procedure might be complicated; moreover, we are not always interested in the magnitudes of these unknown forces.

In the Lagrangian approach, there are two straightforward ways to treat constrained systems:

  • The method of solving the constraints. In this method, we introduce the generalized coordinates in such a way that the constraints are automatically satisfied. For example, suppose a point mass is constrained to move along a circle of radius  . We might describe this situation by saying that the Cartesian coordinates   are constrained to satisfy the equation  . Now we can introduce the angle   as the "generalized coordinate" and express the Cartesian coordinates of the point mass as  . These coordinates solve the constraint for all  . The power of the Lagrangian approach is that any generalized coordinates are good enough; so we can now directly write the Lagrangian in terms of the function   and forget about the fact that the system is constrained. We shall automatically obtain the correct equations of motion.
  • The method of Lagrange multipliers. In this method, we do not try to introduce new generalized coordinates that solve the constraints. (This may be difficult; not all algebraic equations can be solved!) Instead, we formulate the variational problem in the presence of constraints: the correct trajectory   is such that the action functional has an extremum while the constraint equations are satisfied. For example, if a point mass is constrained to move along a circle of radius  , but is otherwise unforced, then the Lagrangian is   and the constraint is  . The correct trajectory   will be such that the integral   has the minimum value while the constraint holds at every  . Thus we need to solve the problem of conditional minimization.

Conditional minimization and Lagrange multipliers edit

A conditional minimization problem can be solved by the method of Lagrange multipliers. One takes a different, specially modified Lagrangian that describes the fact that the system is constrained. The modified Lagrangian is equal to the normal Lagrangian plus special terms containing Lagrange multipliers. Let us now explain this method.

For simplicity, consider the minimization of a function   with respect to variables  , subject to the constraint that  .

First recall how the problem would be solved without the constraint: the minimum (or, more generally, an extremum) of   would be the point   where the partial derivatives of   vanish:

 

This would give a system of two equations that determines the two unknowns  .

With the constraint, the above system of equations will not give the correct answer because the solution   most probably will not satisfy the constraint:  . Let us look at the problem geometrically. The constraint   determines a curve or several curves in the   plane; we are looking for the point on that curve where the function   has an extremum. Let us imagine the level lines of the function  , i.e. the lines   for various values of the constant  . The constraint curve   may go across the level lines of  ; it means that the value of   changes along the curve. It is clear from this geometric consideration that the extremum of   along the constraint curve will be the point where the constraint curve is tangent to some level line of  . A condition for two curves to be tangent is that their normal vectors are parallel. The normal vector to a curve   at a point   has components  . The normal vector to the surface line   has components  . These two vectors are parallel if there exists a number   such that

 

This is, together with the constraint  , a system of three equations that determines the three unknowns  . In this way we can solve the conditional minimization problem.

Note that the equations are the same as for minimization of the function   with respect to the three variables   without any constraints. Therefore, the conditional minimization problem is equivalent to a normal minimization problem for the different function,  . This new function is built by adding the original function   and the constraint   multiplied by an extra variable  . This variable is called the Lagrange multiplier.

Example of using Lagrange multipliers edit

Here is a worked example. Suppose we need to maximize the function   under the constraint  .

First, we write the constraint in the form  . For instance, we may take  . (It does not matter how we choose the function  , as long as the constraint is equivalent to the equation  . Then we make a new function

 

Then we need to minimize this function with respect to the three variables  . We obtain the system of equations:

 

It is easy to solve these equations:

 

These are the required values of   and  . The value of the Lagrange multiplier   is useless for us now (but it will be useful when we apply this method to problems in mechanics!).

General case edit

The general form of the constrained optimization problem is the following. We need to find an extremum (or all extrema) of a given function  , where   is an array of variables satisfying   different constraints  .

The geometric consideration that I showed you for the simple case (the example with functions   above) can be generalized to many dimensions and many constraints: one considers level surfaces of   and surfaces given by the constraints. A constrained extremum will be at a point   if the level surface of   are tangent to the constraint surface at that point. The constraint surface is an intersection of   different surfaces  , each having its own normal vector  . It can be shown using elementary vector algebra (I omit the proof) that the normal vector   to the level surface of   must be a linear combination of the   normal vectors  . Therefore, the conditions for the constrained extremum to be located at a point   are that (1)   must satisfy all the constraints and (2) that there should exist   numbers   such that

 

It is easy to see that these conditions are equivalent to the conditions for an extremum of a new function

 

with respect to   variables  , without any constraints.

Let us then formulate the recipe to solve the problem of constrained optimization. We introduce an array   of   different Lagrange multipliers   and build a new function

 

We then find an extremum of this function with respect to the total set of   variables  . In order to do that, we need to solve a system of   equations:

 

By solving these equations, we will obtain a set of values   which we are interested in. The values of the auxiliary variables   can be discarded.

Motion constrained to a surface edit

Let us now consider the Constrained Mechanical Problem Number One: A point mass is moving in a potential   and, additionally, is constrained to move along a surface given by an equation  . (This can be physically realized by e.g. a mass point sliding without friction on top of a curved surface.)

According to the Lagrangian approach, we must find an extremum of the action,  , under the condition that   for all times  . We can apply the method of Lagrange multipliers. Note that we have, in effect, infinitely many constraints---one constraint for each moment of time  . Therefore, we need to introduce a set of infinitely many Lagrange multipliers, one Lagrange multiplier for each  . It is convenient to arrange this set of Lagrange multipliers into a function  .

According to the method of Lagrange multipliers, we need to build a "modified action" which is equal to old action minus the sum of all the constraints multiplied by their respective Lagrange multipliers. Therefore, the new action is

 

Solving the constrained optimization problem is equivalent to finding an extremum of the functional   with respect to arbitrary   and  .

It should now be clear how to approach the "Constrained Problem Number One" in principle. What remains is some technical work:

  • Deriving the Euler-Lagrange equations from the modified action   and solving them. This will be a system of equations for the unknown functions   and  .
  • Interpreting the function  . It will turn out that   is related to the force that is needed to keep the point mass moving only along the surface  . So the Lagrange multipliers have a direct physical interpretation in this case. Namely, we shall show that the time-dependent force   exerted by the surface is equal to  .

Example edit

A massive bead is set on a wire curved in the vertical plane (coordinates  ) as a plot of the function  , where   is a given constant. The only external force is the gravitational field of the Earth. We would like to determine the equation of motion for the position of the bead.

Choose the constraint function as  . Then the modified Lagrangian is

 

The Euler-Lagrange equations are derived in the standard way:

  • Variation w.r.t.   gives:  
  • Variation w.r.t.   gives:  
  • Variation w.r.t.   gives:  

It is not easy to solve these equations by hand, but deriving them is straightforward and requires "no thinking", as physicists say. (That is, we simply follow general rules, and we do not need to make any special decisions or find special tricks for each particular situation.)

Exercise: There is an arbitrary choice in selecting the constraint function  . For example, we could have selected   or even  , and the constraint line   remains the same. Show that the equations of motion also remain the same, up to a change in the definition of  .

Lagrange multipliers and constraining forces edit

We see from the equation for   in the above example that   looks like the component of the normal force in the   direction. So it is clear that the Lagrange multiplier   is somehow related to the unknown constraining force. We shall now derive this relationship in a more general case.

Consider again the Constrained Problem Number One. We solved it by using the modified Lagrangian  . The term   looks like an extra piece of potential energy, although it depends on time through the factor  ). So this is a somewhat weird kind of potential energy, but let us examine it in more detail. As long as the point mass remains within the constraint surface, we have   and this "extra potential energy" is equal to zero. But if the point mass could move a little bit off the constraint surface, say in a direction given by a small vector  , then this "extra potential energy" would change by

 

This looks like work done by a force. The force is directed orthogonally to the constraint surface and is equal to   But we expect precisely this kind of force to act on the mass point by the constraining device.

Let us verify more formally that   is in fact the constraining force we were looking for. The Euler-Lagrange equations that follow from the Lagrangian   are of the form

"mass  acceleration"  

The term   describes the usual "free" forces due to potential energy in the original Lagrangian  . Now it is clear that the term   describes the additional forces due to constraints.

Exercise: There is, of course, an arbitrary choice in defining the constraint function  . For instance, we may choose the constraint function as   or   or some other function   instead of  . Show that the constraining force does not depend on the choice of the function   (because   would change appropriately for every different choice of  !).

Exercise: Figure out how to compute the constraining force   if there are several independent constraints  , ...,  .

Constraints involving velocities edit

So far we considered only constraints that are expressed by functions of coordinates, such as  . This form of the constraints covers a wide range of applications. However, there exist important cases where physical constraints cannot be expressed in this way. For example, the motion of a massive ball that is rolling on a surface without sliding, or the motion of a skater who is sliding on ice, can be described only using complicated constraints that involve velocities and coordinates at the same time. Such constraints, which are not equivalent to a simple function of coordinates, are called nonintegrable or nonholonomic constraints, whereas the constraints of the type we considered are called integrable or holonomic.

One would think that nonholonomic constraints could be simply added to the Lagrangian with Lagrange multipliers. It turns out, however, that the result is not the correct equations of motion for the problem! The main reason is that velocities   are not varied independently from the coordinates  , so the standard procedure involving the Lagrange multipliers is not the correct way to implement nonholonomic constraints. A special theory (based on the so-called Appell equation) was developed to derive the equations of motion for systems with nonholonomic constraints. However, this theory is beyond the scope of the minimal standard course.