1

The second differential versus the differential of a differential form

 2 years ago
source link: https://math.stackexchange.com/questions/2843918/the-second-differential-versus-the-differential-of-a-differential-form/3561534#3561534
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.

3 Answers

I will answer both questions. As @md2perpe said in the other answer, one use of the second-order differential is in quadratic approximation. More generally, all of the higher-order differentials together make up a Taylor series, which (for analytic functions, at least locally) is not just an approximation but exact. Yet another use of differentials is to take care of the Chain Rule when performing a change of variables, although you have to be careful here, because the second differential that you have doesn't do that (because it's missing a term). So that's two purposes. A third purpose, in theory, is local optimization, although I don't think that it's used in practice.

Before I can explain these applications, I'll need the total second differential of y when y=f(x). The first differential is dy=f′(x)dx, which depends on both x and dx, so its differential has two terms, which we can find with the help of the Product Rule: d2y=d(dy)=d(f′(x)dx)=d(f′(x))dx+f′(x)d(dx)=(f″(x)dx)dx+f′(x)d2x=f″(x)dx2+f′(x)d2x. This is the correct rule if you want to do change of variables by substitution; that is, if x=g(t), so that y=(f∘g)(t), then using dx=g′(t)dt and d2x=g″(t)dt2+g′(t)d2t, we get: dy=f′(x)dx=f′(g(t))(g′(t)dt)=f′(g(t))g′(t)dt, so (f∘g)′(t)=f′(g(t))g′(t), which you know is correct; and also d2y=f″(x)dx2+f′(x)d2x=f″(g(t))(g′(t)dt)2+f′(g(t))(g″(t)dt2+g′(t)d2t)=(f″(g(t))g′(t)2+f′(g(t))g″(t))dt2+f′(g(t))g′(t)d2t, so (f∘g)″(t)=f″(g(t))g′(t)2+f′(g(t))g″(t), which is less famous but also correct. This is probably the most common application.

Now if you want to, you can partially evaluate the second differential d2y when d2x=0, getting a partial second differential showing only the dependance on x and not on dx: (∂2y)dx=d2y|d2x=0=f″(x)dx2. Then if you divide by dx, you could call that the partial derivative of dy with respect to x; but since dy is itself a differential, we usually divide by dx again to get the second derivative of y with respect to x. Since y depends only on x, this is really a total second derivative, which is why people usually write it as d2y/dx2, even though it's not literally the quotient of d2y and dx2, in contrast to the first derivative. (You could fairly write it as ∂2y/∂x2, or even (∂2y/∂x2)dx to indicate what is held fixed, but this is unlikely to catch on; or if you want to be both pedantic and understood, you can still write (d/dx)2y.) Partial though it is, this second differential does have its uses, as in the quadratic approximation to f at a: Q(x)=f(a)+f′(a)(x−a)+12f″(a)(x−a)2=(y+dy+12d2y)|x=a,dx=x−a,d2x=0. More generally, we have the Taylor series of f at a: T(x)=∞∑n=01n!f(n)(a)(x−a)n=∞∑n=01n!dny|x=a,dx=x−a,dnx=0forn≥2. And if f is analytic at a, then T(x) converges to f(x), at least on some neighbourhood of a. Many authors (including George B. Thomas, apparently) will treat dx as a constant, so that dnx=0 for n≥2 automatically, which makes the development of this application a bit simpler. However, this isn't appropriate for the other applications.

Another potential application is local optimization; so assume that f is twice-differentiable at a and defined on a neighbourhood of a. Normally we say that f has a (local) minimum at a only if f′(a)=0 and f″(a)≥0, and that f has a mimimum at a if f′(a)=0 and f″(a)>0. In higher dimensions, the parts about f″ are generalized to saying that the Hessian matrix is positive (semi)-definite, but the parts about f′ are also still there (referring to the gradient vector). But you can combine each of these into a single statement: y has a minimum at x=a only if d2y|x=a≥0 for all nonzero values (hence all values) of dx and d2x (a kind of positive semidefiniteness); and y has a minimum at x=a if d2y|x=a>0 for all nonzero values of dx and d2x (a kind of positive definiteness). This works unchanged in higher dimensions (taking x and a from Rn instead of just R), and it can even handle points on the boundary of the domain if you're careful. I'm not sure how useful this is, because you still have to pull out the gradient vector and the Hessian matrix to analyse it with the tools of linear algebra, so this is only a potential application rather than anything that I've seen people use; but it's a nice way to think about it in my opinion.


Now to show the connection to differential forms, I want to say something about what d2x, dx2, and so forth really mean. As you probably know, one way to think of an exterior differential form is as a multilinear alternating (or antisymmetric) operation on tangent vectors. While d2x and dx2 are not exterior differential forms, we can still think of them as generalized differential forms, giving operations on tangent vectors that are not necessarily multilinear, alternating, or antisymmetric.

So if you're working in R2 (where a tangent vector at a given point is essentially just another point in R2), with x and y as the standard coordinate functions, then the differential form 2xdx+3y2dy at a point (x0,y0) takes a vector (vx,vy) and returns 2vx+3y20vy. And the differential form x2dx∧dy at a point (x0,y0) takes two vectors, (vx,vy) and (wx,wy), and returns x20(vxwy−vywx) (or half that, depending on your convention). Similarly, the generalized differential form √dx2+dy2 at a point (x0,y0) takes a vector (vx,vy) and returns √v2x+v2y. This is not linear, but it still makes sense. And you can even define what it means to integrate this form along a curve and prove that the value of the integral is the arclength of the curve. So there is no reason that you cannot perform arbitrary operations on differentials.

As for d2x and d2y, these also simply return the x- or y-component of a vector, only the interpretation of this vector is different. That is, while x and y return the x- and y-coordinates of a point thought of as representing position, and dx and dy return the x- and y-components of a vector thought of as representing velocity, so d2x and d2y return the x- and y-components of a vector thought of as representing acceleration, and so on. This is a little more subtle on a more general manifold, but if you work in local coordinates, then you don't really have to pay attention to the subtleties as long as your higher differentials respect the Chain Rule. So if y=f(x), then the second differential d2y=f″(x)dx2+f′(x)d2x at a point x=x0 takes a velocity v and an acceleration a and returns f″(x0)v2+f′(x0)a, and similarly in more dimensions.

Now, there is another possible version of the second differential, which to avoid ambiguity I will write as d⊗dx or d⊗2x for short. But first I should say what dx⊗dx or dx⊗dy means. This is, like the exterior form dx∧dy, an operation that acts on two tangent vectors (at a given point); dx⊗dx multiples their x-components together, and dx⊗dy multiplies the x-component of the first vector by the y-component of the second vector. (Then dx∧dy itself is dx⊗dy−dy⊗dx, or half that, depending on your convention.) This is multilinear, but it's not antisymmetric, so it's not an exterior differential form, but it's still a generalized differential form. Note that now both vectors represent a velocity, but they represent velocities along two different curves, or along two edges of a parallelogram (or triangle). Then d⊗2x is another vector, still a kind of acceleration, but it indicates how the first velocity vector changes when moving in the direction of the second velocity vector (or how the second changes when moving in the direction of the first, which on an infinitesimal level is the same, essentially because of Schwarz's Theorem). Now if y=f(x), we have d⊗dy=f″(x)dx⊗dx+f′(x)d⊗dx, which at a point x=x0 takes two velocities v1 and v2 and an acceleration a and returns f″(x0)v1v2+f′(x0)a.

Now antisymmetrize ⊗ to ∧: d∧dy=f″(x)dx∧dx+f′(x)d∧dx, which features the exterior product (aka wedge product) and exterior differential (aka exterior derivative) that you know from exterior differential forms, and so this all comes to zero. In more detail, if y=f(x), then d∧dy at a point x=x0 takes two velocities v1 and v2 and an acceleration a and returns (f″(x0)v1v2+f′(x0)a)−(f″(x0)v2v1+f′(x0)a)=0 (or half that, which is still 0). Of course, in more dimensions, there are more interesting exterior forms, but d∧d will still be zero. When working exclusively with exterior forms, one may leave out all of the wedges; this is often done with the exterior product and essentially always done with the exterior differential. But I have included all of the wedges here to contrast with the kind of multiplication and differentiation that appears in the second differential.


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK