Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie
Hi there,
There is an issue with role permissions that is being worked on at the moment.
If you are having trouble with access or permissions on regional forums please post here to get access: https://www.boards.ie/discussion/2058365403/you-do-not-have-permission-for-that#latest

General Chain Rule?

  • 07-01-2011 3:12pm
    #1
    Registered Users, Registered Users 2 Posts: 3,038 ✭✭✭


    I'm looking at the proof of the multivariable chain rule & just a little bit curious about something.
    In the single variable chain rule proof the way I know it is that you take the derivative:

    [latex] f'(x) \ = \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} [/latex]

    and manipulate it as follows:

    [latex] f'(x) \ - \ \lim_{ \Delta x \to \infty} \frac{ \Delta y}{ \Delta x} \ = \ 0 [/latex]

    [latex] f'(x) \ - \ \frac{ \Delta y}{ \Delta x} \ = \ \epsilon (x) [/latex]

    [latex] \Delta y \ = \ f'(x) \Delta x \ + \ \epsilon (x) \Delta x [/latex]

    and you work off that function to prove the single variable version.
    The multivariable version uses a function:

    [latex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/latex]

    which I can see is analogous to the single variable version but having
    trouble deriving to be honest. But assuming that I'm okay with this
    function I wonder about the proof.

    The special case is just do divide by Δt & take the limit:

    [latex] \frac{dz}{dt} \ = \ \lim_{ \Delta t \to \infty} \frac{ \Delta z}{ \Delta t} \ = \ \lim_{ \Delta t \to \infty} \ [ \ f_x(x,y) \frac{ \Delta x}{ \Delta t} \ + \ f_y(x,y) \frac{ \Delta y}{ \Delta t} \ + \ \epsilon_1 (x) \frac{ \Delta x}{ \Delta t} \ + \ \epsilon_2 (x) \frac{ \Delta y}{ \Delta t} \ ] \ = \ \ f_x(x,y) \frac{ dx}{ dt} \ + \ f_y(x,y) \frac{ d y}{ dt} [/latex]

    and if f(x,y) has both x & y as functions of two variables
    z = f(x,y) = f [ x(s,t),y(s,t) ]
    then you follow the exact same idea if you're taking the partial w.r.t.
    to s or t.

    The general chain rule would just be a natural extension of this right? i.e.

    z = f(x₁,x₂,...,xᵢ) = f [ x₁(t₁,t₂,...,tᵢ),x₂(t₁,t₂,...,tᵢ),...,xᵢ(t₁,t₂,...,tᵢ) ]

    and the partial w.r.t. to tᵥ is the exact same idea:


    [latex] \frac{\partial z}{\partial t_v} \ = \ f_{x_1}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_1}{dt_v} \ + \ f_{x_2}[x_2(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_2}{dt_v} \ + \ ... \ + \ f_{x_i}[x_1(t_1,t_2,...,t_v,...,t_i),x_2(t_1,t_2,...,t_v,...,t_i),...] \ \frac{dx_i}{dt_v}[/latex]

    obviously the notation can be shortened :o but that's it right?


    Assuming that proof to be correct I'm wondering about the function


    [latex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/latex]

    I mean rather than just saying it's analogous in different dimensions
    shouldn't there be a way to derive it from the very similar arguments
    involving tangent planes?

    Start with the vector equation N • (X - X₀) = 0 to derive the plane.
    N•(X - X₀) = 0
    (A,B,C)•[(x - x₀),(y - y₀),(z - z₀)] = 0
    A(x - x₀) + B(y - y₀) + C(z - z₀) = 0
    z - z₀ = (-A/C)(x - x₀) + (-B/C)(y - y₀)
    f(x,y) = f(x₀,y₀) + (-A/C)(x - x₀) + (-B/C)(y - y₀)
    f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)

    Now, I understand that this is the description of the tangent plane that
    intersects the point f(x₀,y₀) & can be used to approximate a function for
    all x close to f(x₀,y₀)
    f(x,y) ≈ f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
    I say this to make sure I have the correct understanding, when I derived
    f(x,y) = f(x₀,y₀) + (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
    above I was deriving a linear tangent plane equation but for any function
    at the point f(x₀,y₀) we can use this equation to find the tangent plane
    intersecting the point f(x₀,y₀) and we can also linearly approximate any
    function for all x,y close to f(x₀,y₀) just like the single variable tangent line.

    It is the extra terms of taylor's formula that turn
    f(x,y) ≈ f(x₀,y₀) + ... into f(x,y) = f(x₀,y₀) + ...
    That's been confusing me & I'd really appreciate confirmation that I've
    got the logic right now.

    But how do we turn f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) into
    [latex] \Delta z \ = \ f_x(x,y) \Delta x \ + \ f_y(x,y) \Delta y \ + \ \epsilon_1 (x) \Delta x \ + \ \epsilon_2 (x) \Delta y [/latex]

    in a more linear fashion than just saying it should work :confused:


Comments

  • Registered Users, Registered Users 2 Posts: 1,082 ✭✭✭Fringe


    You can generalise the derivative as the linear operator f'(a) of f at a such that
    f(a + h) = f(a) + f'(a)h + R(h)
    where ||R(h)||/||h|| goes to zero as h goes to zero. This definition can then be applied to any normed space.

    For the chain rule, take functions g:U -> V and f:V -> W. Then the composition is f o g. Now if you assume that g and f are differentiable, then the chain rule comes from applying the definition.


  • Registered Users, Registered Users 2 Posts: 3,038 ✭✭✭sponsoredwalk


    Fringe wrote: »
    You can generalise the derivative as the linear operator f'(a) of f at a such that
    f(a + h) = f(a) + f'(a)h + R(h)
    where ||R(h)||/||h|| goes to zero as h goes to zero. This definition can then be applied to any normed space.

    For the chain rule, take functions g:U -> V and f:V -> W. Then the composition is f o g. Now if you assume that g and f are differentiable, then the chain rule comes from applying the definition.

    The book I'm reading doesn't have this form of the proof, I have this one in
    a more advanced book & will come to it soon I just want to understand the
    more elementary one here first. It seems alright to me, I think, but there's
    also the question of turning the tangent plane equation

    f(x,y) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
    f(x₀ + Δx,y₀ + Δy) - f(x₀,y₀) = (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)
    Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀)

    into

    Δz= (∂f/∂x)(x - x₀) + (∂f/∂y)(y - y₀) + ε₁(x)Δx + ε₂(y)Δy
    Δz= (∂f/∂x)Δx + (∂f/∂y)Δy + ε₁(x)Δx + ε₂(y)Δy


  • Registered Users, Registered Users 2 Posts: 3,038 ✭✭✭sponsoredwalk


    You can't derive from a linear tangent plane a function with error terms,
    the whole thing is conceptually null & void from the beginning!!! :(

    I'm okay, should have recognized this from the start :( Remove this
    abortion of a thread :(


Advertisement