Dla tego filmu nie wygenerowano opisu.
After seeing how we think about ordinary differential equations in Chapter 1, we turn now to an example of a partial differential equation, the heat equation. To set things up, imagine you have some object, like a piece of metal, and you know how the heat is distributed across it at any one moment, that is, what's the temperature of every individual point along this plate? The question is, how will this distribution change over time, as the heat flows from warmer spots to cooler ones? The image on the left shows the temperature of an example plate using color, with the graph of that temperature being shown on the right.
To take a concrete one-dimensional example, let's say you have two different rods at different temperatures, where that temperature is uniform along each one. You know that when you bring them into contact, the temperature will flow from the hot one to the cool one, tending to make the whole thing equal over time. But how, exactly? What will the temperature distribution be at each point in time? As is typical with differential equations, the idea is that it's easier to describe how this setup changes from moment to moment than it is to jump straight to a description of the full evolution.
We write this rule of change in the language of derivatives, though as you'll see, we'll need to expand our vocabulary a bit beyond ordinary derivatives. And don't worry, we'll learn how to read the equations you're seeing now in just a minute. Differential equations of the heat equation show up in many other parts of math and physics, like Brownian motion, the Black-Scholes equations from finance, and all sorts of diffusion, so there are many dividends to be had from a deep understanding of this one setup. In the last video, we looked at ways of building understanding while acknowledging the truth that most differential equations are simply too difficult to actually solve.
And indeed, PDEs tend to be even harder than ODE's, largely because they involve modeling infinitely many values changing in concert. But our main character for today is an equation that we actually can solve. In fact, if you've ever heard of Fourier series, you may be interested to know that this is the physical problem which Babyface Fourier was trying to solve when he stumbled across the corner of math that is now so replete with his name. We'll dig into Fourier series much more deeply in the next chapter, but I would like to give you at least a little hint of the beautiful connection which is to come.
This animation you're seeing right now shows how lots of little rotating vectors, each rotating at some constant integer frequency, can trace out an arbitrary shape. To be clear, what's happening is that these vectors are being added together, tip to tail, at each moment, and you might imagine that the last one has some sort of pencil at its tip tracing a path as it goes. For finitely many vectors, this tracing usually won't be a perfect replica of the target shape, which in this animation is a lowercase f, but the more circles you include, the closer it gets. What you're seeing now uses only 100 circles, and I think you'd agree that the deviations from the real shape are negligible.
What's mind-blowing is that just by tweaking the initial size and angle of each vector, that gives you enough control to approximate any curve you want. At first, this might seem like an idle curiosity, a neat art project but little more. In fact, the math that makes this possible is the same as the math describing the physics of heat flow. But we're getting ahead of ourselves. Step one is simply to build up the heat equation, and for that, let's start by being clear about what the function we're analyzing is exactly.
We have a rod in one dimension, and we're thinking of it as sitting on an x-axis, so each point of that rod is labeled with a unique number, x. The temperature is some function of that position, T of x, shown here as a graph above it. But really, since the value changes over time, we should think of this function as having one more input, T, for time. You could, if you wanted, think of this input space as really being two-dimensional, representing space and time together, with the temperature being graphed as a surface above it, each slice across time showing you what that distribution looks like at any given moment.
Or you could simply think of this graph of temperature changing with time. Both are equivalent. This surface is not to be confused with what I was showing earlier, the temperature graph of a two-dimensional body. Be mindful when you're studying equations like these of whether time is being represented with its own axis, or if it's being represented with literal changes over time, say in an animation. Last chapter, we looked at some systems where just a handful of numbers changed over time, like the angular and angular velocity of a pendulum, describing that change in the language of derivatives. But when we have an entire function changing with time, the mathematical tools become slightly more intricate.
Because we're thinking of this temperature function with multiple dimensions to its input space, in this case one for space and one for time, there are multiple different rates of change at play. There's the derivative with respect to x, how rapidly the temperature changes as you move along the rod. You might think of this as the slope of our surface when you slice it parallel to the x-axis, or given a tiny step in the x-direction and the tiny change to temperature caused by it, this gives a ratio between the two.
But there's also the rate at which a single point on the rod changes with time, what you might think of as the slope of the surface when you slice it in the other direction, parallel to the time axis. Each one of these derivatives tells only part of the story for how this temperature function changes, so we call them partial derivatives. To emphasize this point, the notation changes a little, replacing the letter D with a special curly D, sometimes called del. Personally, I think it's a little silly to change the notation for this, since it's essentially the same operation. I would rather see notation that emphasizes that the delT terms up in the numerators refer to different changes.
One is a small change to temperature after a small change in time, the other is a small change to temperature after a small step in space. To reiterate a point I made in the calculus series, I do think it's healthy to initially derivatives like this as a literal ratio between a small change to the function's output and the small change to the input that caused it. Just keep in mind that what this notation is meant to encode is the limit of that ratio for smaller and smaller nudges to the input, rather than a specific value of the ratio for a finitely small nudge. This goes for partial derivatives just as much as it does for ordinary derivatives.
The heat equation is written in terms of these partial derivatives. It tells us that the way this function changes with respect to time depends on how it changes with respect to space. More specifically, it's proportional to the second partial derivative with respect to x. At a high level, the intuition is that at points where the temperature distribution curves, it tends to change more quickly, in the direction of that curvature. Since a rule like this is written using partial derivatives, we call it a partial differential equation.
This has the funny result that to an outsider, the name sounds like a tamer version of ordinary differential equations, when quite to the contrary, partial differential equations tend to tell a much richer story than ODE's and are much harder to solve in general. The general heat equation applies to bodies in any number of dimensions, which would mean more inputs to our temperature function, but it'll be easiest for us to stay focused on the one-dimensional case of a rod. As it is, graphing this in a way which gives time its own axis already pushes our visuals into the third dimension.
So I threw out this equation, but where does this come from? How could you think up something like this yourself? Well, for that, let's simplify things by describing a discrete version of the setup, where you have only finitely many points x in a row. This is sort of like working in a pixelated universe where instead of having a continuum of temperatures, we have a finite set of separate values. The intuition here is simple. For a particular point, if its two neighbors on either side are, on average, hotter than it is, it will heat up. If they're cooler on average, it'll cool down.
Here, specifically focus on these three neighboring points, x1, x2, and x3, with corresponding temperatures T1, T2, and T3. What we want to compare is the average of T1 and T3 with the value of T2. When this difference is greater than zero, T2 will tend to heat up, and the bigger the difference, the faster it heats up. Likewise, if it's negative, T2 will tend to cool down, at a rate proportional to that difference. More formally, we write that the derivative of T2, with respect to time, is proportional to the difference between the average value of its neighbors and its own value.
To write this in a way which will ultimately explain the second derivative in the heat equation, let me rearrange this right hand a bit in terms of the difference between T1 and T2, and the difference between T2 and T3. You can quickly check that these two are the same. The top has half of T1, and in the bottom there are two minus signs in front of T1, so it's positive, and the half has been factored out. Likewise, both have half of T3. Then on the bottom, we have a negative T2 that's effectively written twice, so when you take half of that, it's the same as the single negative T2 written up top.
Like I said, the reason to rewrite it is that it takes us a step closer to the language of derivatives. In fact, let's go ahead and write these guys as delta T1 and delta T2. It's the same value on the right hand side, but we're adding a new perspective to how to think about it. Instead of comparing the average of the neighbors to T2, we're thinking about the difference of the differences. Here, take a moment to gut check that this makes sense. If those two differences are the same, then the average of T1 and T3 is the same as T2, so T2 will not tend to change.
If delta T2 is bigger than delta T1, meaning the difference of the differences is positive, notice how the average of T1 and T3 is bigger than T2, so T2 tends to increase. And on the flip side, if the difference of the differences is negative, which means delta T2 is smaller than delta T1, it corresponds to an average of these neighbors being less than T2. We could be especially compact with our notation and write this whole term, the difference between the differences, as delta delta T1. This is known in the lingo as a second difference.
If it feels a little weird to think about, keep in mind it's essentially a compact way of writing the idea of how much T2 differs from the average of its neighbors. It just has this extra factor of one half, is all. And that factor doesn't really matter, because either way we're writing this equation in terms of some proportionality constant. The upshot is that the rate of change for the temperature of a point is proportional to the second difference around it. As we go from this finite context to the infinite continuous case, the analog of a second difference is the second derivative.
Instead of looking at the difference between the temperature values at points some fixed distance apart, you instead consider what happens as you shrink the size of that step toward zero. And in calculus, instead of talking about absolute differences, which would also approach zero, you think in terms of the rate of change. In this case, what's the rate of change in temperature per unit distance? And remember, there are two separate rates of change at play.
How does that temperature change as time progresses, and how does the temperature change as you move along the rod? The core intuition remains the same as what we had in the discrete case, to know how a point differs from its neighbors, look not just at how the function changes from one point to the next, but at how the rate of change itself changes. Now in calculus land, we write this as del squared t over del x squared, the second partial derivative of our function with respect to x. Notice how this slope increases at points where the graph curves upwards, meaning the rate of change of the rate of change is positive.
Similarly, that slope decreases at points where the graph curves downwards, where the rate of change of this rate of change is negative. Look that away as a meaningful intuition for problems well beyond the heat equation. Second derivatives give a measure of how a value compares to the average of its neighbors. Hopefully, that gives some satisfying added color to the equation. It's already pretty intuitive when you read it as saying that curved points tend to flatten out, but I think there's something even more satisfying about seeing a partial differential equation like this arise almost mechanistically from thinking about each point as simply tending towards the average of its neighbors.
Take a moment to compare what this feels like to the case of ordinary differential equations. For example, if we have multiple bodies in space tugging at each other with gravity, what we're analyzing is a handful of changing numbers, in this case the coordinates of each object. The rate of change for any one of these values depends on the values of the other numbers, and we often write this down as a system of equations. On the left, we have the derivative of each value with respect to time, and on the right there's some combination of all the other values.
In our partial differential equation, the difference is that we have infinitely many values changing from a continuum, and again, the way that any one of these values changes depends on the other values, but quite helpfully, each one only depends on its immediate neighbors in some limiting sense of the word neighbor. So here, the relation on the right hand side is not a sum or product of the other numbers, it's instead a kind of derivative, just a derivative with respect to space instead of with respect to time. In a sense, when you think about it, this one partial differential equation is like a system of infinitely many equations, one for each point on the rod.
You might wonder about objects which are spread out in more than one dimension, like a plate or something three dimensional. In that case the equation looks quite similar, but you include the second derivative with respect to the other spatial directions as well. And adding up all of these second spatial derivatives like this is common enough as an operation that it has its own special name, the Laplacian, often written as this upside down triangle squared. It's essentially a multivariable version of the second derivative, and the intuition for this equation is no different from the one dimensional case.
This Laplacian can still be thought of as measuring how different is a point from the average of its neighbors, but now these neighbors aren't just left and right of it, they're all around. For the curious among you, I did a couple of videos during my time at Khan Academy on this operator if you want to go check them out. For those of you with some multivariable calculus under your belt, it's nice to think about as the divergence of the gradient. But you don't have to worry about that, for our purposes let's stay focused on the one dimensional case. If you feel like you understand all of this, pat yourself on the back.
Being able to read a PDE is no joke, and it's a powerful addition to have to your vocabulary for describing the world around you. But after all this time spent interpreting the equations, I say it's high time we start solving them, don't you? And trust me, there are few pieces of math quite as satisfying as what poodle-haired Fourier over here developed to solve this problem. All this and more in the next chapter. I was originally motivated to cover this particular topic when I got an early view of Steve Strogatz's new book Infinite Powers. This isn't a sponsored message or anything like that, but all cards on the table I do have two selfish ulterior motives for mentioning it.
The first is that Steve has been a really strong, maybe even pivotal advocate for the channel since the very beginnings, and I've had an itch to repay that kindness for quite a while. And the second is to make more people love math, and calculus in particular. That might not sound selfish, but think about it. When more people love math, the potential audience base for these videos gets bigger. And frankly, there are few better ways to get people loving the subject than to expose them to Strogatz's writing.
So if you have friends who you know who you think would enjoy the ideas of calculus but maybe have been a bit intimidated by math in the past, this book does a really outstanding job communicating the heart of the subject, both substantively and excessively. Its main theme is the idea of constructing solutions to complex real-world problems from simple idealized building blocks, which as you'll see is exactly what Fourier did. And for those of you who already know and love the subject, you will find no shortage of fresh insights and enlightening stories. I certainly enjoyed it. Again, I kind of know that sounds like an ad, but it's not. I just actually think you'll enjoy the book. .