I wish all your videos were on english, because your explanations are just excellent. I was familiar with Ito but u just gave me a new intuition, Thank you so much
Just stumbled upon this explanation. Really nice, thanks a lot! I have a question though. At 15:00 you show the formula for the correct solution and there I can see that there is a term -A2 which is "t". But in this case we basically end up with the function similar to exp(-t), because "t" is increasing and Wt is not. And the solution can't look like what we need. So did you correct the formula afterwards? How does "t" actually contribute to exp(sigma*Wt - t*sigma^2/2) formula?
Very good, although I do think its useful to include a note on the quadratic variance of the Wiener process being equal to dt, for the application of Ito's lemma.
Very instructive video thanks sir. May I know where the intuition comes for adding the variance term in order to correct the solution of the PDE for Wienner process ?
Thanks for the question, sure! You go down 1 percent, then up again 1 percent, but you do not get the initial value. So the "arithmetic" version (one more, one less) has no drift, but the "geometric" version (one percent more, one percent less) does have a drift, and that needs to be taken into account. Does that help (a little)? Best, H^2
If you treat it like a regular ODE, then you get the first function you were describing but it doesn't take into account the stochastic part like you said. My question is how did you figure out the stochastic part was -variance/2*t? Is it because you equated the 2nd and 3rd parts in ito's lemma so that they would cancel out and you would be left with the correct solution?
Thanks for the great video! One question if I may, at 8:29 if your delta t is not 1, your dWt still using standard normal? I just want to clarify the relation between dWt, standard normal, and dt. Is dWt always ~N(0,1) under any dt? Many thanks in advance if anyone can advise.
The variance of the normal distribution is typically the difference in time for the interval. The interval happened to be 1 here so we use N(0,1), but if the intervals were 2, you’d use N(0,2).
@@eugenefrancisco8279 Thank you Sir so much for clarifying! So here we just choose dt=1 so dWt~N(0,1) for mathematical convenient, and all the following derivation is based on dt=1, right? And if we choose dt=2, then his excel spread sheet example will yield different random path as the the variance become 2, dWt will be more volatile and the path will shift more from the mean. is that right?
When in the minute 15:53 you said that "this means that this ( formula) is the solution to our differential equation", this means that the formula is capable to predict all the values of the function ahead on time? Or means that is a solution for all the points that you already have and you can't predict nothing with the solution to the differential equation? Thanks for your answer
Thanks for the question! It is really not trivial what "solving" an SDE means. Can you make predictions? Yes and no. The stochastic process always "wiggles" up and down. So you can never know where it will be at some future time T. But you can calculate where you expect it to be (expected value), the standard deviation, and so on, even the complete probability distribution of potential values. Because if you "solve" the SDE, you know that the probability distribution is a distorted version of the Gaussian distribution, that is, the distribution of the underlying Wiener process. I hope that helps (a little). Compare with the solution of an ODE. Without the solution: you have to simulate in order to get the value at some date. With the solution: you can get the value with an equation, no simulation needed. In the case of an SDE, without the solution: you can use a Monte Carlo simulation to get a distribution of potential values. With the solution: you can calculate the distribution right away, no simulation needed.
thank you for the brilliant content, professor. I have known that a standard Brownian motion has property : W(t) - w(S) FOLLOWS N(0, t-s) . May I ask if the dW used here is also following N (0 , step size) assuming step size used here is 1?
Hello, thanks for your questions, maybe the same answer applies to both questions: In order to make everything as simple as possible for a start, so I used a step size of 1. To make up for that, I used a tiny beta. For a larger beta, to get a good approximation, one would have to reduce the step size. Your second intuition is correct: W(t)-W(s) follows N(0,t-s), and W(t+1)-W(t) follows N(0, 1). If this is not helpful, let me know, I'll write more :-) Best, H^2
The video is very interesting, thank you! However, I didn't quite understand how Ito's lemma allows to take into account continuous variations of the interest rate
Dear Marine, continuous variations of the interest rate are more complex. The easiest model is Vasicek, see for example th-cam.com/video/bHr1bBO61FY/w-d-xo.htmlsi=9caCaOHSEqVKAncS
Thanks Professor for this excellent explanatory video. I have a doubt- do we take dw(t) as realization of a standard normal variable a time t or dw(t) as difference of (realization of a Std Nomal Variable - such realization at time t-1)?
Dear Surendra, thanks for the positive feedback and for the question! I think that your answer 1 is correct. dw is the realization of a standard normal variable, and we add it to w at date t to get w at date t+dt. And of course, like always when we speak about stochastic differential equations, this is just the intuitive explanation. dt is really infinitesimally small, so also dw is infinitesimally small. But adding up an infinite number of infinitesimals gives us something non-zero. Best! H^2
Professor, I have tested the equations on my computer, and I found that the denotation "t" used is actually the step size, instead of the actual t {0,1,2,3,4,5,6,7,8....} . may I ask why is that ?
Thanks for the question! Because the step size is 1, and the volatility of a Wiener process is 1, and drift (mean) zero, so the increments dW_t are standard normally distributed. The equation for dW_t in Excel is thus simply =NORMSINV(RAND()). Then you get W_t by aggregating dW_t, B3 = B2 + C3 and so on. Does that work? Best, H^2
Dear Camile, thanks for the question. Actually, it is the other way round. dW(t) is the increment, it is (in this example) ~N(0,1) because the time increment is always 1. If you choose a smaller increment dt, then the increment is ~N(0, 1/sqrt(dt)). So the increments are normally distributed, such that the path W(t) itself is wiggly but continuous (if you'd let dt converge to 0). I chose dt = 1 in order not to get too much notation at the start. Best, H^2
Hi, I tried to reproduce this in excel. I noticed that if I use a volatility of 10% or 0.1, the Ito and actual value separate at some points but re-converge or are at least close. More interestingly, the naïve exp(Wt) implementation diverges significantly. I suspect that this is due to numerical error. Could you comment on that?
I am not sure, it is difficult to diagnose from far away. My guess would be, a high volatility has a similar effect as a large step size. An a large step size, also for ordinary differential equations, leads to an imprecise approximation. In each step, you follow the tangent instead of the actual solution. So my suggestions would be, let the step size decrease, and see whether the tracking error remains... Let me know if you were successful. Best, H^2
I'll have to create one and upload it some day. But actually, I'd recommend to "program" it yourself, because I think, "doing" is more insightful then "looking". But anyway, I hope I find the time to create a nice uploadable Excel file soon...
This is slightly confusing, and potentially teach the wrong intuition, in that you appear to show there is a volatility impact on terminal wealth. But shouldn't be an expectation of volatility drag on cumulative wealth. Your comparison is not quite "(static) apple to (stochastic) apple". In the ODE (non stochastic) case, you had assumed a constant and positive compounding rate for the stock. But in the SDE, stochastic case, the stock compounding rate is drawn from a normal distribution with mean 0 and stdev of sigma. Therefore in your SDE, the expected compounding rate is 0, while the "expected" compounding rate in your ODE is finite, which you labeld beta! You wouldn't compare the static apple to a fussy stochastic apple with a mean diameter of 0, would you? Now, if the more interesting question here is whether a stochastic process with the same log normal mean as a non stochastic process, would there be a drag on cumulative wealth due to the volatility? To answer that question we need to solve a SDE with the same mean drift as the ODE, but add a stochastic term representing the geometric brownian motion (Wiener process): dS_t =a x S_t x dt+b x S_t x dW_t, where a is the drift (same as your beta) or average compounding rate, and b is the standard deviation of the compounding rate for one time period. dW is the geometric brownian shock ~ N(0,1), or white noise. You can integrate this by first express this Ito process into a Stratonovich form: dS_t =(a - 1/2*b^2) x S_t x dt+b x S_t * dW_t where "x" is the Ito, and "*" is the Stratonovich form of SDE. We can use separation of variable to integrate this but first we have to separate the variables, dividing both side of the Stratonovich form of the SDE by S_t: dS_t/S_t = (a-1/2*b^2) x dt + b * dW_t Now integrate both sides, from t=0 to t you get S_t = So * exp[(a-1/2*b^2)*t +b*sqrt(t)*epsilon] where epsilon ~ N(0, 1). People often erroneously assume 1/2*b^2 is a volatility drag on performance. Let's see is it a drag or not on average expected terminal wealth, S_t. To find the difference in mean terminal value, lets take the expectation of the S_t, and realizing that only epsilon is random you get: E[S_t]=So * exp[(a-1/2*b^2)*t ] * E[exp(b*sqrt(t)*epsilon)] Recall E[exp(X)]= exp(sigma^2/2) if X ~N(0,sigma). I am omitting the derivation, which essentially involves the integral of INTEGRAL[exp(X)*pdf(X) dX], where pdf(X) is normal gaussian in X. This means. E[S_t]= S0* exp(a*t), the "volatility drag" 1/2*b^2t, cancels out by the Expectation of cumulative random process. So there is NO drag. And therefore the fussy stochastic apple has its average shape as the static apple. So in your case, the S_t=S_0*exp[sigma*Wt-1/2*sigma^2*t] your sigm =: b, and 0=: a in my equation. And therefore the final discrete dS_t will have a mean 0, which does not compare to your ODE case where there is a finite drift.
Thanks for the comment, but I disagree, for two reasons. First, I think that mathematical concepts, when used in econ, are much more valuable if one has a strong intuition. Second, real price movements are discrete anyway, so I think there is nothing wrong in having a discrete example, even if it is only programmed in Excel. But also, the video is not for our university students. The topic came up at a Xmas party, where someone (a pretty smart someone, actually) said that he never got an intuition what Ito is about. So I felt challenged and tried to do a video with a lot of intuition in it. This video is not for the 5% for whom Ito is Kindergarden, but for the 15% for whom it is within reach but still a little complicated. My favorite TH-cam shows are MinutePhysics, Numberphile, Veritasium and the like, so I tried to do something in that direction (but without the budget).
If you are good with Matlab or Python you can quickly run a "simulation" and explain complex concepts. The benefit of excel is that you can in real time, tweak a value and see its immediate impact without having to rerun a program. You wont use excel to price options or to evaluate a strategy. You should use excel to teach a fundamental mathematical concept. This video made Steven Shreve's first 4 chapters come to life. Amazing. Thank you for taking the time to explain a complex concept visually and practically. Thank you.
@@mehdiAbderezai I agree. Also, it should be a program that everyone knows, and that applies to Excel more than to Python or MatLab, I believe. I personally use Mathematica... Thanks!!
Being able to explain a complex mathematical concept with the simplest tools is a sign of expertise and shows that one truly understands the inner workings of the concept. I don’t see a point of stating such a comment but to make yourself look “good”. This video is extremely well executed and provides more insight than most other videos. Good job to the Professor!
In 2023 - this is still the most powerful explanation I have ever came across regarding Ito and SDEs. Thanks a lot!
Thanks a lot for the positive response 😀
one the simplest and most excellent expositions I ve seen. Bravo!
Thanks a lot!
This is the first time I got SDE's and how to use Ito's Lemma. Thank you!
Excellent!!!-A CQF alumni
This is the best explanation of Ito I found so far!
Thanks a lot!
Enlightening. Thanks
I would love to see more financial mathematics videos covered in english!!! This was really helpful. Thank you :)
Thanks a lot for the nice words!
Good job sir, i always try to watch intuitive videos of math and the solve the equations understanding why you use that
I’m taking a financial mathematics course this semester. Thanks for this
I wish all your videos were on english, because your explanations are just excellent. I was familiar with Ito but u just gave me a new intuition, Thank you so much
Thanks a lot for your nice comment. I will do a math for economists channel next term, but that will be fairly elementary.
Brilliant explanation.
Thank you for the positive feedback!
Great explanation with Excel. Good job, thank you!
Just stumbled upon this explanation. Really nice, thanks a lot! I have a question though. At 15:00 you show the formula for the correct solution and there I can see that there is a term -A2 which is "t". But in this case we basically end up with the function similar to exp(-t), because "t" is increasing and Wt is not. And the solution can't look like what we need. So did you correct the formula afterwards? How does "t" actually contribute to exp(sigma*Wt - t*sigma^2/2) formula?
Thanks for the clear explanation! greeting from malaysia👍👍👍
Thanks for the reply! Great to see where all the viewers come from! Greetings from Bonn, Germany...
Very good, although I do think its useful to include a note on the quadratic variance of the Wiener process being equal to dt, for the application of Ito's lemma.
Great video!
Very instructive video thanks sir. May I know where the intuition comes for adding the variance term in order to correct the solution of the PDE for Wienner process ?
Thanks for the question, sure! You go down 1 percent, then up again 1 percent, but you do not get the initial value. So the "arithmetic" version (one more, one less) has no drift, but the "geometric" version (one percent more, one percent less) does have a drift, and that needs to be taken into account. Does that help (a little)? Best, H^2
If you treat it like a regular ODE, then you get the first function you were describing but it doesn't take into account the stochastic part like you said. My question is how did you figure out the stochastic part was -variance/2*t? Is it because you equated the 2nd and 3rd parts in ito's lemma so that they would cancel out and you would be left with the correct solution?
Would love you to go through ito integration in similar detail
Thanks for the great video! One question if I may, at 8:29 if your delta t is not 1, your dWt still using standard normal? I just want to clarify the relation between dWt, standard normal, and dt. Is dWt always ~N(0,1) under any dt? Many thanks in advance if anyone can advise.
The variance of the normal distribution is typically the difference in time for the interval. The interval happened to be 1 here so we use N(0,1), but if the intervals were 2, you’d use N(0,2).
@@eugenefrancisco8279 Thank you Sir so much for clarifying! So here we just choose dt=1 so dWt~N(0,1) for mathematical convenient, and all the following derivation is based on dt=1, right? And if we choose dt=2, then his excel spread sheet example will yield different random path as the the variance become 2, dWt will be more volatile and the path will shift more from the mean. is that right?
amazing video thank u so much
Excellent video.
Thanks for this comment :-)
Thanks for this video
Brilliant!
When in the minute 15:53 you said that "this means that this ( formula) is the solution to our differential equation", this means that the formula is capable to predict all the values of the function ahead on time? Or means that is a solution for all the points that you already have and you can't predict nothing with the solution to the differential equation? Thanks for your answer
Thanks for the question! It is really not trivial what "solving" an SDE means. Can you make predictions? Yes and no. The stochastic process always "wiggles" up and down. So you can never know where it will be at some future time T. But you can calculate where you expect it to be (expected value), the standard deviation, and so on, even the complete probability distribution of potential values. Because if you "solve" the SDE, you know that the probability distribution is a distorted version of the Gaussian distribution, that is, the distribution of the underlying Wiener process. I hope that helps (a little). Compare with the solution of an ODE. Without the solution: you have to simulate in order to get the value at some date. With the solution: you can get the value with an equation, no simulation needed. In the case of an SDE, without the solution: you can use a Monte Carlo simulation to get a distribution of potential values. With the solution: you can calculate the distribution right away, no simulation needed.
Dear Hendrik, Thanks very much for an awesome video. Could you please share the Excel sheet which you produced in the video?
I just did, see docs.google.com/spreadsheets/d/1R8XnkAcfAmASlk2sn7bnlJxzceOosxKYUAlSg61_7ro/edit?usp=sharing
thank you for the brilliant content, professor. I have known that a standard Brownian motion has property : W(t) - w(S) FOLLOWS N(0, t-s) . May I ask if the dW used here is also following N (0 , step size) assuming step size used here is 1?
Hello, thanks for your questions, maybe the same answer applies to both questions: In order to make everything as simple as possible for a start, so I used a step size of 1. To make up for that, I used a tiny beta. For a larger beta, to get a good approximation, one would have to reduce the step size.
Your second intuition is correct: W(t)-W(s) follows N(0,t-s), and W(t+1)-W(t) follows N(0, 1).
If this is not helpful, let me know, I'll write more :-) Best, H^2
The video is very interesting, thank you! However, I didn't quite understand how Ito's lemma allows to take into account continuous variations of the interest rate
Dear Marine, continuous variations of the interest rate are more complex. The easiest model is Vasicek, see for example th-cam.com/video/bHr1bBO61FY/w-d-xo.htmlsi=9caCaOHSEqVKAncS
Thanks Professor for this excellent explanatory video. I have a doubt- do we take dw(t) as realization of a standard normal variable a time t or dw(t) as difference of (realization of a Std Nomal Variable - such realization at time t-1)?
Dear Surendra, thanks for the positive feedback and for the question! I think that your answer 1 is correct. dw is the realization of a standard normal variable, and we add it to w at date t to get w at date t+dt. And of course, like always when we speak about stochastic differential equations, this is just the intuitive explanation. dt is really infinitesimally small, so also dw is infinitesimally small. But adding up an infinite number of infinitesimals gives us something non-zero. Best! H^2
P.S. If this answer did not help, let me know 🙂
Professor, I have tested the equations on my computer, and I found that the denotation "t" used is actually the step size, instead of the actual t {0,1,2,3,4,5,6,7,8....} . may I ask why is that ?
Thanks
How do you calculate Wt and dWt in excel at 6:30
Thanks for the question! Because the step size is 1, and the volatility of a Wiener process is 1, and drift (mean) zero, so the increments dW_t are standard normally distributed. The equation for dW_t in Excel is thus simply =NORMSINV(RAND()). Then you get W_t by aggregating dW_t, B3 = B2 + C3 and so on. Does that work? Best, H^2
@@heha1390 No, I didn’t work. Could you please add the excel file link to description
Sir, please more videos in english on Financial Mathematics
Hi, very basic question: how you get dWt from Wt? And Wt is just ~N(0,1)?
Dear Camile, thanks for the question. Actually, it is the other way round. dW(t) is the increment, it is (in this example) ~N(0,1) because the time increment is always 1. If you choose a smaller increment dt, then the increment is ~N(0, 1/sqrt(dt)). So the increments are normally distributed, such that the path W(t) itself is wiggly but continuous (if you'd let dt converge to 0). I chose dt = 1 in order not to get too much notation at the start. Best, H^2
Hi,
I tried to reproduce this in excel. I noticed that if I use a volatility of 10% or 0.1, the Ito and actual value separate at some points but re-converge or are at least close. More interestingly, the naïve exp(Wt) implementation diverges significantly. I suspect that this is due to numerical error. Could you comment on that?
I am not sure, it is difficult to diagnose from far away. My guess would be, a high volatility has a similar effect as a large step size. An a large step size, also for ordinary differential equations, leads to an imprecise approximation. In each step, you follow the tangent instead of the actual solution. So my suggestions would be, let the step size decrease, and see whether the tracking error remains... Let me know if you were successful. Best, H^2
@@heha1390 I reproduced the question and used your solution and it appears to work. Thank you both!
Can we find this excel file ?
I'll have to create one and upload it some day. But actually, I'd recommend to "program" it yourself, because I think, "doing" is more insightful then "looking". But anyway, I hope I find the time to create a nice uploadable Excel file soon...
It took you 25 mins to explain what my teacher tried to explain in 6 months
Thank you for these nice words, made my day!
This is slightly confusing, and potentially teach the wrong intuition, in that you appear to show there is a volatility impact on terminal wealth. But shouldn't be an expectation of volatility drag on cumulative wealth. Your comparison is not quite "(static) apple to (stochastic) apple". In the ODE (non stochastic) case, you had assumed a constant and positive compounding rate for the stock. But in the SDE, stochastic case, the stock compounding rate is drawn from a normal distribution with mean 0 and stdev of sigma. Therefore in your SDE, the expected compounding rate is 0, while the "expected" compounding rate in your ODE is finite, which you labeld beta! You wouldn't compare the static apple to a fussy stochastic apple with a mean diameter of 0, would you? Now, if the more interesting question here is whether a stochastic process with the same log normal mean as a non stochastic process, would there be a drag on cumulative wealth due to the volatility? To answer that question we need to solve a SDE with the same mean drift as the ODE, but add a stochastic term representing the geometric brownian motion (Wiener process):
dS_t =a x S_t x dt+b x S_t x dW_t,
where a is the drift (same as your beta) or average compounding rate, and b is the standard deviation of the compounding rate for one time period. dW is the geometric brownian shock ~ N(0,1), or white noise.
You can integrate this by first express this Ito process into a Stratonovich form:
dS_t =(a - 1/2*b^2) x S_t x dt+b x S_t * dW_t
where "x" is the Ito, and "*" is the Stratonovich form of SDE. We can use separation of variable to integrate this but first we have to separate the variables, dividing both side of the Stratonovich form of the SDE by S_t:
dS_t/S_t = (a-1/2*b^2) x dt + b * dW_t
Now integrate both sides, from t=0 to t you get
S_t = So * exp[(a-1/2*b^2)*t +b*sqrt(t)*epsilon]
where epsilon ~ N(0, 1). People often erroneously assume 1/2*b^2 is a volatility drag on performance. Let's see is it a drag or not on average expected terminal wealth, S_t. To find the difference in mean terminal value, lets take the expectation of the S_t, and realizing that only epsilon is random you get:
E[S_t]=So * exp[(a-1/2*b^2)*t ] * E[exp(b*sqrt(t)*epsilon)]
Recall E[exp(X)]= exp(sigma^2/2) if X ~N(0,sigma). I am omitting the derivation, which essentially involves the integral of INTEGRAL[exp(X)*pdf(X) dX], where pdf(X) is normal gaussian in X. This means.
E[S_t]= S0* exp(a*t), the "volatility drag" 1/2*b^2t, cancels out by the Expectation of cumulative random process. So there is NO drag. And therefore the fussy stochastic apple has its average shape as the static apple.
So in your case, the S_t=S_0*exp[sigma*Wt-1/2*sigma^2*t] your sigm =: b, and 0=: a in my equation. And therefore the final discrete dS_t will have a mean 0, which does not compare to your ODE case where there is a finite drift.
It is embarrassing to explain a mathematical concept with Excel. Probably your students are from Kindergarden.
Thanks for the comment, but I disagree, for two reasons. First, I think that mathematical concepts, when used in econ, are much more valuable if one has a strong intuition. Second, real price movements are discrete anyway, so I think there is nothing wrong in having a discrete example, even if it is only programmed in Excel.
But also, the video is not for our university students. The topic came up at a Xmas party, where someone (a pretty smart someone, actually) said that he never got an intuition what Ito is about. So I felt challenged and tried to do a video with a lot of intuition in it. This video is not for the 5% for whom Ito is Kindergarden, but for the 15% for whom it is within reach but still a little complicated.
My favorite TH-cam shows are MinutePhysics, Numberphile, Veritasium and the like, so I tried to do something in that direction (but without the budget).
If you are good with Matlab or Python you can quickly run a "simulation" and explain complex concepts. The benefit of excel is that you can in real time, tweak a value and see its immediate impact without having to rerun a program. You wont use excel to price options or to evaluate a strategy. You should use excel to teach a fundamental mathematical concept.
This video made Steven Shreve's first 4 chapters come to life. Amazing. Thank you for taking the time to explain a complex concept visually and practically. Thank you.
@@mehdiAbderezai I agree. Also, it should be a program that everyone knows, and that applies to Excel more than to Python or MatLab, I believe. I personally use Mathematica... Thanks!!
Being able to explain a complex mathematical concept with the simplest tools is a sign of expertise and shows that one truly understands the inner workings of the concept. I don’t see a point of stating such a comment but to make yourself look “good”. This video is extremely well executed and provides more insight than most other videos. Good job to the Professor!
Let's see your explanation @FernandoAMarroquin