I use these concepts when working on performance. The project stakeholders are always amazed how much better our systems respond when I integrate those changes. These lectures are literal career makers :)
@@sucmanh I can't get into too much detail but basically the application I worked on involved a weighted graph that was computationally expensive to generate. Initially caching results would give some improvement but that would quickly eat up any remaining memory the client had in their systems (we observed this when the hard disk started paging memory). So I went with a dynamic approach using some of the techniques outlined in this talk, I had to completely rethink the problem statement and model it in a way that would benefit from a dynamic algorithm (mostly defining the main problem statement as recursive sub-problems). The end results allowed me to literally skip entire sets of sub-problems and discard the cached results if needed when moving to the next major subgraph. It doesn't exactly follow what's outlined in this lecture as the implementation was really specific to the problem
For my own reference: 0:00 intro, DP = SRTBOT + memoization 3:05 SRTBOT - recursive algo design paradigm to solve diverse range of problems 5:58 merge-sort algo as SRTBOT 8:41 solving fibonacci with SRTBOT, reusing our calculation 13:43 memoization (top down and bottom up approach) 26:57 solving DAG shortest path with DP (DP is same DFS as it uses small problems to large, or to say large problems need smaller problems for calculation) 37:00 bowling problem (a toy problem, in which we uses the power of DP) 52:05 bottom-up DP, intuition of DP as local brute force
1:15 In general, this class-- all of algorithms-- is about RECURSIVE algorithm design at some level, cuz we want to write constant-sized pieces of code that solve problems of arbitrary size.
Wow! A new version of 6.006.Missing Victor a little bit. Erik is always brilliant! Now I got a deeper understanding of DP every time I have this lecture!
13:05 Erik: "Fibonacci growth is the golden ratio to the n... This is exponential growth. As we know especially in this time, exponential growth is bad." :D :"(
l always rewatch the OCW dynamic programming videos for my interview prep and everytime i re- remember how the recurrence relation works and how to find it i get the same a-ha feeling and all the good brain chemicals that come with it :-)
There is a wonderful usage of Best time complexity in calculating the DAC as nlogn time complexity to divide the whole array and the resulting array sorted generation as we don't have to take order the previous DAC with respect to pivotal element. Thus the additive summative logarithm a little bit less in calculation of nlogn but approximated.
Today I found a resemblance in the case of the BST creation in case of the Divide option in case of division of array in merge sort and the same recursive ordering in the resultant arrays. Thus,we find a great relation in case of creation of bipartite scenario creation in case of the Quick or Merge sort array generation.
Why does O(ceil(n/w)*n) = O(n + n^2/w)? Where does the addition with n come into play, shouldn't it just be O(n^2/w)? I guess it doesn't matter since n^2/w dominates anyways (assuming w is a const or grows slower than n)
Approximately, in the worst case, wenn (n/w) is just short of being an integer, then ceil(n/w) = n/w + 1. In general, you will always add some constant added, so the order of n remains in the big O sense.
fantastic lecture, thank you. i'm not sure but I think there's a mistake in the recitation code for bowling? If you try to index v[i+1] and you're at the very last item, you will get an out of bounds error. the base case doesn't check for this
Yes! You can view the complete course on OCW: ocw.mit.edu/6-006S20 (note that we always include the link to the course site in the video description). This course includes lecture notes as well as both assignments and exams with solutions.
Both Fibonacci and single-source shortest path in a DAG can be solved following the topological order, without recursion nor memoization. It makes me wonder if every problem solved with DP could be solved without recursion nor memoization, exploiting the topological order of its sub-problems.
It's a bit misleading to say you can do it without memoization, for Fibonacci at least you're at least keeping track of the previous two iterations, or "whatever the last two set elements of your memoized array would be". You can just reduce it to two variables because you can discard the remainder of the "virtual" array you're tracking since you don't need to continue to track anything before the last two elements.
Thanks. Do parallelism of recursion hurts the idea of memorization, imagine in your recursion tree there is F(n-1) in two branches ,since they are running in parallel no one can use the value of the other since when the programme arrive to F(n-1) in the two branches no one of them is done ! Is there any explanation to that please
There is no parallelism in recursion, the call stack executes recursive calls in a DFS order or in other words just coz recursive tree have it's 2 sibling nodes as F(n-1) doesn't mean they both will be traversed over(or called, since in this context nodes are functions) simultaneously , for example in a binary tree even though two nodes n1 and n2 might be siblings but they aren't traversed in parallel aka at the same time but n1 is visited(suppose) then it's descendants are visited, the control backtracks and only then it's sibling n2 and it's descendants are visited therefore by the time we visit n2 we already have n1 memoized so we can use n1 when we visit n2 since in your example n1 & n2 are both == F(n-1), ( From your example 2 sibling F(n-1) call stacks are equivalent to my sibling nodes n1 and n2)
@@apoorv219 Thank you a lot. So the main thing that I did not know is that recursion is not than using parallelism, is there a way to make it parallel because I think it will speed up computations
@@anas.2k866 yeah may be computer engineers can introduce parallelism in recursion to make computation faster using some mechanism with constraints (like if n1 and n2 are different visit them in parallel but if they are same then visit only one set a flag for current node being visited and once the flag becomes false aka it's completed then visit n2. But since nodes can be any arbitrary n then how we many flags will be needed which will suffice for all/most cases and things like this need to be considered and other considerations like cost of such a cpu etc) But yeah in principle you are right parallelism can speed up recursion for sure.
Question: is the proposed algorithm optimal when pins are -1,-1,-9,1? In the first step it will choose to hit first and second pins -1 and -1. On the second step the algorithm will choose to skip -9. On the third step the algorithm will hit 1. Overall score will be 2 if we use the algorithm. But optimal solution is to hit the second and the third pins -1 and -9, and then hit the last, so the score would be 10. Or I misunderstood something?
Why would it choose to skip -9? It will take the maximum between three possible games: [[-1,-9]] points: 9 [[-1,-1]] points: 1 [[-1,-1],[-9]] points: -8 So the algorithm picks the first option.
To walk through it myself because the perceived "boundaries" is something that trips me up often (making a choice on a sub-problem before optimal cases on the boundary are considered): Since the algorithm is essentially just `B = (i) => { max( B(i+1) , B(i + 1) + v[i] , B(i + 2) + v[i] * v[i + 1]) }` it resolves in reverse order, so the iterations would look like: i = 3: max(0, 1, 0) = 1 ; B(4) and B(5) are empty so it's the base case of 0, the middle case (B(4) + v[i] = 0 + 1) is the largest so B(3) is 1. i = 2: max(1, -8, -9) = 1 ; First case is always the previous max (B(i + 1), here B(3) is 1). Middle case is the 1 + current value (-9) so 8, third case is current + next, so -9, so B(2) is 1. i = 1: max(1, 0, 10) = 10 ; Middle value is previous plus current (1 + -1), third value is B(3) + v[1] * v[2], or 1 + (-1 * 9). It skips B(2) in the third case to avoid a double count. i = 0: max(10, 9, 2) = 10 ; Middle is B(1) + v[0], or 10 + -1 = 9. Right is B(2) + v[0] * v[1], or 1 + (-1 * -1) = 2. So the result is 10 because it ignored the first -1, hit the -1/-9 pair, and the solitary 1 at the end (and did it backwards). Written as-is, it would be O(n^2) time, what lets it be O(n) is the assumption that every time we calculate a value of B(i) we're caching it somewhere and reusing it. To do it in a loop without recursion would look something like: auto B = vector(n + 1, 0); // initial array with extra element past the end to cover base cases, initialized to 0 for (int i = n-1; i >= 0; --i) { auto values = { B[i + 1], B[i + 1] + v[i], (i == n - 1) ? 0 : (B[i + 2] + v[i] * v[i + 1]) }; // third case is guarded against trying to access v[i + 1] on the n-1th case B[i] = max_element(values.begin(), values.end()); } return B[0]; You could optimize for space complexity by removing the B vector since each iteration only ever uses 2 memoized values: B[i + 1] and B[i + 2], call it "Bi1" and "Bi2" to be uncreative I guess. Initialize to 0 before the loop, and in each iteration you do `Bi2 = Bi1;` and then `Bi1 = max_element(...)` instead of using B[i]. Then on the next iteration, Bi1 is now what would have been Bi0 at the end of the previous loop, and Bi2 is "shifted" one space over in the virtual array. The return value then is Bi1. This reduces necessary memory to just 2 words since you're just keeping track of the two entries for B that are trailing i instead of the whole, mostly unused array. Anyway, thanks for joining me for this impromptu episode of "if you don't quite understand it, explain it to someone else", lol.
This is a theoretical computer science course on algorithm analysis, design, etc. It doesn't have to have examples that are used in the real world. It just so happens that dynamic programming is a super important topic to be used "in the real world" as a problem-solving paradigm.
Slightly correct. Recursion is reverse induction, this includes strong induction. Dynamic Programming is actually not related to programming and the word "programming" is a misnomer. It actually means "optimization", much in the same way "linear programming" is all about optimizing linear systems and not actual programming. DP was developed for applications in control theory. The relation to recursion is an implementation detail if you rewrite the corresponding local optimization as a recurrence relation. In reality the original way that DP was derived was the bottom up approach rather than top down.
I use these concepts when working on performance. The project stakeholders are always amazed how much better our systems respond when I integrate those changes. These lectures are literal career makers :)
Great to hear this
Can you tell us your real world example?
@@sucmanh I can't get into too much detail but basically the application I worked on involved a weighted graph that was computationally expensive to generate. Initially caching results would give some improvement but that would quickly eat up any remaining memory the client had in their systems (we observed this when the hard disk started paging memory). So I went with a dynamic approach using some of the techniques outlined in this talk, I had to completely rethink the problem statement and model it in a way that would benefit from a dynamic algorithm (mostly defining the main problem statement as recursive sub-problems). The end results allowed me to literally skip entire sets of sub-problems and discard the cached results if needed when moving to the next major subgraph. It doesn't exactly follow what's outlined in this lecture as the implementation was really specific to the problem
The moment that he defined the bowling with that 3 elements in a so simple and cleaver way, and a huge moment of a aha happened! Demaine is insane! o/
0:00 intro, DP = SRTBOT + memoization
3:05 SRTBOT - recursive algo design paradigm
5:58 merge-sort algo as SRTBOT
8:41 solving fibonacci with SRTBOT
13:43 memoization
26:57 solving DAG shortest path with DP
37:00 bowling problem
52:05 bottom-up DP, intuition of DP as local brute force
Great to see a new version of 6.006. I love it. The recording is much more smooth and the quality is better.
It's great to see both lectures a decade apart. They both have interesting nuances.
They just do re-upload
@@cutiepets664 no. this lecture is from when he taught the class in 2020
Agreed, I have watched both a few times. I'd watch pretty much any course he lectured.
@@at0mly are there any differences? is one bettwer than the other?
I love when Eric write on board and explain.
I really appreciate these institutes for uploading such a good lecture like this for free.
Pure gold. In MIT we trust.
For my own reference:
0:00 intro, DP = SRTBOT + memoization
3:05 SRTBOT - recursive algo design paradigm to solve diverse range of problems
5:58 merge-sort algo as SRTBOT
8:41 solving fibonacci with SRTBOT, reusing our calculation
13:43 memoization (top down and bottom up approach)
26:57 solving DAG shortest path with DP (DP is same DFS as it uses small problems to large, or to say large problems need smaller problems for calculation)
37:00 bowling problem (a toy problem, in which we uses the power of DP)
52:05 bottom-up DP, intuition of DP as local brute force
pure golden giant withstanding the test of time about 10 years. really appreciate sharing this invaluable humankind resources
Man I love hearing him talk! So brilliant and clear
1:15 In general, this class-- all of algorithms-- is about RECURSIVE algorithm design at some level, cuz we want to write constant-sized pieces of code that solve problems of arbitrary size.
Love this guy's lectures. Interesting parts in this especially the tricks to design the DP sub-problem. Really helpful
Wow! A new version of 6.006.Missing Victor a little bit. Erik is always brilliant! Now I got a deeper understanding of DP every time I have this lecture!
Which one is better old one or new one?
@@utkarsh_108 I prefer this one, more concise and clearer. And better read the lecture notes also.
13:05 Erik: "Fibonacci growth is the golden ratio to the n... This is exponential growth. As we know especially in this time, exponential growth is bad." :D :"(
l always rewatch the OCW dynamic programming videos for my interview prep and everytime i re- remember how the recurrence relation works and how to find it i get the same a-ha feeling and all the good brain chemicals that come with it :-)
1:33 after so many lecture the camera cut blew my mind
There is a wonderful usage of Best time complexity in calculating the DAC as nlogn time complexity to divide the whole array and the resulting array sorted generation as we don't have to take order the previous DAC with respect to pivotal element.
Thus the additive summative logarithm a little bit less in calculation of nlogn but approximated.
Hi Erik, you look great haven't aged since your last same DP video 8 years ago.
Erik da Man!
Mindblown about the bottom up version of the solution of the bowling problem!
You can use topological order to write iterative version
Today I found a resemblance in the case of the BST creation in case of the Divide option in case of division of array in merge sort and the same recursive ordering in the resultant arrays.
Thus,we find a great relation in case of creation of bipartite scenario creation in case of the Quick or Merge sort array generation.
No wonder why americans drive the technology, when you have guy like Prof.Eric. Thanks MIT🔥
Grateful for this content!
Which one is better old one or new one?
Thank you MIT! Thank You Erik.
42:00 Anybody please try this idea with reinforcement learning?
52:32
Add one more base case
V(n)=0
Or
B(n-1) = max (0, v(n-1) )
25:00 triangular number is not right I think. Its 1+2+4+... Not 1+2+3+...
Why does O(ceil(n/w)*n) = O(n + n^2/w)? Where does the addition with n come into play, shouldn't it just be O(n^2/w)? I guess it doesn't matter since n^2/w dominates anyways (assuming w is a const or grows slower than n)
Ya, same question.
Approximately, in the worst case, wenn (n/w) is just short of being an integer, then ceil(n/w) = n/w + 1. In general, you will always add some constant added, so the order of n remains in the big O sense.
very good
thank you so much... Love from INDIA
fantastic lecture, thank you. i'm not sure but I think there's a mistake in the recitation code for bowling? If you try to index v[i+1] and you're at the very last item, you will get an out of bounds error. the base case doesn't check for this
basically, you can add a virtual pin for the boundary check at the nth place with value 0.
@@haomintian6815 What would be the dummy value? 1 would screw up the addition, 0 would screw up the multiplication.
The out-of-bounds-error case can be resolved by providing a comprehensive base case: B = {}, B[len(v)] = 0, B[len(v) + 1] = 0
This course has any pdf with notes and exercises?
Yes! You can view the complete course on OCW: ocw.mit.edu/6-006S20 (note that we always include the link to the course site in the video description). This course includes lecture notes as well as both assignments and exams with solutions.
@@mitocw thanks 😊
Does anyone know the link for a previous lecture he talked about DAGS?
Erik Demaine the sexiest computer scientist alive?
What is n-bit addition? I did not understand that particular thing. Is it the representation of total digits?
n-bit addition comes from how numbers are represented in a computer. you'll understand once you take a computer architecture course
Both Fibonacci and single-source shortest path in a DAG can be solved following the topological order, without recursion nor memoization. It makes me wonder if every problem solved with DP could be solved without recursion nor memoization, exploiting the topological order of its sub-problems.
It's a bit misleading to say you can do it without memoization, for Fibonacci at least you're at least keeping track of the previous two iterations, or "whatever the last two set elements of your memoized array would be". You can just reduce it to two variables because you can discard the remainder of the "virtual" array you're tracking since you don't need to continue to track anything before the last two elements.
@@KingBobXVI You are right, you still need to keep track of the results of some of the sub-problems, but not all of them.
Thanks. Do parallelism of recursion hurts the idea of memorization, imagine in your recursion tree there is F(n-1) in two branches ,since they are running in parallel no one can use the value of the other since when the programme arrive to F(n-1) in the two branches no one of them is done ! Is there any explanation to that please
There is no parallelism in recursion, the call stack executes recursive calls in a DFS order or in other words just coz recursive tree have it's 2 sibling nodes as F(n-1) doesn't mean they both will be traversed over(or called, since in this context nodes are functions) simultaneously , for example in a binary tree even though two nodes n1 and n2 might be siblings but they aren't traversed in parallel aka at the same time but n1 is visited(suppose) then it's descendants are visited, the control backtracks and only then it's sibling n2 and it's descendants are visited therefore by the time we visit n2 we already have n1 memoized so we can use n1 when we visit n2 since in your example n1 & n2 are both == F(n-1), ( From your example 2 sibling F(n-1) call stacks are equivalent to my sibling nodes n1 and n2)
@@apoorv219 Thank you a lot. So the main thing that I did not know is that recursion is not than using parallelism, is there a way to make it parallel because I think it will speed up computations
@@anas.2k866 yeah may be computer engineers can introduce parallelism in recursion to make computation faster using some mechanism with constraints (like if n1 and n2 are different visit them in parallel but if they are same then visit only one set a flag for current node being visited and once the flag becomes false aka it's completed then visit n2. But since nodes can be any arbitrary n then how we many flags will be needed which will suffice for all/most cases and things like this need to be considered and other considerations like cost of such a cpu etc)
But yeah in principle you are right parallelism can speed up recursion for sure.
Question: is the proposed algorithm optimal when pins are -1,-1,-9,1? In the first step it will choose to hit first and second pins -1 and -1. On the second step the algorithm will choose to skip -9. On the third step the algorithm will hit 1. Overall score will be 2 if we use the algorithm. But optimal solution is to hit the second and the third pins -1 and -9, and then hit the last, so the score would be 10. Or I misunderstood something?
Why would it choose to skip -9? It will take the maximum between three possible games:
[[-1,-9]] points: 9
[[-1,-1]] points: 1
[[-1,-1],[-9]] points: -8
So the algorithm picks the first option.
@@JorgeLuis-ts6qp Thanks, I got it!
To walk through it myself because the perceived "boundaries" is something that trips me up often (making a choice on a sub-problem before optimal cases on the boundary are considered):
Since the algorithm is essentially just `B = (i) => { max( B(i+1) , B(i + 1) + v[i] , B(i + 2) + v[i] * v[i + 1]) }` it resolves in reverse order, so the iterations would look like:
i = 3: max(0, 1, 0) = 1 ; B(4) and B(5) are empty so it's the base case of 0, the middle case (B(4) + v[i] = 0 + 1) is the largest so B(3) is 1.
i = 2: max(1, -8, -9) = 1 ; First case is always the previous max (B(i + 1), here B(3) is 1). Middle case is the 1 + current value (-9) so 8, third case is current + next, so -9, so B(2) is 1.
i = 1: max(1, 0, 10) = 10 ; Middle value is previous plus current (1 + -1), third value is B(3) + v[1] * v[2], or 1 + (-1 * 9). It skips B(2) in the third case to avoid a double count.
i = 0: max(10, 9, 2) = 10 ; Middle is B(1) + v[0], or 10 + -1 = 9. Right is B(2) + v[0] * v[1], or 1 + (-1 * -1) = 2.
So the result is 10 because it ignored the first -1, hit the -1/-9 pair, and the solitary 1 at the end (and did it backwards). Written as-is, it would be O(n^2) time, what lets it be O(n) is the assumption that every time we calculate a value of B(i) we're caching it somewhere and reusing it. To do it in a loop without recursion would look something like:
auto B = vector(n + 1, 0); // initial array with extra element past the end to cover base cases, initialized to 0
for (int i = n-1; i >= 0; --i) {
auto values = { B[i + 1], B[i + 1] + v[i], (i == n - 1) ? 0 : (B[i + 2] + v[i] * v[i + 1]) }; // third case is guarded against trying to access v[i + 1] on the n-1th case
B[i] = max_element(values.begin(), values.end());
}
return B[0];
You could optimize for space complexity by removing the B vector since each iteration only ever uses 2 memoized values: B[i + 1] and B[i + 2], call it "Bi1" and "Bi2" to be uncreative I guess. Initialize to 0 before the loop, and in each iteration you do `Bi2 = Bi1;` and then `Bi1 = max_element(...)` instead of using B[i]. Then on the next iteration, Bi1 is now what would have been Bi0 at the end of the previous loop, and Bi2 is "shifted" one space over in the virtual array. The return value then is Bi1. This reduces necessary memory to just 2 words since you're just keeping track of the two entries for B that are trailing i instead of the whole, mostly unused array.
Anyway, thanks for joining me for this impromptu episode of "if you don't quite understand it, explain it to someone else", lol.
great lecture !
thank you~😃
Perfect
Now I know from where the Burst Balloons problem came from 😂
miss prof Srini
Srini was deported back to his village by Trump.
correct me if I am wrong folks, but isn't this just like an expert writer saying to a kid "write exactly like I do, and you'll be rich and famous"
Sr. T-bot
How are these examples used in the real world?
This is a theoretical computer science course on algorithm analysis, design, etc. It doesn't have to have examples that are used in the real world. It just so happens that dynamic programming is a super important topic to be used "in the real world" as a problem-solving paradigm.
remember when you thought shit like this mattered... 😂
bad
So my discrete maths course wasn’t useless. DP is just strong induction 🥲 I feel free
Which topics of descrete is used in this course? Pov: I'm just overviewing the course rn
Slightly correct. Recursion is reverse induction, this includes strong induction.
Dynamic Programming is actually not related to programming and the word "programming" is a misnomer. It actually means "optimization", much in the same way "linear programming" is all about optimizing linear systems and not actual programming. DP was developed for applications in control theory. The relation to recursion is an implementation detail if you rewrite the corresponding local optimization as a recurrence relation. In reality the original way that DP was derived was the bottom up approach rather than top down.