Hi, I am able to understand the content except this part. Based on the example, a single line of 11 mb file when exploded we get multiple rows of 11 mb so the size is getting huge. But still we have around 60 mb of execution memory (around 90mb minus 30 mb of cached memory). So even the data size is getting bigger it can spill to disk right ? why we are receiving OOM . Can you please explain this part.
Hello, To keep it simple, runtime computations are stored in memory. Since runtime object gets multiplied because of explode it will not be able to fit in memory, running into OOM. In other case, when we try to read a bigger partition, spark knows that it will not fit in memory thus it spills it to disk before running computations on top of it. But once a data is brought in memory and then it explodes because of computation, that can cause issue. I tried to make this simple, as this is not so simple to understand in first time. Don't forget to Like and shar with your network over LinkedIn 💓
Amazing. You deserve to get subscribed... Keep it coming! 😀
Thank you! Don't forget to share this with your network over LinkedIn♻️
Hi, I am able to understand the content except this part. Based on the example, a single line of 11 mb file when exploded we get multiple rows of 11 mb so the size is getting huge. But still we have around 60 mb of execution memory (around 90mb minus 30 mb of cached memory). So even the data size is getting bigger it can spill to disk right ? why we are receiving OOM .
Can you please explain this part.
Hello,
To keep it simple, runtime computations are stored in memory. Since runtime object gets multiplied because of explode it will not be able to fit in memory, running into OOM. In other case, when we try to read a bigger partition, spark knows that it will not fit in memory thus it spills it to disk before running computations on top of it. But once a data is brought in memory and then it explodes because of computation, that can cause issue.
I tried to make this simple, as this is not so simple to understand in first time.
Don't forget to Like and shar with your network over LinkedIn 💓
@@easewithdata Thanks for your reply.
Bhai shab, Awsm video ❤
Subscribed🎉
Don't forget to repost this with your friends as well on LinkedIn ♻️
Bhai Databricks series complete ho gaya kya
Nhi that is in progress. Both Spark and Databricks will run in parallel.
@easewithdata still how many videos left?
@@moyeenshaikh4378 for databricks? Around 10
❤