Hi @dataengineering , as you said no of task creatred is equal to no of blocks ,in this example data size is small , usually hdfs block size is 128 mb right , even though why its splitted into two blocks .
One more doubt , if data is in two blocks during spark submit if we provide one executor and 5 cores per executor , During that time only two task get created other three cores will be in ideal or how its going to work , please explain that
Amazing explanation . Thanks
Excellent Explanation sir
Please upload videos on broadcast and accumulator in spark
Nice explanation
Bro.. semmaya explain pandringa bro
🙂❤️
Simple and Detailed Explanation bro !!! , Hash Partition = Hash Key % No. of Output ( Partitions )
Excellent :)
sir, you need to put this video in tamil playlist. :) Thanks for very good videos with very crisp and clear explanations.
boss, you well said, what is parallelism and grouping clearly with state exam result..
Do video on joins broadcast, shuffle hash , sort merge
Hi @dataengineering , as you said no of task creatred is equal to no of blocks ,in this example data size is small , usually hdfs block size is 128 mb right , even though why its splitted into two blocks .
One more doubt , if data is in two blocks during spark submit if we provide one executor and 5 cores per executor ,
During that time only two task get created other three cores will be in ideal or how its going to work , please explain that
What if 'hi' also receives even number by hashing (as it is decided by spark internally). At that time, part-00001 will be empty ?
Data analytics vs data engineering ethukku bro job adigama iruku