Spark Architecture in 3 minutes| Spark components | How spark works

BigData Thoughts

มุมมอง 77 786

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 เม.ย. 2021
Spark is one of the most prominent and widely used processing framework in Bigdata world. This videos explains the core components and architecture of spark with a real world example in just 3 minutes.
วิทยาศาสตร์และเทคโนโลยี

ความคิดเห็น • 124

@deepalirathod4929 4 หลายเดือนก่อน ⁺¹
Finally it got cleared to me after reading here and there . thank you .
@apurvgolatgaonkar6722 ปีที่แล้ว ⁺¹
Ma'am you taught amazing 😍😍 very less time consuming lecture but perfect... Keep it up
@bitthal24 3 ปีที่แล้ว ⁺⁷
I havent seen such a lucid way of explaining something this complex concept. Great work!
@BigDataThoughts 3 ปีที่แล้ว
Thanks Bitthal
@satishchippa 3 ปีที่แล้ว ⁺¹⁰
Excellent way of explaining things in a most simplified manner.
Looking forward to more videos on Spark.
@BigDataThoughts 3 ปีที่แล้ว
Thanks Satish
@NIYANTAjmp 2 ปีที่แล้ว ⁺¹
very simple and nice explanation. Thank you for posting this video
@SagarSingh-ie8tx ปีที่แล้ว
Example was very good for beginners
@showbhik9700 2 ปีที่แล้ว ⁺¹
This is the only video on TH-cam which clarified my doubts. Thanks!!
@BigDataThoughts 2 ปีที่แล้ว
Thanks showbhik
@BigDataThoughts 2 ปีที่แล้ว
Thanks showbhik
@mehrozalam94 2 ปีที่แล้ว ⁺¹
Great Video, I have learned alot.
Thank you
@theamithsingh ปีที่แล้ว ⁺¹
finally, a video that simplifies spark, amazing, keep the videos coming please!!
@BigDataThoughts ปีที่แล้ว
Thanks
@seemanthinin448 3 ปีที่แล้ว ⁺¹
Simple example and easy way of explaining an important concept.. thanks!
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
Thanks seemanthini
@puneetojha1195 2 ปีที่แล้ว
This video is better than 1 hour course on spark . Thanks
@BigDataThoughts 2 ปีที่แล้ว
Thanks puneet
@desiengineerashish4908 2 ปีที่แล้ว ⁺¹
This made far easy to understand Spark Architecture. Thank u ma'am, you are great
@BigDataThoughts 2 ปีที่แล้ว ⁺¹
Thanks Ashish
@swapnilchilwant6867 ปีที่แล้ว
Thank you ma'am..👍
@vsriga82 3 ปีที่แล้ว ⁺¹
Lot of information presented in simple way for everyone to understand. 👍
@BigDataThoughts 3 ปีที่แล้ว
Thanks sriganesh
@sandeepchoudhary4900 2 ปีที่แล้ว ⁺¹
Very nice video and it covers everything related to the Spark architecture in just 5 minutes. Keep sharing new videos.
@BigDataThoughts 2 ปีที่แล้ว
Thanks sandeep
@trainingt9855 2 ปีที่แล้ว ⁺¹
Great video
@rahulkoley9447 11 หลายเดือนก่อน ⁺¹
Very Informative, with full of clarity, Thank you.
@BigDataThoughts 11 หลายเดือนก่อน
thanks
@SaurabhKumar-mc1is 3 ปีที่แล้ว ⁺¹
Vividly explained. Thanks mam
@BigDataThoughts 3 ปีที่แล้ว
Thanks saurabh
@ayyappareddymuthikepalli4261 3 ปีที่แล้ว ⁺¹
Nice explanation.. Pls keep videos 🎥 like this
@dancingmoveswithdhruv3649 2 ปีที่แล้ว ⁺²
Very clearly explained, really appreciate all your efforts
@BigDataThoughts 2 ปีที่แล้ว
Thanks Dhruv
@iamramanr4s 11 หลายเดือนก่อน ⁺¹
way of explanation is .....just amazing
@BigDataThoughts 11 หลายเดือนก่อน
Thanks
@gkethanvarma889 2 ปีที่แล้ว ⁺¹
Thank you so.................. much for this
@mohnishverma87 3 หลายเดือนก่อน ⁺¹
Just woow, very simple explanation of a complex cluster overview..
Thanks.
@BigDataThoughts 3 หลายเดือนก่อน
Thanks
@samk_jg 2 ปีที่แล้ว ⁺¹
Amazing!
@krishnavardhandasari2694 ปีที่แล้ว ⁺¹
Thank you mam for excellent way of teaching Spark.
@BigDataThoughts ปีที่แล้ว
Thanks
@Azureandfabricmastery 3 ปีที่แล้ว ⁺⁵
Simple and easy to understand. Thanks. I like doodle way of explaining concepts :)
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
Thanks sheik
@askdoubts6359 3 ปีที่แล้ว ⁺¹
Grate nice explanation
@minaksheebagul4938 2 ปีที่แล้ว ⁺¹
just love the way u explained it ;) Really appreciated mam.
@BigDataThoughts 2 ปีที่แล้ว ⁺²
Thanks minakshee
@nahomg.4191 3 หลายเดือนก่อน ⁺¹
I wish I could give 1000 likes. You’re an excellent teacher!
@BigDataThoughts 3 หลายเดือนก่อน
Thanks
@ankitachauhan6084 11 หลายเดือนก่อน ⁺¹
the best explanation ever great work !
@BigDataThoughts 11 หลายเดือนก่อน
Thanks
@vikastangudu712 ปีที่แล้ว ⁺¹
one of the bests
@BigDataThoughts ปีที่แล้ว
Thanks vikas
@vemulasunayana904 9 หลายเดือนก่อน ⁺¹
Excellent example 👏
@BigDataThoughts 9 หลายเดือนก่อน
Thanks
@lakshmikanthavilalakumar 3 ปีที่แล้ว ⁺¹
Beautifully explained short video 👏👏
@BigDataThoughts 3 ปีที่แล้ว
thanks lakshmikanth
@adityasisodiya3000 ปีที่แล้ว ⁺¹
Amazing content ! Really appreciate the understanding and approach to explain...looking fwd to more
@BigDataThoughts ปีที่แล้ว
Thanks Aditya
@ThankGod143 ปีที่แล้ว ⁺¹
Extraordinary mam
@BigDataThoughts ปีที่แล้ว
Thanks harika
@nareshkumarbattula5847 3 ปีที่แล้ว ⁺⁹
It's the best video I've seen so far on spark architecture..awesome..keep going..
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
Thanks
@athar5867 2 ปีที่แล้ว
Your videos really help me in clearing my interviews and getting the job, now I need some support for my new job. Can you either post videos regarding how pyspark class object work in backend, how parquet/csv reading writing work in distributed environment like which data read by each executor, how to do pagination of some order data etc.
If possible provide your LinkedIn profile url or suggest some way to connect
@tanushreenagar3116 ปีที่แล้ว ⁺¹
VERY HELPFUL BEST EXPLANATION
@BigDataThoughts ปีที่แล้ว ⁺¹
Thanks
@nishchaysharma5904 10 วันที่ผ่านมา ⁺¹
Thank you for this video.
@BigDataThoughts 6 วันที่ผ่านมา
Thanks
@hemanthkumarreddyedde 2 ปีที่แล้ว ⁺¹
Superb delivery
@BigDataThoughts 2 ปีที่แล้ว
Thanks Hemanth
@moughosh3640 ปีที่แล้ว ⁺¹
Extremely good explanation
@BigDataThoughts ปีที่แล้ว
Thanks mou
@sanjoydas007 2 ปีที่แล้ว ⁺³
Very helpful, it would be great if you can take an example and illustrate how the data chunking happens
@sheereenhamza3700 3 ปีที่แล้ว ⁺¹
Thank you for such a goooood explanation :D
@BigDataThoughts 3 ปีที่แล้ว
thanks sheereen
@peekagyan 2 ปีที่แล้ว
Nice explanation. For 1 GB input data for a batch processing how can we decide how can we decide the cluster size, no. of nodes or no. of executors ? Could you please explain Thanks ma’am
@neerajmishra6828 ปีที่แล้ว ⁺¹
Saw this video.. content looks promising... great job
@BigDataThoughts ปีที่แล้ว
Thanks
@nareshb5859 3 ปีที่แล้ว ⁺¹
Very Nice Explanation
@BigDataThoughts 3 ปีที่แล้ว
Thanks Naresh
@shivratanmishra1459 3 ปีที่แล้ว ⁺¹
Great work ma'am
@BigDataThoughts 3 ปีที่แล้ว
Thanks Shivratan
@SandyRocker ปีที่แล้ว ⁺¹
Thanks a lot mam
@BigDataThoughts ปีที่แล้ว ⁺¹
Thanks Sandy
@SandyRocker ปีที่แล้ว
Subscribed ✅
@vidyaradhakrishnan5611 2 ปีที่แล้ว ⁺¹
Very nice explanation,
@BigDataThoughts 2 ปีที่แล้ว
Thanks Vidya
@RaviYadav-cx2pb ปีที่แล้ว ⁺¹
Amazing explanation mam 😊😊👍
@BigDataThoughts ปีที่แล้ว
Thanks ravi
@upskillwithchetan 3 ปีที่แล้ว ⁺¹
Great explanation Ma'am, please add more videos and arrange it in seq. under playlist
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
Thanks Chetan. yes there are more videos coming up. stay tuned
@mohanmannam 2 ปีที่แล้ว ⁺¹
good explanation...
@BigDataThoughts 2 ปีที่แล้ว
Thanks
@iwonazwierzynska4056 10 หลายเดือนก่อน ⁺¹
Excelent video :)!
@BigDataThoughts 10 หลายเดือนก่อน
Thanks
@toandao7113 9 หลายเดือนก่อน ⁺²
This was marked to know that I'm here on 3/10/2023
@rovashri566 20 วันที่ผ่านมา
How did you make such a good visual explanation? Which tool you used to draw sketches ? Pls guide 🙏
@hlearningkids 4 หลายเดือนก่อน ⁺¹
Very nice 👍
@BigDataThoughts 4 หลายเดือนก่อน ⁺¹
Thanks
@hlearningkids 4 หลายเดือนก่อน ⁺¹
@@BigDataThoughts did you explained in this style big query also. ? improvement in this video can be summary in slow way. please dont get hurt because i gave comment. you did really well in video. excellent explanation.
@shreyamoghe6893 3 ปีที่แล้ว ⁺¹
Great video!
I have one question though. Is it my correct understanding that each student which got the coin bag is same as how data is partitioned.
i.e. 1 student = 1 data partition?
@BigDataThoughts 3 ปีที่แล้ว
1 data partition is operated on by 1 slot/task
@Ramakrishna410 2 ปีที่แล้ว
One executor in one core and 2 partitions are assigned so one by one will execute. My que is if 10 tasks are there then these tasks wil execute parallelly or sequentially in partition level
@BigDataThoughts 2 ปีที่แล้ว
A task operates on a partition of data. Tasks do run in parallel. If you have multiple cores you can specify how many cores will a executor use. The number of concurrent tasks an executor can run is equal to the cores assigned..
@MrSmarthunky 3 ปีที่แล้ว ⁺¹
Really informative Shreya. One quick question. Stage will run sequentially depending on use case and Tasks will run in parallel?
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
Thanks Madhu. Stage may run sequentially or in parallel depending upon whether they have dependency or not. Typically a stage will have multiple tasks running in parallel on a different set of data but doing the same set of operations that the stage contains.
@sandeepmullangi4413 2 ปีที่แล้ว
Nice video. Really liked it. So you said one node can act as driver. I want to know what is the best practise here? I usually submit jobs by doing SSH to master node (atleast in GCP dataproc) and then submit job. So should I consider my master node as driver? Is it right to do that way?
@BigDataThoughts 2 ปีที่แล้ว
To give an example if we are using Yarn - when you are submitting a spark job in cluster mode. Container where the Application Master runs acts as Master node (driver) and the containers where all the executor process runs the tasks are called Slave Node. When the job gets submitted first the spark submit calls resource manager which in turn starts the application master and from there driver takes over.
@vedantshirodkar ปีที่แล้ว ⁺²
Mam, I have one question.
If spark has to write a data to sql database and as our data is broken on to multiple worker nodes, so is it driver who establishes single connection with sql db or it is worker nodes who establishes multiple parallel connections ?
@BigDataThoughts ปีที่แล้ว ⁺²
When Spark writes data to a SQL database, it is the driver program that establishes a connection with the database and manages the write process. Each worker node will write its portion of the data to the database through this single connection established by the driver
@vedantshirodkar ปีที่แล้ว ⁺¹
@@BigDataThoughts Thank You Mam for the elaborated explanation.
@mdatasoft1525 2 หลายเดือนก่อน
❤
@ur8946 3 ปีที่แล้ว ⁺¹
could you pls explain more on parttiton
@BigDataThoughts 3 ปีที่แล้ว ⁺¹
The Dataset is divided into partitions and each partition is the unit on which a task works. That's the input to task.
@trainingt9855 2 ปีที่แล้ว
Can you help in understanding RDD
@BigDataThoughts 2 ปีที่แล้ว
RDD are resilient distributed datasets and they are the lowest abstraction in spark. check this video - th-cam.com/video/DuWGBMF7ARc/w-d-xo.html
@RameshKumar-ng3nf 3 หลายเดือนก่อน
At the start of the video i was so happy seing all the diagrams..
Later got fully confused & felt complicated and i didnt understand well 😢
@bintangmuhammad7082 2 ปีที่แล้ว
Can you please turn on the subtitles? thank you
@SM-mq1iq 2 ปีที่แล้ว
Can you make it slow to follow. I felt this was fast to get to know the terms.
@sumonmal009 3 หลายเดือนก่อน ⁺¹
Good playlist for Spark
th-cam.com/play/PL1RS9FR9qIPEAtSWX3rKLVcRWoaBDqVBV.html
@BigDataThoughts 3 หลายเดือนก่อน
Thanks
@vibhad-cv4sf 9 หลายเดือนก่อน
Great Video....!! Appreciate your efforts!🎉
One question, Where does a cluster manager fit in in this architecture? What role does it play in comparison with driver?
@BigDataThoughts 8 หลายเดือนก่อน ⁺¹
Cluster manager's job is to provide resources for job execution. Ex - yarn, mesos etc. Driver is the one controlling the overall job execution and which executors take part in the job
@vibhad-cv4sf 8 หลายเดือนก่อน
@@BigDataThoughts ohh okay!! Thank you!!
@harigovindk 2 หลายเดือนก่อน
18/april/2024

ต่อไป

เล่นอัตโนมัติ

Spark APIs | Spark programming for beginners | RDD vs Dataframe vs Dataset