spark architecture | Lec-5

MANISH KUMAR

มุมมอง 36 478

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 4 เม.ย. 2023
In this video I have talked about spark Architecture in great details. please follow video entirely and ask doubt in comment section below.
Directly connect with me on:- topmate.io/manish_kumar25
For more queries reach out to me on my below social media handle.
Follow me on LinkedIn:- / manish-kumar-373b86176
Follow Me On Instagram:- / competitive_gyan1
Follow me on Facebook:- / manish12340
My Second Channel -- / @competitivegyan1
Interview series Playlist:- • Interview Questions an...
My Gear:-
Rode Mic:-- amzn.to/3RekC7a
Boya M1 Mic-- amzn.to/3uW0nnn
Wireless Mic:-- amzn.to/3TqLRhE
Tripod1 -- amzn.to/4avjyF4
Tripod2:-- amzn.to/46Y3QPu
camera1:-- amzn.to/3GIQlsE
camera2:-- amzn.to/46X190P
Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
Pentab (Small size):-- amzn.to/3RpmIS0
Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop -- amzn.to/3Ns5Okj
Mouse+keyboard combo -- amzn.to/3Ro6GYl
21 inch Monitor-- amzn.to/3TvCE7E
27 inch Monitor-- amzn.to/47QzXlA
iPad Pencil:-- amzn.to/4aiJxiG
iPad 9th Generation:-- amzn.to/470I11X
Boom Arm/Swing Arm:-- amzn.to/48eH2we
My PC Components:-
intel i7 Processor:-- amzn.to/47Svdfe
G.Skill RAM:-- amzn.to/47VFffI
Samsung SSD:-- amzn.to/3uVSE8W
WD blue HDD:-- amzn.to/47Y91QY
RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
Gigabyte Motherboard:-- amzn.to/3RFUTGl
O11 Dynamic Cabinet:-- amzn.to/4avkgSK
Liquid cooler:-- amzn.to/472S8mS
Antec Prizm FAN:-- amzn.to/48ey4Pj

ความคิดเห็น • 109

@Shradha_tech 5 หลายเดือนก่อน ⁺⁹
There is a saying 'if you can't explain it simply, you don't know yourself very well ', fits so accurately. You have understood it so well that you made it even easier for others. Thank you for all the hard work.
@boseashish หลายเดือนก่อน ⁺³
sir, aap ne jaan laga di hai videos banane me....bahut he sachhe videos hain...bhagwan aap ko bahut tarkki de aisi prarthna hai
@sabarnaghosh1658 ปีที่แล้ว ⁺²⁵
please be consistent , dont't leave midway,,i have 5 years SQL development experience , i will switch to big data spark domain within 3 months, pls don't stop midway, you are making wonderful videos
@manish_kumar_1 ปีที่แล้ว ⁺⁴
I won't
@engineerbaaniya4846 ปีที่แล้ว
Thanks for this seried
@blutoo1363 8 หลายเดือนก่อน
did you switch? @@engineerbaaniya4846
@kumarankit2302 หลายเดือนก่อน ⁺³
Kya bawal padhai ho Manish bhai, In future if anyone comes to me for guidance ki kaha sa padhna chiaya, I think without any doubt i will refer your channel
@sachinbhoi5727 8 หลายเดือนก่อน ⁺³
Very detailed and layman explaination which no one gives, keep it up
@phulers 5 หลายเดือนก่อน ⁺²
I think therir is slight confuson between AM(Application Master) and driver program: 8:28
The AM launches the driver program within a container on a worker node.
The driver program communicates with the AM for resource allocation and task scheduling.
The AM acts as a bridge between the driver program and the cluster manager(YARN).
@laboni8359 9 หลายเดือนก่อน ⁺¹
Literally mind-blown by your teaching! Awsome content
@sahillohiya7658 8 หลายเดือนก่อน ⁺²
you are one of THE BEST TEACHER i have ever known
@lucky_raiser ปีที่แล้ว ⁺³
bro, you can be the codeWithHarry of data engineering world, keep continuing this thing. and thanks for this knowledge sharing.
@satyamrai2577 10 หลายเดือนก่อน ⁺²
Beautifully explained. Concepts are so much easier to understand with the help of diagram.
@arju1010 8 หลายเดือนก่อน
I have watched many tutorials on Spark but you are the best. The way you teaching is amazing. Sir, please don't stop to uploade tutorials like this. You are great sir. Thank you. From Bangladesh
@gchanakya2979 ปีที่แล้ว ⁺¹
You are building my confidence in the subject. Thank you bhaiya.
@nayanikamula7109 10 วันที่ผ่านมา
You probably won’t see this. But I watched your videos 2 days before my DE interview and I cracked it with confidence. Like you said, the fundamentals make all the difference. My understanding was so clear that they offered me the position on the spot
@nayanikamula7109 10 วันที่ผ่านมา
You are a wonderful teacher. You have a gift. Please start a DE bootcamp. You’ll see great success with it I’m sure
@shubhamwaingade4144 5 หลายเดือนก่อน
The video summary at the end are very useful to recall everything from the video! Good thought Manish...
@AprajitaPandey-of2kf 19 วันที่ผ่านมา
explained wonderfully.
@nitiksharathore5290 2 หลายเดือนก่อน
Thank you so much for this explanation, Please continue the good work
@dhairyaarya2500 2 หลายเดือนก่อน
the flow of explanation and engagement were on point 💯
@siddhantmishra6581 20 วันที่ผ่านมา
brilliantly explained. Loads of Thanks.
@Rakesh-if2tx ปีที่แล้ว ⁺¹
Please continue making videos like this with complete information... I appreciate your hard work. Time lage to lage... Concept clear hona chahiye... 😅
@ujjalroy1442 ปีที่แล้ว
Right said.... Very detailed👏👏👍👍
@harshitgupta355 6 หลายเดือนก่อน
Thankyou manish bhai for this wonderful video
@satyammeena-bu7kp 23 วันที่ผ่านมา
So Helpful ! Really a Great Explanation !
@deeksha6514 3 หลายเดือนก่อน
Superb! Explanation
@himanshutrripathii2837 ปีที่แล้ว
Salute for your hard work but hope in the next video you will come up with the practical too..
@kavyabhatnagar716 9 หลายเดือนก่อน
Crystal clear. Thanks a lot. 👏
@Rakesh-if2tx ปีที่แล้ว
Thank you Manish Bhai.... You're really doing a great work🙏🏻🙏🏻.... In this series please upload the videos a bit faster... 😊
@Suraj_a_3405 ปีที่แล้ว
Thank you,This is perfect.
@engineerbaaniya4846 10 หลายเดือนก่อน ⁺²
Hi Manish, I watched this completely I Understood But most of the time in interviews people ask about spark contenxt and the other way of architechture that you did not covered any view on this ?
@user-dp1rw2qm1c 9 หลายเดือนก่อน
Hi Manish, great explanation, I have one doubt-
Is it possible to add more than one executor in worker node?
asking because u demonstrated as one executor comes to one worker node only.
@mrinalraj7166 4 หลายเดือนก่อน
khatarnak Manish bhai. maja aa gya
@rpraveenkumar007 ปีที่แล้ว ⁺³
Thank you, Manish. It was an absolutely crystal clear explanation. Hoping to get more in-depth videos like this.
@manish_kumar_1 ปีที่แล้ว
Glad you liked it
@atulbisht9019 7 หลายเดือนก่อน
thanks for the video manish
@amitgupta8179 5 หลายเดือนก่อน ⁺²
Bhai concept apne deep diya hai
Lekin mujhe avi bhi container me bahut confusion hai...
Repeat krne ke baad bhi clear nahi hua
@anirbanadhikary7997 ปีที่แล้ว
Wonderful explanation
@Analystmate 9 หลายเดือนก่อน
Maja aa gya bhai . Khan Sir yaad aa gye 🙂
Thanks
@rishav144 ปีที่แล้ว ⁺¹
very nice series
@shubham2881 7 หลายเดือนก่อน
Stunning explanation bro 👍
@SrihariSrinivasDhanakshirur 3 หลายเดือนก่อน
God level explanation!
@shreyanshrathod2007 2 หลายเดือนก่อน
Thanks for the explanation Manish. One quick question, aapne yaha 5 executors 5 alag alag worker nodes pe banaye hai. Is it possible that we can have more than 1 executor available on the same worker node/ same machine?
Thanks in advance
@RakeshGupta-kx5qe 10 หลายเดือนก่อน
Hi Manish Thank you very much for sharing great knowledge . Currently I have 10.5 Year Experience in IT including SQL,PLSQL(7 Year), SQL Server T-SQL (1.5 Year) and Snowflake Query Optimization 6 Month . When I was joined before 2 Year as Data Engineer (Spark with Scala) in one MNC company but He was given project on T-SQL . I was only taken trainings and search interview question and clear interview . At time I on bench what should be we take decision Please suggest me?.
@kudlamolka1429 3 วันที่ผ่านมา
The Spark Code can be written in Scala itself right? Will we need Application Driver even if the code is written in Scala?
@analyticstamizan677 5 หลายเดือนก่อน
Great explanation bro👌👍.. It would be nice if you add subtitles.
@user-rh1hr5cc1r 3 หลายเดือนก่อน
Spark Architecture:
whenever a job is initiate, 'Spark Context' start the 'Spark Session'.It connect with 'Cluster Manager' and trying to understand how many 'Worker Node'(Slave) is required and once the information is received, the 'Driver Program'(Master) will start assign the task to the Worker Node. 'Executor' is responsible for doing all the task. Inter mediate results stored in 'Cache'. All the Worker Node connected with each other so that it can share data, logics with each other
@dishanttoraskar2885 10 หลายเดือนก่อน
Very well explained 🤩
@mahnoorkhalid6496 9 หลายเดือนก่อน
Great
@tanushreenagar3116 4 หลายเดือนก่อน
PERFECT BEST ONE EVER
@user-hr2tz6ny1v 9 หลายเดือนก่อน
explained very well
@sharma-vasundhara 4 หลายเดือนก่อน
I have a question - in the video, we wanted 5 executors of 25 GB RAM, 5 cores each. And for 5 executors you used - w2, w3, w4, w7, and w8. Now, all of them have 100 GB RAM and 20 cores.
Why can't we put 4 executors on a single machine? 4 x 25 = 100 GB, and 4 x 5 = 20 cores
That way, our resources (executors, driver) will be spread across less number of machines. I don't know what benefits/drawbacks that might have. Just curious why can't we do this
@SurajKumar-hb7oc 6 หลายเดือนก่อน
Hii @manish I have two questions
1) What is the difference between cluster manager and resource manager?
2) How developer tell that this type of requirements like RAM, core?
@rupeshreddy4408 2 หลายเดือนก่อน
Great Explanation! But, I have doubt regarding the driver. Will there be an extra worker node for driver manager or can it be in any of the executors which process the data. What I mean is for instance if we want to process 10 GB let's say after calculation we want 16 executors, so along with driver will it be 17 executors or am I missing something here.?
@yashwantdhole7645 5 หลายเดือนก่อน
Hi Manish. If code from Pyspark driver is getting converted into equivalent java code, won't the udfs too will get converted?
If this is true? Why do we need Python Worker again in the executor?
@sagarmendhe8194 20 วันที่ผ่านมา
Hi Manish sir agar cluster size puche interview me to kaisa batane ka
@aniketnaikwadi6074 7 หลายเดือนก่อน
hello Manish Kumar,
hope you're doing well , Very well explained concept and very good Spark series, can you provide the pdf or link of the notes?
@anupandey7888 2 หลายเดือนก่อน
Hi Manish I have Question why can't the UDF in Pyspark be converted to Java code in the application Master
@shivakrishna1743 ปีที่แล้ว
Thanks!
@krishnavamsirangu1727 3 หลายเดือนก่อน
Hi Manish
I am learning spark from your videos,but in this video I am bit confusing because you are saying driver is present in worker node but actual architecture diagram it is saying driver present in master.
Could you please clarify or elaborate on this.
@rajvirkumar4787 หลายเดือนก่อน
good
@RAHULKUMAR-px8em ปีที่แล้ว
Bhaiya total Syllabus cover kijiyga me apka Spark series follow kr rha hu
@raghudeep8873 ปีที่แล้ว
👍👍👍👍
@amitpatel9670 ปีที่แล้ว ⁺¹
osm video.. also please share the playlist or course for SQL. would really appreciate it.
@manish_kumar_1 ปีที่แล้ว
You can follow kudvenkat youtube channel for sql
@shubhamkhatri6908 9 หลายเดือนก่อน
Hi Manish, very informative video.
I have one question, what exactly executor is?
As per my understanding, its responsible for executing task and have cores in it for processing.
Since each worker node has 20 core, i can create execution with any core and any memory.
@manish_kumar_1 8 หลายเดือนก่อน ⁺¹
Worker node me se aapko kuch memory milega in form of container for your spark job. And in that container aapka executor chalega with the memory that you have asked for. So let say worker node ke paas 64 GB RAM and 16 core CPU hai. And aap bas 10 GB with 3 core manage ho to utna hi milega. Baaki ka memory kisi aur job ko milega
@susreesuvramohanty261 3 หลายเดือนก่อน
When application driver will stop working?Could you please explain again ?
@audiobook-pustakanbarobara2603 3 หลายเดือนก่อน
I have one doubt please anyone resolve it.
pyspark driver is created only in the application master if we don't use any udf(user defined function) but we write code in pyspark and that distributly process on the the worker nodes so even if I use any udf or not but our code is in pyspark only then how the worker nodes process the pyspark code even though I having only JVM and not having any python worker in worker node?
@khurshidhasankhan4700 7 หลายเดือนก่อน
Sir agar ek node fail hoga to kya karenge interview me pucha hai , please give me the answer, bahut fasa raha hai interview me
@manish_kumar_1 ปีที่แล้ว
Directly connect with me on:- topmate.io/manish_kumar25
@chiragsharma9430 ปีที่แล้ว
Hi Manish, I have one question to ask, I have seen in some job descriptions mentioning about the databricks. What does it mean when we see one must know on how to work on DataBricks? I mean when someone say a candidate should know on how to work on DataBricks what exactly they mean by that? What are the things one should know about DataBricks?
Looking forward for your reply.
@manish_kumar_1 ปีที่แล้ว
You should know how to work with databricks. It's just a tool which you can learn very easily once you start using it
@chiragsharma9430 ปีที่แล้ว
Alright, thanks for the reply, Manish. Really appreciate your response.
@worldthroughmyvisor 5 หลายเดือนก่อน
what if I try to provision more executor nodes than is available on my cluster ?
or what if I try to provision more ram or CPU cores than the capacity of my executors ?
can you try to explain what would happen on a cluster as locally I think it is more difficult to replicate it ?
@manish_kumar_1 5 หลายเดือนก่อน
You can try locally also. Ask for more than available RAM in your system but you are going to only available memory. If you are ask for more memory then you are not going to get that because there is a hardware limit. You will be allocated the memory available in your cluster. If already multiple jobs are running then your job will be in queue waiting to get memory to be available for the run. It runs in FIFO manner
@bangalibangalore2404 11 หลายเดือนก่อน
Hello Manish, agar hum zyada RAM ya zyada core maang le ek machine me jitnna available hai usse zyada to kya hoga?
@manish_kumar_1 11 หลายเดือนก่อน ⁺¹
Resource wastage hoga. And aapko denge nahi extra resources kyunki RAM bahut costly Hota hai.
@bangalibangalore2404 11 หลายเดือนก่อน
Aur ek question hai ki files ko kya le ke aaya jata hai? Matlab files to distributed tareeka se padi hongi usi cluster me, to Jahan file hai wahin pe executor banega ki randomly banega.
Soch lijiye file abc.csv hai machine 4, 5 me.
To yarn se jab resources maangenge to 4, 5 me hi banayega executor ka container? Ya fir randomly cluster me kahin bhi banega?
@nitilpoddar 6 หลายเดือนก่อน
done
@hafizadeelarif3415 2 หลายเดือนก่อน
Hallo Brother,
I have a question: Spark is a distributed processing framework and is fail-tolerant. However, if the driver node fails, what happens?
@manish_kumar_1 2 หลายเดือนก่อน ⁺¹
You will have to re run the job
@quiet8691 4 หลายเดือนก่อน ⁺¹
Bro yeh lecture ka notes do na pdf format m pls
@ETLMasters ปีที่แล้ว
Spark ecosystem ki study bas interview crack karne ke liye jaruri h ya fir iska practical work me bhi koi use h?
@manish_kumar_1 ปีที่แล้ว
Overall picture samajhne ke liye pata hona chahiye
@vikasvk9174 11 หลายเดือนก่อน
Great explanation
I have doubt 1) what happend if we don't have 5 free worker in cluseter
2) we have 5 free worker but we don't have enough cpu core or memory that we requested
Thank you and waiting for you replay
@manish_kumar_1 11 หลายเดือนก่อน ⁺¹
You will have to wait in queue. FIFO is applied by the resource manager
@moyeenshaikh4378 8 หลายเดือนก่อน
bhai last me jo driver band hoga bole tum.. vo application driver hoga na?
and isme ek application driver gai and dusra bhi koi driver hai kya master me?
@manish_kumar_1 8 หลายเดือนก่อน ⁺¹
Ek job ka ek hi driver hoga. And driver band hone ke baad executor v band ho jayega.
@raviyadav-dt1tb 5 หลายเดือนก่อน
I’m following 2024.
@Home-so8gi หลายเดือนก่อน
ek node me 5 containers ni ban sakte? 20gb ka
@manish_kumar_1 หลายเดือนก่อน
Ban sakte hai. Wahi to video me bola Tha. Workload ke basis par container Banta hai
@rajasekhar4023 ปีที่แล้ว ⁺¹
Did not Understand the JVM main(), Since Spark supports Python then why JVM needed to submit spark application. pls Explain Elaborately ?
Thanks for Wonderful session..
@rajasekhar4023 ปีที่แล้ว
What exactly use of JVM since spark supports Python to code ?
@bangalibangalore2404 11 หลายเดือนก่อน ⁺¹
Spark is written in Java/ Scala, spark by default does not understand python, think that there is a language translator that changes the python code to Java Byte Code which is understood by spark. Thus python code is converted to Java code first and then the code is run
Spark supports python due to this translator
@rukulraina7440 9 หลายเดือนก่อน
can anyone explain this to me if they understood it well ?
@moyeenshaikh4378 9 หลายเดือนก่อน
bhai theory and practical khatam ho gaya hai kya playlist? ya bacha hai kuch?
@manish_kumar_1 9 หลายเดือนก่อน ⁺¹
Ho gaya hai khatam
@shirikantjadhav4308 10 หลายเดือนก่อน
Hi, I'm following your video and I need PDF file so could you provide me?
@manish_kumar_1 10 หลายเดือนก่อน
I think you haven't watched starting wala video. I don't provide pdf, you have to note it down by yourself. By this way, you are only going to get benefits
@radheshyama448 10 หลายเดือนก่อน
❤💌💯💢
@TechnoSparkBigData 9 หลายเดือนก่อน
so driver is our application master?
@manish_kumar_1 9 หลายเดือนก่อน ⁺¹
Nahi. Application master container ke inside Jo Application driver Banta Hai that is the driver
@TechnoSparkBigData 9 หลายเดือนก่อน
@@manish_kumar_1 thanks
@deepakpandey836 11 หลายเดือนก่อน
ye fir se dekhna hoga lol
@manish_kumar_1 11 หลายเดือนก่อน
Kyu kya hua
@CodeInQueries ปีที่แล้ว
Hii Manish I need your linkedin profile link for connect with you.. I need a guidance
@manish_kumar_1 ปีที่แล้ว ⁺¹
Check description. You can find all of my social media handle link

ต่อไป

เล่นอัตโนมัติ