rdd in spark | Lec-9
ฝัง
- เผยแพร่เมื่อ 3 ต.ค. 2024
- In this video I have talked about rdd in spark in great detail. please follow video entirely and ask doubt in comment section below.
Directly connect with me on:- topmate.io/man...
For more queries reach out to me on my below social media handle.
Follow me on LinkedIn:- / manish-kumar-373b86176
Follow Me On Instagram:- / competitive_gyan1
Follow me on Facebook:- / manish12340
My Second Channel -- / @competitivegyan1
Interview series Playlist:- • Interview Questions an...
My Gear:-
Rode Mic:-- amzn.to/3RekC7a
Boya M1 Mic-- amzn.to/3uW0nnn
Wireless Mic:-- amzn.to/3TqLRhE
Tripod1 -- amzn.to/4avjyF4
Tripod2:-- amzn.to/46Y3QPu
camera1:-- amzn.to/3GIQlsE
camera2:-- amzn.to/46X190P
Pentab (Medium size):-- amzn.to/3RgMszQ (Recommended)
Pentab (Small size):-- amzn.to/3RpmIS0
Mobile:-- amzn.to/47Y8oa4 ( Aapko ye bilkul nahi lena hai)
Laptop -- amzn.to/3Ns5Okj
Mouse+keyboard combo -- amzn.to/3Ro6GYl
21 inch Monitor-- amzn.to/3TvCE7E
27 inch Monitor-- amzn.to/47QzXlA
iPad Pencil:-- amzn.to/4aiJxiG
iPad 9th Generation:-- amzn.to/470I11X
Boom Arm/Swing Arm:-- amzn.to/48eH2we
My PC Components:-
intel i7 Processor:-- amzn.to/47Svdfe
G.Skill RAM:-- amzn.to/47VFffI
Samsung SSD:-- amzn.to/3uVSE8W
WD blue HDD:-- amzn.to/47Y91QY
RTX 3060Ti Graphic card:- amzn.to/3tdLDjn
Gigabyte Motherboard:-- amzn.to/3RFUTGl
O11 Dynamic Cabinet:-- amzn.to/4avkgSK
Liquid cooler:-- amzn.to/472S8mS
Antec Prizm FAN:-- amzn.to/48ey4Pj
This guy is underrated in the field of Data Engineering!! Crisp & Clear explaination of RDD
Man, "How to and what to" point you mentioned in the video was magical, cleared the whole picture
Best explanation. You are sent by God.
best explaination of RDD i found on youtube, and no doubt one of the best teachers !
the way you are teaching is excellent and easy manner.
very nice explanation !!
best one sir , you showed us the real process behind the scenes. Thanks
please be consistent,and post video ,,i am depending on you to switch my career to data engineering within 2 months
Hi Manish,
Your lectures are amazing and full of knowledge. Thanks a lot for uploading such beautiful conceptual videos.
very well explained
Every where rdd had same meaning but no proper explanation thank you sar
Brilliant..!! Thank you so much for your efforts in teaching
Love you sir thank you so much......❤❤❤ बोहोत अछा सिखाते हो
mast samjhaya bhai maza aa gya!! Thank you
Best yotube channel....for learning 😊
hard concept but you made a very easy explanation.👍
Thank you sir for best tutorial ❤
Best explained, the way to teaching is very clear. thanks for providing such content on youtube. I have one request, could you please create a video on difference between RDD, dataframe and Datasets
Pehele aapne kaha features of RDD me ki optimization or disadvantage me No optimization
Mtlb kya hai
Or full control on our data chahie to rdd use krna chahie ye point clear kro
at 12:15, I think it should be age < 18 for RDD1 and then again age < 10 for RDD2 🙂
excellent content #Love you Manish Bhai
really amazing explanation, thanks :)
hi Manish, your explaination is very good. Keep us teaching and inspiring always!!! Do we deal with unstructered data on a frequent basis, if yes then is it through RDD always? If possible can you give some example or make a video on it.
Nice explanation sir
well explained
nice explanation.
Another good one, @MANISH KUMAR!! 🙌 By any chance, do you still work with them at Jio? 😅😅 Also, can you share what type of architecture are you working with along with data volume?
Keep continuing this series! I'm preparing for #spark for my next switch with your series!! :) Thanks a lot again on behalf of all DEs!!! 👍👏👏
Hi @sankuM it has been 8 months, can I ask you about your progress and how has it been going with your switch ?
Hey@@nitilpoddar, it is going good only! I switched in May last year only!
Knowing about fundamentals and optimizations thereof really help in day-to-day work I feel now!!!
Hi Manish, when is the RDD originally created? Is it created just after JVM code is generated by spark engine? Thanks for all the work you do.
Waiting for the next video !!
Q. When you say 500MB/128MB = 4 Partition, is this means that every single partioned data will be stored over each node ?
Yes this partitioned data is distributed over the network
@@manish_kumar_1 each node may have different capacity, right? it is not necessary that each node should accommodate only one partition. say we have each node with capacity of 300 mb so in this case it can accommodate 2 partitions of 128 mb each.please confirm?
@@adityaverma4770 yes it can execute two partitions as well in a single node but the executor should have multiple cores. If it is only having 1 core then one executor will only execute one partition. Hope this helps
Node and executer both are different. Single node can build multiple executers base on node capacity
i started watching lecture from last one week and now from today i am going to complete the lecture within 10 days // target started from today let's support me for this and i know i am a slow learner but i can do it 🙂// counting started from toady date 20/04/24
Isko jaldi complete karne me bahaduri nhi hai. Take your time and try to understand in great depth.
@@manish_kumar_1 ok and thank you
Hello Manish.. practical wale videos ki link mil sakti hai kya?!
Thanks for sharing valuable information,👍
जब भी कोई RDD के बेस पर नया RDD बनता है तो वो जो नया RDD बनेगा वो क्या same क्लस्टर मे बेनग़ा या फिर किसी दूसरे मे.
On same cluster
That means every single dataframe will eventually transformed into RDD right?
Yes
In previous videos, it was mentioned by you that a DAG will be created only when it encounters any action(Job--> DAG). In this video you have mentioned a DAG will be created after every transformation. DAG created as an when it scans every line of code ? Or is it created when any action is encountered
DAG is created when an action is encountered.
Love you guru ji
Here dataframe code means pyspark code right???
@manish_kumar_1 bahi can you explain it?.
Q1. DataFrames in Spark are built on top of RDDs, but they use a more efficient execution engine called Tungsten.
Q2. RDD don't have compile-time checking. In Vise versa Data frame have Compile-time and runtime checking.
Thanks for the explanation, Manish.
One doubt, when we apply multiple transformations and generate new RDD's, are all the previous RDD stored in the memory until an action is called?
Thanks in advance
Whole spark works on the principal of lazy transformation. What that essentially means is, don't process any data unless an output activity is called on it. so if you called 1000 transformations , but never called an action, which is like an output activity of sorts , nothing will be executed. that means , spark won't have to create or store any rdd anywhere.
done👍
🎉👍
Nice explanation sir.
Red color is not proper visible , better to use other color.
Ok. Thanks
Guru ji good morning
👍👍👍
What's your source of learning Spark? Btw explanation is good 👍
Book spark the definitive guide,
Blogs,
TH-cam videos,
Company real time projects
done
Directly connect with me on:- topmate.io/manish_kumar25
Which app are you using for making notes?
One note
Guru ji bhar bhar ke like subscribe karo sab log
Bhai like kr diya kro dekhne ke bad
very well explained