Top Big Data Interview Questions asked in 2024 | Cloud Data Engineer | Azure | Spark | SQL
ฝัง
- เผยแพร่เมื่อ 5 ก.พ. 2025
- 𝐓𝐨 𝐞𝐧𝐡𝐚𝐧𝐜𝐞 𝐲𝐨𝐮𝐫 𝐜𝐚𝐫𝐞𝐞𝐫 𝐚𝐬 𝐚 𝐂𝐥𝐨𝐮𝐝 𝐃𝐚𝐭𝐚 𝐄𝐧𝐠𝐢𝐧𝐞𝐞𝐫, 𝐂𝐡𝐞𝐜𝐤 trendytech.in/... for curated courses developed by me.
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
𝐖𝐚𝐧𝐭 𝐭𝐨 𝐌𝐚𝐬𝐭𝐞𝐫 𝐒𝐐𝐋? 𝐋𝐞𝐚𝐫𝐧 𝐒𝐐𝐋 𝐭𝐡𝐞 𝐫𝐢𝐠𝐡𝐭 𝐰𝐚𝐲 𝐭𝐡𝐫𝐨𝐮𝐠𝐡 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭 𝐬𝐨𝐮𝐠𝐡𝐭 𝐚𝐟𝐭𝐞𝐫 𝐜𝐨𝐮𝐫𝐬𝐞 - 𝐒𝐐𝐋 𝐂𝐡𝐚𝐦𝐩𝐢𝐨𝐧𝐬 𝐏𝐫𝐨𝐠𝐫𝐚𝐦!
"𝐀 8 𝐰𝐞𝐞𝐤 𝐏𝐫𝐨𝐠𝐫𝐚𝐦 𝐝𝐞𝐬𝐢𝐠𝐧𝐞𝐝 𝐭𝐨 𝐡𝐞𝐥𝐩 𝐲𝐨𝐮 𝐜𝐫𝐚𝐜𝐤 𝐭𝐡𝐞 𝐢𝐧𝐭𝐞𝐫𝐯𝐢𝐞𝐰𝐬 𝐨𝐟 𝐭𝐨𝐩 𝐩𝐫𝐨𝐝𝐮𝐜𝐭 𝐛𝐚𝐬𝐞𝐝 𝐜𝐨𝐦𝐩𝐚𝐧𝐢𝐞𝐬 𝐛𝐲 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐢𝐧𝐠 𝐚 𝐭𝐡𝐨𝐮𝐠𝐡𝐭 𝐩𝐫𝐨𝐜𝐞𝐬𝐬 𝐚𝐧𝐝 𝐚𝐧 𝐚𝐩𝐩𝐫𝐨𝐚𝐜𝐡 𝐭𝐨 𝐬𝐨𝐥𝐯𝐞 𝐚𝐧 𝐮𝐧𝐬𝐞𝐞𝐧 𝐏𝐫𝐨𝐛𝐥𝐞𝐦."
𝐇𝐞𝐫𝐞 𝐢𝐬 𝐡𝐨𝐰 𝐲𝐨𝐮 𝐜𝐚𝐧 𝐫𝐞𝐠𝐢𝐬𝐭𝐞𝐫 𝐟𝐨𝐫 𝐭𝐡𝐞 𝐏𝐫𝐨𝐠𝐫𝐚𝐦 -
𝐑𝐞𝐠𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐢𝐧𝐤 (𝐂𝐨𝐮𝐫𝐬𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐟𝐫𝐨𝐦 𝐈𝐧𝐝𝐢𝐚) : rzp.io/l/SQLINR
𝐑𝐞𝐠𝐢𝐬𝐭𝐫𝐚𝐭𝐢𝐨𝐧 𝐋𝐢𝐧𝐤 (𝐂𝐨𝐮𝐫𝐬𝐞 𝐀𝐜𝐜𝐞𝐬𝐬 𝐟𝐫𝐨𝐦 𝐨𝐮𝐭𝐬𝐢𝐝𝐞 𝐈𝐧𝐝𝐢𝐚) : rzp.io/l/SQLUSD
BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
Our highly experienced guest interviewer, Ganesh Ramdas Kudale, / ganesh-kudale-50bb14ab shares invaluable insights and practical guidance drawn from his extensive expertise in the Big Data Domain.
Our expert guest interviewee, Prithvi Salve, / prithvi-salve-45545a1ba has an interesting approach to answering the interview questions on Apache Spark, SQL and Azure Cloud Services.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - • SQL tutorial for every...
Python Playlist - • Complete Python By Sum...
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - / bigdatabysumit
Twitter - / bigdatasumit
Instagram - / bigdatabysumit
Student Testimonials - trendytech.in/...
TIMESTAMPS : Questions Discussed
01:00 Introduction
01:47 What is Hadoop and how does it work?
03:09 Why move from MapReduce to Spark?
05:07 Does Spark provide storage?
05:47 Give a high-level explanation of Spark.
06:50 Why switch from RDDs to DataFrames in Spark?
07:53 Which languages does Spark support?
08:27 What are RDDs and their importance?
09:47 What happens during actions/transformations in Spark?
11:15 Explain Spark architecture.
13:06 What are deployment modes and their use cases?
14:30 Describe the plans created when executing a Spark job.
16:00 What is a predicate push down?
18:10 Explain jobs, stages, and tasks in Spark.
19:10 What are the types of transformations in Spark?
20:38 Difference between repartition and coalesce?
23:30 Should you infer schema or specify it when creating a DataFrame?
24:19 What are the ways to enforce schema? Provide an example.
24:54 SQL coding questions
41:09 Which Azure cloud services have you used?
41:35 Explain Databricks architecture at a high level.
42:40 How do you run SQL queries in Databricks?
44:10 How can one notebook run another in Databricks?
45:35 Can you use parameters when running Databricks notebooks?
46:07 Difference between Data Lake and Delta Lake? Pros and cons of each.
48:11 What activities are available in ADF?
49:09 Scenario-Based question
Music track: Retro by Chill Pulse
Source: freetouse.com/...
Background Music for Video (Free)
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
The guy answered very well ! Got the good idea on what to say and what to avoid during interview
This is awesome. Literally, every concept from Spark is covered. A must watch interview.
At 40:46 he applied lead() on results columns which is wrong. It should be on date column.
Though it is a mock interview, I appreciate his calm and pleasant responses to all the questions!
When ever transformation applied it never created a dag rather than it created a lineage between rrds and action created a DAG
Great! This is very useful for anyone who wants to become a data engineer
Thanks for the videos.
It's very helpful!
16:53
Broadcast join decided on the go or run time which is by Adaptive Query Execution not spark sql engine or catalytic optimizer as said
`he is always looking at his left side. xD
waha pe usne answer likh ke rakhe honge
keep up the good work !
Great Initiative Sumit...Kudos to both the interviewer and the candidate conducting such an outstanding session.
he answered to the point most of the questions very good
Great answers!
Java used in Hadoop
Bound to work on mapreduce
Can only work on batch process not real time in map reduce
Continue this series
Well scored.
Sir pls provide the questions in description
The million dollar question is...."Is he selected"..??? and how did he do in the 2nd round..??..2nd round questions please..
this is a demo QnA just for our understanding what questions are asked in DE interview
btw he got selected in Deloitte with 120% hike
cheers 🎉
If he doesn't get selected after knowing this much..feeling sad for the recruiter
basically, well interview
Good explanation men😅
Bro has a PhD in spark..❤
👏👏👏👏
Spark core -Rdd (flexible)
high level apis-
Df and Spark sql (easy to write query)
Transformation n action
Spark submit process
Deployment modes
Types of transformation
Repartition n coalesce
Methods for schema enforcement - ddl, struct
Consecutive wins in sql