Tiger Analytics PySpark Interview Question | Very Important Question of PySpark |

REAL SQL Interview PROBLEM by Capgemini | Solving SQL Queries

Data validation between source and target table | PySpark Interview Question |

เจนนี่เซอร์ไพรส์ลุงบูรณ์

นายกฯอิ๊งค์แจกเงินสด 10,000 บาท : ขีดเส้นใต้เมืองไทย | 4 ก.ย. 67 | ข่าวเที่ยงไทยรัฐ

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

Capgemini Data Engineer Interview Question - Round 1 | Save Multiple Columns in the DataFrame |

GeekCoders

มุมมอง 13 969

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 5 ก.ย. 2024
Input
data = [
(1, "Sagar", 23, "Male", 68.0),
(2, "Kim", 35, "Female", 90.2),
(3, "Alex", 40, "Male", 79.1),
]
schema = "Id int,Name string,Age int,Gender string,Marks float"
df = spark.createDataFrame(data, schema)
Solution:
from pyspark.sql.functions import col
set_of_dtypes=set(i[1] for i in df.dtypes)
for i in set_of_dtypes:
cols=[]
for j in df.dtypes:
if(i==j[1]):
cols.append(j[0])
df.select(cols).write.mode('overwrite').save(f'/FileStore/tables/output_capegmini/{i}')
Combo course package : www.geekcoders...
I have prepared many courses on Azure Data Engineering
1. Build Azure End to. End Project
www.geekcoders...
2. Build Delta Lake project
www.geekcoders...
3. Master in Azure Data Factory with ETL Project and PowerBi
www.geekcoders...
4. Master in Python
www.geekcoders...
Check out my courses on Azure Data Engineering
www.geekcoders...
hastags
tags
#dataengineer #interviewquestions #spark
#hashtags #hastag #tags

ความคิดเห็น • 27

@kunuturuaravindreddy5879 หลายเดือนก่อน ⁺¹
very good you are posting real interview questions many of them simply explain concer defentitiins
@GeekCoders หลายเดือนก่อน
@@kunuturuaravindreddy5879 thanks
@sourav_sarkar_2000 7 หลายเดือนก่อน ⁺⁴
# creating a dict of columns as to avoid checking multiple datatypes
d={}
for col in df.dtypes:
if col[1] not in d:
d[col[1]] = [col[0]]
else:d[col[1]].append(col[0])
for key,val in d.items():
df.select(val).show()
# write df to the location
@Offical_PicturePerfect 18 วันที่ผ่านมา
int_cols = [col for col, dtype in df.dtypes if dtype == 'int']
string_cols = [col for col, dtype in df.dtypes if dtype == 'string']
float_cols = [col for col, dtype in df.dtypes if dtype == 'float']
Creating DataFrames for each data type
int_df = df.select(int_cols)
string_df = df.select(string_cols)
float_df = df.select(float_cols)
@sourav_sarkar_2000 7 หลายเดือนก่อน ⁺¹
# creating a dict of columns to avoid checking multiple datatypes
d={}
for col in df.dtypes:
if col[1] not in d:
d[col[1]] = [col[0]]
else:d[col[1]].append(col[0])
print(d)
for key,val in d.items():
df.select(val).show()
# write df to the location
# df.write.mode('overwrite').save(f'temp_loc/{key}')
@myl1566 8 หลายเดือนก่อน ⁺¹
Good problem to solve. Thanks for posting sagar!
@GeekCoders 8 หลายเดือนก่อน
Thank you
@aamirmansuri69 8 หลายเดือนก่อน ⁺²
Thank you for posting this video. But, can you please post pyspark interview questions for freshers. Thank you!
@rawat7203 7 หลายเดือนก่อน ⁺¹
My Way Sir
intType = []
stringType = []
floatType = []
for i in df.dtypes:
if i[1] == 'int':
intType.append(i[0])
elif i[1] == 'string':
stringType.append(i[0])
elif i[1] == 'float':
floatType.append(i[0])
dfInt = df.select(*intType)
dfString = df.select(*stringType)
dfFloat = df.select(*floatType)
@GeekCoders 7 หลายเดือนก่อน
Nice
@Dataengineeringlearninghub 8 หลายเดือนก่อน ⁺¹
Great problem sagar
@rawat7203 7 หลายเดือนก่อน ⁺¹
Thanks a lot Sir
@GeekCoders 7 หลายเดือนก่อน
Thank you
@vutv5742 8 หลายเดือนก่อน ⁺¹
Completed 👏
@pradishpranam6175 6 หลายเดือนก่อน
cool question
@2412_Sujoy_Das 7 หลายเดือนก่อน
My solution is as follows:
string = df
integer = df
float = df
for i in df.dtypes:
if i[1]!='string' and i[1]=='int':
string = string.drop(i[0])
float = float.drop(i[0])
elif i[1]!='string' and i[1]=='float':
string = string.drop(i[0])
integer = integer.drop(i[0])
elif i[1]!='int' and i[1]=='string':
integer = integer.drop(i[0])
float = float.drop(i[0])
elif i[1]!='int' and i[1]=='float':
integer = integer.drop(i[0])
string = string.drop(i[0])
elif i[1]!='float' and i[1]=='string':
float = float.drop(i[0])
integer = integer.drop(i[0])
else:
float = float.drop(i[0])
string = string.drop(i[0])
print(string)
print(integer)
print(float)
@Nextgentrick 6 หลายเดือนก่อน
Shouldn’t you use append instead of overwrite
@pratyushkumar8567 8 หลายเดือนก่อน ⁺¹
Hi Sagar
this Capgemini Data Engineer Interview Question - Round 1 | Save Multiple Columns in the DataFrame
what was the experience the candidate has ?
@GeekCoders 8 หลายเดือนก่อน
4 years
@bhumikalalchandani321 8 หลายเดือนก่อน
okay, is this internal functionality of conversion to parq format
@rawat7203 7 หลายเดือนก่อน
yes
@souvikchattopadhyay4760 3 หลายเดือนก่อน
my solution:
dict={}
for i in df.dtypes:
if i[1] in dict.keys():
l=dict.get(i[1])
l.append(i[0])
dict.update({i[1]:l})
else:
l=[]
l.append(i[0])
dict.update({i[1]:l})

for i in dict.keys():
df_s=df.select(dict.get(i))
df_s.show()
##did show instead of writing
@ug1880 7 หลายเดือนก่อน ⁺¹
Were u asked for any imocha test ?
@GeekCoders 7 หลายเดือนก่อน ⁺¹
No
@ug1880 7 หลายเดือนก่อน
@@GeekCoders okk...

ต่อไป

เล่นอัตโนมัติ

Tiger Analytics PySpark Interview Question | Very Important Question of PySpark |

Tiger Analytics PySpark Interview Question | Very Important Question of PySpark |

REAL SQL Interview PROBLEM by Capgemini | Solving SQL Queries

REAL SQL Interview PROBLEM by Capgemini | Solving SQL Queries

Data validation between source and target table | PySpark Interview Question |

Data validation between source and target table | PySpark Interview Question |

เจนนี่เซอร์ไพรส์ลุงบูรณ์

เจนนี่เซอร์ไพรส์ลุงบูรณ์

นายกฯอิ๊งค์แจกเงินสด 10,000 บาท : ขีดเส้นใต้เมืองไทย | 4 ก.ย. 67 | ข่าวเที่ยงไทยรัฐ

นายกฯอิ๊งค์แจกเงินสด 10,000 บาท : ขีดเส้นใต้เมืองไทย | 4 ก.ย. 67 | ข่าวเที่ยงไทยรัฐ

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

🔴Live โหนกระแส หลวงตางานเข้ายืมเงินชาวบ้าน 10 ล้าน เอาโบสถ์มาค้ำ

"จุดจบตำนานหงอคง" วิเคราะห์เนื้อเรื่องตำนานทมิฬ Black Myth Wukong | The Codex

"จุดจบตำนานหงอคง" วิเคราะห์เนื้อเรื่องตำนานทมิฬ Black Myth Wukong | The Codex

TCS Live Interview for Azure Data Engineer | Technical round -1 Azure | KSR DATAVIZON

TCS Live Interview for Azure Data Engineer | Technical round -1 Azure | KSR DATAVIZON

Partitioning vs Bucketing | Interview Question | PySpark #pyspark #bigdata #pwc #interview

Partitioning vs Bucketing | Interview Question | PySpark #pyspark #bigdata #pwc #interview

Flatten Nested Json in PySpark

Flatten Nested Json in PySpark

I gave 127 interviews. Top 5 Algorithms they asked me.

I gave 127 interviews. Top 5 Algorithms they asked me.

The Sad Reality of Being a Data Scientist

The Sad Reality of Being a Data Scientist

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Azure Cloud Data Engineer Mock Interview | Important Questions asked in Big Data Interviews| Pyspark

Azure Cloud Data Engineer Mock Interview | Important Questions asked in Big Data Interviews| Pyspark

Most Important Question of PySpark in LTIMindTree Interview Question | Salary in each department |

Most Important Question of PySpark in LTIMindTree Interview Question | Salary in each department |

day 3 | consecutive days | pyspark scenario based interview questions and answers

day 3 | consecutive days | pyspark scenario based interview questions and answers

📍LIVE📍 งานแถลงข่าวการประกวด Miss Grand Thailand 2025

📍LIVE📍 งานแถลงข่าวการประกวด Miss Grand Thailand 2025

โผครม.นายน้อย! "เนวิน"แผลงฤทธิ์ | DAILYNEWSTODAY 03/09/67

โผครม.นายน้อย! "เนวิน"แผลงฤทธิ์ | DAILYNEWSTODAY 03/09/67

เจนนี่เซอร์ไพรส์ลุงบูรณ์

เจนนี่เซอร์ไพรส์ลุงบูรณ์

ตีสิบเดย์ [FULL] | ตัวตึง! ปากกล้า! ท้ารบ! "ทนายไพศาล เรืองฤทธิ์"

ตีสิบเดย์ [FULL] | ตัวตึง! ปากกล้า! ท้ารบ! "ทนายไพศาล เรืองฤทธิ์"

คุณจะต้องไม่เชื่อแน่ๆ ว่าลูกโป่งจะทำออกมาเป็นสิ่งนี้ได้ #negi #diy

คุณจะต้องไม่เชื่อแน่ๆ ว่าลูกโป่งจะทำออกมาเป็นสิ่งนี้ได้ #negi #diy

UNO REVERSE #beatbox #tiktok

UNO REVERSE #beatbox #tiktok

อนุบาลอกหัก ครูขา หนูขอลาออกจากห้อง | อีจัน EJAN

อนุบาลอกหัก ครูขา หนูขอลาออกจากห้อง | อีจัน EJAN

MISS GRAND SARABURI 2025 | Swimsuits Saraburi in Wonderland ✨

MISS GRAND SARABURI 2025 | Swimsuits Saraburi in Wonderland ✨