PySpark Coding Question - Accenture and TCS | PySpark Interview Question |

Capgemini Data Engineer Interview Question - Round 1 | Save Multiple Columns in the DataFrame |

Flatten Nested Json in PySpark

เก่งโพล 🐈 🐶 🦛

Seja Gentil com os Pequenos Animais 😿

# Xiaohua acridine# Xiaohua acridine# interesting jokes# Rural Short Drama# everyday

Data validation between source and target table | PySpark Interview Question |

GeekCoders

มุมมอง 3 125

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 14 ต.ค. 2024
Hello Everyone,
source_data = [(1,'A'),(2,'B'),(3,'C'),(4,'D'),(5,'E')]
source_schema = ['id','name']
source_df = spark.createDataFrame(source_data,source_schema)
source_df.show()
target_data = [(1,'A'),(2,'B'),(3,'X'),(4,'F'),(6,'G')]
target_schema = ['id','name']
target_df = spark.createDataFrame(target_data,target_schema)
target_df.show()
This series is for beginners and intermediate level candidates who wants to crack PySpark interviews
Here is the link to the course : www.geekcoders...
#pyspark #interviewquestions #interview #pysparkinterview #dataengineer #aws #databricks #python

ความคิดเห็น • 11

@beingnagur 2 หลายเดือนก่อน ⁺¹
At 6.04 instead of copying the same statement you can use .otherwise("not matching")
@rishabhkesarwani-br2rx 5 หลายเดือนก่อน ⁺²
I do below steps to compare source vs target table
1) Count should be matching in source and target table
2) Schema should be matching in source and target table
3) Use the except and to check if any records are there which are present in source and not in target or vice versa.
4) Use the left anti join to find out the records which are not matching.
5) Trying to debug why there is record mismatch
@GeekCoders 5 หลายเดือนก่อน
Nice
@gudiatoka 5 หลายเดือนก่อน ⁺¹
exceptAll can be usefull too or anti join
@GeekCoders 5 หลายเดือนก่อน
Except all may miss the null value sometime
@nishirajnikku969 5 หลายเดือนก่อน
I request you to please create a playlist for Pyspark Unit testing .
@jhonsen9842 5 หลายเดือนก่อน ⁺³
Main Problem i found in learning Pyspark is brackets every time it gives me some error.
@GeekCoders 5 หลายเดือนก่อน
Yes
@CeejayPTcoach หลายเดือนก่อน
wont the join be a costly operation
@VinodKumar-gz8bk หลายเดือนก่อน
What are the most challenging thing that you faced in your project & how you overcome?
@shivamchandan50 5 หลายเดือนก่อน
plz make video on pyspark unit testing

ต่อไป

เล่นอัตโนมัติ

PySpark Coding Question - Accenture and TCS | PySpark Interview Question |

PySpark Coding Question - Accenture and TCS | PySpark Interview Question |

Capgemini Data Engineer Interview Question - Round 1 | Save Multiple Columns in the DataFrame |

Capgemini Data Engineer Interview Question - Round 1 | Save Multiple Columns in the DataFrame |

Flatten Nested Json in PySpark

Flatten Nested Json in PySpark

Seja Gentil com os Pequenos Animais 😿

Seja Gentil com os Pequenos Animais 😿

# Xiaohua acridine# Xiaohua acridine# interesting jokes# Rural Short Drama# everyday

# Xiaohua acridine# Xiaohua acridine# interesting jokes# Rural Short Drama# everyday

🔴Live โหนกระแส ติดกับดัก...รักบอสตัวร้าย #3 "ตอนล่าเทวดา"

🔴Live โหนกระแส ติดกับดัก...รักบอสตัวร้าย #3 "ตอนล่าเทวดา"

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Processing 25GB of data in Spark | How many Executors and how much Memory per Executor is required.

Validate data between source and target table | Data Engineering Interview | BIG 4 | PWC

Validate data between source and target table | Data Engineering Interview | BIG 4 | PWC

Solve KPMG Pyspark Interview Questions

Solve KPMG Pyspark Interview Questions

Catalyst Optimizer in Spark SQL| Logical Plan Vs Physical Plan

Catalyst Optimizer in Spark SQL| Logical Plan Vs Physical Plan

GrayMatter Pyspark Interview Question - Get null count of all columns

GrayMatter Pyspark Interview Question - Get null count of all columns

Mphasis ETL Testing Interview Question

Mphasis ETL Testing Interview Question

Apache Spark Architecture - EXPLAINED!

Apache Spark Architecture - EXPLAINED!

Solving an Amazon SQL Interview Question on Notepad

Solving an Amazon SQL Interview Question on Notepad

Deloitte USI Senior Data Engineer Interview Questions | SDE Interview Experience | 25 LPA

Deloitte USI Senior Data Engineer Interview Questions | SDE Interview Experience | 25 LPA

Flipping Robot vs Heavier And Heavier Objects

Flipping Robot vs Heavier And Heavier Objects

📌LIVE #13 : ยายสปีดอย่าวิ่งชนรถ !! | Night Drive คืนหลอนซ่อนทาง

📌LIVE #13 : ยายสปีดอย่าวิ่งชนรถ !! | Night Drive คืนหลอนซ่อนทาง

วิเคราะห์ภาษากาย “บอสพอล” ร้องไห้-เผชิญหน้ากับผู้เสียหาย | ข่าวพร้อมบวก | 14 ต.ค. 67

วิเคราะห์ภาษากาย “บอสพอล” ร้องไห้-เผชิญหน้ากับผู้เสียหาย | ข่าวพร้อมบวก | 14 ต.ค. 67

MY HEIGHT vs MrBEAST CREW 🙈📏

MY HEIGHT vs MrBEAST CREW 🙈📏

ผมจะรอดมั้ย!? ช่วยครีปเปอร์หรือซอมบี้!? #เกมกับshorts #mrwattana

ผมจะรอดมั้ย!? ช่วยครีปเปอร์หรือซอมบี้!? #เกมกับshorts #mrwattana

女儿：爸爸，玩你轻而易举！ #funny #萌娃 #宝宝 #comedy #搞笑

女儿：爸爸，玩你轻而易举！ #funny #萌娃 #宝宝 #comedy #搞笑

ซื้อบ้านเถอะมึง555555 #ไม่มีไรทุกคนแค่อยากเล่าเฉยๆ #เจนิส #เจณิสตา

ซื้อบ้านเถอะมึง555555 #ไม่มีไรทุกคนแค่อยากเล่าเฉยๆ #เจนิส #เจณิสตา