40. Databricks | Spark | Pyspark Functions| Arrays_zip

Raja's Data Engineering

มุมมอง 12 029

178

เพิ่มลงใน
- เพลย์ลิสต์ของฉัน
- ดูภายหลัง
แชร์

แชร์

ฝัง

ขนาดวิดีโอ:

แสดงแผงควบคุมโปรแกรมเล่น

เล่นอัตโนมัติ

เล่นใหม่

เผยแพร่เมื่อ 10 ก.พ. 2025
#PysparkArrayFunction, #SparkArray, #DatabricksArrayFunction, #ArraysZip, #Arrays_Zip
#Databricks, #DatabricksTutorial, #AzureDatabricks
#Databricks
#Pyspark
#Spark
#AzureDatabricks
#AzureADF
#Databricks #LearnPyspark #LearnDataBRicks #DataBricksTutorial
databricks spark tutorial
databricks tutorial
databricks azure
databricks notebook tutorial
databricks delta lake
databricks azure tutorial,
Databricks Tutorial for beginners,
azure Databricks tutorial
databricks tutorial,
databricks community edition,
databricks community edition cluster creation,
databricks community edition tutorial
databricks community edition pyspark
databricks community edition cluster
databricks pyspark tutorial
databricks community edition tutorial
databricks spark certification
databricks cli
databricks tutorial for beginners
databricks interview questions
databricks azure

ความคิดเห็น • 41

@shreyashchoudhary6827 2 ปีที่แล้ว ⁺⁵
its very hard to practise without dataset ,please provide git repo of all dataset you used in the databricks series
@sravankumar1767 3 ปีที่แล้ว
Nice explanation 👌 👍
@oiwelder 2 ปีที่แล้ว ⁺¹
Excelente...👏👏👏
@rajasdataengineering7585 2 ปีที่แล้ว
Thank you
@RakeshGandu-wb7eu ปีที่แล้ว
Very nice explanation
@rajasdataengineering7585 ปีที่แล้ว
Thanks for liking
@kamalbhallachd 3 ปีที่แล้ว
Knowledge session
@rajasdataengineering7585 3 ปีที่แล้ว
Thanks Kamal
@dataengineerazure2983 2 ปีที่แล้ว
@@rajasdataengineering7585 How get json file?
@balajibalu335 2 ปีที่แล้ว ⁺¹
Thank you 😊
@rajasdataengineering7585 2 ปีที่แล้ว
Thank you 😊
@riyazalimohammad633 3 ปีที่แล้ว ⁺²
Hello Raja sir! Great content and Thank you for the amazing explanation! One request sir - Could you please provide us the notebooks you've used along with the datasets, so that we could replicate what we are watching in the videos. That would be very helpful for us. Please consider, thanks again!!
@rajasdataengineering7585 3 ปีที่แล้ว ⁺²
Sure, will do
@riyazalimohammad633 3 ปีที่แล้ว ⁺¹
@@rajasdataengineering7585 Awaiting further correspondence sir, thank you!!
@shreyashchoudhary6827 2 ปีที่แล้ว ⁺¹
@@rajasdataengineering7585 when???
@varun8952 2 ปีที่แล้ว
@@rajasdataengineering7585 , Please share these notebooks
@sraj7136 2 ปีที่แล้ว
Hi Raja, I come across your channel and videos, quite crisp to the point, if you can add notebook file along with video would be more beneficial to all viewers.
@saswatad1 ปีที่แล้ว
Very good video. Can you please share us notebook of this video. Difficult to grasp the concept without doing hands on
@gurumoorthysivakolunthu9878 2 ปีที่แล้ว
Taken example is ultimate, Sir...
Need a small clarification on last explode -- how the key - value pair -- is getting converted -- key as column -- value as it's values... Because in explode video -- you said that all the keys will be under key column and all values will be under 1 value column.... He explode works different... Like pivot it is working...
Please help, Sir... Thank you....
@suvratrai8873 ปีที่แล้ว
Hi Sir, 1 doubt here. If we apply array_zip function to a column containing map values (key : pair), then does the column name also gets added to each element, because that's what happened when you applied array_zip to the Employee column, as we can see "Employee" in every row. But this did not happen when array_zip was applied in your first example, where the elements are just strings.
@sravankumar1767 3 ปีที่แล้ว ⁺²
Can you please make a video load csv file with Jason nested hierarchy using pyspark in ADB just like cust id, cust name, item name, quantity this is csv file but jn Json cusid, cust name, under purchases we have itemname, quantity. Can you please explain this scenario
@rajasdataengineering7585 3 ปีที่แล้ว
Hi Sravan, I have already posted a video on flattening nested json files.
Please check this video and let me know if it fulfils your requirement
th-cam.com/video/jD8JIw1FVVg/w-d-xo.html
@souravbarik8470 ปีที่แล้ว
Great explanation. But array_zip in the employee explode scenario is not required, without zipping also we can simply explode to get the same result.
@Eliteritz 4 หลายเดือนก่อน
+1
@vikashchandra6262 4 หลายเดือนก่อน
+1
We can directly explode the given dataset and fetch its value by creating new column.
Then why using ArrayZip?
@Lakshmikanth-l3p 24 วันที่ผ่านมา
Hi Kindly provide the notebooks and the datasets so that we can practice side by side.
@gulsahtanay2341 11 หลายเดือนก่อน ⁺¹
Thank you
@rajasdataengineering7585 11 หลายเดือนก่อน
You're welcome
@gopireddyvelpula8135 8 หลายเดือนก่อน
@@rajasdataengineering7585 can you provide dataset to practice
@sravanthireddy3849 2 ปีที่แล้ว
hi Raja
please provide noteboks for practice purpose in the description
@sumitchandwani9970 ปีที่แล้ว
Please provide the dataframes created in the videos its difficult to manually type each dataframe used
@a2zhi976 ปีที่แล้ว ⁺¹
Sir.. Can I create RDD in databricks 2. DF in databricks is different from DF in hadoop system ?. can you please clear my confusion?
@rajasdataengineering7585 ปีที่แล้ว
Yes we can create all 3 APIs in databricks such as rdd, dataframe and dataset.
There is no concept of dataframe in Hadoop. The programming engine in Hadoop is mapreduce
@AshrithaPasupuleti ปีที่แล้ว
Hi Sir, Can you please enable transcript for your videos which helps us to take notes Easily
@babydiyarena7052 ปีที่แล้ว
Yes, Transcript is not available for videos that are created initially
@bashask2121 3 ปีที่แล้ว
How to print {4,1}{4,3}{4,7} etc I mean element of array 1 value pair with array 2 all values
@bashask2121 3 ปีที่แล้ว
Pls provide sample data in the description
@techfunpak ปีที่แล้ว ⁺¹
another way can be
df.select(df.Department,explode(df.Employee).alias("empMap")).select(df.Department,col("empMap").getItem('emp_name').alias("emp_name"),col("empMap").getItem('salary').alias("salary")).show(truncate=False)
@rajasdataengineering7585 ปีที่แล้ว
Thanks for sharing alternate approach
@ShivamGupta-wn9mo 2 หลายเดือนก่อน
simpler way:
df_flattened=df.select("*",explode("Employee").alias("new_emp"))\
.drop("Employee")\
.select("Department","new_emp.emp_name","new_emp.salary","new_emp.yrs_of_service","new_emp.Age")
df_flattened.show()

ต่อไป

เล่นอัตโนมัติ

41. Databricks | Spark | Pyspark Functions| Part 2 : Array_Intersect