40. Databricks | Spark | Pyspark Functions| Arrays_zip
ฝัง
- เผยแพร่เมื่อ 10 ก.พ. 2025
- #PysparkArrayFunction, #SparkArray, #DatabricksArrayFunction, #ArraysZip, #Arrays_Zip
#Databricks, #DatabricksTutorial, #AzureDatabricks
#Databricks
#Pyspark
#Spark
#AzureDatabricks
#AzureADF
#Databricks #LearnPyspark #LearnDataBRicks #DataBricksTutorial
databricks spark tutorial
databricks tutorial
databricks azure
databricks notebook tutorial
databricks delta lake
databricks azure tutorial,
Databricks Tutorial for beginners,
azure Databricks tutorial
databricks tutorial,
databricks community edition,
databricks community edition cluster creation,
databricks community edition tutorial
databricks community edition pyspark
databricks community edition cluster
databricks pyspark tutorial
databricks community edition tutorial
databricks spark certification
databricks cli
databricks tutorial for beginners
databricks interview questions
databricks azure
its very hard to practise without dataset ,please provide git repo of all dataset you used in the databricks series
Nice explanation 👌 👍
Excelente...👏👏👏
Thank you
Very nice explanation
Thanks for liking
Knowledge session
Thanks Kamal
@@rajasdataengineering7585 How get json file?
Thank you 😊
Thank you 😊
Hello Raja sir! Great content and Thank you for the amazing explanation! One request sir - Could you please provide us the notebooks you've used along with the datasets, so that we could replicate what we are watching in the videos. That would be very helpful for us. Please consider, thanks again!!
Sure, will do
@@rajasdataengineering7585 Awaiting further correspondence sir, thank you!!
@@rajasdataengineering7585 when???
@@rajasdataengineering7585 , Please share these notebooks
Hi Raja, I come across your channel and videos, quite crisp to the point, if you can add notebook file along with video would be more beneficial to all viewers.
Very good video. Can you please share us notebook of this video. Difficult to grasp the concept without doing hands on
Taken example is ultimate, Sir...
Need a small clarification on last explode -- how the key - value pair -- is getting converted -- key as column -- value as it's values... Because in explode video -- you said that all the keys will be under key column and all values will be under 1 value column.... He explode works different... Like pivot it is working...
Please help, Sir... Thank you....
Hi Sir, 1 doubt here. If we apply array_zip function to a column containing map values (key : pair), then does the column name also gets added to each element, because that's what happened when you applied array_zip to the Employee column, as we can see "Employee" in every row. But this did not happen when array_zip was applied in your first example, where the elements are just strings.
Can you please make a video load csv file with Jason nested hierarchy using pyspark in ADB just like cust id, cust name, item name, quantity this is csv file but jn Json cusid, cust name, under purchases we have itemname, quantity. Can you please explain this scenario
Hi Sravan, I have already posted a video on flattening nested json files.
Please check this video and let me know if it fulfils your requirement
th-cam.com/video/jD8JIw1FVVg/w-d-xo.html
Great explanation. But array_zip in the employee explode scenario is not required, without zipping also we can simply explode to get the same result.
+1
+1
We can directly explode the given dataset and fetch its value by creating new column.
Then why using ArrayZip?
Hi Kindly provide the notebooks and the datasets so that we can practice side by side.
Thank you
You're welcome
@@rajasdataengineering7585 can you provide dataset to practice
hi Raja
please provide noteboks for practice purpose in the description
Please provide the dataframes created in the videos its difficult to manually type each dataframe used
Sir.. Can I create RDD in databricks 2. DF in databricks is different from DF in hadoop system ?. can you please clear my confusion?
Yes we can create all 3 APIs in databricks such as rdd, dataframe and dataset.
There is no concept of dataframe in Hadoop. The programming engine in Hadoop is mapreduce
Hi Sir, Can you please enable transcript for your videos which helps us to take notes Easily
Yes, Transcript is not available for videos that are created initially
How to print {4,1}{4,3}{4,7} etc I mean element of array 1 value pair with array 2 all values
Pls provide sample data in the description
another way can be
df.select(df.Department,explode(df.Employee).alias("empMap")).select(df.Department,col("empMap").getItem('emp_name').alias("emp_name"),col("empMap").getItem('salary').alias("salary")).show(truncate=False)
Thanks for sharing alternate approach
simpler way:
df_flattened=df.select("*",explode("Employee").alias("new_emp"))\
.drop("Employee")\
.select("Department","new_emp.emp_name","new_emp.salary","new_emp.yrs_of_service","new_emp.Age")
df_flattened.show()