40. Databricks | Spark | Pyspark Functions| Arrays_zip

แชร์
ฝัง
  • เผยแพร่เมื่อ 10 ก.พ. 2025
  • #PysparkArrayFunction, #SparkArray, #DatabricksArrayFunction, #ArraysZip, #Arrays_Zip
    #Databricks, #DatabricksTutorial, #AzureDatabricks
    #Databricks
    #Pyspark
    #Spark
    #AzureDatabricks
    #AzureADF
    #Databricks #LearnPyspark #LearnDataBRicks #DataBricksTutorial
    databricks spark tutorial
    databricks tutorial
    databricks azure
    databricks notebook tutorial
    databricks delta lake
    databricks azure tutorial,
    Databricks Tutorial for beginners,
    azure Databricks tutorial
    databricks tutorial,
    databricks community edition,
    databricks community edition cluster creation,
    databricks community edition tutorial
    databricks community edition pyspark
    databricks community edition cluster
    databricks pyspark tutorial
    databricks community edition tutorial
    databricks spark certification
    databricks cli
    databricks tutorial for beginners
    databricks interview questions
    databricks azure

ความคิดเห็น • 41

  • @shreyashchoudhary6827
    @shreyashchoudhary6827 2 ปีที่แล้ว +5

    its very hard to practise without dataset ,please provide git repo of all dataset you used in the databricks series

  • @sravankumar1767
    @sravankumar1767 3 ปีที่แล้ว

    Nice explanation 👌 👍

  • @oiwelder
    @oiwelder 2 ปีที่แล้ว +1

    Excelente...👏👏👏

  • @RakeshGandu-wb7eu
    @RakeshGandu-wb7eu ปีที่แล้ว

    Very nice explanation

  • @kamalbhallachd
    @kamalbhallachd 3 ปีที่แล้ว

    Knowledge session

  • @balajibalu335
    @balajibalu335 2 ปีที่แล้ว +1

    Thank you 😊

  • @riyazalimohammad633
    @riyazalimohammad633 3 ปีที่แล้ว +2

    Hello Raja sir! Great content and Thank you for the amazing explanation! One request sir - Could you please provide us the notebooks you've used along with the datasets, so that we could replicate what we are watching in the videos. That would be very helpful for us. Please consider, thanks again!!

    • @rajasdataengineering7585
      @rajasdataengineering7585  3 ปีที่แล้ว +2

      Sure, will do

    • @riyazalimohammad633
      @riyazalimohammad633 3 ปีที่แล้ว +1

      @@rajasdataengineering7585 Awaiting further correspondence sir, thank you!!

    • @shreyashchoudhary6827
      @shreyashchoudhary6827 2 ปีที่แล้ว +1

      @@rajasdataengineering7585 when???

    • @varun8952
      @varun8952 2 ปีที่แล้ว

      @@rajasdataengineering7585 , Please share these notebooks

    • @sraj7136
      @sraj7136 2 ปีที่แล้ว

      Hi Raja, I come across your channel and videos, quite crisp to the point, if you can add notebook file along with video would be more beneficial to all viewers.

  • @saswatad1
    @saswatad1 ปีที่แล้ว

    Very good video. Can you please share us notebook of this video. Difficult to grasp the concept without doing hands on

  • @gurumoorthysivakolunthu9878
    @gurumoorthysivakolunthu9878 2 ปีที่แล้ว

    Taken example is ultimate, Sir...
    Need a small clarification on last explode -- how the key - value pair -- is getting converted -- key as column -- value as it's values... Because in explode video -- you said that all the keys will be under key column and all values will be under 1 value column.... He explode works different... Like pivot it is working...
    Please help, Sir... Thank you....

  • @suvratrai8873
    @suvratrai8873 ปีที่แล้ว

    Hi Sir, 1 doubt here. If we apply array_zip function to a column containing map values (key : pair), then does the column name also gets added to each element, because that's what happened when you applied array_zip to the Employee column, as we can see "Employee" in every row. But this did not happen when array_zip was applied in your first example, where the elements are just strings.

  • @sravankumar1767
    @sravankumar1767 3 ปีที่แล้ว +2

    Can you please make a video load csv file with Jason nested hierarchy using pyspark in ADB just like cust id, cust name, item name, quantity this is csv file but jn Json cusid, cust name, under purchases we have itemname, quantity. Can you please explain this scenario

    • @rajasdataengineering7585
      @rajasdataengineering7585  3 ปีที่แล้ว

      Hi Sravan, I have already posted a video on flattening nested json files.
      Please check this video and let me know if it fulfils your requirement
      th-cam.com/video/jD8JIw1FVVg/w-d-xo.html

  • @souravbarik8470
    @souravbarik8470 ปีที่แล้ว

    Great explanation. But array_zip in the employee explode scenario is not required, without zipping also we can simply explode to get the same result.

    • @Eliteritz
      @Eliteritz 4 หลายเดือนก่อน

      +1

    • @vikashchandra6262
      @vikashchandra6262 4 หลายเดือนก่อน

      +1
      We can directly explode the given dataset and fetch its value by creating new column.
      Then why using ArrayZip?

  • @Lakshmikanth-l3p
    @Lakshmikanth-l3p 24 วันที่ผ่านมา

    Hi Kindly provide the notebooks and the datasets so that we can practice side by side.

  • @gulsahtanay2341
    @gulsahtanay2341 11 หลายเดือนก่อน +1

    Thank you

  • @sravanthireddy3849
    @sravanthireddy3849 2 ปีที่แล้ว

    hi Raja
    please provide noteboks for practice purpose in the description

  • @sumitchandwani9970
    @sumitchandwani9970 ปีที่แล้ว

    Please provide the dataframes created in the videos its difficult to manually type each dataframe used

  • @a2zhi976
    @a2zhi976 ปีที่แล้ว +1

    Sir.. Can I create RDD in databricks 2. DF in databricks is different from DF in hadoop system ?. can you please clear my confusion?

    • @rajasdataengineering7585
      @rajasdataengineering7585  ปีที่แล้ว

      Yes we can create all 3 APIs in databricks such as rdd, dataframe and dataset.
      There is no concept of dataframe in Hadoop. The programming engine in Hadoop is mapreduce

  • @AshrithaPasupuleti
    @AshrithaPasupuleti ปีที่แล้ว

    Hi Sir, Can you please enable transcript for your videos which helps us to take notes Easily

    • @babydiyarena7052
      @babydiyarena7052 ปีที่แล้ว

      Yes, Transcript is not available for videos that are created initially

  • @bashask2121
    @bashask2121 3 ปีที่แล้ว

    How to print {4,1}{4,3}{4,7} etc I mean element of array 1 value pair with array 2 all values

  • @bashask2121
    @bashask2121 3 ปีที่แล้ว

    Pls provide sample data in the description

  • @techfunpak
    @techfunpak ปีที่แล้ว +1

    another way can be
    df.select(df.Department,explode(df.Employee).alias("empMap")).select(df.Department,col("empMap").getItem('emp_name').alias("emp_name"),col("empMap").getItem('salary').alias("salary")).show(truncate=False)

  • @ShivamGupta-wn9mo
    @ShivamGupta-wn9mo 2 หลายเดือนก่อน

    simpler way:
    df_flattened=df.select("*",explode("Employee").alias("new_emp"))\
    .drop("Employee")\
    .select("Department","new_emp.emp_name","new_emp.salary","new_emp.yrs_of_service","new_emp.Age")
    df_flattened.show()