If the requirement is to have data from both the dataframes then we can use UnionByName.. but here the requirement is to check if the columns are matching with other dataframe and if any column is missing then it will create a column with same name and null value.
@@sravanalakshmipisupati6533 Yes please explain the overall procedure. like what tools are using (Github, jenkins, jira, etc) in the project with flow. actually there is no proper video which will explain project end to end process. so it will be great if you do one ?
Hi give me solution if i have table with name , id ,departmnet in name column 2 name . now the condition is want new column but in that new column i want all nam which are in name column
I have the same scenario. I have 2 dataframes with different number of columns. The second dataframe have an update values so i want to update the dataframe 1 considering the values of the second dataframe but keeping the values of the first dataframe if there is no a change. Could you help with this?
@@sravanalakshmipisupati6533 thanks..under "sampledata" branch, looks this specific notebook is not checked in yet, could you commit the same? Or pls help me locate the file if it's checked in with some other name..
Nice explanation sravana 👌 👍 👏
Thank you, Sravan.
Another approach is unionByName() to above query.
Correct me if I am wrong MAM
If the requirement is to have data from both the dataframes then we can use UnionByName.. but here the requirement is to check if the columns are matching with other dataframe and if any column is missing then it will create a column with same name and null value.
could you do one video on project end-to-end pipeline.
Like how we are using github, jenkins, etc into project. what is the process in project.
I can explain the overall procedure.. I may not be able to execute and show because I don't have required setup in my personal laptop.
@@sravanalakshmipisupati6533 Yes please explain the overall procedure. like what tools are using (Github, jenkins, jira, etc) in the project with flow. actually there is no proper video which will explain project end to end process. so it will be great if you do one ?
@@sravanalakshmipisupati6533 Hi, please make this in your checklist.
Will you provide Scala code for this example if you are aware of it.
Can you please try with FoldLeft? If facing some issues, please let me know.
Hi give me solution if i have table with name , id ,departmnet in name column 2 name . now the condition is want new column but in that new column i want all nam which are in name column
Use withColumn and use the name column, it will copy the contents from name to new column.
Df.withColumn("new_col", f.col("name"))
I have the same scenario. I have 2 dataframes with different number of columns. The second dataframe have an update values so i want to update the dataframe 1 considering the values of the second dataframe but keeping the values of the first dataframe if there is no a change.
Could you help with this?
Hi Jefferson, Please join the 2 dataframes and select the updated columns from 2nd dataframe.
amazing 👏
Nicely explained, could you pls share the gihub link for this notebook, will help in practicing.
Thank you. Please check the description for GitHub links for code.
@@sravanalakshmipisupati6533 thanks..under "sampledata" branch, looks this specific notebook is not checked in yet, could you commit the same? Or pls help me locate the file if it's checked in with some other name..
@@Learn2Share786 github.com/sravanapisupati/SampleDataSet/blob/main/weatherHistory.csv please open this link and click on the code hyperlink.
Why can't you just do an inner join?
We need a join key for that right?
i have 3 dataframes one with 15 columns and 2 with one col each i want to add them make them as one dataframe .
You can use join.
hi can you help me?
Sure, please share your problem statement to my gmail sravana.pisupati@gmail.com
Scala code for same plz
Sure. I'll get back.