Solve using PySpark and Spark-SQL | Accenture Interview Question |
ฝัง
- เผยแพร่เมื่อ 13 ม.ค. 2025
- Write a Pyspark query to report the movies with an odd-numbered ID and a description that is not "boring".Return the result table in descending order by rating.
data=[(1, 'War','great 3D',8.9)
,(2, 'Science','fiction',8.5)
,(3, 'irish','boring',6.2)
,(4, 'Ice song','Fantacy',8.6)
,(5, 'House card','Interesting',9.1)]
schema="ID int,movie string,description string,rating double"
df=spark.createDataFrame(data,schema)
Course Link:
www.geekcoders...
#pysparkinterview
#interviewquestion #pythonbeginners
#Geekcoders,#Sagarprajapati,#Freecontent,#Azure,#Dataengineer,#Python,SQL,Data,#Database,#Database,#Engineering,#Databricks,#Azuredatafactory,#Spark,#pythonlist
Bro can you please tell me which software you are using to record screen.
How can i purchase your delta lake course
www.geekcoders.co.in/courses/Build-Real-Time-DeltaLake-Project--using-PySpark-and-Spark-SQL-with-Databricks-64a0af30e4b0f71237fa1bf3-64a0af30e4b0f71237fa1bf3
Hey Sagar, Please make delta live table related project. Delta live table is most demanding.
Sure will do that soon thanks
I saw this question on leetcode sql question series.
Cool. The second syntax is DataFrame, not PySpark, as a Pyspark it could be also DataSet or RDD
Second query is dataframe only
result_df = df.filter((col("ID") % 2 != 0) & (col("description") != 'boring')).orderBy(desc("rating"))
display(result_df)
Thank you Sir
from pyspark.sql.functions import col, upper
df_1 = df.where(df.id%2!=0).where(upper(df.description)!="BORING")
display(df_1)
my Solution:
df.select("*").filter((col("ID") % 2 != 0) & (col("description") != "boring")).orderBy("rating",ascending=False)