Pyspark Advanced interview questions part 1

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.ย. 2024
  • Pyspark Advanced interview questions part 1
    How to create Databricks Free Community Edition.
    • Databricks Tutorial 3 ...
    Complete Databricks Tutorial
    • Databricks Tutorial 2 ...
    Databricks Delta Lake Tutorials
    • introduction To Delta ...
    Pyspark Tutorials
    • Pyspark Tutorial 3, fi...
    Top 30 PySpark Interview Questions and Answers
    PySpark Interview Questions,
    PySpark Interview and Questions,
    PySpark Interview Questions for freshers,
    PySpark Interview Questions for experienced ,
    Top 40 Apache Spark Interview Questions and Answers,
    Most Common PySpark Interview Questions & Answers,
    PySpark Interview Questions and Answers,
    Top Apache Spark Interview Questions You Should Prepare In 2021,
    Apache Spark Interview Questions And Answers,
    Best PySpark Interview Questions and Answers
    PySpark Interview Questions and Answers for beginners and experts. List of frequently asked PySpark Interview Questions with Answers by Besant Technologies. We hope these PySpark Interview Questions and Answers are useful and will help you to get the best job in the networking industry. This PySpark interview questions and answers are prepared by PySpark Professionals based on MNC Companies’ expectations. Stay tune we will update New PySpark Interview questions with Answers Frequently
    Top 25 Pyspark Interview Questions & Answers
    Top 40 Apache Spark Interview Questions and Answers in 2021
    Top 10 Spark Interview Questions and Answers in 2021
    Top Spark Interview Questions
    Top 50 Spark Interview Questions and Answers for 2021
    Best Pyspark Interview Questions and Answers
    10 Essential Spark Interview Questions
    Top 75 Apache Spark Interview Questions - Completely Covered With Answers
    SPARK SQL PROGRAMMING INTERVIEW QUESTIONS & ANSWERS

ความคิดเห็น • 31

  • @abhilash0410
    @abhilash0410 3 ปีที่แล้ว +8

    Bro bring more real-time interview questions like these thank you so much !

  • @saachinileshpatil
    @saachinileshpatil 7 หลายเดือนก่อน +1

    Thanks for sharing 👍, very informative

  • @vedanthasm2659
    @vedanthasm2659 3 ปีที่แล้ว +3

    One of the best explanation. Bro..Please make more videos on Pyspark

  • @sjitghosh
    @sjitghosh 2 ปีที่แล้ว +3

    You are doing an excellent work. Helping a lot!!

  • @janardhanreddy3267
    @janardhanreddy3267 6 หลายเดือนก่อน

    nice explanation ,please attach csv file or json in description to practice

  • @seshuseshu4106
    @seshuseshu4106 3 ปีที่แล้ว +1

    Very good detailed explanation, thanks for your efforts, keep continue ..

  • @rocku4evr
    @rocku4evr 2 ปีที่แล้ว +1

    Great......fortunate to be your subscriber

  • @sanooosai
    @sanooosai 5 หลายเดือนก่อน

    great thank you

  • @fratkalkan7850
    @fratkalkan7850 2 ปีที่แล้ว

    very clean explanation thank you sir

  • @janardhanreddy3267
    @janardhanreddy3267 6 หลายเดือนก่อน

    please upload all pyspark interview questions videos

  • @akashpb4044
    @akashpb4044 2 ปีที่แล้ว +1

    Awesome video... Cleared my doubts 👍👍👍

  • @nsrchndshkh
    @nsrchndshkh 3 ปีที่แล้ว +1

    Thanks Man. This was some detailed explanation. Kudos

  • @achintamondal1494
    @achintamondal1494 ปีที่แล้ว +1

    Awesome video.
    Could you please share the notebook, it will really help.

  • @varuns4472
    @varuns4472 2 ปีที่แล้ว

    Nice one

  • @shreekrishnavani7868
    @shreekrishnavani7868 2 ปีที่แล้ว

    Nice explanation 👌 thanks

  • @rajanib9057
    @rajanib9057 11 หลายเดือนก่อน

    can you pleaae explain how did spark filter those 2 colums as bad data? I don't see any where condition mentioned for the corrupt column

  • @rahulyeole6411
    @rahulyeole6411 2 ปีที่แล้ว

    Please share basic big data video

  • @naveendayyala1484
    @naveendayyala1484 11 หลายเดือนก่อน

    plz share the notebook in .dbc format

  • @johnsonrajendran6194
    @johnsonrajendran6194 3 ปีที่แล้ว

    are any such mode options available while reading parquet files?

  • @balajia8376
    @balajia8376 2 ปีที่แล้ว

    seems querying _corrupt_record is not working. I tried it today and not allowing me to query with the column name.cust_df.filter("_corrupt_record is not null"). AnalysisException: Since Spark 2.3, the queries from raw JSON/CSV files are disallowed when the
    referenced columns only include the internal corrupt record column
    (named _corrupt_record by default). For example:
    spark.read.schema(schema).csv(file).filter($"_corrupt_record".isNotNull).count()
    and spark.read.schema(schema).csv(file).select("_corrupt_record").show().
    Instead, you can cache or save the parsed results and then send the same query.
    For example, val df = spark.read.schema(schema).csv(file).cache() and then
    df.filter($"_corrupt_record".isNotNull).count().

    • @TRRaveendra
      @TRRaveendra  2 ปีที่แล้ว

      cust_df.cache()
      Cache dataframe and it's won't raise exception

    • @balajia8376
      @balajia8376 2 ปีที่แล้ว

      @@TRRaveendra Yes I did, even after that also not allowing to write a query on _corrupt_record is null or not null.

    • @balajia8376
      @balajia8376 2 ปีที่แล้ว

      seems badRecordsPath is only the solution.

  • @balajia8376
    @balajia8376 2 ปีที่แล้ว

    cust_df.select("_corrupt_record").show() is working but not allowing is null or not null. cust_df.select("_corrupt_record is null").show(). let me know if this is working for you. thank you.

  • @swagatikatripathy4917
    @swagatikatripathy4917 2 ปีที่แล้ว +1

    Why do we write inferschema= true

    • @TRRaveendra
      @TRRaveendra  2 ปีที่แล้ว +2

      InferSchema =True Creating datatypes based on data.
      Header = True creating columns from file first line

  • @sachintiwari6846
    @sachintiwari6846 ปีที่แล้ว

    Woah what a explanation

  • @srikanthbachina7764
    @srikanthbachina7764 ปีที่แล้ว

    Hi pls share ur contact details I am looking for python, pyspark, databricks training

  • @balajia8376
    @balajia8376 2 ปีที่แล้ว

    root
    |-- cust_id: integer (nullable = true)
    |-- cust_name: string (nullable = true)
    |-- manager: string (nullable = true)
    |-- city: string (nullable = true)
    |-- phno: long (nullable = true)
    |-- _corrupt_record: string (nullable = true) . display(cust_df.filter("_corrupt_record is not null")). FileReadException: Error while reading file dbfs:/FileStore/tables/csv_with_bad_records.csv.
    Caused by: IllegalArgumentException: _corrupt_record does not exist. Available: cust_id, cust_name, manager, city, phno