End to End Pyspark Project | Pyspark Project

แชร์
ฝัง
  • เผยแพร่เมื่อ 29 ธ.ค. 2024

ความคิดเห็น • 104

  • @kuladeepk1255
    @kuladeepk1255 ปีที่แล้ว +5

    Absolutely blown away by this TH-cam video! In just one word: phenomenal. It's like diving into an encyclopedia dedicated to CI/CD pipelines. My quest for a basic explanation led me to countless sources, but this video turned out to be an absolute goldmine.

  • @naren06938
    @naren06938 หลายเดือนก่อน +1

    He is Perfect Trainer, so he never used this words: " Likewise remain do yourself & forward by skipping some stuff".....His commitment to teach from scratch to Advance without skipping, thtswhy He always Great

  • @geetak5285
    @geetak5285 11 หลายเดือนก่อน +2

    Great, Nice Sales KPI Use case, Simple explanation, much Intuitive, Thank for your content contribution.

  • @safkaify7875
    @safkaify7875 5 หลายเดือนก่อน +2

    Well spoken, well prepared, nicely presented. Thank you for helping others. One suggestion (IMHO): I would reduce the last 10 minutes to 2 to 3 minutes, for example: In dashboard, instead of showing the removal of each and every dataframe, I would just show the removal of one, and tell the audience "Likewise, you can remove all the other dataframes". Same thing for adding title (header) to each visualization and arranging visualizations: I would just do it for one and tell audience "Likewise you can add title to all other visualization and arrange them per your requirements". Then I would just fast forward (skip) to show the final view of the dashboard with a few seconds of my comments.

  • @hritikapal683
    @hritikapal683 ปีที่แล้ว +4

    Absolutely terrific content, done with my first pyspark project. Subscribed too for more projects videos keep them coming ✨

  • @Tasfiomi
    @Tasfiomi 6 หลายเดือนก่อน +1

    Well explained and detailed ! Superb Content. Appreciated your time & efforts !

  • @IlseZubieta
    @IlseZubieta 10 หลายเดือนก่อน +2

    You're simply incredible! Thanks for uploading this for us :) God bless you!

    • @safkaify7875
      @safkaify7875 5 หลายเดือนก่อน

      GOD bless you indeed for doing a prenominal job (hence helping others like me)

  • @1112electronics
    @1112electronics 5 หลายเดือนก่อน +1

    thanks a lot for creating such a grate video by doing and explaining. Great job. awesome ..keep it up

  • @sanjeevpandey2753
    @sanjeevpandey2753 7 หลายเดือนก่อน +1

    Hats off to your effort Man! Keep rocking ith aesome content

  • @prasannakumar7097
    @prasannakumar7097 7 หลายเดือนก่อน +1

    Nice explanation. Please do more pyspark projects

  • @AGenerationthatFearsGod
    @AGenerationthatFearsGod 3 หลายเดือนก่อน +1

    You have good content. You should upload more of end-end projects. It will definitely give your channels the credit it deserves.

  • @RaviKumar-o4p2o
    @RaviKumar-o4p2o ปีที่แล้ว +2

    Nice explanation.Very much easy to underatnd.Thank you very much .

  • @ETLMasters
    @ETLMasters ปีที่แล้ว +2

    Thanks.
    Would love see more project like these in future.

  • @sagardevara4174
    @sagardevara4174 3 หลายเดือนก่อน +1

    Awesome work

  • @SanjayKumar-qp8ss
    @SanjayKumar-qp8ss 3 หลายเดือนก่อน +1

    Amazing content, thanks a ton!

  • @prasannakumar7097
    @prasannakumar7097 7 หลายเดือนก่อน +1

    Nice video pls do more projects on pyspark

  • @bhavyakanzariya7124
    @bhavyakanzariya7124 ปีที่แล้ว +1

    good examples. easy to understand.

  • @ranjansrivastava9256
    @ranjansrivastava9256 ปีที่แล้ว +1

    Very nice Video !!!! Great job !!!!

  • @vadderamu5422
    @vadderamu5422 ปีที่แล้ว +1

    Thanks a lot for valued information.

  • @fashionate6527
    @fashionate6527 ปีที่แล้ว +1

    great explanation

  • @dbarhate
    @dbarhate 5 หลายเดือนก่อน +1

    Nice stuff. Thanks.

  • @Laura11001
    @Laura11001 8 หลายเดือนก่อน +1

    This was very useful for me!

  • @sanskritisharma5447
    @sanskritisharma5447 ปีที่แล้ว +2

    thanks for this project really helpful

  • @biramdevpawar9902
    @biramdevpawar9902 ปีที่แล้ว +2

    Instead of joining both df for each KPI we can join it once & cache it. so that it will increase performance.

  • @ayeshasyedKhan
    @ayeshasyedKhan 4 หลายเดือนก่อน +1

    Thanks brother..great content

    • @ayeshasyedKhan
      @ayeshasyedKhan 4 หลายเดือนก่อน

      This project is of data engineering or Data analytics ? Please reply ?

  • @Lapookie
    @Lapookie ปีที่แล้ว +5

    So is Spark use for aggregating and viewing data only like this ?? It's for Data analyst so ? No, Could you show a real example with data coming from a source (exemple an API) and writing production code to send spark job on batch data ?

  • @ADESHKUMAR-yz2el
    @ADESHKUMAR-yz2el 6 หลายเดือนก่อน +1

    thanks man. good stuff

  • @naren06938
    @naren06938 หลายเดือนก่อน +1

    It's really Awesome....i shocked ur teaching skills, Really how can u simplify this much complex Real-time projects also...But as a beginner my doubt is, with Databricks PySpark we did almost all this, then what's the use of Apache Beam Airflow, AWS Glue, Azure ADF, GCP Dataflow, Dataproc...etc many services, when u getting Same Results with 1 service

    • @learnbydoingit
      @learnbydoingit  หลายเดือนก่อน

      Will make video for that

  • @vikrammore-y4t
    @vikrammore-y4t ปีที่แล้ว +1

    best content,thank you

  • @Vinod-dd2vc
    @Vinod-dd2vc ปีที่แล้ว +1

    Super tutor🔥🙏

  • @rajutjr
    @rajutjr ปีที่แล้ว +2

    Price is in string format .then how you get aggregate sum ..

  • @omkarm7865
    @omkarm7865 ปีที่แล้ว +1

    very nice

  • @sachinchavanmusic1412
    @sachinchavanmusic1412 ปีที่แล้ว +1

    very helpful video 🙏

  • @tedduharish7474
    @tedduharish7474 2 หลายเดือนก่อน

    price is in string type, so it can do maths formula, i didnt getting because iam using sparksql for KPI

  • @nileshyadav7543
    @nileshyadav7543 ปีที่แล้ว +1

    great work

  • @SantoshKumar-yr2md
    @SantoshKumar-yr2md 10 หลายเดือนก่อน +1

    well explained, at least you should increase zoom and while deriving column we can derive in one go all column like year, month, qtr

  • @Arif-rs2il
    @Arif-rs2il 6 หลายเดือนก่อน +1

    thank's bro

  • @tedduharish7474
    @tedduharish7474 2 หลายเดือนก่อน

    bro why price is showing null even we define it as IntegerType but its showing numbers if it is StringType

  • @mmp9371
    @mmp9371 ปีที่แล้ว +2

    Hi Sir, one question on the query "frequency of customer who visited restaurant". In the Sales.csv file there are 27 records with restaurant entries.Your output giving 21 records. In your video you did ".agg(countDistinct("ordered_date"))" I changed that with "agg(count("customer_id"))" and I got 27 records matching with the input file. Request you to look into it and suggest if any misunderstanding from my end.

    • @learnbydoingit
      @learnbydoingit  ปีที่แล้ว +2

      Actually data I created with so many duplicate records.. So may be issue that's good u are debugging that's what is expectation

  • @spicytuna08
    @spicytuna08 8 หลายเดือนก่อน +1

    can u show an example where pandas failed due to memory where pyspark was able to overcome the memory problem?

  • @RasheedShaik-f1p
    @RasheedShaik-f1p 11 หลายเดือนก่อน +1

    Please upload more pyspark projects

  • @mnirani5230
    @mnirani5230 ปีที่แล้ว +1

    Thanks for this

  • @bhargavchaitanya3399
    @bhargavchaitanya3399 11 หลายเดือนก่อน +1

    Thank you

  • @giri41
    @giri41 8 หลายเดือนก่อน +1

    Sir, how can get system date and calculate the current month??

  • @AsadChoudhary-b3d
    @AsadChoudhary-b3d ปีที่แล้ว +1

    Hi do you also support people in their data engineering jobs?

  • @shivamchandan50
    @shivamchandan50 10 หลายเดือนก่อน +1

    please make video on how to perform unit testing in spark

  • @KishoreReddy-c3v
    @KishoreReddy-c3v ปีที่แล้ว +1

    hi bro content is very nice please a end to end project on data engineering using aws bro

  • @techhelphub3
    @techhelphub3 11 หลายเดือนก่อน +1

    how can we store this dashboard into pdf or how can we share this dashboard to others and can you pls share the ppt that you are presented in the video

    • @learnbydoingit
      @learnbydoingit  11 หลายเดือนก่อน

      Give the link and access to dashboard ...

  • @kunal6782
    @kunal6782 ปีที่แล้ว +2

    Everything is very good... just try to not say "OK"

  • @VasiSultan
    @VasiSultan ปีที่แล้ว

    Hey, was there a need to use inferschema option when you are manually defining the schema? Can you please reply?
    Also, from where we can download the data set for practice?

    • @learnbydoingit
      @learnbydoingit  ปีที่แล้ว

      If it's not in description you get in telegram and if schema there no need Inferschema

  • @RutujaBsk
    @RutujaBsk 8 หลายเดือนก่อน

    Thanks for the informative session. Can you please let me know if we can import all the functions together instead of importing them one by one ( eg: from pyspark.sql.functions import month,year,quarter ) like we import libraries pandas,matplotlib, etc in Python?

    • @learnbydoingit
      @learnbydoingit  8 หลายเดือนก่อน

      We can import in one time all libraries

  • @stan8966
    @stan8966 ปีที่แล้ว +1

    Excellent content. Thank you

  • @vishnuannavarapu3888
    @vishnuannavarapu3888 ปีที่แล้ว

    All your videos are commendable. Could you please create a video on scheduling the execution of a Databricks notebook using Azure Data Factory (ADF) pipeline?

  • @gauravjoshi4035
    @gauravjoshi4035 ปีที่แล้ว +1

    thanks

  • @chitrarekhatiwari6629
    @chitrarekhatiwari6629 ปีที่แล้ว +1

    Thanks

  • @ibrahimhussain2442
    @ibrahimhussain2442 8 หลายเดือนก่อน

    Hi, Im working on the pay-as-you-go service of Databricks. When I'm uploading the file its not giving me the path of my computer where the file is stored. It's getting stored in the 'hive' of the databricks as a table and sales.csv its getting changed to delta format. Can you tell me how to upload a csv file and work on it. Thank you.

    • @learnbydoingit
      @learnbydoingit  8 หลายเดือนก่อน

      Are u able to upload in databricks metastore or not ?

    • @ibrahimhussain2442
      @ibrahimhussain2442 8 หลายเดือนก่อน

      @@learnbydoingit I was able to resolve it by going into the settings -> Advanced -> Enabling DBFS File Browser.

  • @code4code-p5c
    @code4code-p5c 9 หลายเดือนก่อน +5

    Hi, that's good explanation, I liked it. but my advise is please don't say Ok all the times and don't go fast. If you can improve these 2 things in your explanation then you can become good tutor.

    • @learnbydoingit
      @learnbydoingit  9 หลายเดือนก่อน +1

      Yes working on it thanks for ur feedback

    • @AGenerationthatFearsGod
      @AGenerationthatFearsGod 3 หลายเดือนก่อน

      @@learnbydoingit Honestly don't think this is important. Krish Naik does this but his channels is very popular. Don't have to change

  • @sobhareddymangoform
    @sobhareddymangoform 10 หลายเดือนก่อน +1

    this is complete end to end project

  • @zahidalam7831
    @zahidalam7831 5 วันที่ผ่านมา

    If the CSV file in blob storage than how is it?

    • @learnbydoingit
      @learnbydoingit  5 วันที่ผ่านมา

      We do mounting and then same process

    • @zahidalam7831
      @zahidalam7831 5 วันที่ผ่านมา

      Didn't get you , could you please elaborate

  • @ishasingh1039
    @ishasingh1039 10 หลายเดือนก่อน

    Can i download dashboard? if so please tell me how

  • @prachideokar7639
    @prachideokar7639 7 หลายเดือนก่อน +1

    Can we show this project for 2 years of experience in data engineer in real time

  • @savithweraniyagoda1297
    @savithweraniyagoda1297 ปีที่แล้ว +1

    How we can get the dataset?

    • @learnbydoingit
      @learnbydoingit  ปีที่แล้ว

      Telegram link mentioned in the description

  • @saikalyangonuguntla594
    @saikalyangonuguntla594 ปีที่แล้ว +1

    Where can i execute my pyspark code,is it free or can i pay for using databricks

    • @learnbydoingit
      @learnbydoingit  ปีที่แล้ว

      Databricks community edition and it's free

    • @saikalyangonuguntla594
      @saikalyangonuguntla594 ปีที่แล้ว +1

      Thank you

    • @saikalyangonuguntla594
      @saikalyangonuguntla594 ปีที่แล้ว +1

      Iam preparing for interviews,Iam watching and practicing your realtime pyspark projects it's very helpful for me,
      If possible can you make video on how to explain about real time project in interviews,and what type of questions could I expect they will ask about realtime projects.

  • @giri41
    @giri41 8 หลายเดือนก่อน +1

    Can you help on my project please .. on a part bases for money please

    • @learnbydoingit
      @learnbydoingit  8 หลายเดือนก่อน +1

      Please join telegram we can discuss

    • @giri41
      @giri41 8 หลายเดือนก่อน

      Telegram channel name??

    • @giri41
      @giri41 8 หลายเดือนก่อน

      @@learnbydoingitplease contact me

  • @SantoshKumar-yr2md
    @SantoshKumar-yr2md 10 หลายเดือนก่อน +1

    proper column name

  • @CrceAC2
    @CrceAC2 ปีที่แล้ว

    Can you please provide the code of this video?

    • @learnbydoingit
      @learnbydoingit  ปีที่แล้ว

      Would suggest do along with this video and if issue u can connect

  • @DanyDaniel-ky5rm
    @DanyDaniel-ky5rm 11 หลายเดือนก่อน

    Telugu lo chepachuga bro

  • @shubhamtandon9815
    @shubhamtandon9815 11 หลายเดือนก่อน

    bro please stop using "ok". its so frustrating

  • @amanisdreaming3914
    @amanisdreaming3914 6 หลายเดือนก่อน

    earlier it was running but now for this command:-
    sales_df = sales_df.withColumn("order_month",month(sales_df.order_date))
    sales_df = sales_df.withColumn("order_quarter",quarter(sales_df.order_date))
    display(sales_df)
    this is the error i m getting:-
    AnalysisException: [DATATYPE_MISMATCH.UNEXPECTED_INPUT_TYPE] Cannot resolve "month(order_date)" due to data type mismatch: parameter 1 requires "DATE" type, however, "order_date" is of "INT" type.;

    • @learnbydoingit
      @learnbydoingit  6 หลายเดือนก่อน

      Pls do covert proper format

  • @prachideokar7639
    @prachideokar7639 7 หลายเดือนก่อน +1

    How to connect on wtsup or telegram

    • @learnbydoingit
      @learnbydoingit  7 หลายเดือนก่อน

      Link in the description

  • @koushikthalanki1945
    @koushikthalanki1945 10 หลายเดือนก่อน +1

    great explanation