Unit testing with Databricks | Jonathan Neo | November 2021

แชร์
ฝัง
  • เผยแพร่เมื่อ 5 ก.ย. 2024
  • Just like eating vegetables, no one likes writing tests. However, writing unit tests is good for your programming diet. It helps ensure that data flows from one end of the pipeline to the other without any hitches.
    In this talk, Jonathan Neo, Senior Data Engineer at Cuusoo, will explain why and how you can write unit tests, and where does unit testing fit in the bigger picture.
    Jonathan will demo how you can write your own unit tests in Databricks using Databricks Connect and PyTest (a popular Python testing library), and also automate the execution of unit tests using CI/CD pipelines.
    --
    This was the talk from the November 2021 event of the Melbourne Databricks User Group.

ความคิดเห็น • 12

  • @harshaaleti5609
    @harshaaleti5609 2 ปีที่แล้ว +12

    Unit testing info starts at 10:32

  • @jhonsen9842
    @jhonsen9842 5 หลายเดือนก่อน

    Great session very much thankful to you.

  • @MuzicForSoul
    @MuzicForSoul 2 หลายเดือนก่อน

    Hi Jonathan, thanks for this demo, this is fantastic. few things have changed in this 2 years but the basics are same. When are you going to show us the remaining two parts like integration testing and data quality testing? or if you already have those videos can you please upload them to your channel. Thanks.

  • @anoj4985
    @anoj4985 2 ปีที่แล้ว +3

    Good stuff! Liked and subscribed ! :)

  • @adityaranjanmohanty5980
    @adityaranjanmohanty5980 2 ปีที่แล้ว

    Thanks a ton. Loved it

  • @BjarneThorsted
    @BjarneThorsted 2 ปีที่แล้ว +1

    Great video! Since databricks-connect is now deprecated, how should we set up unit testing?

    • @yoshitjuh
      @yoshitjuh 2 ปีที่แล้ว

      Hi Bjarne, have you managed to get an answer on this somewhere else?

    • @BjarneThorsted
      @BjarneThorsted 2 ปีที่แล้ว +6

      @@yoshitjuh, actually there's an update to the databricks-connect package so that it now supports runtime 10.4 LTS, but databricks apparently recommends not using it and rather use dbx by databrickslab to setup a project, run all unit testing locally and supply convenient command line functions for deployment and running jobs. Still not sure about testing during pull requests and stuff like that.

  • @allieubisse316
    @allieubisse316 2 ปีที่แล้ว

    informative

  • @peterko8871
    @peterko8871 ปีที่แล้ว

    Why 45 minutes needed for a demo example?

  • @brijesh0808
    @brijesh0808 ปีที่แล้ว

    nothing is visible on your code screenshots. Very bad presentation.