Working With Notebooks in Azure Databricks

แชร์
ฝัง
  • เผยแพร่เมื่อ 31 ม.ค. 2025

ความคิดเห็น • 8

  • @datoalavista581
    @datoalavista581 3 ปีที่แล้ว

    Thank you for sharing

  • @prashanthxavierchinnappa9457
    @prashanthxavierchinnappa9457 3 ปีที่แล้ว

    Great video once again. Good info to get started with Data bricks. I wonder if the notebooks are also a standard way to deploy your workloads into production. Notebook is usually meant for prototyping, my question is how do big companies write Spark code for production? Do you have some views on that?

    • @AdvancingAnalytics
      @AdvancingAnalytics  3 ปีที่แล้ว +1

      Yep, notebooks used to be very much a scrappy/experimentation thing. However, we can now have notebooks deployed and locked down so people cannot edit them, so we can treat them as any other disciplined, deployed piece of code. The benefit is that support teams can read the notebook, see the output of various cells and generally understand things better than having things in pure code.
      So yep, absolutely we use notebooks in production, deployed through DevOps and tested thoroughly!

  • @MoinKhan-cg8cu
    @MoinKhan-cg8cu 4 ปีที่แล้ว

    Hi it's nice vedio and very informative too,
    Can u please share the notebook path where these are saved .

  • @edoardoroba3349
    @edoardoroba3349 4 ปีที่แล้ว

    Hi, can you please tell me how to allow Databricks to plot multiple displays? If I write two display(...) in the same cell, it outputs the last one only.

    • @AdvancingAnalytics
      @AdvancingAnalytics  4 ปีที่แล้ว

      Afraid there isn't a clean way to have multiple display() functions in a single cell that I'm aware of! You can use print or .show() instead of display but you lose the rich table explorer.
      Usually we just split the code over several cells and it's no problem?
      Simon

  • @arpitgupta5511
    @arpitgupta5511 4 ปีที่แล้ว

    Hi,
    Can you Please tell how to do i save my error in pyspark running in databricks to a table. I want to do that for logs creation.
    thanks
    -Arpit

    • @AdvancingAnalytics
      @AdvancingAnalytics  4 ปีที่แล้ว

      Hey Arpit - the cluster can be setup to automatically push logs out to the DBFS/Mounted Drive, so you can collect ALL logs from the spark cluster, which will include any errors. But you would then need to dig through the logs.
      Instead, you have "try", "except" and "finally" in python which works as a try/catch block. So you can do something like:
      try:
      df.count()
      except Exception as e:
      print(f'Dataframe failed to load with error "{e}"')
      Then pad it out with various different exception handlers. This is pure python exception handling, you can grab more info from the python docs: docs.python.org/3/tutorial/errors.html