Astronomer
Astronomer
  • 157
  • 396 451
How Laurel Uses Airflow To Enhance Machine Learning Pipelines with Vincent La and Jim Howard
The world of timekeeping for knowledge workers is transforming through the use of AI and machine learning. Understanding how to leverage these technologies is crucial for improving efficiency and productivity.
In this episode, we’re joined by Vincent La, Principal Data Scientist at Laurel, and Jim Howard, Principal Machine Learning Engineer at Laurel, to explore the implementation of AI in automating timekeeping and its impact on legal and accounting firms.
Key Takeaways:
(01:54) Laurel's mission in time automation.
(03:39) Solving clustering, prediction and summarization with AI.
(06:30) Daily batch jobs for user time generation.
(08:22) Knowledge workers touch 300 items daily.
(09:01) Mapping 300 activities to seven billable items.
(11:38) Retraining models for better performance.
(14:00) Using Airflow for retraining and backfills.
(17:06) RAG-based summarization for user-specific tone.
(18:58) Testing Airflow DAGs for cost-effective summarization.
(22:00) Enhancing Airflow for long-running DAGs.
Resources Mentioned:
Vincent La -
www.linkedin.com/in/vincentla/
Jim Howard -
www.linkedin.com/in/jameswhowardml/
Laurel -
www.linkedin.com/company/laurel-ai/
Apache Airflow -
airflow.apache.org/
Ernst & Young -
www.ey.com/
Thanks for listening to The Data Flowcast: Mastering Airflow for Data Engineering & AI. If you enjoyed this episode, please leave a 5-star review to help get the word out about the show. And be sure to subscribe so you never miss any of the insightful conversations.
#AI #Automation #Airflow #MachineLearning
มุมมอง: 91

วีดีโอ

How Vibrant Planet's Self-Healing Pipelines Revolutionize Data Processing
มุมมอง 7721 วันที่ผ่านมา
Discover the cutting-edge methods Vibrant Planet uses to revolutionize geospatial data processing and resource management. In this episode, we delve into the intricacies of scaling geospatial data processing and resource allocation with experts from Vibrant Planet. Joining us are Cyrus Dukart, Engineering Lead, and David Sacerdote, Staff Software Engineer, who share their innovative approaches ...
How Vibrant Planet is Making the World’s Ecosystems More Resilient with Fully-Managed Apache Airflow
มุมมอง 47หลายเดือนก่อน
Vibrant Planet is an eco-tech startup that uses geospatial processing to collect massive amounts of data on the world’s forests, orchestrating all this data through Apache Airflow. They originally self-hosted their own Airflow environments, but soon faced scaling issues. In evaluating hosted solutions, they ultimately selected Astro, which helped free up valuable time for their engineering team...
Driving Next-Gen AI Applications with AWS and Astronomer
มุมมอง 166หลายเดือนก่อน
Artificial Intelligence is shaping how modern organizations make decisions, drive critical business outcomes, and support their customers and stakeholders. But behind the innovation is a continually changing set of requirements and best practices that are key to the success of getting AI into production. In this webinar, resident AI/ML experts from AWS and Astronomer explore: real world applica...
The Power of Airflow in Modern Data Environments at Wynn Las Vegas with Siva Krishna Yetukuri
มุมมอง 579หลายเดือนก่อน
Understanding the critical role of data integration and management is essential for driving business success, particularly in a dynamic environment like a luxury casino resort. In this episode, we sit down with Siva Krishna Yetukuri, Cloud Data Architect at Wynn Las Vegas, to explore how Airflow and other tools are transforming data workflows and customer experiences at Wynn Las Vegas. Key Take...
The Future of AI in Data Engineering With Astronomer’s Julian LaNeve and David Xue
มุมมอง 229หลายเดือนก่อน
The world of data orchestration and machine learning is rapidly evolving, and tools like Apache Airflow are at the forefront of these changes. Understanding how to effectively utilize these tools can significantly enhance data processing and AI model deployment. This episode features Julian LaNeve, CTO at Astronomer, and David Xue, Machine Learning Engineer at Astronomer. They delve into the in...
Powering the Texas Rangers World Series Win With AI on Airflow with Alexander Booth
มุมมอง 184หลายเดือนก่อน
The integration of data and AI in sports is transforming how teams strategize and perform. Understanding how to harness this technology is key to staying competitive in the rapidly evolving landscape of baseball. In this episode, we sit down with Alexander Booth, Assistant Director of Research and Development at Texas Rangers Baseball Club, to explore the intersection of big data, AI and baseba...
How Trellix Modernized their Data Stack and Eliminated Airflow Management Overhead with Astronomer
มุมมอง 51หลายเดือนก่อน
In modernizing their data stack, Trellix set out to move towards a cloud native scheduler, selecting Apache Airflow for its modern data orchestration capabilities. There were challenges and limitations initially with the managed offering from their cloud provider, so they turned to Astronomer to alleviate management overhead, ultimately giving the data engineering team time back to focus on del...
Apache Airflow: Where Data Engineers and ML Engineers Meet
มุมมอง 1672 หลายเดือนก่อน
This talk from GenAI World: Tools, Infra & Open Source Demo Days, covers why Apache Airflow is the leading workflow management and data orchestration framework, and THE platform on which data and machine learning engineers can unify their workflows, directly integrating data engineering with Generative AI pipelines using custom Python scripts or pre-built modules for many popular data tools.
Reliable Data Orchestration for AI Applications
มุมมอง 932 หลายเดือนก่อน
As Apache Airflow has become the standard for both data processing and machine learning pipelines, it has been foundational in supporting AI applications by providing an easy-to-use platform to build advanced data workflows. The teams at Astronomer and Dosu know this first-hand, applying Generative AI to automate tedious tasks for developers. This webinar explores best practices for successfull...
Intro to Astro Hosted!
มุมมอง 6553 หลายเดือนก่อน
This video will take you through the basics of getting started with Astro Hosted, the best place to run Apache Airflow!
Airflow 2.9.0 Release: Meet The Contributors with Ryan Hatter and Ankit Chaurasia
มุมมอง 1643 หลายเดือนก่อน
Introducing Ryan Hatter and Ankit Chaurasia, dynamic contributors to the groundbreaking 2.9.0 Airflow release. Join us as they delve into the exhilarating array of new features they've crafted, such as the debut of Logical Operators for dataset conditional logic, and the unveiling of custom instance names for mapped tasks in the UI.
2024 Apache Airflow: Trends and Insights
มุมมอง 3664 หลายเดือนก่อน
Explore insights into how Apache Airflow is used across industries for data orchestration and workflow management. As organizations navigate the complexities of today’s data landscape, Apache Airflow emerges as a pivotal tool in streamlining processes and driving efficiency. In this webinar, we discuss strategies you can employ to maximize the full potential of Airflow in your modern data opera...
How Kiwi.com Achieves Shorter Time to Value with Astronomer
มุมมอง 714 หลายเดือนก่อน
Kiwi.com was dealing with high infrastructure costs from cloud providers and was looking to reduce spend in this area. Astronomer helped bring these costs down, in addition to eliminating the need for Kiwi.com to hire a full team of engineers focused on running Apache Airflow, further increasing cost savings.
How Astronomer Helps Faire Deliver Critical Data on Time
มุมมอง 594 หลายเดือนก่อน
Airflow is central to Faire’s data organization, powering their critical pipelines. Faire uses data for improving their products, as well as building better products for their customers. A stable infrastructure is paramount for Faire’s data needs and for their internal stakeholders - any downtime would be extremely detrimental. Astronomer has helped Faire ensure critical data is delivered on ti...
Astronomer Gives Custom Ink the Bandwidth to Focus on Key Business Initiatives
มุมมอง 204 หลายเดือนก่อน
Astronomer Gives Custom Ink the Bandwidth to Focus on Key Business Initiatives
How Collibra Delivers Critical Analytics with Astronomer
มุมมอง 444 หลายเดือนก่อน
How Collibra Delivers Critical Analytics with Astronomer
Managing Airflow across multiple teams
มุมมอง 2784 หลายเดือนก่อน
Managing Airflow across multiple teams
DAG writing for data engineers and data scientists
มุมมอง 6195 หลายเดือนก่อน
DAG writing for data engineers and data scientists
Astronomer Democratizes Data Engineering with Microsoft Azure
มุมมอง 1375 หลายเดือนก่อน
Astronomer Democratizes Data Engineering with Microsoft Azure
The Astronomer Champions Program for Apache Airflow
มุมมอง 1925 หลายเดือนก่อน
The Astronomer Champions Program for Apache Airflow
How the World’s Best ML Teams Get AI into Production
มุมมอง 1215 หลายเดือนก่อน
How the World’s Best ML Teams Get AI into Production
How to manage connections in Airflow
มุมมอง 6686 หลายเดือนก่อน
How to manage connections in Airflow
What's new in Airflow 2.8
มุมมอง 6236 หลายเดือนก่อน
What's new in Airflow 2.8
Supercharge your Airflows with Apache Airflow on Astro - An Azure Native ISV Service
มุมมอง 1776 หลายเดือนก่อน
Supercharge your Airflows with Apache Airflow on Astro - An Azure Native ISV Service
Airflow 2.8 Release: Meet the Contributors
มุมมอง 2157 หลายเดือนก่อน
Airflow 2.8 Release: Meet the Contributors
Unify Your Data to Power AI and Next-Gen Applications
มุมมอง 807 หลายเดือนก่อน
Unify Your Data to Power AI and Next-Gen Applications
Airflow - Where Data Engineers and Machine Learning Engineers Meet
มุมมอง 3038 หลายเดือนก่อน
Airflow - Where Data Engineers and Machine Learning Engineers Meet
Advanced Tips & Tricks for Improving Your Airflow DAGs!
มุมมอง 1.5K8 หลายเดือนก่อน
Advanced Tips & Tricks for Improving Your Airflow DAGs!
How to Use the TaskFlow API and Traditional Operators Together to Create More Efficient DAGs!
มุมมอง 1.4K8 หลายเดือนก่อน
How to Use the TaskFlow API and Traditional Operators Together to Create More Efficient DAGs!

ความคิดเห็น

  • @marcin2x4
    @marcin2x4 10 วันที่ผ่านมา

    Are presented examples available in any code repo?

  • @shadabbigdel5017
    @shadabbigdel5017 13 วันที่ผ่านมา

    Thank you very much for the great presentation and hands-on session. We are going to use Airflow in EKS, and our development Team needed a way to simulate their local environment to test their DAGs during development and become familiar with airflow on Kubernetes. Your guide was extremely helpful.

  • @bryanpolito8576
    @bryanpolito8576 27 วันที่ผ่านมา

    Gracias

  • @AnchalGupta-ek3wr
    @AnchalGupta-ek3wr หลายเดือนก่อน

    After adding the python file and html file, and restarting the web server plugin details are visible in Admin > Plugins path. But the View is not populating in cloud composer. Is there anything else need to be performed?

  • @AnchalGupta-ek3wr
    @AnchalGupta-ek3wr หลายเดือนก่อน

    After adding the python file and html file, and restarting the web server and postgres from docker. But the View is not populating in my local airflow, Is there anything else need to be performed? Running airflow from docker setup, My airflow version is 1.10.15, pretty old, but can't switch to newer version right now

  • @ap2394
    @ap2394 หลายเดือนก่อน

    HI Is it possible to schedule the Task using dataset ? or its controlled at Dag level. I mean if i hv 2 task in downstream Dag , do I hv option to customised the schedule on the basis of Task's upstream dataset

  • @spikeydude114
    @spikeydude114 หลายเดือนก่อน

    Do you have LinkedIn?

  • @pgrvloik
    @pgrvloik หลายเดือนก่อน

    Great!

  • @rkenne1391
    @rkenne1391 2 หลายเดือนก่อน

    Can you provide more context on the batch inference pipeline ? Airflow is an orchestrator, you will need a different framework to perform batch inference ?

  • @snehal4520
    @snehal4520 2 หลายเดือนก่อน

    Very informative, thank you!

  • @amirhosseinsharifinejad7752
    @amirhosseinsharifinejad7752 2 หลายเดือนก่อน

    Really helpful thank you😍

  • @PaulChung-rg6jv
    @PaulChung-rg6jv 2 หลายเดือนก่อน

    Tons of information. Any chance this can be thrown in a github for us engineers who need more time to digest?

  • @munyaradzimagodo3983
    @munyaradzimagodo3983 2 หลายเดือนก่อน

    thank you, well explained. Created an express application to create DAGs programatically but the endpoints are not working

  • @CarbonsHDTuts
    @CarbonsHDTuts 2 หลายเดือนก่อน

    This is really awesome and I love the entire video and always love content from you guys and girls but could I please give some constructive feedback?

  • @mettuvamshidhar1389
    @mettuvamshidhar1389 2 หลายเดือนก่อน

    Is it possible to get the list of variables pushed through xcom push in first task (here extracting lets say) And can we pull that varibales list xcom_pull and have it as a group Dynamically (instead of A, B, C)??

  • @bilalmsd07
    @bilalmsd07 3 หลายเดือนก่อน

    what about if any of the subtasks fails ? how to trigger the error than but also the remining parallel tasks to be run.

  • @yevgenym9204
    @yevgenym9204 3 หลายเดือนก่อน

    @Astronomer Please share a direct link to the CLI library you mention (for proper files strcuture) th-cam.com/video/zVzBVpbgw1A/w-d-xo.htmlsi=HiJa9Afi-53yLZOG&t=873

    • @Astronomer
      @Astronomer 3 หลายเดือนก่อน

      You can find documentation on the Astro CLI, including download instructions, here: docs.astronomer.io/astro/cli/overview

  • @rohitnath5545
    @rohitnath5545 3 หลายเดือนก่อน

    Do we have a video on how to run airflow using docker on cloud containers. Running locally is fine to learn and test. But the real work is to see how on cloud. Am a consultant and for my clients easier setup is the goal. With airflow i dont see that

    • @Astronomer
      @Astronomer 3 หลายเดือนก่อน

      Astronomer provides a managed service for running Airflow at scale and in the cloud. You can learn more at astronomer.io/try-astro

  • @marehmanmarehman9431
    @marehmanmarehman9431 3 หลายเดือนก่อน

    great work, keep it up.

  • @ryank8463
    @ryank8463 3 หลายเดือนก่อน

    Hi, this video is really beneficial. I have some question about the best practive of handling data transmission btw tasks. I am building MLops using airflow. In my model training dag, it contains data preprocess-> model training. So there would be massive data transmission btw this 2 dags. I am using Xcom to transmit data btw them. But there's like a 2G limitation in Xcom. So what's the best practice to deal with this problem? Using a S3 to sned/pull data from tasks? Or should I simply combine these 2 tasks(data preprocess-> model training)? Thank you.

    • @Astronomer
      @Astronomer 3 หลายเดือนก่อน

      Thank you! For passing larger amounts of data between tasks you have two main options: a custom XCom backend or writing to intermediary storage directly from within the tasks. In general we recommend a custom XCom backend as a best practice in these situations, because you can keep your DAG code the same, the change happens in how the data sent to and retrieved from XCom is processed. You can find a tutorial on how to set up a custom XCom backend here: docs.astronomer.io/learn/xcom-backend-tutorial. Merging the tasks is generally not recommended because it makes it harder to get observability and rerun individual actions.

    • @ryank8463
      @ryank8463 3 หลายเดือนก่อน

      @@Astronomer Hi, Thanks for your valuable reply. I would also like to ask what level of granularity should we aim for when allocating tasks. Since the more tasks there are, the more push/pull data from the external storage happens, and when the data is large, it brings some level of network overhead.

  • @christianfernandez5717
    @christianfernandez5717 3 หลายเดือนก่อน

    Great video. Would also be interested in a webinar regarding scaling the Airflow database since I'm having some difficulties of my own with that.

    • @Astronomer
      @Astronomer 3 หลายเดือนก่อน

      Noted, thanks for the suggestion! If it's helpful, you can check out our guide on the metadata db docs.astronomer.io/learn/airflow-database. Using a managed service like Astro is also one way many companies avoid scaling issues with Airflow.

  • @dan-takacs
    @dan-takacs 4 หลายเดือนก่อน

    great video. I'm trying to make this work with LivyOperator do you know if it can be expanded or partial arguments supplied to it?

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      It should work. Generally you can map over any type of operator, but not that some parameters can't be mapped over (e.g. BaseOperator params). More here: docs.astronomer.io/learn/dynamic-tasks

  • @looklook6075
    @looklook6075 4 หลายเดือนก่อน

    32:29 why "test' connection button is disabled. SO frustrating. Aifrflow makes it so hard to connect to anything. Not intuitive at all. And your video just skipped on how to enable "test". And ask me to contact my deployment admin. lol, I am the deployment admin. Can you show me how? I checked its website and the documentation is not helpful at all. I have been stuck for over a week on how to connect airflow to an MSSQL Sever.

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      The `test` connection button is disabled by default starting in Airflow 2.7 for security reasons. You can enable it by setting the test_connection core config to Enabled. docs.astronomer.io/learn/connections#test-a-connection. We also have some guidance on connecting to an MSSQL server, although the process can vary depending on your exact setup: docs.astronomer.io/learn/connections/ms-sqlserver

    • @quintonflorence6492
      @quintonflorence6492 2 หลายเดือนก่อน

      @@Astronomer Hi, where can I find the core config to make this update? I'm currently using Astro CLI. I'm not seeing this setting in the two .yaml files in the project. Thank you.

  • @saritabasye5254
    @saritabasye5254 4 หลายเดือนก่อน

    *promosm* 💔

  • @pichaibravo
    @pichaibravo 4 หลายเดือนก่อน

    Is it good to return df many times in Airflow?

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      It's generally fine to pass dataframes in between your Airflow tasks, as long as you make sure your infrastructure can support the size of your data. If you use XCom, it's a good idea to consider a custom XCom backend for managing dataframes as Airflow's metadata db isn't set up for this specifically.

  • @ziedsalhi4503
    @ziedsalhi4503 4 หลายเดือนก่อน

    Hi, I have already an existing airflow project, so how can use Astro CLI to run my project ?

  • @greatotool
    @greatotool 4 หลายเดือนก่อน

    is the git repository public?

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      Yes! You can find it here: github.com/astronomer/webinar-demos/tree/best-practices-prod

    • @greatotool
      @greatotool 4 หลายเดือนก่อน

      Thakns!!🙂@@Astronomer

  • @user-by8um8bk1w
    @user-by8um8bk1w 5 หลายเดือนก่อน

    please, share repository

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      The repo is here: github.com/astronomer/webinar-demos/tree/best-practices-prod

  • @mcpiatkowski
    @mcpiatkowski 5 หลายเดือนก่อน

    That is great intro and overview of Airflow for beginners! I very much like the datasets concepts and the ability to see data lineage. However, I haven't found the solution for how to make a triggered pipe, that is dataset aware, to be executed with the parent dag execution date. Is it even possible at the moment?

    • @Astronomer
      @Astronomer 4 หลายเดือนก่อน

      Thanks! And that is a great question. It is not possible to have the downstream Dataset-triggered DAG have the same logical_date (the new paramater equivalent to the old execution_date ) as the DAG that caused the update to the dataset, but it is possible to pull that date from the downstream DAG by accessing context["triggering_dataset_events"]: @task def print_triggering_dataset_events(**context): triggering_dataset_events = context["triggering_dataset_events"] for dataset, dataset_list in triggering_dataset_events.items(): print(dataset, dataset_list) print(dataset_list[0].source_dag_run.logical_date) print_triggering_dataset_events() If you use the above in your downstream DAG you can get that logical_date/execution_date to use in your Airflow tasks. For more info and an example with Jinja templating see: airflow.apache.org/docs/apache-airflow/stable/authoring-and-scheduling/datasets.html#fetching-information-from-a-triggering-dataset-event .

    • @mcpiatkowski
      @mcpiatkowski 4 หลายเดือนก่อน

      @@Astronomer That is amazing! You are my hero for life! Thank you!

  • @veereshk6065
    @veereshk6065 5 หลายเดือนก่อน

    Hi, Thank you for detailed demo. I just started exploring dynamic task mapping and I have below requirement where I need to get the data from metadata table and create list of dictionary. [ { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, ] The above structure can be generated using fetch_metadata_task (combination of BigQueryHook and PythonOperator). Now the Question is, how do I generate the dynamic tasks using the above list of dictionary. for each dictionary I want to perform set of tasks ex:GCSToBigQueryOperator, BigQueryValueCheckOperator, BigQueryToBigQueryCopyOperator etc. The sample dag dependancy look like this: start_task >> fetch_metadata_task fetch_metadata_task >> [GCSToBigQueryOperator_table1 >> BigQueryValueCheckOperator_table1 >> BigQueryToBigQueryCopyOperator_table1 >> connecting_dummy_task ] fetch_metadata_task >> [GCSToBigQueryOperator_table2 >> BigQueryValueCheckOperator_table2 >> BigQueryToBigQueryCopyOperator_table2 >> connecting_dummy_task ] fetch_metadata_task >> [GCSToBigQueryOperator_table3 >> BigQueryValueCheckOperator_table3 >> BigQueryToBigQueryCopyOperator_table3 >> connecting_dummy_task ] connecting_dummy_task >> BigQueryExecuteTask >> end_task

  • @veereshk6065
    @veereshk6065 5 หลายเดือนก่อน

    Hi All, Thank you for detailed demo. I just started exploring dynamic task mapping and I have below requirement where I need to get the data from metadata table and create list of dictionary. [ { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, { 'colA' : 'valueA', 'colB' : 'valueB', 'colC' : 'valueC', 'colD' : 'valueD', }, ] The above structure can be generated using fetch_metadata_task (combination of BigQueryHook and PythonOperator). Now the Question is, how do I generate the dynamic tasks using the above list of dictionary. for each dictionary I want to perform set of tasks ex:GCSToBigQueryOperator, BigQueryValueCheckOperator, BigQueryToBigQueryCopyOperator etc. The sample dag dependancy look like this: start_task >> fetch_metadata_task fetch_metadata_task >> [GCSToBigQueryOperator_table1 >> BigQueryValueCheckOperator_table1 >> BigQueryToBigQueryCopyOperator_table1 >> connecting_dummy_task ] fetch_metadata_task >> [GCSToBigQueryOperator_table2 >> BigQueryValueCheckOperator_table2 >> BigQueryToBigQueryCopyOperator_table2 >> connecting_dummy_task ] fetch_metadata_task >> [GCSToBigQueryOperator_table3 >> BigQueryValueCheckOperator_table3 >> BigQueryToBigQueryCopyOperator_table3 >> connecting_dummy_task ] connecting_dummy_task >> BigQueryExecuteTask >> end_task

  • @ayushikhanna1094
    @ayushikhanna1094 5 หลายเดือนก่อน

    Is there any option available in airflow ui to auto trigger.

  • @78salieri78
    @78salieri78 5 หลายเดือนก่อน

    Great video, with many examples, much appreciated!

  • @manCoder
    @manCoder 5 หลายเดือนก่อน

    Very nice introductory video. thanks a lot for this.

  • @vladislavzadvornev4548
    @vladislavzadvornev4548 6 หลายเดือนก่อน

    Hi. Thank you for a great video. I have one question. Can I somehow start astro locally inside of my existing project that already follows a different structure? I would very much like to benefit from a conveneience of astro cli, but there's no way I want to modify a structure of a project that has been in place for more than 1.5 years :)

  • @averychen4633
    @averychen4633 6 หลายเดือนก่อน

    you are the best

  • @user-ee6hz2zl9s
    @user-ee6hz2zl9s 6 หลายเดือนก่อน

    I am able to see dags in Sync Container and in Scheduler but not in WebUI. I am using Kubernetes Executor and btnami/airflow image

  • @cloudlover9186
    @cloudlover9186 6 หลายเดือนก่อน

    I am running in to below problem , will it be acheived by time table concepts . i have same dag which should satisfy the below schedule intervals schedule interval = '30 1,4,7,10,13,16,19,22 * * *' & '00 3,6,12,15,18,21,00 * * *' , Please help and guide.

    • @Astronomer
      @Astronomer 6 หลายเดือนก่อน

      Would you mind sharing info on what type of scheduling interval you'd like to achieve? Not sure what it is based on that string unfortunately!

    • @cloudlover9186
      @cloudlover9186 6 หลายเดือนก่อน

      @@Astronomer Hi , we are in process of changing a daily schedule to 90 mins frequency dag and expectation of dag to run at 00:00, 01:30 , 03:00 so on and also another new dag which is of same 90 mins frequency should run at 00:20, 01:50, 03:10 etc.., point is if i have hard coded start date as future date for example today is 01/10 i will hard code as 01/11 (2024/01/11,00,00) any future change is not impacting the start date schedule , having said we have advised to research more not to hard code start date .FYI we are using timedelta(minutes=90) in schedule interval attribute. if we use current date logic , during deployment time ,(deployment time > start date time) it is executing immediatley , how we can over come this , please help.

  • @bananaboydan3642
    @bananaboydan3642 7 หลายเดือนก่อน

    Airflow wasnt able to locate my existing python scripts. I receive this error: ImportError: cannot import name 'weeklyExtract' from 'dags' (unknown location)

    • @Astronomer
      @Astronomer 6 หลายเดือนก่อน

      Would you mind sharing how you're referencing the script in the code? And where are you python scripts stored? Typically you'll need to create a sub-folder within the DAG's folder to store them and then you can reference them from that path.

  • @richie.edwards
    @richie.edwards 7 หลายเดือนก่อน

    Thank you. Any link to a consumable resource discussingt the topic? The video quality is very bad.

    • @Astronomer
      @Astronomer 6 หลายเดือนก่อน

      Yes definitely, apologies for that! Check out this link it links out to a few different methods of managing your secrets: docs.astronomer.io/astro/secrets-management

  • @HarshGupta-wi1zn
    @HarshGupta-wi1zn 7 หลายเดือนก่อน

    Since I am using Azure Astro. There is no astro cli what to do in this case?

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      The Astro CLI works for Azure Astro as well!

    • @HarshGupta-wi1zn
      @HarshGupta-wi1zn 7 หลายเดือนก่อน

      @@Astronomer so where is astro cli in azure?

  • @pushpendudhara7764
    @pushpendudhara7764 7 หลายเดือนก่อน

    After adding the python file and html file, and restarting the web server plugin details are visible in Admin > Plugins path. But the View is not populating in cloud composer. Is there anything else need to be performed?

    • @pushpendudhara7764
      @pushpendudhara7764 7 หลายเดือนก่อน

      It has to be airflow admin to be able to view the new Menu in web server.

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      Ah thank you for noting that! Have loved your comment so hopefully others can see!

  • @maximilianrausch5193
    @maximilianrausch5193 7 หลายเดือนก่อน

    Do you have more resources on how to create plugins?

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      Definitely! Check out this guide for more: docs.astronomer.io/learn/using-airflow-plugins

  • @stevenanderson3896
    @stevenanderson3896 7 หลายเดือนก่อน

    "Promo sm" ✌️

  • @illiakaltovich
    @illiakaltovich 8 หลายเดือนก่อน

    Tamara Fingerlinm, your approach of 'Live with an Astronomer' is really cool and organized. I gained some nice insights about the topic I am struggling with. Thank you so much! ❤

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      Thanks so much, Tamara is the best!

  • @RedShipsofSpainAgain
    @RedShipsofSpainAgain 8 หลายเดือนก่อน

    13:45 this is great. But one suggestion: show the assert `Today is {{ execution_date }}` where templating is not working (slide 12) next to where the templating is working (slide 14) so that the viewer audience can compare the two easily side-by-side.

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      Thanks for the suggestion!

  • @Cam-xu1sq
    @Cam-xu1sq 8 หลายเดือนก่อน

    You skipped out a HUGE amount in the middle lol, seems to be a very common occurrence with these videos tbh.

    • @Astronomer
      @Astronomer 8 หลายเดือนก่อน

      What do you mean by skipped out Cam?

    • @Cam-xu1sq
      @Cam-xu1sq 8 หลายเดือนก่อน

      @@Astronomer You've missed a few steps, also, this demo is a little outdated as the latest version of Astronomer.Cosmos has had a complete re-work so that the DbtTaskGroup function does everything, however, whenever I try test it out I get a weird error with JaffleShop: Database Error in model orders (models/orders.sql) improper relation name (too many dotted names): raw_csvs.testing_dbt.raw_csvs.orders__dbt_backup This error only occurs for the dbt backups for staging tables (doesn't impact views or seed tables) It's trying to query using Schema.Db_Name,Schema.Table which obviously throws an error because Schema should come after DB_Name I don't get this error if I use the only airflow-dbt python package to do seed, snapshot, test and run commands so I've kept using those for now. If you can explain why I'm getting that error that'd be awesome because I don't understand... I removed schema from my airflow conn object called cams_db and pass in the schema and db_name with the profile_config, however, I still get the same error which is frustrating.

  • @maximilianrausch5193
    @maximilianrausch5193 8 หลายเดือนก่อน

    Are the code examples available?

    • @Astronomer
      @Astronomer 7 หลายเดือนก่อน

      Yes they are coming, apologies the blog hasn't been published yet, but will publish when they are!

  • @dani2500d
    @dani2500d 8 หลายเดือนก่อน

    Hey, Awesome webinar! Thank you! I do have one question about the best practices of structuring a DAG. So, It is better to put the task implementations (python operators) into a separate file. If my tasks require a lot of imports, is it better to import inside every task (method) or is it fine to import it all on the top level of the tasks file?

    • @Astronomer
      @Astronomer 8 หลายเดือนก่อน

      I honestly usually just add them all to the top of DAG file as Taskflow tasks, instead of pulling in from files, but if you're using the same python functions in multiple DAG's, the import method is probably best for you!

  • @user-ij5cf4vs5c
    @user-ij5cf4vs5c 9 หลายเดือนก่อน

    Hello, My test button is disabled could you tell me how to fix that issue, i dont find anything in the web for troubleshooting that kind of problem ?

    • @Astronomer
      @Astronomer 8 หลายเดือนก่อน

      The Test button was disabled in the latest release of airflow, but you can re-enable, check out the following docs link: docs.astronomer.io/learn/connections#:~:text=You%20can%20enable%20connection%20testing,Enabled%20in%20your%20Airflow%20environment.