This is the best video on the internet for complete beginners trying their hand at docker/airflow on a windows pc ! After multiple videos you are the only one who's addressed the issue of not having the required packages in the default docker image ! Please keep making more such videos !
Hi...this video is a godsend...thanks a million sir...🙏 Also i have a doubt. What if the ML project is bigger and more extensive? In that case can we still include the entire project source (incl. src, artifacts etc. etc.) inside the dags folder? Or should we mount the entire project? If we should mount, how do we mount the project sir? Thanks.
You are coding everything in your local VS Code and then the dockerized airflow runs the dag files. But how can you run this code from VS Code program? I want to see exactly how it's executed. I guess you need to configure it to run the python inside airflow. How do you force it to do it?
If I want to use the csv file in a susbsequent function, what kind of path can I specify to access it? Is it the opt/airflow/ or the local directory path?
I have 5 external files having pandas code. I used to run them one after another in Spyder. How can I write dag to read code from these 5 files which read input files from my local and generate output csv files again in local in each? Tried a few things but not able to run dag. Thanks in advance.
Are you running airflow inside the docker container? If yes then dag will not read files from local. Alternative would be- first file read from some url may be GitHub and then second onwards you can store output files directly without giving any file path prefix example pd.to_csv("abc.csv") and it will save in the container local memory and then you can read it as well like pd.read_csv("abc.csv") then it will work.
Use Ubuntu OS with 18 GB of RAM. it will suffice your requirement. I am not suggesting of having GPU and all in your laptop as it will increase the cost. For that you can use Google colab. Window is not good for ML development. It create many issues while installing things
Airflow vs Argo: quick comparison and know when to use which one: th-cam.com/video/FAktWEwlezs/w-d-xo.html
Really informative video. It is a great starting point!!
Thank you.
This is the best video on the internet for complete beginners trying their hand at docker/airflow on a windows pc ! After multiple videos you are the only one who's addressed the issue of not having the required packages in the default docker image ! Please keep making more such videos !
Glad to hear that it helped 🙏
Great ashtouch!!. More contents on complex pipelines pls, with third party connectors too.
Thanks, need more advance ML and Data Eng. videos
You can suggest the topic
Thank you so much for this video! 12:27 onwards helped me solve an issue I was facing for a long time with python packages not found.
Good to hear that it helped you.
Hi...this video is a godsend...thanks a million sir...🙏
Also i have a doubt. What if the ML project is bigger and more extensive? In that case can we still include the entire project source (incl. src, artifacts etc. etc.) inside the dags folder?
Or should we mount the entire project? If we should mount, how do we mount the project sir?
Thanks.
You can use the kubernetes environment and inside that have a pv mounted and place the codebase there. This should work.
Great content ashtouch.. we need consistent content on airflow pls. Perhaps a paid course is in the works?
I agree,, Ashu sir please bring more Airflow content possibly with a cloud platform like AWS
Thanks
Welcome
HI Ashutosh, thanks for providing the overview, it would be great if you share the project related github repo link
Thanks for this informative tutorial. It would be really helpful if you can make one tutorial vdo for an ML example of mlflow with airflow use.
Ok, probably the next video. So stay tuned 😜
You are coding everything in your local VS Code and then the dockerized airflow runs the dag files.
But how can you run this code from VS Code program? I want to see exactly how it's executed.
I guess you need to configure it to run the python inside airflow. How do you force it to do it?
If I want to use the csv file in a susbsequent function, what kind of path can I specify to access it? Is it the opt/airflow/ or the local directory path?
I have 5 external files having pandas code. I used to run them one after another in Spyder. How can I write dag to read code from these 5 files which read input files from my local and generate output csv files again in local in each?
Tried a few things but not able to run dag. Thanks in advance.
Are you running airflow inside the docker container? If yes then dag will not read files from local. Alternative would be- first file read from some url may be GitHub and then second onwards you can store output files directly without giving any file path prefix example pd.to_csv("abc.csv") and it will save in the container local memory and then you can read it as well like pd.read_csv("abc.csv") then it will work.
Which best laptop config for mlops engineer
Use Ubuntu OS with 18 GB of RAM. it will suffice your requirement. I am not suggesting of having GPU and all in your laptop as it will increase the cost. For that you can use Google colab.
Window is not good for ML development. It create many issues while installing things
Can you give github link for code?
Make videos in hindi
There is a mix audience. English is understood by everyone. Also, I speak very simple english.
Thanks
Welcome 🙂