Vertex AI Pipelines - The Easiest Way to Run ML Pipelines

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ต.ค. 2024

ความคิดเห็น • 70

  • @razalminhas6349
    @razalminhas6349 ปีที่แล้ว +9

    This video should be first in search results when searching for Vertex AI pipelines. Thanks for making it!

  • @jobiquirobi123
    @jobiquirobi123 2 ปีที่แล้ว +4

    Thank you Sascha. Great video to start learning more about all the features in Vertex AI. Keep the good work!

  • @Pirake123
    @Pirake123 5 หลายเดือนก่อน +4

    This is better than the GCP videos, amazing thankyou!!

    • @ml-engineer
      @ml-engineer  5 หลายเดือนก่อน

      Thank you Pirake

  • @zbynekba
    @zbynekba 9 หลายเดือนก่อน +3

    Sascha, you can significantly enhance the intelligibility of your presentation by improving the audio quality. The distracting sound reflections from your office walls make listening stressful. The easiest no-cost remedy is close-miking, such as using a headset microphone for recording. Alternatively, if you prefer speaking to a distant microphone during recording, you could consider some acoustic treatment for your office space.

    • @ml-engineer
      @ml-engineer  8 หลายเดือนก่อน +1

      great feedback. currently testing different setups to improve it.

  • @whatisthis7510
    @whatisthis7510 หลายเดือนก่อน +1

    In the middle of the Google Cloud ML Engineer course. They need to archive their entire set of course material and replace it with your videos. I learned more from your videos than anything else!. Thank god for Germans or I would still be struggling. :)

    • @ml-engineer
      @ml-engineer  หลายเดือนก่อน +1

      Love it. Thanks for watching my videos. Every minute watched is much appreciated.
      Have a wonderful day.

  • @eliegakuba
    @eliegakuba 2 ปีที่แล้ว +4

    Thank you so much for the video. it is well explained and very helpful. I think one thing could be notably mentioned is that the introduction of artifacts as parameter was to make it easier working with gcsfuse as the artifacts path points to the mounted folder instead of the actual location in GCS. Also if possible can you make a video explaining improvement that kfp v2 brings compared to kfp v1? thanks.

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว +1

      True the switch to artifacts as a reference path helped to also introduce the concept of ML Metadata.

  • @miguelalba2106
    @miguelalba2106 ปีที่แล้ว +1

    Great video! I really like your channel, everything is super clear

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว +1

      Thank you Miguel
      Any other ML related topics that you are interested in?

    • @miguelalba2106
      @miguelalba2106 ปีที่แล้ว +1

      ​​@@ml-engineert would be very nice a video explaining how to run components that were already contenarized and then run using dsl.ContainerSpec or how to do CI/CD for a vertex AI pipeline

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      I wrote an article quite some time ago about CI/CD for Vertex AI Pipelines medium.com/google-cloud/how-to-implement-ci-cd-for-your-vertex-ai-pipeline-27963bead8bd

  • @irfandogic9579
    @irfandogic9579 2 ปีที่แล้ว +1

    Hi Sascha! First of all thank you for the great explanation and source code. I am using Vertex AI and want to automate our ML process using Pipelines. I‘ve followed yout „basic pipelines“ code and it worked. My question is: I have seen everywhere that when installing kfp, aiplatform and pipeline-components it should be installed with -USER, but in your example is working without it (and in my vertex project also). Do I still need to install it with -USER or I can just use it without? Regards, Irfan

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว +1

      Hi Irfan
      --user you only need if you don't have root access to install the packages. When ever you get access errors try to add --user.

  • @souravthakur6222
    @souravthakur6222 2 ปีที่แล้ว +2

    Thank you ! Please share more end to end ML projects using Vertex AI pipelines plz

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Hi Google has a great list of examples on their GitHub repository
      github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main/notebooks/official/pipelines
      Check it out almost all of them are end to end examples.

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Just recently I released a new video including a end to end pipeline to create Recommendations.

  • @raharijaonazolalainayannic7851
    @raharijaonazolalainayannic7851 2 ปีที่แล้ว +1

    Great video Sascha, Is it easy to manage autoscaling with VertexAI?

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Vertex AI pipelines do not support autoscaling. But if you want to autoscale your deployed models for serving that is possible.

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Thanks

    • @raharijaonazolalainayannic7851
      @raharijaonazolalainayannic7851 2 ปีที่แล้ว

      When you write your pipeline on top of kubeflow cluster does it support the autoscaling?

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      No autoscaling for self managed KFP on GKE either. You define CPU and memory needed.

  • @o_o610
    @o_o610 ปีที่แล้ว +1

    Thank you so much for the video ! Do you know if Vertex AI Pipeline handle Pipeline versioning or historize the evolution of the pipeline ?

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      I always recommend to put your pipeline code into git. This way you have the perfect pipeline version over time available.
      Is that what you meant with versioning?

  • @gokulramasamy8361
    @gokulramasamy8361 ปีที่แล้ว +1

    Thank you Sascha for the video. Is it possible to apply unit test for Vertex AI pipeline? If yes, Can you give me a suggestion, how to do?

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      Not as straightforward as it could be. It's best if you think of each component as a simple python function. This way you can abstract away some of the unnecessary parts that are not required to be tested. The part that should be tested is your python code for each of the components.
      You can create a component from a simple python function by using
      create_component_from_func

  • @LucasGomide
    @LucasGomide ปีที่แล้ว +1

    Hey dude, I have one more question. In my context, the search must be filtered by UserID in order to avoid returning results from another user.
    What's the best approach to do that, creating an index for each user? By using MatchingEngine Filters?

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      Hi Lucas
      I guess you are referring to one of those videos?
      th-cam.com/video/inAY6M6UUkk/w-d-xo.html
      th-cam.com/video/KMTApM5ajAw/w-d-xo.html
      No need to build an index for each user that would be way to expensive and you would reach the number of allowed indexes probably pretty quickly.
      The best solution is the built in filtering that Matching Engine is providing. This features is meant for exactly those use case like yours.
      cloud.google.com/vertex-ai/docs/matching-engine/filtering
      Have a good day

  • @Smart-ls6xi
    @Smart-ls6xi 11 หลายเดือนก่อน +1

    Hello, I have a question. If I am working with a team, is it one person who is supposed to have a vertexAI account that will be charged? Or will each user, though sharing the same project, be charged in their account?

    • @ml-engineer
      @ml-engineer  10 หลายเดือนก่อน

      Hi
      One person need to register on Google Cloud and create a project. This person can invite additional people to the project.
      All user that are invited will share the same project and the project get charged. For each project you have a billing account that uses a credit card for payments.

  • @TheRobertjoellewis
    @TheRobertjoellewis ปีที่แล้ว +1

    Great intro! Hmmm I get an error when I run the basic pipeline. "Internal error encountered. Please try again" - moving over to the docs.

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว +1

      Take the compiled pipeline JSON and upload it via UI to see if you get a different error there.

    • @TheRobertjoellewis
      @TheRobertjoellewis ปีที่แล้ว +1

      @@ml-engineer It was indeed a permissions error. I gave my account the right permissions and it worked :)

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว +1

      @@TheRobertjoellewis good glad it is working now

  • @kadapa-rl6jg
    @kadapa-rl6jg 2 ปีที่แล้ว +1

    Hi,
    I saw your medium post where you are reffering to cloud composer when you are using cloud run as your personal note. Can you please share clarification on why you are advising cloud composer for cloudrun jobs

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Hi
      many companies use Cloud Composer for processing heavy workloads. From my experience this can lead to a lot of challenges. By design Composer is a Orchestration tool and only meant for orchestration. That's why I recommend to offload processing heavy workloads to Cloud Run or Cloud Dataflow.
      If you don't need a orchestration tool and simply want to run a few Cloud Run Jobs you don't need Composer.
      Let me know if that helps answer your question =)

    • @kadapa-rl6jg
      @kadapa-rl6jg 2 ปีที่แล้ว

      @@ml-engineer can you please let me know if there are any documents or medium note or any blog on how to work with cloud composer.

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      @@kadapa-rl6jg I am a big fan of the Google documentation cloud.google.com/composer/docs/tutorials
      Neil also has a few very good articles around Cloud Composer medium.com/@kolban1

    • @kadapa-rl6jg
      @kadapa-rl6jg 2 ปีที่แล้ว

      @@ml-engineer thanks for the information I shall go through it to understand this

  • @geeglu
    @geeglu ปีที่แล้ว +1

    Error importing aiplatform
    Tried following the vertex ai documentation and while running:
    from google_cloud_pipeline_components import aiplatform as gcc_aip
    I get an error: Import error: Cannot import name '_dynamic' from 'kfp.components' (/opt/conda/lib/python3.10/site_packages/kfp/components/init.py)
    Any suggestions to resolve this error?

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      Hi Geeglu
      aiplatform is not part of the google_cloud_pipeline_components package. For importing AI Platform you need to use pypi.org/project/google-cloud-aiplatform/

  • @fredericmolina6890
    @fredericmolina6890 2 ปีที่แล้ว +1

    Awesome video ! Thanks

  • @ronaldboodram6466
    @ronaldboodram6466 ปีที่แล้ว +1

    Excellent video

  • @kadapa-rl6jg
    @kadapa-rl6jg 2 ปีที่แล้ว +1

    Can you also create a session for troubleshooting Vertex AI

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      Hi
      Sure any specific service?
      Usually everything is logged. Though batch predictions can get a bit more complicated to troubleshoot.

    • @mariannakovalova8849
      @mariannakovalova8849 ปีที่แล้ว +1

      @@ml-engineer it would be great! Because I have an error "The DAG failed because some tasks failed. The failed tasks are: [concat]" for this tutorial and have no idea why and how to fix... And can't move on

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      @@mariannakovalova8849 Hi Marianna I just ran the notebook to ensure everything is working as expected. Could not reproduce the error you get for the basic pipeline.
      Head to the logs and check the detailed error information. If you like, post them here and I might see why it is failing.

    • @mariannakovalova8849
      @mariannakovalova8849 ปีที่แล้ว

      @@ml-engineer com.google.cloud.ai.platform.common.errors.AiPlatformException: code=RESOURCE_EXHAUSTED, message=The following quota metrics exceed quota limits: aiplatform.googleapis.com/custom_model_training_cpus, cause=null; Failed to create custom job for the task. Task: Task name: concat, Task state: DRIVER_SUCCEEDED, Execution name: projects/2

    • @mariannakovalova8849
      @mariannakovalova8849 ปีที่แล้ว

      @@ml-engineer but when I go by link the usage of the resources are 0 or some small percentage

  • @kanavdua4587
    @kanavdua4587 9 หลายเดือนก่อน

    Hi Sascha. I have been facing an error for the last 3 days. Please help me resolve it.

    • @ml-engineer
      @ml-engineer  9 หลายเดือนก่อน

      Hi
      What kind of error?

    • @kanavdua4587
      @kanavdua4587 9 หลายเดือนก่อน

      I am not able to write it as a comment. I don't know why.

    • @kanavdua4587
      @kanavdua4587 9 หลายเดือนก่อน

      The DAG failed because some tasks failed.
      The failed tasks are: [concat].; Job (project_id = practice-training, job_id = 125471868915286016) is failed due to the above error.; Failed to handle the job: {project_number = 385236764312, job_id = 125471868915286016}

    • @ml-engineer
      @ml-engineer  9 หลายเดือนก่อน

      @@kanavdua4587 you can check what happened in the logs for each step/ component in your pipeline.

    • @kanavdua4587
      @kanavdua4587 9 หลายเดือนก่อน

      @@ml-engineer Please can you guide me a little 🙏🏻🙏🏻.
      @component()
      def concat(a:str,b:str)->str:
      Logging.info(f"concatenating '{a}' and '{b}' results in '{a+b}' ")
      return a+b
      I am a beginner. I don't have any knowledge. Please help.
      return

  • @frederikbode9880
    @frederikbode9880 ปีที่แล้ว

    so how's the stress? :D

    • @ml-engineer
      @ml-engineer  ปีที่แล้ว

      Which stress? 🙂🙂

  • @Juliodonadello
    @Juliodonadello 2 ปีที่แล้ว +1

    0,96 ? overfitted xd

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว

      It was a very easy dataset 0.96 is indeed correct.

    • @ml-engineer
      @ml-engineer  2 ปีที่แล้ว +1

      It's in the notebook you can run it yourself l. Breast cancer dataset. Scores around 95 upwards are achievable and the normal range for his dataset. You can get up to 98.

    • @Juliodonadello
      @Juliodonadello 2 ปีที่แล้ว +1

      @@ml-engineer 92 mine