Create Text Dataset in Vertex AI

แชร์
ฝัง
  • เผยแพร่เมื่อ 14 ต.ค. 2024

ความคิดเห็น • 12

  • @NuhilMehdy-mz6gi
    @NuhilMehdy-mz6gi ปีที่แล้ว

    Liked your Vertex AI series. Nicely presented with the exact level of detail. Hope you do some Feature Store, Metadata, and Matching Enginee videos as well.

    • @cloud4datascience772
      @cloud4datascience772  ปีที่แล้ว

      Thanks for the suggestions, I might take a look into that!

  • @JustinGreen-w4u
    @JustinGreen-w4u ปีที่แล้ว

    Hey thanks for this instructional. It's great! I uploaded the dataset, but can't seem to use the data splits to train the model when using the API. When I fill-in the filter_splits in the TextDataset.run() method I receive an error that "Provided data item filter is not valid." I used "training", "validation" and "test" in the 3 filter_splits parameter as recorded in my json input file for the ml_use parameter. Any idea why? Thanks again!!

    • @cloud4datascience772
      @cloud4datascience772  ปีที่แล้ว

      Hello, thank you for your message, I am glad that you enjoyed the tutorial! Could you point me to this TextDataset.run() method documentation?
      I can't find it on the official Google documentation page: cloud.google.com/python/docs/reference/aiplatform/1.22.0/google.cloud.aiplatform.TextDataset - but I am happy to think about potential solutions to your problems once you give me more details :)
      I have a feeling that in order to train your model you simply need to run the training job AutoMLTextTrainingJob.run() and provide your dataset, if you added a split into those 3 sets during the dataset creation this should be applied automatically.

  • @bullmilkers
    @bullmilkers 4 หลายเดือนก่อน

    I want to create a text data set, but all of my text is in pdf form. How would I go about doing that?

    • @August-m8l
      @August-m8l หลายเดือนก่อน

      same, did you find a resolution on this?

    • @bullmilkers
      @bullmilkers หลายเดือนก่อน

      @@August-m8l I didnt create a text dataset - I just used the pdfs as is. I created a GCS bucket with the pdfs, and i used gemini multimodal to process the text within the files. Hope that helps!

    • @August-m8l
      @August-m8l หลายเดือนก่อน

      @@bullmilkers thanks!

  • @sahil-if4cb
    @sahil-if4cb ปีที่แล้ว

    Could you please also attach JSONL input file for for sentiment analysis

    • @cloud4datascience772
      @cloud4datascience772  ปีที่แล้ว

      The file is in the linked GitHub repository: github.com/rafaello9472/c4ds/blob/main/Create%20text%20dataset%20in%20Vertex%20AI/input_file_sentiment_analysis.jsonl

  • @computadorhumano949
    @computadorhumano949 ปีที่แล้ว

    how i need make to create my own dataset text classification?

    • @cloud4datascience772
      @cloud4datascience772  ปีที่แล้ว

      Hey, I am not quite sure about what you want to know, could you explain your question a little bit more?