TensorFlow Tutorial 19 - Custom Dataset for Text with TextLineDataset

แชร์
ฝัง
  • เผยแพร่เมื่อ 18 ต.ค. 2024

ความคิดเห็น • 18

  • @AladdinPersson
    @AladdinPersson  4 ปีที่แล้ว +3

    I was inspired and learned the basics of TensorFlow after I completed the TensorFlow specialization on coursera. Personally I think these videos I created give a similar understanding but if you wanna check it out you can. Below you'll find both affiliate and non-affiliate links, the pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link which helps me create more future videos.
    affiliate: bit.ly/3JyvdVK
    non-affiliate: bit.ly/3qtrK39
    Here's the outline for the video:
    0:00 - Introduction and Dataset Overview
    1:39 - Load using TextLineDataset
    4:13 - Filtering Dataset
    8:12 - Creating Vocabulary
    13:43 - Numericalizing with TokenTextEncoder
    18:10 - Applying map on datasets
    20:35 - Simple Model
    22:30 - Dataset in Several Files
    25:50 - Sketch Load Translation Dataset
    29:22 - Ending
    This has shaped out to be a pretty long and thorough TensorFlow Playlist, hopefully you guys find these videos useful! :)

  • @wolfisraging
    @wolfisraging 4 ปีที่แล้ว

    Yet another awesome video. 🔥🔥 Never done NLP in tensorflow before... though now I can't wait to get my hands dirty :)

    • @AladdinPersson
      @AladdinPersson  4 ปีที่แล้ว

      Thank you! I appreciate your support 🙏

  • @lendixful7932
    @lendixful7932 3 ปีที่แล้ว +3

    Hiii Aladin. In the line 66 why is not "vacabulary.update(word)" ? because we want to include that word that was not befor because didnt surpass the threshold ?. Very possibly it is a stupid question but dont get at all why is used "tokenized_text" instead of "word"

  • @olajideosho9186
    @olajideosho9186 4 ปีที่แล้ว +1

    Good day sir. If I complete this playlist, will I be in a good position to take the Tensorflow Developer Certification Exam by Google and Pass? Thanks for the great content.

    • @AladdinPersson
      @AladdinPersson  3 ปีที่แล้ว +1

      I haven't done the developer exam so I don't know, but I took the TensorFlow specialization and if I've succeeded these videos should be more in depth and concise than that specialization because I thought it could be improved in some aspects

    • @olajideosho9186
      @olajideosho9186 3 ปีที่แล้ว

      @@AladdinPersson Thank you sir.

  • @sagihaviv5675
    @sagihaviv5675 3 ปีที่แล้ว +1

    sir how can use this to incorporate with ocr? its like text classification but from image

  • @codedungeon7158
    @codedungeon7158 4 ปีที่แล้ว

    After finishing this, where do you think we should go to continue learning?

    • @AladdinPersson
      @AladdinPersson  4 ปีที่แล้ว +1

      Continue with TensorFlow official tutorials (but more advanced ones), implement research papers, do courses, and try doing projects :)

  • @doodidam3363
    @doodidam3363 3 ปีที่แล้ว

    What the different if I used between tokenizer and TextVectorization?

  • @nickpgr10
    @nickpgr10 2 ปีที่แล้ว

    Hi people, This may be a silly doubt but why Aladdin considered same encode_map_fn for the test dataset?

  • @guitalexmg6356
    @guitalexmg6356 ปีที่แล้ว

    Hi,
    I have this error with this line of code :
    -> tokenizer = tfds.features.text.Tokenizer()
    -> module 'tensorflow_datasets.core.features' has no attribute 'text'
    Is it possible to know the exact tensorflow-dataset version of your environment?
    THX
    Alex

  • @omerfarukbaspinar1
    @omerfarukbaspinar1 3 ปีที่แล้ว

    hello can you do that for machine translation dataset. I want to replicate Opus machine translation dataset with my own datas.

  • @Agrover112
    @Agrover112 3 ปีที่แล้ว

    Dude you could have used RaggedTensor instead of padding?

  • @jiageng1997
    @jiageng1997 3 ปีที่แล้ว

    The default threshold of 200 is too high, it only gets the punctuation marks except commas. Try threshold=10 instead