I had to sign in just to say that this video is awesome! I have watched many of your videos several times and have obviously learned tonnes. When you mentioned, 'HIckory' in here, I just broke out with a smile. I thought, 'where else can you learn a tonne of cool stuff and get entertained with awesome humour the same time?' Awesome video professor, though of course you know that. Thanks again.
I will add it to my list of potential topics. Like the comment if you too are interested in that... or comment if you have another computer vision topic your interested in.
Jeff, thank you for such an useful tutorial that indescribably helped me! In the data_generator method, we're looping each word in the description (also words that aren't in the vocabulary). By setting num_classes=vocab_size, does that mean that it happens that 2 different words will have the same binary class matrix? Are there any disadvantages to this? Thanks a lot in advance!
I suppose in a way. The CNN is encoding the images in such a way as the LSTM to generate the caption. But not trained like an autoencoder, where the input and output are the same.
Another amazing video. Although, I would like to make a small nit-picky suggestion sir. How about you make 2 videos? 1. Where you would show the data preperation stage as well and, secondly, 2. This video basically. One that explains the entire flow. Love from Nepal.
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/projects/captions/data/test2048.pkl' how to solve this error and we have also saved files to our drive
This is not an are that I have gone really deep on. So my advice is fairly general. But more data and better embeddings seems to me to be the next direction.
Hi, I'm finding a problem in the accuracy on the same testset between a model and the same model reloaded in a different python kernel. Someone know why?
hey jeff - thanks for this video it's super helpful! One question - I'm getting an when I try to run caption_model.fit_generator(generator, epochs=1, steps_per_epoch=steps, verbose=1) while training the neural network: ValueError: could not broadcast input array from shape (377,2048) into shape (377) Any idea what's going on? It's the same code as yours.
That is strange. I tried rerunning with my environment, which is beta1 of TF. If might be a version incompatibility. I am going to rerun/recheck everything in a few weeks when the class ends now that TF 2.0 is officially out.
Can we use word2vec instead of Glove for this project? How can we choose when to use word2vec or Glove? Is there any criteria? Look forward to your reply. Thank you very much.
Hi Jeff Heaton, Love your work from Pakistan !, while running this code, i'm facing 'KeyError' in data_generator function. like 9 for key, desc_list in descriptions.items(): 10 n+=1 ---> 11 photo = photos[key+'.jpeg'] 12 # Each photo has 5 descriptions 13 for desc in desc_list: KeyError: '1000268201_693b08cb0e.jpg' i'm new in this field and can't fix this, can you please check this out and guide me ? i'll be thankful.
Thank you so much, Jeff. Please never stop making these videos.
Full speed ahead! Thanks
I had to sign in just to say that this video is awesome! I have watched many of your videos several times and have obviously learned tonnes. When you mentioned, 'HIckory' in here, I just broke out with a smile. I thought, 'where else can you learn a tonne of cool stuff and get entertained with awesome humour the same time?' Awesome video professor, though of course you know that. Thanks again.
26:25 "Man in swim trunks is holding drink in his hand. [...] This thing really wants me to get fired." 😁
Hey Jeff, good to see you on ML and DL stuff, watched your java tuts when i am studying college. Great to see you again pal. Good life Bless up.
Thanks.
Can you make a tutorial on drowsy driver detection using lstm and CNN
I will add it to my list of potential topics. Like the comment if you too are interested in that... or comment if you have another computer vision topic your interested in.
Amazing video tutorial !! Thanks a lot :D
Thanks! Glad you liked it.
Jeff, thank you for such an useful tutorial that indescribably helped me!
In the data_generator method, we're looping each word in the description (also words that aren't in the vocabulary). By setting num_classes=vocab_size, does that mean that it happens that 2 different words will have the same binary class matrix? Are there any disadvantages to this?
Thanks a lot in advance!
Thanks for a briefly explained video! Is encoder-decoder pattern used here?
I suppose in a way. The CNN is encoding the images in such a way as the LSTM to generate the caption. But not trained like an autoencoder, where the input and output are the same.
Another amazing video.
Although, I would like to make a small nit-picky suggestion sir.
How about you make 2 videos?
1. Where you would show the data preperation stage as well and, secondly,
2. This video basically. One that explains the entire flow.
Love from Nepal.
Sir , can u keep explaining new Interesting research papers in ML field... it will be great help as your explanations are amazing ..
FileNotFoundError: [Errno 2] No such file or directory: '/content/drive/My Drive/projects/captions/data/test2048.pkl' how to solve this error and we have also saved files to our drive
How to improve the embedding so that caption formation is more perfect?
From where can i get the dataset??
Hi Jeff, Thank you for the video explanation, Can you please guide me on what can be next to improve image captioning?
This is not an are that I have gone really deep on. So my advice is fairly general. But more data and better embeddings seems to me to be the next direction.
It’s better to use GRU’s
Hi, I'm finding a problem in the accuracy on the same testset between a model and the same model reloaded in a different python kernel. Someone know why?
In train neural network, the path was for which? Should download something and use it here?
hey jeff - thanks for this video it's super helpful! One question - I'm getting an when I try to run caption_model.fit_generator(generator, epochs=1, steps_per_epoch=steps, verbose=1) while training the neural network:
ValueError: could not broadcast input array from shape (377,2048) into shape (377)
Any idea what's going on? It's the same code as yours.
That is strange. I tried rerunning with my environment, which is beta1 of TF. If might be a version incompatibility. I am going to rerun/recheck everything in a few weeks when the class ends now that TF 2.0 is officially out.
did you fix it?
did u get sol?
The github link is not functional. Can you please upload the image captioning link on Github?
Hi Jeff! how can I train this Neural Net using FP16. I have an RTX 2070 and want to use Tensor Cores for faster training?
Can we use word2vec instead of Glove for this project? How can we choose when to use word2vec or Glove? Is there any criteria? Look forward to your reply. Thank you very much.
The data generator not looks like using all the images because we getting 3 images per batch and 20 epochs 20*3=60 images in model to train
How to make voice-based image captioning?? can you introduce a source, please??
This is not TF2..., Is it???
Why is The code mixing import keras.... with import tensorflow.keras....
where do i get this glove.txt
Could you please add requirements.txt or make a note on which versions of modules you are using...
github.com/jeffheaton/t81_558_deep_learning/blob/master/tensorflow.yml
What is imagenet ?
Hi Jeff Heaton,
Love your work from Pakistan !,
while running this code, i'm facing 'KeyError' in data_generator function. like
9 for key, desc_list in descriptions.items():
10 n+=1
---> 11 photo = photos[key+'.jpeg']
12 # Each photo has 5 descriptions
13 for desc in desc_list:
KeyError: '1000268201_693b08cb0e.jpg'
i'm new in this field and can't fix this, can you please check this out and guide me ?
i'll be thankful.
I'm glad your wife is not a man. One of the benefits of being on the good side of Dr. Who.