21:22 - Please, use transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)) instead of transforms.Resize(IMAGE_SIZE) because transforms.Resize(IMAGE_SIZE) resizes image PROPORTIONALY. For example, celebrity image (3, 178, 218) is resized to (3, 72, 64) by transforms.Resize(IMAGE_SIZE). This doesn't influence MNIST, because its images are squares. However, for CelebA dataset DCGAN shows no errors, but doesn't converge.
After I set it according to your method, the network still does not converge when using the celeb dataset. How should I solve this problem? Looking forward to your answer
Note: I had previously done a video on DCGAN but with this GAN playlist I had several things I wanted to make better in that tutorial. This video should be better and also fit better to this playlist :) Also if you have recommendations on GANs you think would make this into an even better resource for people wanting to learn about GANs let me know in the comments below and I'll try to do it! I learned a lot and was inspired to make these GAN videos by the GAN specialization on coursera which I recommend. Below you'll find both affiliate and non-affiliate links, the pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link. affiliate: bit.ly/2OECviQ non-affiliate: bit.ly/3bvr9qy Timestamps: 0:00 - Introduction 0:26 - Quick Paper Recap 4:31 - Implementation of Discriminator 9:38 - Implementation of Generator 15:27 - Weight initialization and test model 19:09 - Setup of training 31:36 - Training on MNIST 32:20 - Modifications to CelebA dataset 33:52 - Training on CelebA and ending
26:01 - Training GANS is not unsupervised. When you use the BCE criterion to calculate loss for the Generator and Discriminator, you pass in the torch.zeros_like() and torch.ones_like() as the labels to the disc_real and disc_fake predictions. We can use the labels from the Dataloader instead but we do need labels to calculate the loss and optimize the weights i.e. supervised learning.
I think you have a good point, maybe a better term is self-supervised. I'm honestly not sure if the distinction between unsupervised and self-supervised is perfect either but you're right that it's not unsupervised by definition. With GANs you do not need a labelled dataset, you just need a dataset and the labels are inferred by themselves. Do you find it better to use self-supervised in this context?
@@AladdinPersson You are right GANs don't specifically need a labelled dataset, just a set of labels you provide to both criterion. By that definition I think self supervised can be an appropriate term. I'd also want to thank you here for all your tutorials. They have really helped me not just learn pytorch but get the proper intuition behind Machine Learning approaches. Keep it up!
"Unsupervised learning is a type of machine learning in which the algorithm is not provided with any pre-assigned labels or scores for the training data." - from wikipedia It's not that you can't have labels in unsupervised learning, you just can't have them ahead of time.
@@AladdinPersson There is more than one definition of self-supervised learning... but GANs (in general) don't qualify against any definition I'm aware of. Some people restrict self-supervised learning to be only robotics tasks. Yann LeCun uses the phrase to talk about autoencoders as well. In either case, we are looking for situations where labels are obtained by hiding a bit of the data (or making use of naturally hidden data). So in NLP, we could hide one word from a sentence, and then discovering that hidden word from context is the task... In robotics, often the hidden information is contained in the physics of the world... So based on a picture, how soft is the object in the picture? Then check softness with a robotic hand to see how to update the weights. In a autoencoder (to be fair, you can train a pix2pix to do this as well) automatic colorization is a great example of self-supervised learning.
Thanks for the details and not copying code in chunks. Please keep it up! Also, please note that the target dataset is MNIST, not CelebA at the beginning as I was confused with the 64x64 dimension of the generator/discriminator. Also, I think it would be better to use one writer for real and fake image, I'm not sure what are the reasons behind using two separate writers.
Great video. Can wait for the next video in the series. Can you please also create videos relating to the matrix for evaluating GANs like Inception Score?
I think there is a problem with the interpretation of figure 13:46. If noise is a 100-dimensional vector it should be drawn as a 1x1 square with 100 channels depth. Now as seen in figure, I would translate it as a 1 (x),100(y),z(1) vector (VECTOR OF ONE CHANNEL AND 100 FEATURES) . So in 11:59, you reject the figure You should reshape the 1,1,1,100 to 1,100,1,1 to work as it says to figure! Another question is why the discriminator does not have Dense layers?
I actually used the DCGAN you implemented in your previous video as a foundation for some work I was doing. I haven't watched this video yet (I plan to) but I wanted to ask how big is the improvement between this one and the older one? If it is pretty significant I would look into refactoring my current implementation.
Not much difference, so if you followed the previous tutorial it's mostly the same, and there's not much gain in watching this one. I felt I wanted to clarify some explanations and improve the video quality mostly and adapt it better to the GAN playlist. For the actual changes/differences (none are major): 1. I didn't use soft labels in the new video (didn't have to) 2. Cleaned up some code particularly for the model architectures 3. Used weight initialization similar to what they did in the paper, i.e mean 0 and std 0.02 for all weights (skipped this in previous) 4. Used bias=False for conv layers since we used BatchNorm they are spurious parameters 5. Trained both on MNIST and CelebA dataset Next video I will implement WGAN and WGAN-GP which is probably going to be a much more useful to watch. My goal in future videos for the playlist is to implement more advanced architectures so that we can reach closer to sota performance
Not working as per GitHub code you uploaded for MNIST dataset. Discriminator getting strong(as loss remains 0 after Epoch [0/5] Batch 100/469. How to fix it? Thanks Edited: It works after uncomment the BN. But I wonder why some of the steps showing blank black images? That weird. It did works when training going on and opening the tensorboard to observe simultaneously
This fixed it for me, uncommenting 'nn.BatchNorm2d(out_channels),' in both the Discriminator and Generator block methods (model.py). Also fixed the 'transforms.Resize((IMAGE_SIZE, IMAGE_SIZE))' line, as mentioned below (train.py). Thanks!
Thanks for video, may I ask something? When I trained the generator with celebA dataset and monitored fake images on Tensorboard. I got fake images as color noise that is kind of different as you got real fake face images. do you know why? thx in advance
I cropped the images and it worked. I hope this solves your problem: transforms = transforms.Compose( [ transforms.Resize(IMAGE_SIZE), transforms.CenterCrop(IMAGE_SIZE), # THIS LINE IS ADDED transforms.ToTensor(), transforms.Normalize( [0.5 for _ in range(CHANNELS_IMG)], [0.5 for _ in range(CHANNELS_IMG)]), ] ) Another version of this model: pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html
@@ahmetburakyldrm8518 You have to crop the image to be square. So add transforms.CenterCrop(IMAGE_SIZE) Or use transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)) instead of transforms.Resize(IMAGE_SIZE), because transforms.Resize(IMAGE_SIZE) resizes images PROPORTIONALLY.
I had a doubt, in the discriminator, we are taking the features_d as 64, and in the first layer, we map the channel_img(=3) to feature_d(=64), but in the actual diagram of generator, we map channel_img(=3) to 128 as the discriminator is just the ooposite of generator. So in the first layer of discriminator, shouldn't it be features_d*2 instead of features_d?
Thank you for this awesome video! I have a problem that if I want to save every generated fake image (the image of each grid) individually, how can I do it?
how can you save the generated fake images to a separate folder during training rather than displaying them in a grid, so i can access each one individually?
Hi! I followed the steps but when I run my code my lossD is 0 and my loss G is huge, the images in tensorboard are almost completely black. Anyone could help me? Thank you in advance
Awesome video bro, one more suggestion... you can use labels that comes with celeba dataset and implement cgan, the training in cgan is much more faster and produces better results... And after sufficient training, since the generator is conditional, so you could generate the girl of ur dream 😁
Doesn't line 68 loss_disc.backwards() cause the generator to train at the same time unless you freeze the weights? I was under the impression the generator is not supposed to train at the same time as the discriminator because then you are making the gen worse to improve the disc.
Thanks much for the video .... I am trying to adopt this implementation to text classification using SGANs...I would appreciate your help in this regard or further lead to a similar implementation for text classification.
Thank you so much for the help! I need to save my generated images for a project and was wondering if you had any code or recommendations for saving the images individually?
Not sure if this is still needed for you, you'll need to import save_image and then in the if statement where we create the grid of images you'll add a couple of lines using the sav_image function to write the image to a png or jpeg, should look something like this: #import statement from torchvision.utils import save_image #at the end of the if statement generating the grid of images place this (or similar code) save_gen_img = fake[0] save_image(save_gen_img, "images/%d.png" % batch_idx, normalize=True) # note that you will need to create a new folder in the project called images to have it write properly using this.
Trying to copy and paste the code here but I'm getting the error that there is 'No Module named model'. I have PyTorch installed and to my knowledge this error should not be happening? Is the model being used here from PyTorch?
When you print the results on tensorboard for every step..is it runs automaticaly?...I mean that when i start training and open tensorboard i need to click refresh button on browser tab to see the next step progress.
The real images input to discriminator (from dataloader) still has range from 0 to 1 right ? We did not change it to [-1, 1] before feeding it to discriminator even though the generator will output value between [-1, 1]
When we are applying transform to our dataset using transform = transforms (where transforms is our custom transform) we are normalizing our input data between -1 and 1 because transforms has a step of transform.normalize(). Hope this helps!
Hello Aladdin! As usual amazing videos. Have been reading a few blogs on the same topic. I see this statement "train the Discriminator first and then the generator" makes sense since you don't want the Discriminator to strong. What I fail to get it is in code, where do we find that. Per epoch both the Generator and Discriminator get trained in the same loop. I hope I am clear. Do people mean for an epoch do Discriminatior first and then Generator?
Hey Sai thanks for the kind words! I have not seen people train the Discriminator and Generator separately in the way you describe. What I think they could be referring to in the article you read is that for architectures like WGAN they want to train the Discriminator more (5 update steps vs 1 update step). This has the intuition that the Discriminator is leading the Generator, because it needs to have some understanding of what a real image actually looks like in order to provide valuable feedback. In more state of the art architectures like ProGAN, StyleGAN they kind of move away from these "tricks" and provide a new way of training GANs. I don't think exactly how many update steps you train the Generator vs Discriminator matters too much. Will make videos on those in the future:)
Hey Aladdin, great video. I had a quick question, in the fixed noise for the generator, what is the 32 for? Initially I presumed that each image in the batch would receive a 100x1x1 noise tensor for generation, a.k.a. fixed_noise = (batch_size, noise_dim, 1, 1). If the question is a little confusing, I can definitely elaborate more. Thanks.
I think I may have found the answer to the question, but I believe we are showing 32 images on Tensorboard for each epoch, so 32 is the image count for that (testing). Hopefully this is correct.
Thanks for the video, I have two questions after watching it. How can I generate multiple fake images at once and save them? For different size datasets, it seems that you start training without changing other parts of the network, is this correct?? Looking forward to your answer!
Thanks for the good video, but I have a question, why the generated numbers are not equal to the real numbers even though they definitely can be identified as numbers.
In a GAN, the generated images are supposed to match the distribution of the real images. In a autoencoder, the output looks just like the input, but in a GAN we are just making something that looks like it could have been part of the dataset. Does that help at all?
Very good. Thanks for your useful videos. I want to train a cycle gan but my colab session was crashed just at first batch of data. Do you have any idea for its reason?
Thanks for this video, I tried running the code but I got this error: File "C:\Python39\lib\site-packages\torch n\modules\module.py", line 1130, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'Generator' object has no attribute 'parameter'
I didn't understand what that super(Generator, self).__init__() is doing, super().__init__() is basically used to initialize the parent class but why have we given it these arguments?
Hello Aladdin! First of all, thank you for the amazing content! This is the best resource on ML that I have found on the internet. I cloned your repo and ran the train.py wo changing anything and I am having a problem: the D loss goes quickly to zero and the G loss grows at the same rate (already in the first epoch). What is the problem here? And what things could I try to solve this? Thank you for this content again!
Thank you very much. I just had the CELEBA version running. It is awesome! I work on Google Colab and faced a bunch of problems with the Dataset, which I eventually solved by generating 4 zip files with a size of approx. 50K pics in them and read them with an IterativeDataloader inspired by the web page medium.com/speechmatics/how-to-build-a-streaming-dataloader-with-pytorch-a66dd891d9dd of David McLeod. Why 4 zip files? I found if they are bigger than 2^16, they span several disks and can not be handled by the python zipfile library. Additionally I took over some code lines from the Pytorch DCGAN tutorial (eg. model initialization, transforms.CenterCrop) to get convergence. I would not wonder, if this need came up because I had first introduced some mistakes into the code. Maybe, these informations help others. I have learnt a huge amount!!!
hello brother can you help me to fix my problem and give me solution about my problem ,,,,,Traceback (most recent call last): File "d:\semester 8\SYSTEM GAN\DCGAN_EXE\Train.py", line 40, in gen = Generator(Z_DIM, CHANNELS_IMG, FEATURES_GEN).to(device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\ASUS TUF\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch n\modules\module.py", line 485, in __init__ raise TypeError( TypeError: Generator.__init__() takes 1 positional argument but 4 were given
21:22 - Please, use transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)) instead of transforms.Resize(IMAGE_SIZE)
because transforms.Resize(IMAGE_SIZE) resizes image PROPORTIONALY.
For example, celebrity image (3, 178, 218) is resized to (3, 72, 64) by transforms.Resize(IMAGE_SIZE).
This doesn't influence MNIST, because its images are squares. However, for CelebA dataset DCGAN shows no errors, but doesn't converge.
True
Thanks a lot. After using your method, my code finally works out!
Thanks for sharing!
its resized from (3, 218, 178) to (3, 78, 64), the smaller edge was matched to IMAGE_SIZE
After I set it according to your method, the network still does not converge when using the celeb dataset. How should I solve this problem? Looking forward to your answer
Note: I had previously done a video on DCGAN but with this GAN playlist I had several things I wanted to make better in that tutorial. This video should be better and also fit better to this playlist :) Also if you have recommendations on GANs you think would make this into an even better resource for people wanting to learn about GANs let me know in the comments below and I'll try to do it!
I learned a lot and was inspired to make these GAN videos by the GAN specialization on coursera which I recommend. Below you'll find both affiliate and non-affiliate links, the pricing for you is the same but a small commission goes back to the channel if you buy it through the affiliate link.
affiliate: bit.ly/2OECviQ
non-affiliate: bit.ly/3bvr9qy
Timestamps:
0:00 - Introduction
0:26 - Quick Paper Recap
4:31 - Implementation of Discriminator
9:38 - Implementation of Generator
15:27 - Weight initialization and test model
19:09 - Setup of training
31:36 - Training on MNIST
32:20 - Modifications to CelebA dataset
33:52 - Training on CelebA and ending
PLEASE PLEASE SHOWW HOW TO MAKE THE PROGRAM RUN WITH BIGGER IMAGE SIZE
26:01 - Training GANS is not unsupervised. When you use the BCE criterion to calculate loss for the Generator and Discriminator, you pass in the torch.zeros_like() and torch.ones_like() as the labels to the disc_real and disc_fake predictions. We can use the labels from the Dataloader instead but we do need labels to calculate the loss and optimize the weights i.e. supervised learning.
I think you have a good point, maybe a better term is self-supervised. I'm honestly not sure if the distinction between unsupervised and self-supervised is perfect either but you're right that it's not unsupervised by definition. With GANs you do not need a labelled dataset, you just need a dataset and the labels are inferred by themselves. Do you find it better to use self-supervised in this context?
@@AladdinPersson You are right GANs don't specifically need a labelled dataset, just a set of labels you provide to both criterion. By that definition I think self supervised can be an appropriate term.
I'd also want to thank you here for all your tutorials. They have really helped me not just learn pytorch but get the proper intuition behind Machine Learning approaches. Keep it up!
"Unsupervised learning is a type of machine learning in which the algorithm is not provided with any pre-assigned labels or scores for the training data."
- from wikipedia
It's not that you can't have labels in unsupervised learning, you just can't have them ahead of time.
@@AladdinPersson There is more than one definition of self-supervised learning... but GANs (in general) don't qualify against any definition I'm aware of.
Some people restrict self-supervised learning to be only robotics tasks.
Yann LeCun uses the phrase to talk about autoencoders as well.
In either case, we are looking for situations where labels are obtained by hiding a bit of the data (or making use of naturally hidden data).
So in NLP, we could hide one word from a sentence, and then discovering that hidden word from context is the task...
In robotics, often the hidden information is contained in the physics of the world... So based on a picture, how soft is the object in the picture? Then check softness with a robotic hand to see how to update the weights.
In a autoencoder (to be fair, you can train a pix2pix to do this as well) automatic colorization is a great example of self-supervised learning.
Thanks for the details and not copying code in chunks. Please keep it up!
Also, please note that the target dataset is MNIST, not CelebA at the beginning as I was confused with the 64x64 dimension of the generator/discriminator.
Also, I think it would be better to use one writer for real and fake image, I'm not sure what are the reasons behind using two separate writers.
BROTHER FROM ANOTHER MOTHER... THANKS!
Thank you for your explain and implement for DCGAN, have a nice day
Great video. Can wait for the next video in the series.
Can you please also create videos relating to the matrix for evaluating GANs like Inception Score?
Yeah for sure, was thinking of doing a video on Frechet Inception Distance (FID) score
I think there is a problem with the interpretation of figure 13:46. If noise is a 100-dimensional vector it should be drawn as a 1x1 square with 100 channels depth. Now as seen in figure, I would translate it as a 1 (x),100(y),z(1) vector (VECTOR OF ONE CHANNEL AND 100 FEATURES) . So in 11:59, you reject the figure You should reshape the 1,1,1,100 to 1,100,1,1 to work as it says to figure!
Another question is why the discriminator does not have Dense layers?
Thank you so much sir for these tutorials, really enjoying them!
Thanks for awesome video. I’m waiting for how to load a video(image frames) dataset for unsupervised learning.
Yeah will look into it more when I start exploring more video datasets, let me know if you've found a solution :)
Another great GAN video, thanks!
VERY EPIC DUDE!!!! LOVE YOUR VIDEOS!!
Thanks for the video, you explained very well.
I actually used the DCGAN you implemented in your previous video as a foundation for some work I was doing. I haven't watched this video yet (I plan to) but I wanted to ask how big is the improvement between this one and the older one? If it is pretty significant I would look into refactoring my current implementation.
Not much difference, so if you followed the previous tutorial it's mostly the same, and there's not much gain in watching this one. I felt I wanted to clarify some explanations and improve the video quality mostly and adapt it better to the GAN playlist.
For the actual changes/differences (none are major):
1. I didn't use soft labels in the new video (didn't have to)
2. Cleaned up some code particularly for the model architectures
3. Used weight initialization similar to what they did in the paper, i.e mean 0 and std 0.02 for all weights (skipped this in previous)
4. Used bias=False for conv layers since we used BatchNorm they are spurious parameters
5. Trained both on MNIST and CelebA dataset
Next video I will implement WGAN and WGAN-GP which is probably going to be a much more useful to watch. My goal in future videos for the playlist is to implement more advanced architectures so that we can reach closer to sota performance
Not working as per GitHub code you uploaded for MNIST dataset. Discriminator getting strong(as loss remains 0 after Epoch [0/5] Batch 100/469. How to fix it?
Thanks
Edited: It works after uncomment the BN. But I wonder why some of the steps showing blank black images? That weird. It did works when training going on and opening the tensorboard to observe simultaneously
This fixed it for me, uncommenting 'nn.BatchNorm2d(out_channels),' in both the Discriminator and Generator block methods (model.py). Also fixed the 'transforms.Resize((IMAGE_SIZE, IMAGE_SIZE))' line, as mentioned below (train.py).
Thanks!
am runnig code in pycharm how can i show output in tensorboard
Thanks for video, may I ask something?
When I trained the generator with celebA dataset and monitored fake images on Tensorboard. I got fake images as color noise that is kind of different as you got real fake face images. do you know why?
thx in advance
I cropped the images and it worked. I hope this solves your problem:
transforms = transforms.Compose(
[
transforms.Resize(IMAGE_SIZE),
transforms.CenterCrop(IMAGE_SIZE), # THIS LINE IS ADDED
transforms.ToTensor(),
transforms.Normalize(
[0.5 for _ in range(CHANNELS_IMG)], [0.5 for _ in range(CHANNELS_IMG)]),
]
)
Another version of this model: pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html
@@ahmetburakyldrm8518 You have to crop the image to be square. So add transforms.CenterCrop(IMAGE_SIZE)
Or use transforms.Resize((IMAGE_SIZE, IMAGE_SIZE)) instead of transforms.Resize(IMAGE_SIZE),
because transforms.Resize(IMAGE_SIZE) resizes images PROPORTIONALLY.
Newbie here. Cropping the image to square really helps. Could you explain why it has so much impact?
great video man! help me lots
I had a doubt, in the discriminator, we are taking the features_d as 64, and in the first layer, we map the channel_img(=3) to feature_d(=64), but in the actual diagram of generator, we map channel_img(=3) to 128 as the discriminator is just the ooposite of generator. So in the first layer of discriminator, shouldn't it be features_d*2 instead of features_d?
Thank you for this awesome video! I have a problem that if I want to save every generated fake image (the image of each grid) individually, how can I do it?
Did you solve this problem please?
THANK YOU SO MUCH
Great video!
how can you save the generated fake images to a separate folder during training rather than displaying them in a grid, so i can access each one individually?
Did you solve this problem please?
Can anyone suggest how to implement Resnet with GANS for MNIST ?
Hi! I followed the steps but when I run my code my lossD is 0 and my loss G is huge, the images in tensorboard are almost completely black. Anyone could help me? Thank you in advance
Please explain how you are using tensorboard to display the fake and real images
thannnnnnks !! it helps alot!
can you explain what features_gen and desc does i couldn't understand it within the paper aswell
How can I calculate the generator accuracy for this model?
??/
am runnig code in pycharm how can i show output in tensorboard
Can change the layers of D or G? To get better results??
Awesome video bro, one more suggestion... you can use labels that comes with celeba dataset and implement cgan, the training in cgan is much more faster and produces better results... And after sufficient training, since the generator is conditional, so you could generate the girl of ur dream 😁
Yeah will do conditional GANs in a separate video :)
Doesn't line 68 loss_disc.backwards() cause the generator to train at the same time unless you freeze the weights? I was under the impression the generator is not supposed to train at the same time as the discriminator because then you are making the gen worse to improve the disc.
Once I have my model trained, how would I have it create let's say 10,000 unique synthetic images and save them to a folder?
Thanks much for the video .... I am trying to adopt this implementation to text classification using SGANs...I would appreciate your help in this regard or further lead to a similar implementation for text classification.
why in test function did you write features_g = 8?? shouldn't it be 64 as the training?
You are the man!!!
Thanks man🙏
I the video you seem to say that you have seen the original code. Could you supply a reference to it as I have not been able to find it. Thanks
How would you add your own custom dataset to this?
how we can apply it on Text data?
Hi Aladdin, thanks for the videos. Can you please have one for conditional GANs too?
am runnig code in pycharm how can i show output in tensorboard
Thank you very much
Thank you so much for the help! I need to save my generated images for a project and was wondering if you had any code or recommendations for saving the images individually?
Not sure if this is still needed for you, you'll need to import save_image and then in the if statement where we create the grid of images you'll add a couple of lines using the sav_image function to write the image to a png or jpeg, should look something like this:
#import statement
from torchvision.utils import save_image
#at the end of the if statement generating the grid of images place this (or similar code)
save_gen_img = fake[0]
save_image(save_gen_img, "images/%d.png" % batch_idx, normalize=True)
# note that you will need to create a new folder in the project called images to have it write properly using this.
@@dannycarroll123 Thankyou! Exactly what I was looking for
love your videos
Great video
Thank you for this. How can we evaluate GANs except based on generated samples?
am runnig code in pycharm how can i show output in tensorboard
hey aladdin can you tell me how you connected to tensorboard
i need the code for the tensorboard part to google colab
Trying to copy and paste the code here but I'm getting the error that there is 'No Module named model'. I have PyTorch installed and to my knowledge this error should not be happening? Is the model being used here from PyTorch?
some quality content !
What ide are u using?
When you print the results on tensorboard for every step..is it runs automaticaly?...I mean that when i start training and open tensorboard i need to click refresh button on browser tab to see the next step progress.
am runnig code in pycharm how can i show output in tensorboard
How come you don't have to squeeze the image into 1 row when you pass it into the network?
The real images input to discriminator (from dataloader) still has range from 0 to 1 right ? We did not change it to [-1, 1] before feeding it to discriminator even though the generator will output value between [-1, 1]
When we are applying transform to our dataset using transform = transforms (where transforms is our custom transform) we are normalizing our input data between -1 and 1 because transforms has a step of transform.normalize(). Hope this helps!
hi, i just want to produce 1024x1024 image size. But how do I specify which hyperparameters? Could you help me?
That's gigantic, you'll need to use something like ProGAN, not DCGAN, and you'll need about 100,000 dollars worth of gpus
Really like your channel. which gpu do you use?
Is it safe to use layernorm instead of batchnorm....cause batchnorm have some issues...?
I haven't tried it but try it and let me know how it goes :)
Hello Aladdin! As usual amazing videos. Have been reading a few blogs on the same topic. I see this statement "train the Discriminator first and then the generator" makes sense since you don't want the Discriminator to strong. What I fail to get it is in code, where do we find that. Per epoch both the Generator and Discriminator get trained in the same loop. I hope I am clear. Do people mean for an epoch do Discriminatior first and then Generator?
Hey Sai thanks for the kind words! I have not seen people train the Discriminator and Generator separately in the way you describe. What I think they could be referring to in the article you read is that for architectures like WGAN they want to train the Discriminator more (5 update steps vs 1 update step). This has the intuition that the Discriminator is leading the Generator, because it needs to have some understanding of what a real image actually looks like in order to provide valuable feedback. In more state of the art architectures like ProGAN, StyleGAN they kind of move away from these "tricks" and provide a new way of training GANs. I don't think exactly how many update steps you train the Generator vs Discriminator matters too much. Will make videos on those in the future:)
Hey Aladdin, great video. I had a quick question, in the fixed noise for the generator, what is the 32 for? Initially I presumed that each image in the batch would receive a 100x1x1 noise tensor for generation, a.k.a. fixed_noise = (batch_size, noise_dim, 1, 1). If the question is a little confusing, I can definitely elaborate more. Thanks.
I think I may have found the answer to the question, but I believe we are showing 32 images on Tensorboard for each epoch, so 32 is the image count for that (testing). Hopefully this is correct.
@@rathikmurtinty5902 Your explanation makes perfect sense, since the fixed noise is used for visualizing results.
hi, If i am implementing this code in google colab, how to see the output in tensorboard
%load_ext tensorboard
%tensorboard --logdir logs
just search it on google :)
Thanks for the video, I have two questions after watching it. How can I generate multiple fake images at once and save them? For different size datasets, it seems that you start training without changing other parts of the network, is this correct?? Looking forward to your answer!
am runnig code in pycharm how can i show output in tensorboard
Thanks for the good video, but I have a question, why the generated numbers are not equal to the real numbers even though they definitely can be identified as numbers.
In a GAN, the generated images are supposed to match the distribution of the real images.
In a autoencoder, the output looks just like the input, but in a GAN we are just making something that looks like it could have been part of the dataset.
Does that help at all?
@@AZTECMAN Thank you for your explanation, I get it!
nice video, will there be a video about conditional GAN ?
Yea it's up now :)
How can we adjust this code to denoise images?
am runnig code in pycharm how can i show output in tensorboard
Very good. Thanks for your useful videos. I want to train a cycle gan but my colab session was crashed just at first batch of data. Do you have any idea for its reason?
No sorry, I never use colab pretty much
I would appreciate it if you record a video of implementation of cycle gan too.
Cycle GAN is hard to train on Colab. I have had only very limited success.
Awesome Video, which IDE are you using btw ?
It looks like pycharm
I cant where the Noise_dim parameter is initialized?
Should be Z_DIM, I think I corrected this error later on in the video
Thanks for this video, I tried running the code but I got this error:
File "C:\Python39\lib\site-packages\torch
n\modules\module.py", line 1130, in __getattr__
raise AttributeError("'{}' object has no attribute '{}'".format(
AttributeError: 'Generator' object has no attribute 'parameter'
I didn't understand what that super(Generator, self).__init__() is doing, super().__init__() is basically used to initialize the parent class but why have we given it these arguments?
They are the same
can you make video on Generative Human faces using Gan
why do you average loss_disc_real and loss_disc_fake? Is it okay to simply sum two criterions and use it as the criterion for the discriminator?
am runnig code in pycharm how can i show output in tensorboard
That's a late answer but if you write the following line in the terminal, you can display the output on tensorboard.
tensorboard --logdir=logs
Great video! Can I request you to make a similar scratch coding video on cycleGANs using PyTorch? I don't find a tutorial anywhere!
I'm currently working on the GAN series again, doing ProGAN atm but will do CycleGAN in the future for sure!
@@AladdinPersson awesome!
Hello Aladdin! First of all, thank you for the amazing content! This is the best resource on ML that I have found on the internet.
I cloned your repo and ran the train.py wo changing anything and I am having a problem: the D loss goes quickly to zero and the G loss grows at the same rate (already in the first epoch). What is the problem here? And what things could I try to solve this? Thank you for this content again!
The repo has batch normalisation commented out in both the discriminator and the generator. Un comment them and I think it will work.
@@alanjohnstone8766 Your comment got this working for me, thanks!
Cool video. Have you considered implementing BiGAN? Its seems not that hard, I tried but discriminator was constantly collapsing
Will look into it!
Thank you very much. I just had the CELEBA version running. It is awesome!
I work on Google Colab and faced a bunch of problems with the Dataset, which I eventually solved by generating 4 zip files with a size of approx. 50K pics in them and read them with an IterativeDataloader inspired by the web page medium.com/speechmatics/how-to-build-a-streaming-dataloader-with-pytorch-a66dd891d9dd of David McLeod.
Why 4 zip files? I found if they are bigger than 2^16, they span several disks and can not be handled by the python zipfile library. Additionally I took over some code lines from the Pytorch DCGAN tutorial (eg. model initialization, transforms.CenterCrop) to get convergence. I would not wonder, if this need came up because I had first introduced some mistakes into the code. Maybe, these informations help others.
I have learnt a huge amount!!!
It is not the size of the zip files, it is the number of files embedded!
Bookmark
hello brother can you help me to fix my problem and give me solution about my problem ,,,,,Traceback (most recent call last):
File "d:\semester 8\SYSTEM GAN\DCGAN_EXE\Train.py", line 40, in
gen = Generator(Z_DIM, CHANNELS_IMG, FEATURES_GEN).to(device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\ASUS TUF\AppData\Local\Programs\Python\Python312\Lib\site-packages\torch
n\modules\module.py", line 485, in __init__
raise TypeError(
TypeError: Generator.__init__() takes 1 positional argument but 4 were given