- 15
- 4 070
Nogunumo
Japan
เข้าร่วมเมื่อ 4 ต.ค. 2022
Building a new intelligence.
AI Devlog #9: Finally Started Training LeNet in Batches — Say Goodbye to Batch Gradient Descent!
Happy New Year! 🎆
In this devlog, I dive into training AI models in batches-a straightforward approach I hadn't explored until now.
Welcome to the channel! I'm exploring AI and machine learning technologies, sharing insights, updates, and behind-the-scenes looks at my development process. If you’re into AI, machine learning, or tech innovation, this is the place for you.
0:00 Intro
0:12 Gradient Calculation: dl_dkernel2 Completed
5:10 Batch Training the Model
8:12 Debugging NaN Errors: Here's Why
10:48 Code Testing Begins
11:35 Organizing Examples & Documentation Folders
12:11 Outro
#machinelearning #deeplearning #forwardpropagation #backpropagation #devlog
In this devlog, I dive into training AI models in batches-a straightforward approach I hadn't explored until now.
Welcome to the channel! I'm exploring AI and machine learning technologies, sharing insights, updates, and behind-the-scenes looks at my development process. If you’re into AI, machine learning, or tech innovation, this is the place for you.
0:00 Intro
0:12 Gradient Calculation: dl_dkernel2 Completed
5:10 Batch Training the Model
8:12 Debugging NaN Errors: Here's Why
10:48 Code Testing Begins
11:35 Organizing Examples & Documentation Folders
12:11 Outro
#machinelearning #deeplearning #forwardpropagation #backpropagation #devlog
มุมมอง: 116
วีดีโอ
AI Devlog #8: Backpropagation in LeNet + Key Improvements!
มุมมอง 13214 วันที่ผ่านมา
Devlog video about how I compute gradients in LeNet. In my opinion, the gradients of max pooling and convolution can be tricky. Welcome to the channel! I'm exploring AI and machine learning technologies, sharing insights, updates, and behind-the-scenes looks at my development process. If you’re into AI, machine learning, or tech innovation, this is the place for you. 0:00 Intro 1:04 Computing G...
AI Devlog #7: Working on LeNet, and some improvements.
มุมมอง 229หลายเดือนก่อน
In this video, I share an update on my recent work. I think 3-dimensional data in CNNs is quite cumbersome to handle. Welcome to the channel! I'm exploring AI and machine learning technologies, sharing insights, updates, and behind-the-scenes looks at my development process. If you’re into AI, machine learning, or tech innovation, this is the place for you. 0:00 Intro 0:15 Removed class-based m...
Recurrent Neural Network: Gated Recurrent Unit (GRU) Built from Scratch in C++!
มุมมอง 109หลายเดือนก่อน
This time, I learned so much about sequential models, especially while explaining the concepts in detail! It's definitely faster than LSTMs. Enjoy the video! 0:00 Intro 1:13 Vanishing and exploding gradients 2:58 Preprocessing 3:23 Initialize the parameters in the constructor 4:46 forward() function 8:15 BPTT 9:06 Demo 11:02 Outro #machinelearning #deeplearning #forwardpropagation #backpropagation
Long Short-Term Memory (LSTM): Built from Scratch in C++!
มุมมอง 1922 หลายเดือนก่อน
I thought it would take more time to implement, but it didn't. I guess that's because it's essentially one of the RNNs, just a more advanced version. One thing to note is that I still had to use the Adam optimizer, which I think was due to the short sequence length I chose. Additionally, I had to slice the weights to focus only on the portions that contributed to the hidden states and to ensure...
Recurrent Neural Network: Built from Scratch in C++!
มุมมอง 7572 หลายเดือนก่อน
Finally, I implemented it! Everything was challenging-data preparation, forward and backpropagation, and especially the optimizer. It just wouldn't work without Adam! I guess that's why LSTMs were invented. Enjoy the video! 0:00 Intro 0:34 Preprocessing 1:09 Prepare x and y 3:11 Forward propagation 6:59 BPTT 11:11 Demo 19:40 Outro #recurrentneuralnetwork #machinelearning #deeplearning #forwardp...
AI Devlog #6: Fixing Loss Calculation
มุมมอง 1042 หลายเดือนก่อน
Before the fix, the loss fluctuated a lot, but when I incorporated batch sizes into the loss calculation, the losses began to decrease more smoothly. While it still fluctuates, this is not due to miscalculations; rather, it's because the model isn't perfectly generalized to the dataset, which I need to improve in the future. Enjoy the video!
Seq2seq: TextVectorization!
มุมมอง 1136 หลายเดือนก่อน
Hey everyone! I'm building the AGI. Feel free to drop your questions in the comments, and I'd love to hear your thoughts on the process! 0:00 Intro 1:52 How it works 5:33 Demo 9:06 Outro
AI Devlog #5: Refactoring the Training Code
มุมมอง 155ปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the regularizations I added to the model. Please share any questions or suggestions below. Hope you enjoy! 0:00 Intro 1:10 Refactorization for forward propagation 3:02 Refactorization for logging metrics 5:48 Refactorization for parameter initializations 7:53 Backpropagation 16:00 Using only one hidden lay...
AI Devlog #4: L1L2 Regularization!
มุมมอง 231ปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the regularizations I added to the model. Please share any questions or suggestions below. Hope you enjoy! 0:00 Introduction 1:39 Realization of immature 2:48 Benefits of stopped writing code in NumPy 7:45 Explanation of L2Regularizations 18:28 Demonstrate the consequences of introducing regularization 20:...
AI Devlog #3: Gradient Clipping!
มุมมอง 245ปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the gradient clipping I added to the model. Please share any questions or suggestions below. Hope you enjoy! 0:00 Introduction 1:08 Gradient clipping explanation 2:35 Assessing model's performance 3:24 I need to decrease number of hidden layer 8:43 Removed hyperparameters.h 11:02 Future updates
AI Devlog #2: Adding momentum to the model
มุมมอง 162ปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the momentum I added to the model. Please share any questions or suggestions below. Hope you enjoy! 0:00 Introduction 0:42 Deleting NumPy code 1:33 C code is slower than NumPy code 3:04 Explanation for Momentum 8:27 Result after adding the momentum 14:34 Code maintenance
AI Devlog #1: New activation and loss functions!
มุมมอง 90ปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the activation and loss functions I added, and other new updates. Please share any questions or suggestions below. Hope you enjoy! 0:00 Introduction 0:23 Learning rate scheduler 4:39 Changing variable names for partial derivatives 6:06 Softmax activation 9:25 Fix partial derivative calculation 13:10 Catego...
Neural Network: Overview 2
มุมมอง 1.3Kปีที่แล้ว
I am implementing machine learning models in C and CUDA. In today's video, I will explain the conceptual overview of a simple neural network written in C and CUDA from scratch. Please share any questions or suggestions below. Hope you enjoy! 0:00 Introduction 0:23 Load the dataset 1:27 Preprocessing 2:38 Initializing parameters 4:53 Mini-batch gradient descent 5:49 Forward propagation 6:23 Back...