Skowster the Geek
Skowster the Geek
  • 23
  • 411 781
Effective Storytelling with Data
This tutorial series has taught you the technical abilities to transform data into innovative solutions. Now learn the communication skills to influence people to take action and adopt your solutions!
7 Storytelling Techniques blog post:
visme.co/blog/7-storytelling-techniques-used-by-the-most-inspiring-ted-presenters/
Playlist of my favorite powerful speeches:
th-cam.com/video/lEOOZDbMrgE/w-d-xo.html
มุมมอง: 717

วีดีโอ

Federated Learning Tutorial
มุมมอง 2.4K5 ปีที่แล้ว
Learn the secrets to taking your deep learning algorithms to massive Facebook, Google TH-cam scales through distributed learning. Can you learn from sensitive user data without invading privacy? Yes you can, with the Federated Averaging Algorithm! Google's blog post on Federated Learning (with links to papers): ai.googleblog.com/2017/04/federated-learning-collaborative.html PyTorch Multi-GPU tu...
Transformer Networks - How to Roll Your Own Google Translate
มุมมอง 2.7K5 ปีที่แล้ว
Discover the latest innovation in sequence to sequence deep learning, the transformer network. Based on Google Brain's 2017 paper "Attention is All You Need." Learn how this innovation improves on state of the art translation results and trains an order of magnitude faster. Original Paper: arxiv.org/abs/1706.03762 The Illustrated Transformer: jalammar.github.io/illustrated-transformer/ Annotate...
Fun With Autoencoders (Deep Fakes, Recommendation Engines & more)
มุมมอง 4.7K5 ปีที่แล้ว
Learn how autoencoders work, what they are can do, and dive into a Python tutorial to create your very own movie recommendation engine. Topics discussed: - Data denoising - Image reconstruction - Recommendation engines Deep dive into Deep Fakes: www.alanzucconi.com/2018/03/14/create-perfect-deepfakes/ Kaggle Workbook: www.kaggle.com/colinskow/movie-recommendation-autoencoder
Topic Modeling Tutorial (Latent Dirichlet Allocation) in Python
มุมมอง 10K5 ปีที่แล้ว
Learn how to automatically detect topics in large bodies of text using an unsupervised learning technique called Latent Dirichlet Allocation (LDA). Take your Natural Language Processing (NLP) skills to the next level in Python. Part of the free Data Lit Course at the School of AI! theschool.ai/courses/data-lit Kaggle Notebook used in the tutorial: www.kaggle.com/colinskow/topic-modeling-latent-...
Linear Regression in Python Tutorial (simple and multivariate)
มุมมอง 1.1K5 ปีที่แล้ว
Learn the basics of how to do linear regression with Python. Kaggle Notebook: www.kaggle.com/colinskow/linear-regression-automobile-tutorial-data-lit Automobile Dataset: www.kaggle.com/toramky/automobile-dataset Correlation Coefficient: www.spss-tutorials.com/pearson-correlation-coefficient/ R Squared: blog.minitab.com/blog/adventures-in-statistics-2/regression-analysis-how-do-i-interpret-r-squ...
Data Visualization in Python Tutorial
มุมมอง 1.5K5 ปีที่แล้ว
Learn how to create amazing interactive data visualizations with Bokeh in Python! Demonstration: demo.bokehplots.com/apps/gapminder Source Code: github.com/bokeh/bokeh/blob/master/examples/app/gapminder This is part of the Data Lit course at School of AI theschool.ai/courses/data-lit
Probability, Distributions, and the Central Limit Theorem (updated)
มุมมอง 1.7K5 ปีที่แล้ว
A crash course in probability, distributions, and the Central Limit Theorem. Explanation of standard deviation: www.mathsisfun.com/data/standard-deviation.html Types of probability distributions: blog.cloudera.com/blog/2015/12/common-probability-distributions-the-data-scientists-crib-sheet/ Tutorial for homework assignment: towardsdatascience.com/histograms-and-density-plots-in-python-f6bda88f5...
OpenRefine Data Cleaning Tutorial
มุมมอง 17K5 ปีที่แล้ว
Discover how to prep data for your next machine learning project using OpenRefine. Software download and documentation: openrefine.org/ Tutorial files: (phm-collection.zip) data.freeyourmetadata.org/powerhouse-museum/ This tutorial is part of The School of AI's Data Lit Course: www.theschool.ai/courses/data-lit/
AlphaGo Zero Tutorial Part 3 - Neural Network Architecture
มุมมอง 12K6 ปีที่แล้ว
Dive deep into the neural network used by Deep Mind's AlphaZero, the most powerful intelligence in the world for the games of go, chess, and shogi. Infographic Download: medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0 Batch Normalization information: towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c Residual Networks information: blog....
AlphaGo Zero Tutorial Part 2 - Monte Carlo Tree Search
มุมมอง 16K6 ปีที่แล้ว
Discover how Deep Mind's AlphaZero applies the Monte Carlo Tree Search algorithm to play board games like go, chess, and shogi at superhuman levels! Infographic download: medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0
AlphaGo Zero Tutorial Part 1 - Overview
มุมมอง 9K6 ปีที่แล้ว
AlphaGo surpassed thousands of years of human thinking after training on go for just a few weeks. Learn how to program your own AI that can play board games at a super-human level using deep neural networks and monte carlo tree search. Info-graphic download: medium.com/applied-data-science/alphago-zero-explained-in-one-diagram-365f5abf67e0
Proximal Policy Optimization (PPO) Tutorial - Master Roboschool!!!
มุมมอง 18K6 ปีที่แล้ว
Master Open AI's Roboschool with Proximal Policy Optimization. Learn reinforcement learning techniques to get bleeding edge results on a variety of environments. Source code for this tutorial: github.com/colinskow/move37/tree/master/ppo Dive even deeper with Arxiv Insights' PPO video... th-cam.com/video/5P7I-xPq8u8/w-d-xo.html
Continuous Action Space Actor Critic Tutorial
มุมมอง 23K6 ปีที่แล้ว
Learn how to adapt the Actor Critic architecture to handle reinforcement learning tasks with continuous actions spaces such as robotics and self-driving cars! Source code for this tutorial: github.com/colinskow/move37/tree/master/actor_critic
Actor Critic (A3C) Tutorial
มุมมอง 20K6 ปีที่แล้ว
Take your reinforcement learning skills to the next level with the Asynchronous Advantage Actor Critic architecture! Code for this tutorial: github.com/colinskow/move37/tree/master/actor_critic
Policy Gradient Methods Tutorial
มุมมอง 9K6 ปีที่แล้ว
Policy Gradient Methods Tutorial
Deep Q Learning Pong Tutorial
มุมมอง 12K6 ปีที่แล้ว
Deep Q Learning Pong Tutorial
Augmented Random Search Tutorial - How to Train Robots to Walk!
มุมมอง 6K6 ปีที่แล้ว
Augmented Random Search Tutorial - How to Train Robots to Walk!
Q Learning Tutorial for Ride Sharing (Open AI Taxi)
มุมมอง 6K6 ปีที่แล้ว
Q Learning Tutorial for Ride Sharing (Open AI Taxi)
Monte Carlo Reinforcement Learning Tutorial
มุมมอง 17K6 ปีที่แล้ว
Monte Carlo Reinforcement Learning Tutorial
Dynamic Programming Tutorial for Reinforcement Learning
มุมมอง 29K6 ปีที่แล้ว
Dynamic Programming Tutorial for Reinforcement Learning
Bellman Equation Basics for Reinforcement Learning
มุมมอง 148K6 ปีที่แล้ว
Bellman Equation Basics for Reinforcement Learning
Bellman Equation Advanced for Reinforcement Learning
มุมมอง 43K6 ปีที่แล้ว
Bellman Equation Advanced for Reinforcement Learning

ความคิดเห็น

  • @robotronix-co-il
    @robotronix-co-il 2 หลายเดือนก่อน

    in my calculation s3 should be 0.9 , since Q(s3 ,a)=0+0.9×max(0,1,0) from here s2 =0.81 s1 = 0.729 , .....

  • @kmishy
    @kmishy 4 หลายเดือนก่อน

    Do you mean that we can use recursive approach (dynamic programming) to find value of all states. Or We can find value of all states by iteration

  • @myelinsheathxd
    @myelinsheathxd 5 หลายเดือนก่อน

    thank you for explanation

  • @mikeyu533
    @mikeyu533 7 หลายเดือนก่อน

    dude thank you your videos are awesome

  • @teddychen0606
    @teddychen0606 7 หลายเดือนก่อน

    The past doesn’t matter baby!!!!!

  • @mohmmedshafeer2820
    @mohmmedshafeer2820 7 หลายเดือนก่อน

    Thanks You Skowster

  • @atharvagundawar6101
    @atharvagundawar6101 8 หลายเดือนก่อน

    One of the most underated video on MDP I have seen

  • @ejbock5b179
    @ejbock5b179 8 หลายเดือนก่อน

    What happens if instead of mario being drunk we had that Mario can move how instructed but the identity of the two rooms trap and success are unknown, but information is known of the probability that either is trap or success. i.e. mario knows that the success room may be trap with 50 % confidence

  • @howardnolan3198
    @howardnolan3198 10 หลายเดือนก่อน

    Thank you for making this, very helpful.

  • @ScottTaylorMCPD
    @ScottTaylorMCPD 11 หลายเดือนก่อน

    The link to the "free Move 37 Reinforcement Learning course" mentioned in the description appears to be dead.

  • @yadugna
    @yadugna ปีที่แล้ว

    Great presentation- thank you

  • @afmjoaa
    @afmjoaa ปีที่แล้ว

    Awesome explanation.

  • @ivanof90
    @ivanof90 ปีที่แล้ว

    This explanation helps me a lot! Thank you!

  • @adam67bree
    @adam67bree ปีที่แล้ว

    Hi mate, thanks for 3 amazing videos on AlphaGo Zero. Where can I find part 4?

  • @rikiriki43
    @rikiriki43 ปีที่แล้ว

    Thank you for this

  • @somecsmajor
    @somecsmajor ปีที่แล้ว

    Thanks Skowster, you're a real one!

  • @chiedozieonyearugbulem9363
    @chiedozieonyearugbulem9363 ปีที่แล้ว

    Thank you for this concise video. The texts provided by my lecturer wasn't this easy to understand

  • @wishIKnewHowToLove
    @wishIKnewHowToLove ปีที่แล้ว

    This guy took this complicated formula and made it easy

  • @newan0000
    @newan0000 ปีที่แล้ว

    So clear, thank you!

  • @anantadutta7506
    @anantadutta7506 ปีที่แล้ว

    I got quickly hooked by the first video on bellman equation as it was simple, intelligent and lucid and immediately watched this one. Alas, quickly lost interest on this one as it starts with some lame gimmicks and I am no longer sure what to expect. You could have followed the simple style of the first one and got another million views..Will look for other material. Thanks

  • @santhoshjayaraman1112
    @santhoshjayaraman1112 ปีที่แล้ว

    sir, can you share any book name or any resources link or something to get more knowledge about continuous action space RL concepts. Please,please

  • @Moprationsz
    @Moprationsz ปีที่แล้ว

    Thanks a lot!!! This video is a true masterpiece. Got a lot of learning "clicks" out of it.

  • @rodolfojoseleopoldofarinar7317
    @rodolfojoseleopoldofarinar7317 ปีที่แล้ว

    thumbs up for the quake reference :)

  • @spoke747
    @spoke747 ปีที่แล้ว

    Fascinating

  • @niklasdamm6900
    @niklasdamm6900 2 ปีที่แล้ว

    21.11 20:55

  • @kabbasoldji3816
    @kabbasoldji3816 2 ปีที่แล้ว

    thank you very much sir 😍

  • @tanujajoshi1901
    @tanujajoshi1901 2 ปีที่แล้ว

    The tutorial is for a stochastic policy or continuous action spaces?

  • @sunaryaseo
    @sunaryaseo 2 ปีที่แล้ว

    Hi! It's a nice video. I wonder if I have continuous action values ranging from 50 to 150, which activation function should I use in the output Actor-network, and how to sample between those values from its probability?

    • @cuongnguyenuc1776
      @cuongnguyenuc1776 10 หลายเดือนก่อน

      I think you still use the tanh activation multiply with range of your action values and add bias to it. The bias should move tanh function to mean of the action values!

  • @michaelmuller136
    @michaelmuller136 2 ปีที่แล้ว

    Nice RL Playlist, thank you!

  • @bea59kaiwalyakhairnar37
    @bea59kaiwalyakhairnar37 2 ปีที่แล้ว

    That means each action of the agent will give him a reward and if he finds the path or get to the goal then he will get maximum reward. But does an agent trys to find different ways to reach goals. Also i subscribed.

  • @Arik1989
    @Arik1989 2 ปีที่แล้ว

    Great explanation, thank you! For me, the most confusing thing with PPO was what is the new policy vs the old policy (for the calculation of the ratio in the loss function) and how are gradients calculated when there are "two" policies. For some reason this is very often quickly glossed over, but it's pretty clear once you see the implementation. For anyone who was confused like I was, here's how I understand it. The old policy is the policy you use to choose actions - you calculate all the old_policy values during the environment sampling phase. These are numerical values which don't need to contribute directly to the gradients, so consider them scalars. Then you do several epochs of updates to the model: On epoch 0 new_policy = old_policy and therefore the ratio is always 1 and the loss is actually just the advantage function (and no clipping obviously) On the next epoch, you made an update to the model, which is now considered the new policy; but the old policy is still the same (already calculated). Therefore, the ratio may now differ from 1 and you make an update according to the loss formula (clipping and all)

  • @curumo_curunir
    @curumo_curunir 2 ปีที่แล้ว

    Thank you for the nice explanations and video. It is useful. I hope your videos about ML&Data Science will continue.

  • @tianyuzhang7404
    @tianyuzhang7404 2 ปีที่แล้ว

    The first time I a have a feeling of what Bellman’s equation is for. Awesome video.

  • @saikiranvarma6450
    @saikiranvarma6450 2 ปีที่แล้ว

    A very positive start of the video thanks you, keep going and we keep supporting

  • @AmanSharma-cv3rn
    @AmanSharma-cv3rn 2 ปีที่แล้ว

    Simple and clear❤️

  • @vizart2045
    @vizart2045 2 ปีที่แล้ว

    I am working on machine learning and this was new to me. Thanks for bringing it to my attention.

  • @akashpb4044
    @akashpb4044 2 ปีที่แล้ว

    Brilliantly explained 👍🏼👍🏼

  • @pggg5001
    @pggg5001 2 ปีที่แล้ว

    10:45. I think you might have some error here. The equation is V(s) = max(R(s,a) + r*V(s')) i.e. V(s) max of the sum of the reward of the CURRENT cell (which is s) plus the V-value of its best NEIGHBOUR (which is s'). so the V-value of the cell left of the princess should be V = 0 + 0.9 * 1 = 0.9, not 1, right?

    • @virgenalosveinte5915
      @virgenalosveinte5915 ปีที่แล้ว

      I thought so too at first, but then realized that R(s,a) is a function of s and a, the current state and action pair, not s' which is the next state. So the reward is given by taking the action, not being in the next state. Its a detail anyway but ill leave it for future people

  • @phattaraphatchaiamornvate8827
    @phattaraphatchaiamornvate8827 2 ปีที่แล้ว

    you make me god ty.

  • @dariuszkrynicki9184
    @dariuszkrynicki9184 2 ปีที่แล้ว

    There are too many videos focusing on explaining theory and equations which are easy to research as there is plenty of theoretical knowledge helping to work out all the math beyond the equations. Although, there are too few good quality videos focused on practice and coding. Your video is an outstanding one!

  • @dariuszkrynicki9184
    @dariuszkrynicki9184 2 ปีที่แล้ว

    good one, ty!

  • @superdahoho
    @superdahoho 2 ปีที่แล้ว

    11:42 you lost me there You said the state of the next square is 0 that's why it's 0 + but then you go ahead and say the value of the square is 1. are the state and value of the square different? I thought the state was the square?

  • @TheTenorChannel
    @TheTenorChannel 2 ปีที่แล้ว

    you're a crack, i love it! cool video!!!

  • @vadimavkhimenia5806
    @vadimavkhimenia5806 2 ปีที่แล้ว

    Is AC still the leading algorithm for tasks such as self-driving?

  • @whyzzy2683
    @whyzzy2683 2 ปีที่แล้ว

    This is greate tutorial. Thanks for the talk.

  • @pavaniddalagi
    @pavaniddalagi 2 ปีที่แล้ว

    subscribed :)

  • @sujathaontheweb3740
    @sujathaontheweb3740 2 ปีที่แล้ว

    You're so funny! 🤓

  • @marcusrose8239
    @marcusrose8239 3 ปีที่แล้ว

    What is a prime state, is the max V(s) at state, or ?

  • @faisalamir1656
    @faisalamir1656 3 ปีที่แล้ว

    thanks alot mannnnnnn

  • @ORagnar
    @ORagnar 3 ปีที่แล้ว

    3:44 It's fascinating to note that in 1954 there were no digital computers. I wonder if they were using analog computers.