- 12
- 2 595
ajegorovs
เข้าร่วมเมื่อ 16 เม.ย. 2024
Derivation of control for a cartpole. Basics: stability, equations of motion, Linearization (sympy)
I've began to study dynamical system control and decided to explore case of cartpole - an inverted pendulum attached to a moving cart.
This topic has few new cool ideas which wanted to share, so video is quite long.
Chapters:
0:00 Intro/Lecture Plan
3:15 Solving system of ODEs via eigenstuff
8:08 Representation of a dynamical system
14:00 Linearization/ Taylor series
21:00 Example with pendulum
25:00 Stability condition for continuous system
30:00 Stability condition for discrete system
33:33 Control/Controllability/Closed loop system
39:00 Pole placement example
43:50 Cartpole Lagrangian
51:00 Cartpole Equations of Motion
58:00 Side note about deriving B
1:04:50 Cartpole Linearization
1:09:00 Cartpole pole placement
1:12:30 Visualization
1:13:05 Comments on nonlinearity
1:18:37 Recap
1:21:45 Old man yelling on cloud
1:23:00 GL!
References:
Aleksandar Haber's lecture on pendulum linearization.
th-cam.com/video/SfbRknFoSZ0/w-d-xo.html
Brunton's "control bootcamp"
th-cam.com/play/PLMrJAkhIeNNR20Mz-VpzgfQs5zrYi085m.html
Matlab's video on pole placement
se.mathworks.com/videos/state-space-part-2-pole-placement-1547198830727.html
This topic has few new cool ideas which wanted to share, so video is quite long.
Chapters:
0:00 Intro/Lecture Plan
3:15 Solving system of ODEs via eigenstuff
8:08 Representation of a dynamical system
14:00 Linearization/ Taylor series
21:00 Example with pendulum
25:00 Stability condition for continuous system
30:00 Stability condition for discrete system
33:33 Control/Controllability/Closed loop system
39:00 Pole placement example
43:50 Cartpole Lagrangian
51:00 Cartpole Equations of Motion
58:00 Side note about deriving B
1:04:50 Cartpole Linearization
1:09:00 Cartpole pole placement
1:12:30 Visualization
1:13:05 Comments on nonlinearity
1:18:37 Recap
1:21:45 Old man yelling on cloud
1:23:00 GL!
References:
Aleksandar Haber's lecture on pendulum linearization.
th-cam.com/video/SfbRknFoSZ0/w-d-xo.html
Brunton's "control bootcamp"
th-cam.com/play/PLMrJAkhIeNNR20Mz-VpzgfQs5zrYi085m.html
Matlab's video on pole placement
se.mathworks.com/videos/state-space-part-2-pole-placement-1547198830727.html
มุมมอง: 25
วีดีโอ
Matrix-vector multiplication column perspective, its relation to affine transformations
มุมมอง 6521 วันที่ผ่านมา
This video is more of a slop that does not introduce anything phenomenal, rather it is just a refinement of "basic" knowledge, a showcase/exploration of methods. In this video we : - discuss popular matrix-vector product multiplication interpretation where we view it as a linear combination of columns. - extend it to matrix-matrix multiplication via 'stacking'/'partitioning' - derive affine tra...
Morton (Z-order) curves and bottom-up Binary Radix tree construction
มุมมอง 30หลายเดือนก่อน
In this video we will discuss: - why we need data structure accelerators- binary trees, quadtrees - how to use Morton (Z-order) curve to impose order and structure to our data - top-down binary radix tree construction based on Morton encoding - bottom-up (parallel) version - practical demonstration 0:00 Motivation 2:26 Binary Tree 3:40 Quadtree 7:00 Space-filling curves: toy example 10:20 Space...
2024 11 09 17 04 31
มุมมอง 61หลายเดือนก่อน
Custom many-body gravitational interaction sim in Vulkan. I use GPU Linear Boundary Volume Hierarchy structure/tree for faster force pair calculation (research.nvidia.com/sites/default/files/pubs/2012-06_Maximizing-Parallelism-in/karras2012hpg_paper.pdf). Implementation taken form github.com/MircoWerner/VkLBVH. Ive added shader for tree traversal force calculation. *Noted that radix sort shader...
2024 11 05 19 11 53
มุมมอง 276หลายเดือนก่อน
Implemented octree-like "accelerator" (Linear Bounding Volume Hierarchy, LVBH) form github.com/MircoWerner/VkLBVH. It was long and painful. Mainly because i could not compile their example, so i could not troubleshoot. Setting correct number of (compute) worker groups was essential, since when you are sorting, you dont want overlap. OG author used spirv-reflect library to grab information from ...
(3/3 bonus)RL Journey to Trust Region Policy Optimization. Bonus. Training quadruped Ant agent.
มุมมอง 174 หลายเดือนก่อน
In this video i will give few thoughts about this environment. NOTE: I forgot to tell, but training this dude takes a lot of time. For my potato laptop 25 batches * 10 episodes per batch * 2_000 steps per episode = 500_000 steps takes ~ 50-60 mins. I did ~3_000-4_000 episodes :C, and first 2000 eps quadruped did not walk forward. Env repos: Jiminy- github.com/duburcqa/jiminy/tree/dev?tab=readme...
(3/3)RL Journey to Trust Region Policy Optimization. TRPO implementation using pytorch
มุมมอง 794 หลายเดือนก่อน
This is the third, and final part of series on TRPO. In this video we will discuss how to implement this algorithm in python pytorch. We dont go into analyzing every line of code. Instead i try to convey main ideas (at least how i understand them xd), methods and cover places which may have pitfalls. NOTE 01: initialize MLP weights to zero (or any constant) with nn.init.zeros_(LAYER.weight). Th...
(2/3)RL Journey to Trust Region Policy Optimization. Conjugate Gradient, Hessian-vector trick, TRPO.
มุมมอง 634 หลายเดือนก่อน
In this second part we explore how is TRPO mathematical definition differs from NPG, find at which part we employ KL divergence constraint, discuss idea behind Conjugate Gradient (CG) method and how to combine it with Hessian-vector product 'trick'. Then we discuss how to implement policy parametrizations for discrete and continuous cases and compare with implementation of TRPO in OpenAI's 'mod...
(1/3)RL Journey to Trust Region Policy Optimization. Vanilla Policy Gradient/Natural Policy Gradient
มุมมอง 1204 หลายเดือนก่อน
This is the first video in series of 3 in which I will explore methods that lead up and are used in Trust Region Policy Optimization (TRPO). Content is borderline stock, except for few derivation examples. Main benefit from this video, with regards of TRPO, is that we find how to implement KL divergence and how to pose solve optimization problem. 0:00 Why is this series made? 6:36 Goals 8:10 VP...
Stochastic MDP for Reinforcement Learning. Explanation of environment expected parameters.
มุมมอง 426 หลายเดือนก่อน
Got really tired at the end haha. zZZzz. 0:00 Intro/Motivation 2:22 Outline 4:25 Probability of A 6:55 Probability of A|B 10:00 Probability chain rule 11:15 Expected value 12:45 Marginal probability 14:51 Markov Process (MP) 20:15 Deterministic MDP 25:05 Stochastic MDP 31:45 joint distribution p(s,a,s',r) 35:55 decouple p(s,a,s',r) for agent and env 41:31 Transition probability p(s'|s,a) 43:51 ...
Graph Attention Network (GAT) from scratch. Forward pass using pytorch. Part 02. Multi-head version.
มุมมอง 3057 หลายเดือนก่อน
This is part two in series of videos on Graph Attention Networks. (This is re-recorded video. I've realized that some things can be presented in an easier way) In this video we: 1) Repeat steps taken to create 1-head GAT, but with addition of multiple attention heads. This adds an additional dimension to our storage matrices which requires alternative approach to calculate matrix multiplication...
Graph Attention Network (GAT) from scratch. Forward pass using pytorch. Part 01. GCN to 1-head-GAT.
มุมมอง 1.5K8 หลายเดือนก่อน
This is part 1 of two part series on Graph Attention Networks. (Suggestion: I talk slow. View on x1.5 speed) In this video we will discuss: 1) General ideas about graphs and difficulty of their application in neural networks. 2) The meaning of convolution operation for images and graphs. 3) I introduce small linear algebra 'hacks' to help our interpretation of mathematical operations in GNNs. 4...
! NOTE. Initialize MLP weights to zero (or any constant) with nn.init.zeros_(LAYER.weight). This will make all logits equal = uniform probs, which is good for exploration.
I am so impressed by this tutorial, hope more helpful and interesting things coming out 👏 !!!!