Fascinating review. I glanced at the paper, particularly at GRPO and GAE. GRPO looks a lot like Fuzzy-Logic with nodes or attention heads adapted to experience (e.g. such as "relative" via using K-means group clustering). Looking more deeply at GAE (Generalized Advantage Estimation) it is for an adaptive control system. I would not be surprised if the origin of the deep learning usage of Theta is an angle of a pendulum.
A Better explanation even which will lead in all Sources.
Fascinating review. I glanced at the paper, particularly at GRPO and GAE. GRPO looks a lot like Fuzzy-Logic with nodes or attention heads adapted to experience (e.g. such as "relative" via using K-means group clustering).
Looking more deeply at GAE (Generalized Advantage Estimation) it is for an adaptive control system.
I would not be surprised if the origin of the deep learning usage of Theta is an angle of a pendulum.
Overlapping membership functions used in Fuzzy Logic is very similar to KL.
Don't have much experience with fuzzy logic. But I like your perspective 🙂
Great explanation
Thanks 👍
Thank you for your explanation !
My pleasure 😊
great video
Thanks!
thanks a lot.
Most welcome!