- 12
- 64 707
Pim de Haan
เข้าร่วมเมื่อ 8 เม.ย. 2008
Causal Confusion in Imitation Learning
Published at NeurIPS 2019
Authors: Pim de Haan, Dinesh Jayaraman, Sergey Levine
Paper at: arxiv.org/abs/1905.11979
Slides at: sites.google.com/view/causal-confusion
Authors: Pim de Haan, Dinesh Jayaraman, Sergey Levine
Paper at: arxiv.org/abs/1905.11979
Slides at: sites.google.com/view/causal-confusion
มุมมอง: 1 251
excellent lecture.!
I think there's a far more fundamental reason why Category Theory is important to AI, beyond a simple belief that it will be useful and everyone should want it. Firstly, we should understand that information is derived from knowledge, not the other way around, because information is data with a meaning, and meaning is defined in terms of knowledge. So, if Information is created from knowledge, then what is knowledge? I think the distinction is that while information is the subject of Set Theory, knowledge is the subject of Category Theory. They're like the inverse of each other. While Set Theory deals with what is in predefined sets, Category Theory is about defining the sets in terms of the relationships between sets. When we consider our existential circumstances as embedded observers in the universe, we are not afforded any absolute or privileged frame of reference. All we get to do is to observe and model the relationships between the things we observe (all measurement is comparison), and the nature of knowledge becomes obviously aligned with Yoneda's Lemma, in which an object in a category is completely determined by its relationships to all other objects... and then our biological representation of that looks like a 100 billion neurons with a trillion dynamic connections for relationships ... and then our AI representation of that looks like either neural networks or the equivalent associations by proximity in very high dimensional vector spaces (aka, embeddings in vector stores). The formalisms of Category Theory are then required to have a way to reason about knowledge, rather than the Set Theory we use to reason about information. This should provide for a comprehensive set of knowledge representations, paving the way for explainable AI, as well as improvements to model representations and structured reasoning . This distinction is also at the centre of most arguments about whether computers can ever be considered to "know" anything. They can't as compositions of information constructs where the meaning is imposed by the viewer, but they can as compositions of knowledge constructs, where everything is known in terms of everything else.
thank you very much for this amazing content
very illuminating!!
Neat.
About the diagrams @ 37:30: This is pedagogical suicide! How can you just drop on us diagrams in a graphical language we've never seen before and expect us to reason about them? We don't even know what the boxes represent, what the lines connecting the boxes represent, etc. We don't know how to read these diagrams!! It was immediately clear to me what the students were asking about: apparently reshuffling parts of the diagram produces a different diagram (hence why they were asking about the "box position encoding information") For someone who understands category theory, it sure as hell shouldn't be a problem to understand that people can't know what you haven't told them yet!!!
I totally agree with you! I felt like becoming a dumbass because of that diagram which suddenly coming from the middle of nowhere. I was still on track with the previous slide about the Tensor operator. But then the diagram was introduced as a question, which entirely made the previous slide irrelevant. Even more, I had no clue about the context of the diagram or any basic information to interpret it. What were the boxes? What did the f, g, h, i mean? What did the lines connecting the boxes mean? Though I already had fundamental knowledge in Category Theory. (For anyone who want to grab fundamental knowledge of Category Theory and its core philosophy, I highly recommend the book The Joy of Abstraction by Eugenia Cheng) Later, thanks to the question of an audience, which is necessary to clear things out, I could catch up a bit after the explanation of the speaker. I think this lecture should be considered a better way to present it to its audience who might come from different backgrounds
Keep going , guys, that's brilliant and much-needed project ;) The scheme of funding is also nice, hope you got this working
In time 26:44, it was mentioned that there is only one morphism uniquely defined from any set A to (), ie., the function returning () for every element in A. However, could it be possible that for some of the elements in A, there is no return, ie., no arrows from some of the elements to ()? In this way, it seems the morphism (function) is not unique and there can be many different ones.
If there is no return for some of elements of A then it will not be a function to start with….in order to be function the whole domain of A must be mapped
15:10 we are talking about commutative diagram, but I was not convinced by the name "commutative". Does the "commutative" have anything to do with the "commutative property" in abstract algebra? If not, what exactly is the meaning of "commutative" in this context regarding the diagram?
Thanks for the great intro! Could you share the Zulip link?
This enriched my understanding of Category Theory and Causality by a lot. Thank you!
Great talk. It addresses questions that I have been wondering about. You've made a lot of interesting progress.
Best ever monads lecture!
The question at 51:09 is really good, asking if compositionality is the nature of things, or simply about the limit of our ability to understand. I think it's analogous to types in programming. Are static types the true nature of computations, or just about our ability to understand them? While sound type systems are necessarily incomplete (i.e. they fail to accommodate some perfectly valid programs), the idea of static typing has proven valuable for reliably constructing and reasoning about programs. Historically, untypable but perfectly working programs have been an important source of inspiration for improving the expressiveness of static types (e.g. various forms of polymorphism, path-sensitive reasoning, delimited continuations, etc.). If one day we come across a perfectly clear and correct program that's untypable, that needs massaging just to appease the type-checker, such program will not be a sign that we should abandon the idea of static typing, nor should we dismiss such program as invalid just because a particular static discipline fails to explain it. Each instance of incompleteness can serve as an opportunity to improve the type system. Similarly, if one day we come across an effective machine learning model that cannot be explained by the current building blocks, that won't mean that the idea of searching for compositional building blocks is worthless, and we obviously shouldn't dismiss that model's existence. That will simply mean we should revise the building blocks. After all, mathematics can't derive all truths, but it's the best tool we have for understanding truths.
Yes, your reply is directly in line with the incompleteness theorem in mathematical logic. Any logic based system we come up with to reason about a domain (a domain that is not 100% knowable) will inherently be incomplete. We will have to continue learning and improving our understanding until either the domain of knowledge is completely covered (all questions are answered) and if we find something we can’t explain, figure out how we have to revise our models to explain it.
You love to smack and click with your mouth a tad too much.
Structuralism, a 100 year old linguistic theory, finally mating with Math and Computer Science, making cute babies. Maybe we can name the first two babies Ferdinand and Claude.
Thanks Petar for the amazing intro!. At time 49:10, how can we tell, from the commutative diagram, that f is both monomorphic and epimorphic? Is f being so by stipulation? Or we can read off from the diagram directly?
This is my attempt to show f being epimorphic: From the axioms of categories, we have id_B o f = f, which is also f = id_B o f. So, id_B o f = id_B o f. Since id_B = id_B, f satisfies being an epimorphic. But, I believe, this argument can be applied to any morphism in any categories. This makes any morphism epimorphic.
@alexanderlim4586 in that specific category the only morphisms than be composed on either side of f are the identities So the properties of mono and epimorphism are trivially verified (it's a bit absurd in a way) f o g = f o h always implies g = h because g = h = id you have no other choice there no other morphisms
(At 15:21) Took me a while to understand the meaning of "linear G equivariant functions are convolutions". This is basically associativity in group, namely multiplying first from the left and then from the right is the same as multiplying first from the right and then from the left. I think that this is much simpler to understand then just saying "convolution" with a formula.
Instead of naming them 'Chapter X', can you them such that they convey more information about the content the specific segment cover.
Really appreciate Jules' candour about the applications of this research programme! Still very supportive of it :)
At 40 mins: a group as a one object category isn't the same thing as the Lawvere theory of groups (or even of any Lawvere theory, which in general have a countably infinite number of objects!)
35 mins on parametric functions, the citation should be for "Theorems for Free" by Wadler.
How did Grav. do the animated gifs in his presentation? Did he draw them frame by frame?
Nothing new! This kind of information is already written down in articles and books since 2013 about the soundhelix (Lauthelix, klankhelix). The Semantics of Derivational Morphology: Theory, Methods, Evidence (Linguistische Arbeiten, 586) by Sven Kotowski (editor), Ingo Plag (Editor) Since Radboud University confirmed that Nomen est Omen (our names say who we are), we know that a name like Ingo PLAG helixes in PLAGIARISM, because words lengthen from the back (adjectio) and dissolve from the front (detractio) like in Dutch WRAAK (revenge) that helixes in German (w)RACHE. In the announcement, Plag talks about “innovative methodologies” by which he means the “universal sound helix”. Unfortunately, Plag copied these “new and very interesting insights” from the books about the Universal soundhelix (klankhelix, Lauthelix) that are in the Deutsche Nationalbibliothek (German National Libraries) in Leipzig and Frankfurt. Plag worked at the same time with a Dutch female teacher at the Philipps-Universität-Marburg. She was falsely accused of sexual harassment, which was followed by immediate dismissal, while the actual background came to light by a notorious German whistleblower. He accused the president, several deans and the ombudsman of covering up her research results with which Ingo Plag plagiarizes in all his publications. In the announcement he talks about “oft-neglected fields” of certain directions in linguistics. It is indeed true that Plag and all his colleagues (including Noam Chomsky) never came up with the idea of sound rules expressed by Goropius Becanus. In addition to adjectio and detractio mentioned, Becanus pointed out the permutatio in which words must be read from back to front. Instead of 'neglect', one should speak of ignorance. In addition to the well-known rows P, T and K, Becanus also pointed out a fourth row W that was never understood by Plag and Co. It is indeed painful that linguists themselves never came up with these ideas, but what falls under the heading of criminality is the fact that they completely ignore the research results of Becanus and the soundhelix (klankhelix, Lauthelix). The question is therefore not whether we should judge this book here, but rather the authors, professors and employees who are given carte blanche through publishers affiliated with their university when it comes to offenses such as plagiarism and making false accusations. For example, the Horizon 2020 project, with subsidies amounting to millions of euros, is solely due to the rediscovery of the soundhelix. With the soundhelix one can not only reconstruct the past, and then correctly; one can also use it to spell and therefore predict the future. It is therefore not just about 'predicting semantic properties' but about a whole system that, like the Oracle of Delphi, can indicate future discoveries. If we look at religious books such as the Torah and the Bible, for example, we can predict the words to which these books refer using the soundhelix. From STAR helixes STER because vowels helix alphabetically (a > e > i > o > u). Via detractio helixes out of STER > (s)TORA while the adjectio is the cause of the extension of the word. From TORA helix (t)ORAK > ORACLE, but through the permutatio we now also know that from ORAK / KARO is helixing, that helixes in KORAN, because there is a second octave available, which gives the vocals a higher pitch, just like in Canaan. In addition to TORA, however, TUAS also is helixing, which is TUAS GLOS, the BIBLE, which literally means 'second writing'. The Oracle of Delphi is never taken seriously by male scientists, because this knowledge was controlled by women. And here is the point, because it now appears that women during pre-Jewish matriarchy already had more knowledge of linguistics than men do today. So that's where the shoe pinches. So the KORAN is not the third book, but the fourth! KORAN helixes in QUA'RAN, but than in QUINTESSENCE, which deals with FIVE. This fifth book has a pointe, a clou, because that is the discovery of the soundhelix self! The soundhelix forms words automatically over which humans themselves have no influence. This phenomenon is therefore 'divine' (i.c. female, but also related with ‘devil’), which leads one to conclude that beliefs in a God (or Allah) is based on a kind of superstition. It has nothing to do with a male God, but only with a Mother Nature! The soundhelix has a MA-the-MA-tic pattern, which is indeed related to the mama’s who were the first human beings who could count, but which is indeed related to the algorithms referred to by Plag in his announcement. The concept of GOD can be translated via permutatio into GOD/DOG, which in Dutch means HOND, which via adjectio helixes in HONDERD, which helixes in HUNDRED in English. But in Dutch, HONDERD is helixing in HUN DRIETJES (their three) alias 1-0-0 in which the ones and zeros refer to algorithms. In Plag’s books, you will never find a reference to the Dutch language, while it is now indisputable that not only German helixes out of the Dutch language, but also English ánd French. And this finding turns the history of Europe upside down. But what is even much more interesting is the discussion going on in astrophysics. Jonathan Oppenheim wants to have discovered something new by combining certain theories when it comes to quantum mechanics, gravity and string theory. In English, a ‘string’ refers to 'underpants'. During the pre-Jewish matriarchy, women were in charge, which is indicated by the saying 'wearing the pants'. Now the image of the molecule that refers to the SAMARIUM is shaped like a pair of pants, which deals with the number five: the pentagram! SAMARIUM contains the name of the Holy Mary who had a baby as a virgin. A synonym for baby in Dutch is a 'broekie' (pants)! The soundhelix is able to give us insight into knowledge about cosmology without us having to study the universe. GRAVITY helixes out of Dutch KRAP which helixes via permutatio in PARK, but P > F and K > C which is pronounced as S, resulting in GRAVITY > FORCE, i.c. FIFTH FORCE! Particles split in two, which is associated with SCHAAR (scissors). Astrophysicists still don't know why Schrödinger's CAT can be dead and alive at the same time. Women are called FOTZE in German, which means helixes out of Dutch VOD which means RAGS in English. Out of SCHAAR(s) > RAGS is about reduction. The VOD / DOV helixes in DUIF (pigeon, but also ‘peace”) ánd VOODOO, which gives the connection between the soundhelix alias THE WORD (Jesus) and religion a different dimension. For centuries, people have been led to believe that bad behavior will get them into hell. This threat came from outside. People now hardly believe in an all-powerful God. Knowledge about the soundhelix should lead to the conviction that there is indeed something like an omniscient mechanism through which people are stimulated from within to behave well. We call this behavior charity. Books like this by Ingo Plag are therefore extremely relevant. But it has now also been proven that Ingo Plag himself is an idiot, because through knowledge of the soundhelix he should know that his plagiarism will come true! Pale(s)tin(t) means "white faces", since the Jewish people are not coming from Israel but out of the North of France: IJZERHOEK! Ingo Plag knows this. Nice regards, Ans (Johanna) Schapendonk
In the notation X ⊗ Y, Are X, Y objects or morphisms? So far small letters f,g,h were used to represent morphisms and capital letter A,B,X,Y were used to represent objects in a category. That is why I am having confusion about X and Y
In monoidal category, what are the objects, morphisms and composition?
~anything; these are not included in the definition of a monoidal category e.g., there are monoidal categories where objects are sets, functors, lenses, etc
Good video. At time 10:35 suddenly a new word "Model of the system" added to explain compositionality. Now to understand compositionality, you need to understand what is "Model of system". Could you please first explain what is meant by "Model of the system" ?
42:06 - a nice easy explanation here, sounds pretty straightforward to do in Python
What is the Zulip server URL?
Fascinating talk. First time I've really started to see why it is useful to work with enriched categories. And now I've got a whole bunch of references to look up for further info!
The only reason this works is because language has a topological/categorical structure on it structure that effectively mimics the reals. The mapping in to R works because we know how to deal with R and because the mapped structure does have structure in it. It's not just an arbitrary set(if it was then the mapping would be arbitrary and all derived structures based on R would be meaningless). The idea is really just representation theory. Represent a structure in terms of another more familiar to try and understand the former.
The largest use case I've seen for chatGPT is that of undermining society by having bots use it to make people think they are real profiles.
36:03 Perfect, creatio ex nihilo now^^
28:00 As a theologian (as well as deep learning enthusiast), the terminal set seems to me to be analogous to the concept of (monotheistic) "God". And it is fascinating that the terminal set can be identified simply in virtue of all arrows pointing to it, without knowing the "nature" of the objects/sets - i.e. in particular, without knowing the "nature of the One". This is a line of thought that definitely bears fruit theologically, when you consider the claim that all manifestation in the world points to God himself (and indeed, that God is exactly He who all manifestation points to). It is also interesting that the different pagan cultures of the past, once they converted to (monotheistic) Christianity, they provided connections to elements of or figures in scripture, e.g. connecting the ancestry of their traditional gods to some specific, lesser known descendant of Noah. This way it seems they provided arrows into the framework of references (in language) that themselves were understood to point to "God", i.e. to biblical scriptures or associated traditions. When you think about it, if compositionality holds in the realm of language/thought, then in virtue of connecting one's culture to an aspect of the story that explicitly contains (or rather points to) "God", one is simultanously providing the composition, i.e. an arrow from one's culture to "God". The desire to do this underlines the notion that (at least for these ancient people, but definitely still today with regard to people who take the concept of "God" serious in either a philosophically or religious way) this concept "God" was seen to serve a role similar to the notion of "terminal set"/"top" or whatever the more general category-theoretical term is (limit?). Also wondering whether the claim that Christ is "alpha and omega" and that the "first shall be last" point to some sort of mathematical structure in which "top" and "bottom" (or "limit" and "colimit"?) are in some sense indistinguishable from one another. Just recording my thoughts here for future reference. But maybe I am not the only one interested in rather arcane topics, considering the topic of this lecture series :D
We need more content from you bring more theory
Great content! :)
I have just found recently that the way I think and model problems always matched the way category theory does. It's a very exciting area to explore :)
Well, it is not the "way you think". It is the way WE think, in general.
@@samueldeandrade8535 I would definitely not generalize my statement. I can spot, for example, some modern / contemporary artists wouldn't think in a compositional way (although classic artists would tend to).
@@0Tsutsumi0 ok. But no. And now I understand a little better what you tried to say.
Great analysis
fantastic presentation overall. One of the simplest descriptions of syntax vs semantics and Lawvere theories
This sounds like a fascinating area of research! I have two question: 1.) Is there a list of open problems that the community could check out? personally I would love to get 'nerd sniped' lol; and 2.) Categorical Cybernetics seems very AI foundational, so I was wondering if there is much work categorically modelling aspects of agent foundations research?
blessed talk
PMs also seem to have similarities to the recent Deep Mind work "Tree of Thoughts"
Would it be accurate to say the Prediction markets are similar to System 2 and ANN are similar to System 1 (perception)?
Amazing talk. fundamentally important to understanding all machine learning. language model pretraining is like a first pass at learning the structure of language by understanding word contexts. knowledge graph learning and link prediction using contrastive learning is eerily similar to the use of the unit interval enriched category between sets. so many interesting insights and connections here.
Thanks for publishing these lectures!
Are guys ready for a corny ass joke: functors are funtors :D
Great lecture
Absolutely brilliant introduction to Category Theory!
This is really inspiring work. On "blue", and "red" functor, a "color" functor might be instructive. What are all the relationships that have to do with a color, very Yoneda? maybe?
Absolutely amazing playlist!