- 614
- 222 349
Roland Speicher
Germany
เข้าร่วมเมื่อ 30 ธ.ค. 2011
What is so amazing about Wigner's semicircle law?
This is an appetizer for my lecture series on random matrices. Histograms of the eigenvalues of large random matrices should convince you that randomness can still result in many precise and deterministic statements.
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2020/05/rm-notes-speicher-v2.pdf
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2020/05/rm-notes-speicher-v2.pdf
มุมมอง: 496
วีดีโอ
The curse and the blessing of high dimensions
มุมมอง 2779 หลายเดือนก่อน
This video is an appetizer for my lecture series on HDA: Random Matrices and Machine Learning. It shows mainly the blessings of high dimensions aka concentration phenomena via histograms of various quantities for random matrices and random vectors. The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf 0:00 Welcome 0:12 Curse and blessing 1:3...
RM+ML: 29. Free Probability Theory and Linearization of Non-Linear Problems
มุมมอง 500ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, random matrices, free probabiity, free cumulants, linearization trick 0:00 Relevance of RM and free probability for ML 4:15 Motivating free probability via random matrices 38:39 Free cumulants and moment-cumulant formula 43:00 Vanishing of mixed free cumulants and free ind...
RM+ML: 28. Properties of the Neural Tangent Kernel
มุมมอง 313ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, neural tangent kernel, concentration, time evolution, large width limit 0:00 Recap on neural tangent kernel 12:00 Convergence of NTK to a deterministic kernel 17:38 Calculation of the limit NTK for the ReLU function 40:13 Why does the NTK not change in time? The goal of th...
RM+ML: 27. Time Evolution of Learning and Neural Tangent Kernel
มุมมอง 260ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, neural tangent kernel, random feature model, learning algorithm, large width limit 0:00 Recap 0:39 Learning for one hidden layer net 10:16 Evolution of parameters via gradient descent 24:50 Neural tangent kernel 32:18 Time evolution of neural net function 45:38 Time evolut...
RM+ML: 26. Gradient Descent for Linear Regression
มุมมอง 196ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, gradient descent, linear regression, random feature model, learning algorithm 0:00 Learning and gradient descent 6:35 Feature learning 8:58 Gradient descent for linear regression 36:29 Gradient descent algorithm 42:02 Example 56:22 Ridged regression The goal of this lectur...
RM+ML: 25. Gaussian Equivalence Principle for Non-Linear Random Features
มุมมอง 218ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf Gaussian equivalence principle, cumulants, non-linear random feature model 0:00 Recap 5:46 Equivalence principle for random feature matrix 15:33 General remarks on equivalence principle The goal of this lecture series is to cover mathematical interesting aspects of neural networks, in par...
RM+ML: 24. Calculation of the Random Feature Eigenvalue Distribution
มุมมอง 279ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf cumulants, random feature model, resolvent expansion 0:00 Recap 3:44 Cumulants of data matrix 14:47 Cumulants of product of matrices 46:58 Calculation of Stieltjes transform for product 1:03:25 Cumulants of non-linear feature matrix 1:34:30 Stieltjes transform for feature matrix The goal ...
RM+ML: 23. Cumulants and Their Properties and Uses
มุมมอง 273ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf cumulant, set partition, moment-cumulant formula, vanishing of mixed cumulants 0:00 Recap 1:55 Partitions of sets 12:38 Moments and cumulants indexed by partitions 20:17 Definition of cumulants 24:18 Examples 36:32 Cumulant-moment formula 40:13 Vanishing of mixed cumulants and independenc...
RM+ML: 22. Resolvent Method and Cumulant Expansion
มุมมอง 227ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf resolvent method, characteristic function, cumulant, set partition 0:00 How to generalize Stein's identity 12:27 Motivation for cumulant expansion 23:55 One-dimensional cumulants and cumulant expansion 32:35 Multivariate cumulants 39:07 Multivariate cumulant expansion 46:21 Relation betwe...
RM+ML: 21. Another Proof of Marchenko-Pastur -- Via Stein's Identity
มุมมอง 188ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf Stein's identity, resolvent, proof of Marchenko-Pastur 0:00 Motivation 2:13 Idea of proof of MP 12:38 Stein's identity 24:28 Finishing the proof of MP The goal of this lecture series is to cover mathematical interesting aspects of neural networks, in particular, those related to random ma...
RM+ML: 20. Asymptotic Eigenvalue Distribution in the Random Feature Model
มุมมอง 188ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, random features, non-linear random matrix theory 0:00 Recap of random feature model 7:50 Theorem on asymptotic eigenvalue distribution 23:22 Special cases of theorem 34:15 General form of the result The goal of this lecture series is to cover mathematical interesting aspec...
RM+ML: 19. General Remarks on Random Feature Model
มุมมอง 161ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, random features, non-linear random matrix theory 0:00 Introducing non-linearities 4:18 Random features 6:50 Non-linear random matrix theory 7:50 Product of two Wishart matrices 12:29 Dealing with non-linearity The goal of this lecture series is to cover mathematical intere...
RM+ML: 18. Double Descent and Linear Regression: Under-Determined Case
มุมมอง 350ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, double descent, linear regression, under-determined 0:00 Recap of over-determined case 6:10 Under-determined case 10:19 Lemma on best solution 23:25 Calculation of error 31:02 Variance term of error 40:12 Bias term of error 50:03 Double descent for linear regression The go...
RM+ML: 17. Double Descent and Linear Regression: Over-Determined Case
มุมมอง 607ปีที่แล้ว
The lecture notes for the course can be found at rolandspeicher.com/wp-content/uploads/2023/08/hda_rmml.pdf neural network, double descent, linear regression, over-determined 0:00 Recall neural networks 9:48 Questions about neural networks 16:28 Learning and overparameterization 23:38 Double descent 27:05 Linear regression 35:49 Over-determined case 42:40 Calculation of the error 58:30 Expected...
RM+ML: 16. Proof of the Signal-Plus-Noise Theorem
มุมมอง 174ปีที่แล้ว
RM ML: 16. Proof of the Signal-Plus-Noise Theorem
RM+ML: 15. Spiked Signal-Plus-Noise Model
มุมมอง 337ปีที่แล้ว
RM ML: 15. Spiked Signal-Plus-Noise Model
RM+ML: 14. Proof of Marchenko-Pastur: Stieltjes Inversion Formula
มุมมอง 345ปีที่แล้ว
RM ML: 14. Proof of Marchenko-Pastur: Stieltjes Inversion Formula
RM+ML: 13. Proof of Marchenko-Pastur: Equation for Stieltjes Transform
มุมมอง 426ปีที่แล้ว
RM ML: 13. Proof of Marchenko-Pastur: Equation for Stieltjes Transform
RM+ML: 12. Preparations for Proof of Marchenko-Pastur Law
มุมมอง 380ปีที่แล้ว
RM ML: 12. Preparations for Proof of Marchenko-Pastur Law
RM+ML: 11. The Marchenko-Pastur Law for Wishart Matrices
มุมมอง 681ปีที่แล้ว
RM ML: 11. The Marchenko-Pastur Law for Wishart Matrices
RM+ML: 10. Proof of Concentration of Largest Eigenvalue
มุมมอง 237ปีที่แล้ว
RM ML: 10. Proof of Concentration of Largest Eigenvalue
RM+ML: 9. Wishart Random Matrices and Concentration of Largest Eigenvalue
มุมมอง 439ปีที่แล้ว
RM ML: 9. Wishart Random Matrices and Concentration of Largest Eigenvalue
RM+ML: 8. General Remarks on Linear and Non-Linear Concentration Inequalities
มุมมอง 237ปีที่แล้ว
RM ML: 8. General Remarks on Linear and Non-Linear Concentration Inequalities
RM+ML: 7. Proof of Non-Linear Concentration for Gaussian Random Vectors
มุมมอง 196ปีที่แล้ว
RM ML: 7. Proof of Non-Linear Concentration for Gaussian Random Vectors
RM+ML: 6. Non-Linear Concentration of Gaussian Random Vectors for Lipschitz Functions.
มุมมอง 394ปีที่แล้ว
RM ML: 6. Non-Linear Concentration of Gaussian Random Vectors for Lipschitz Functions.
RM+ML: 5. Exponential Concentration of Norm of Gaussian Random Vectors
มุมมอง 378ปีที่แล้ว
RM ML: 5. Exponential Concentration of Norm of Gaussian Random Vectors
RM+ML: 4. Gaussian Random Vectors and Concentration of Their Norm
มุมมอง 577ปีที่แล้ว
RM ML: 4. Gaussian Random Vectors and Concentration of Their Norm
Enfes bir anlatım, harika performans
Ich möchte Elektrotechnik Informationstechnik studieren. Ich habe 8-9 Monate um mein Mathe aufzufrischen. Meinen Sie dass ich es hinbekomme mir eine gute Basic bis dahin aufzubauen sodass ich in den Mathematik Vorlesungen besser hinterher komme. Ich bin gelernter Informationselektroniker und habe etwas schon von Grundkentnissen in Elektrotechnik gelernt aber in die Mathematik sind wir nie so tief reingegangen. Vielen Dank
This was the best lecture and crystal-clear proof of the Stone-Weierstrass theorem I've seen. Thank you for sharing!
Danke für die Videos. Hab meine Ausbildung in der Chemieindustrie abgeschlossen und will gerne Verfahrenstechnik studieren. Damals in der Oberstufe bin ich irgendwann aus dem Matheunterricht ausgestiegen und will dies vor Beginn des Studiums auffrischen. Danke!
Thank you so much!
thank you
Auch von mir: Vielen Dank für Ihre super Vorlesungsreihe!
wow, die ersten 60 sekunden erklären es einfach schon perfekt😜
Das freut mich!
Super
Vielen Dank, Sie haben es wirklich sehr zugänglich erklärt. 👍👍👍
top erklärt, weiter so 👍
Thanks for all the lectures! Great lectures!
Thank you prof. Speicher for this lecture series! You clarified many aspects of mathematical quantum mechanics to me that I previously found confusing or was ignorant about. In terms of content, I believe this series is one of a kind on this platform. Additionally the availability of typed-out (!!) lecture notes and exercise sheets has enabled me to learn the material like a regular university course in my free time, which I'm very grateful for. I hope more students will find these lectures, since I regard them as highly underrated by the current amount of views. In any case, best wishes!
Thanks a lot!
7:20 Isn't the Fourier transform of $g(x)^{*}f(x+s)$ usually defined the same but with $\exp(it(x+s))$ instead of just $\exp(itx)$? If you calculate the inner-product of $g$ with $U_{s}V_{t}f$ however, the proof as described would follow with the usual definition of the Fourier transform. (Edit: this is wrong, in fact even the question wasn't put correctly, the derivation in the lecture is correct.)
I am not sure whether I understand what you mean. If you talk about the Fourier transform of $g(x)^{*}f(x+s)$, shouldn't you then say what your variable is in which you take the Fourier transform; in my proof the variable is $x$, and $s$ is a fixed parameter. If you take $x+s$ as variable, what happens then with $x$?
@@SpeicherRoland Right. I made a mistake considering the input variable for the function $h_{s}$. Thank you.
So gut erklärt, danke!
Really wonderful lecture (as far as I watched but I'm sure the rest will be equally good). But at 1:26:25 the condition _"and the number of edges in the tree is m/2"_ is missing. This is also the condition that guarantees that each edge is used exactly twice in the walk and it matches with the list of examples later where for m=2 we get only one tree (despite the partition with one element is also a tree but with less edges). Furthermore partitions with less edges in their graph have fewer vertices and therefore are lower order than the trees with m/2 edges. And this condition is the reason why the number is zero for m odd.
Thanks for your message and pointing this out. I suppose you are right that I better should put the info on the number of edges in the condition. The example you mention with m=2 might still be okay, as it has a loop, hence is not a tree, but for m=4 the crossing pairing gives a tree, but with too few edges.
Is it true, that the smaller set of entries is, the smaller the N is when the semicircle structure appears clearly?
Danke
Sehr schön erklärt! Danke sehr !
Very nice lecture…
Sei mio
This question may not relevant to this topic, but I just wanted to ask, in a von neumann algebra, suppose we have a dense sub algebra, whose centre is trivial can we say the closure of the dense sub algebra which is the whole von neumann algebra is factor?
If I made no mistake it's in general not a factor. Let H be an infinity dimensional Hilbert space. Let the von Neumann algebra be the direct sum B(H)+B(H). It's obviously not a factor. The dense sub algebra is given by the finite rank operators in B(H)+B(H). The center of a dense sub algebra is a subset of the center (density argument). But the only elements in the center contain operators of infinite rank and zero. Therefore the center of the sub algebra is zero. If you want a unital sub algebra you can choose the operators in B(H)+B(H) that are of the form r*Id+(finite rank). Again an element of the center of this algebra must be an element the center of B(H)+B(H). But the only element in the center of B(H)+B(H) that is of the form are multiples of the identity. Therefore the center of this sub algebra is trivial (multiples of the unit).
I love this series 😀 Thank you professor!
sehr gut erklärt
Diese Videos helfen mir so sehr. Dankeschön! Ohne diese Videos würde ich in Mathe 3 sofort durchfallen.
Neville Longbottom
Sehr geehrter Herr Prof. Speicher! Vielen Dank, daß Sie Ihre lehrreichen Vorlesungen öffentlich zugänglich machen!
hi, just one question, I thought convergent sequences of continuous functions are equicontinuous provided convergence is uniform, arent they?
Or is it that youre working with the supremum norm so the convergence is uniform
yes, this is correct and was shown in the video "motivation for notion of equicontinuity"
yes, I am working in the Banach space setting, my norm is the supremum norm and thus convergence is uniform convergence
Moin, Herr Prof. Dr. Speicher! Ich bin eigentlich Programmierer muss aber für ein Projekt einige math. Konzepte nachholen und da kommen einem diese Vorlesungen wirklich sehr gelegen. Danke!
Vielen Dank für das video - echt toll erklärt :)
Thanks a lot for making this publicly available!
Sir the only problem is your board isn’t visible properly 😢
Hello professor, It's really amazing to see neural networks through mathematician glasses, rather than computer science guys 😅 Big Fat thank you for sharing
Interesting lectures, unfortunately youtube keeps interrupting with commercials every couple of minutes. I haven’t seen this with other youtube lectures, is there maybe a setting the author can change?
Thanks for the message; yes, the ads are annoying, but I do not see how I could change this ... but I will have another try on it ... and actually, you can also find the original videos (without ads) on our local server www.math.uni-sb.de/ag/speicher/web_video/zmws1920/zm_ws1920.html
thanks a lot for the link, and the lectures 😊 much appreciated! From a quick google search it looks like you have to do it one video at a time :| I also just discovered your high dimensional analysis lectures from 2023, that's actually closer to what I was looking for to start with, just in case you decide to do only some of the videos that would be most appreciated
Hallo hab das gerade in der Uni und frage mich, was wäre ein Beispiel für eine nicht integrierbare Funktion in einem Beschränktem Intervall [a,b] ?
Das übliche Beispiel für eine nicht Riemann-integrierbare Funktion ist die Funktion, die 0 auf den rationalen Zahlen und 1 auf den nicht-rationalen Zalhlen ist.
Sehr gut, danke für Helfen!
Ich habe eine Frage zu 23:40. Die Taylor-Erweiterung fur e^x ist eine endlose Reihe, deren Folgeglieder ALLE rationale Zahlen sind. Partielle Summen dieser Reihe sind daher rationale Zahlen und diese approximieren die IRrationale Zahl e^x. Aber die konzeptuelle Summe ALLER dieser Folgeglieder muss auch eine rationale Zahl bleiben, und diese ist dann GLEICH (also exakt, KEINE Approximation) der irrationalen Zahl e^x. Ist das kein Widerspruch? Ich habe die Reihendarstellung der Exponentialfunktion natuerlich schon tausendmal gesehen, und mir ist das nie so aufgefallen. Ich weiss, dass dies kein mathematisches Rätsel ist und die Antwort auf meine Frage muss eigentlich einfach sein. Kann mir jemand helfen?
Die Aussage "Aber die konzeptuelle Summe ALLER dieser Folgeglieder muss auch eine rationale Zahl bleiben" stimmt so nicht - die unendliche Reihe ist definiert als der Grenzwert der partiellen Summen, und ein Grenzwert einer Folge von rationalen Zahlen kann eine nicht-rationale Zahl sein. Wenn ich eine nicht-rationale reelle Zahl als Dezimalzahl mit unendlich vielen nicht-periodischen Nachkommastellen schreibe, ist die ja auch nicht rational, obwohl alle Approximationen mit endlich vielen Nachkommastellen rational sind.
@@SpeicherRoland Vielen Dank fuer die Antwort! Ich finde die Dynamik des Grenzübergangs immer wieder erstaunlich. Eine Aufsummierung unendlich vieler rationaler Zahlen kann eine irrationale Zahl ergeben, eine Aufsummierung unendlich vieler unendlich oft differenzierbarer Funktionen (zB Sinus oder Kosinus) kann eine nicht überall differenzierbare Funktion (zB Treppenfunktion) ergeben. Vielen Dank fuer die informative Einfuehrungsvorlesung - ich werde ihre Vorlesungsreihe mit Interesse weiterverfolgen. Gruss, Ralph.
Vielen Dank für Ihre lehrreichen Videos. Ich studiere Luft- und Raumfahrttechnik im 3. Semester und möchte Dr. Ingenieur bei Airbus Helicopters werden.
Danke und viel Erfolg im Studium!
Danke fürs hochladen!
Vielen Dank!
Ich habe mich Tage lang damit beschäftigt und nie verstanden. Endlich verstehe ich es. Hervorragend erklärt, Dankeschön!
The approach to the proof for Stone’s Theorem in this lecture seems to be more direct that that in other references that proceed in sequence by first showing symmetry and then establishing essential self adjointedness, then exponentiating the closure of the essential self adjoint operator and showing that generates the strongly continuous unitary group. While the direct approach to showing that the generator is self adjoint is more insightful, I find a couple of points unresolved. The other references require the fact that U_t acting on D(T) is D(T) itself for any t (which is easy to show). The corresponding requirement strangely does not seem to be required here (either directly or indirectly as far as I can tell). Additionally (and this is relatively minor) the other approaches also use the fact that exp(i z t) will grow exponentially for positive or negative t, whenever z has a non zero imaginary component. Because alternative (minimal) proofs must somehow use the same underlying facts, these discrepancies are philosophically disconcerting. Any insights that help resolve these would be very welcome. Thanks in advance for any responses.
I suppose what you are referring to is that in other approaches one is using the differential equation U'_t y=i T U_t y, and for this one has to show that U_t x is, for x in the domain of T, also in the domain. I am talking about the integrated version of this, so I don't need the statement about the domains directly. But in the end I do also some manipulations of my integrated quantities where I am multiplying with U_t and using the semigroup property to get that X_t y is always in the domain. This looks to me like the analogue of the statement about U_t x.
Thanks much or responding. Indeed, I was referring to the other approaches that use the differential equation. The other approaches establish and use the fact that U_t D(T) = D(T). The proof here establishes and uses the fact that (X_t H) is a subset of D(T). These two facts are indeed analogous as you indicate, but they do not appear to be equivalent, nor does it seem like they can both be distilled down to a more fundamental underlying common fact that underlies both the proof approaches. Both approaches seem to be correct, so perhaps the problem is in the expectation that they should also both establish and use the same intermediate facts. Incidentally, while the overall integral based approach adopted here seems more direct and insightful (as I already indicated in my original message), the fact that the integrated-averaged operator X_t maps the entire Hilbert space H into the dense subset D(T), seems counterintuitive compared to the fact that U_t D(T) = D(T). The differences between the alternative approaches are probably not worth worrying about further - thank you very much for your help!
1:21 p is position and q is momentum I think it should be p is momentum and q is position.
Sir, Is the Hamiltonian operator specific to the Hydrogen Atom a bounded operator ?
Since the Hamiltonian of the hydrogen atom contains the derivative operator it is an unbounded operator; even showing that it is selfadjoint (and not just symmetric) is a non-trivial task.
this video has soome problems i cant load it
Would it be possible for you to recommend some further sources (textbooks) for reading about free probability theory and noncommutative distributions? I know about your lectures and the references in your lecture notes but are there others?
On my website rolandspeicher.com/literature/ I have collected some literature around free probability.
@@SpeicherRoland Thank you so much :D
Sehr gut erklärt. Danke
Promo-SM 🤗
Thank you so much
You're most welcome
awesome. i was wondering what free probability have to do with ML and AI. I hope a lot of good research will come out of this
Thanks!. I hope so, too.
In the version I know of this theorem (From Royden) it is assumed X is a compact *Hausdorff* space. Is this version more general?
No, it was not intended to be more general; I assume that my spaces are Hausdorff. Actually, the condition that the subalgebra separates the points implies that the space must be Hausdorff. You can find a discussion on those point at math.stackexchange.com/questions/612223/is-hausdorffness-necessary-condition-for-the-stone-weierstrass-theorem
@@SpeicherRoland Rudin in his book Functional Analysis I think proposes a generalization of such theorem. However I never understood that generalization (I think it's called Bishop Theorem). Just mentioning.
@@SpeicherRoland hello i could be wrong but since C(k) is a metric space with the supremums norm doesnt that automatically make it hausdorff and as a consequence every compact subtopology is also hausdorff?