- 69
- 184 265
Chris Rycroft
United States
เข้าร่วมเมื่อ 1 ก.ย. 2020
Professor in Mathematics at the University of Wisconsin-Madison, publishing videos on scientific computing and more
Harvard AM205 video 5.10 - Conjugate gradient method
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the conjugate gradient (CG) method, originally introduced by Hestenes and Stiefel in 1952. The conjugate gradient method can be used to solve linear systems for symmetric positive definite (SPD) matrices.
CG only requires matrix multiplication and not direct manipulation of matrix entries, so it is in the family of Krylov subspace methods, and is particularly well suited for sparse SPD matrices. The video introduces the theory begin the CG method, and demonstrates using it to solve the Poisson equation on square grid.
For more information see the main course website at people.math.wisc.edu/~chr/am205
CG only requires matrix multiplication and not direct manipulation of matrix entries, so it is in the family of Krylov subspace methods, and is particularly well suited for sparse SPD matrices. The video introduces the theory begin the CG method, and demonstrates using it to solve the Poisson equation on square grid.
For more information see the main course website at people.math.wisc.edu/~chr/am205
มุมมอง: 5 406
วีดีโอ
Harvard AM205 video 5.9 - Krylov methods: Arnoldi iteration and Lanczos interation
มุมมอง 10Kปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces Krylov methods, which are a family of methods for computing eigenvalues of matrices, and solving them. Krylov methods do not require direct access and manipulation of the matrix elements; it is only necessary to perform matrix multiplication, making the methods well suited to...
Harvard AM205 video 5.6 - QR algorithm
มุมมอง 3Kปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the QR algorithm, which can be used to find many eigenvalues of a matrix at once. The video explores the interesting mathematical structure behind the QR algorithm, and presents a Python example to demonstrate it in practice. For more information see the main course website a...
Harvard AM205 video 5.5 - Rayleigh quotient
มุมมอง 2.7K2 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the Rayleigh quotient, which can be used to accurately estimate an eigenvalue from an approximate eigenvector. The video discusses some of the Rayleigh quotient properties, and shows how it can be incorporated into an inverse iteration to rapidly calculate eigenvalues. For mo...
Harvard AM205 video 5.4 - Power iteration and inverse iteration
มุมมอง 1.2K2 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the power iteration, a method for finding the eigenvalue with largest modulus of a given matrix. The video also introduces the inverse iteration and shifts, which can improve the power iteration and allow any eigenvalue to be computed. For more information see the main course...
Harvard AM205 video 5.3 - Gershgorin circle theorem & eigenvalue sensitivity
มุมมอง 1.7K2 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video continues Unit 5 of the course on eigenvalue problems. It introduces several useful results for bounding eigenvalues and looking at their sensitivity to matrix perturbations. The Gershgorin circle theorem is derived, which constrains eigenvalues to be within a collection of disks in the...
Harvard AM205 video 5.2 - Eigenvalue definitions
มุมมอง 7033 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video continues Unit 5 of the course on eigenvalue problems. It introduces some of the definitions and concepts in this unit, such as the characteristic polynomial, companion matrix, and algebraic/geometric multiplicity. A computer example shows how Python's "eig" function handles defective m...
Harvard AM205 video 5.1 - Introduction to eigenvalue problems
มุมมอง 1.6K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces Unit 5 of the course, on eigenvalues and iterative methods. The video describes several example problems where eigenvalues emerge, and looks at how to solve them using library functions in Python. For more information see the main course website at people.math.wisc.edu/~chr/a...
Harvard AM205 video 4.12 - PDE-constrained optimization
มุมมอง 2.1K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video briefly introduces PDE-constrained optimization, where the goal is to optimize certain aspects of a solution to a partial differential equation. Since solving partial differential equations can be computationally expensive, this motivates the development of several approaches for rapidl...
Harvard AM205 video 4.11 - Penalty methods and linear programming
มุมมอง 3.6K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video continues the survey of different optimization methods. The first half of the video discusses penalty methods, which allow constrained optimization problems to be converted into unconstrained ones. The second half shows an example of linear programming, an important subfield of optimiza...
Harvard AM205 video 4.10 - Sequential quadratic programming
มุมมอง 10K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. Video 4.7 in this series showed how to incorporate equality constraints into an optimization problem using Lagrange multipliers. This previous video looked at this from a purely mathematical perspective, but here we see how to numerically perform equality-constrained optimization using the Newton ...
Harvard AM205 video 4.9 - Quasi-Newton methods
มุมมอง 16K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. The previous video in this series discussed using the Newton method to find local minima of a function; while this method can be highly efficient, it requires the exact Hessian of the function in order to work, which can be expensive and laborious to compute in some cases. Here we discuss quasi-Ne...
Harvard AM205 video 4.8 - Steepest descent and Newton methods for optimization
มุมมอง 4.3K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the steepest descent method and Newton method for optimization. These two methods have different characteristics, and some Python examples are present to compare and contrast these methods. For more information see the main course website at people.math.wisc.edu/~chr/am205 Th...
Harvard AM205 video 4.7 - Equality-constrained optimization
มุมมอง 1.3K3 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video looks at incorporating equality constraints into an optimization, and shows that this can be done using Lagrange multipliers. Two examples are presented: (1) maximizing the surface area of a cylinder given a constraint on its volume, and (2) deriving the undetermined linear least-square...
Harvard AM205 video 4.6 - Optimality conditions and the Hessian
มุมมอง 9753 ปีที่แล้ว
Harvard Applied Math 205 is a graduate-level course on scientific computing and numerical methods. This video introduces the optimality conditions, which can be used to find solutions to a numerical optimization problem. The video first introduces the concept of a critical point for a function, and then shows how critical points can be classified into local minima, local maxima, and saddle poin...
Harvard AM205 video 4.5 - Conditions for a global minimum
มุมมอง 8543 ปีที่แล้ว
Harvard AM205 video 4.5 - Conditions for a global minimum
Harvard AM205 video 4.3 - Newton and secant methods
มุมมอง 1.3K3 ปีที่แล้ว
Harvard AM205 video 4.3 - Newton and secant methods
Harvard AM205 video 4.4 - Multivariate root-finding methods
มุมมอง 1.4K3 ปีที่แล้ว
Harvard AM205 video 4.4 - Multivariate root-finding methods
Harvard AM205 video 4.2 - Root finding: fixed-point iteration
มุมมอง 1.4K3 ปีที่แล้ว
Harvard AM205 video 4.2 - Root finding: fixed-point iteration
Harvard AM205 video 4.1 - Introduction to nonlinear equations and optimization
มุมมอง 2.9K3 ปีที่แล้ว
Harvard AM205 video 4.1 - Introduction to nonlinear equations and optimization
Harvard AM205 video 3.22 - Elliptic PDEs: the Poisson equation
มุมมอง 1.8K4 ปีที่แล้ว
Harvard AM205 video 3.22 - Elliptic PDEs: the Poisson equation
Harvard AM205 video 3.21 - Parabolic PDEs: the heat equation
มุมมอง 1.2K4 ปีที่แล้ว
Harvard AM205 video 3.21 - Parabolic PDEs: the heat equation
Harvard AM205 video 3.20 - Method of lines & wave equation
มุมมอง 3.2K4 ปีที่แล้ว
Harvard AM205 video 3.20 - Method of lines & wave equation
Harvard AM205 video 3.19 - Accuracy and stability for finite-difference schemes
มุมมอง 1.7K4 ปีที่แล้ว
Harvard AM205 video 3.19 - Accuracy and stability for finite-difference schemes
Harvard AM205 video 3.18 - CFL condition & upwinding
มุมมอง 2.2K4 ปีที่แล้ว
Harvard AM205 video 3.18 - CFL condition & upwinding
Harvard AM205 video 3.17 - Advection equation & characteristics
มุมมอง 3.1K4 ปีที่แล้ว
Harvard AM205 video 3.17 - Advection equation & characteristics
Harvard AM205 video 3.16 - Partial differential equations
มุมมอง 7694 ปีที่แล้ว
Harvard AM205 video 3.16 - Partial differential equations
Harvard AM205 video 3.15 - ODE boundary value problems
มุมมอง 5624 ปีที่แล้ว
Harvard AM205 video 3.15 - ODE boundary value problems
Harvard AM205 video 3.12 - ODE error estimation
มุมมอง 5514 ปีที่แล้ว
Harvard AM205 video 3.12 - ODE error estimation
Harvard AM205 video 3.14 - Multistep methods
มุมมอง 1K4 ปีที่แล้ว
Harvard AM205 video 3.14 - Multistep methods
fank u
It is so helpful and so clearly presented with soothing slides. Thank you for the videos.
It was so interesting and fun to me to derive the Newton method from the fixed point iteration 😄 I always like to relate things together and found this pretty satisfying.
Thank you @chrisrycroft2010 for this great series of videos. Greatly appreciate the effort in the quality, production and organisation of the content. I am having trouble understanding why (at 3:10) the second derivative term in the truncated taylor series is evaluated at theta while the others are evaluated at x. Is there any material I could read to understand this?
This follows from Taylor's theorem with remainder. In many situations in numerical analysis, we just write f(x+h) = f(x) + h f'(x) + O(h^2), and the "big-O" term captures everything at higher order. But Taylor's theorem with remainder allows us to make a more precise statement, and characterize that additional term exactly. We can say that f(x+h) = f(x) + h f'(x) + h^2 f''(c)/2, where c is some number between a and h. In general, we don't know exactly what c is, but we know that it exists. This is helpful, because if we have a bound on f'', such as |f''(c)| ≤ M as above, then that allows us to precisely bound this term, even without knowing exactly what c is. By contrast, the "big-O" notation does not give us a way to make an exact bound like this. You can find a number of references online for Taylor's theorem with remainder. The Wikipedia page for Taylor's theorem has it, although the presentation is somewhat technical. See the section about "Mean-value forms of the remainder".
@@chrisrycroft2010 Thanks a lot for the detailed explanation, I will read up on Taylor’s theorem with remainder. Can’t wait to go through the rest of the series, thanks again!
Just on a side note about pronunciation. The name "Krylov" is pronounced "Krilof" with the stress on the last syllable and the "y" being simply an "i" as in "sit". "Lanczos" is pronounced as "Lanchosh" with the stress on the first syllable.
goated🐐
Thanks for the great work. Please use less filler "and"s in upcoming videos, they're distracting.
Great! ThankYou!!
Perfect👌
i was not ready for the green shorts
Thank you for the high-quality lecture series!
great lecture and very elegant algorithm! Would be much easier to understand if the notation for the Rayleigh Quotient Matrix is NOT A_k since they are NOT A, but much closer to the diagonal matrix D
Thanks for the feedback and that is a good observation. I've always found this algorithm difficult to explain, given the similar definitions of the Q_k, R_k, and the underscored versions. You make a good point about A_k becoming close to diagonal. One counterpoint is that A_0 does exactly match A. But I think you're right, that some other notation might be more appropriate to indicate how it behaves in the limit.
really good
what software's where used for coding and plotting ??
The main course website is people.math.wisc.edu/~chr/am205/ and you can find all of the code examples on GitHub at github.com/chr1shr/am205_examples .
Thank you
🎯 Key Takeaways for quick navigation: 📊 Error analysis is fundamental in scientific computing, involving both modeling and numerical sources of error. 🔄 Truncation or discretization error arises from approximating continuous processes with discrete calculations, such as finite differences or series truncation. 💻 Rounding error occurs due to finite precision arithmetic in computers, where small errors accumulate during arithmetic operations. 📉 Discretization error often dominates in practice over rounding error, impacting the accuracy of numerical computations. 📈 Error behavior changes with the choice of discretization parameter \( h \), with a minimum point indicating the transition from dominant discretization to dominant rounding error. 📉 Error plots, especially on logarithmic scales, help analyze error behavior and confirm convergence of numerical methods. 🔍 Relative error, though more informative in some cases, poses challenges when the true value is unknown or close to zero. 📐 Log-log plots are useful for deducing algebraic convergence, while semi-log plots are suitable for detecting exponential convergence in error decay. Made with HARPA AI
Thank you Chris, this is a very interesting content! I take right now course 23C in the Harvard's extension school, and I did want to delve deeper into some of the ideas.
first comment, lets gooo!! also thanks for the helpful video
Thank you for putting this on the web Chris, a beautiful publicly available resource! Quick q: is the beacon example really a good example of the LM algorithm? We are feeding scipy.root with a gradient, whereas the preceding material points to using the residuals (r) and the Jacobian of the residuals (J) separately and use those to approximate it all via a simple normal equations problem, i.e. calculate (J'J + mu I)J'r. If you feed the gradient directly (J'r) to scipy.root then that is obviously fine in that it is correct, but I can't see how the LM algorithm works for root-finding problems? I.e. how does it separate out J again to solve a set of normal equations? Perhaps a more appropriate example given the context is to feed r to scipy.least_squares instead of grad_r to scipy.root?
You're right: this is not an ideal computer demo, and I always found it awkward when I lectured on this. It solves the mathematical problem, but it glosses over the implementation details. This largely arose due to the structure of the lecture-based course: nonlinear least squares belongs mathematically in Unit 1, but I didn't want to get too much into these types of algorithms until Unit 4. If I end up creating online-only material in the future, where the timing of each topic is more flexible, then I will look at revising this.
Thanks for your quick reply and confirmation. I just hacked together the algorithm in javascript as an exercise, so knowing all the nitty-gritty implementation details under the hood became important. I wonder why scipy.root even has a "lm" option and what it does with it. @@chrisrycroft2010
Hello @chrisrycroft2010, I really like the lecture series.. Could you please explain how the second term in the inequality gets eps*f(x)/h term? Why is not eps or any other expression? Thanks!
Glad you like the videos. The term eps*f(x)/h comes from considerations of computer hardware and how rounding errors occur. This is one of the first videos in the sequence and it's designed to provide a quick introduction, so I gloss over these details here. But they are covered extensively in the later video 0.4 (th-cam.com/video/VzKl3-D1CVc/w-d-xo.htmlsi=cmaXFMYZnO877_fJ). Modern computer hardware typically represents real numbers using floating point arithmetic, which is essentially like scientific notation and allows numbers with very different scales to be represented. Whenever you do an arithmetic operation like a+b, then the hardware guarantees that you will get the correct answer up to a relative error of machine precision (called epsilon). Typically epsilon is on the order of 10^{-16}. Here, when you calculate the derivative, you have to do f(x+h)-f(x) in the numerator. Since f(x) and f(x+h) roughly have the same scale, the rounding error you will get will be of size epsilon*|f(x)|. Dividing through by h from the denominator gives epsilon*|f(x)|/h. This division will also create an additional rounding error, although it turns out this will be small and not affect the approximate bound. The reason that the f(x+h)-f(x) operation gives the largest contribution to the rounding error is that you are trying to measure a small difference between two terms that are large. That is a case where rounding error becomes noticeable.
@@chrisrycroft2010 Thank you so much for the prompt reply and detailed explanation. It really helps, and I found the explanation in video 0.4 (@ around 18:00). Thanks again!
I was looking through your channel and it wasn't until this video that I fell in love with your channel. It was because of your nail polish <3
Thank you, that's very helpful!
Pls can you prove that of explicit Runge kutta of order 8
In my opinion in video is missing explanation of effective multiplication by H matrices I dont want to use extra O(n^2) space for H matrices and high school matrix multiplication because it will be too costly Moreover I would like to use Householder reflections to Hessenberg reduction and for generating orthogonal polynomials
5:13 Are the loops constructed correctly ? In my opinion second loop should be for j = k+1:m do unless it is backward iteration Really it is G = G(j-1,j,theta) Correct me if i'm wrong
It is indeed a backward iteration, and the algorithm is correctly written in the video. You need to start at the bottom of each column at j=m and work your way up to j=k+1. By considering in that order, you will zero out all of the entries in the column below the diagonal.
@@chrisrycroft2010 But it is possible to have G(j-1,j,theta) to eliminate a_{jk} ? not G(j,k,theta)
@@holyshit922 Yes, the notation G(j-1,j,theta) is correct. This matrix is only acting on rows j-1 and j. The theta value is chosen so that G eliminates a_{jk} and sets it to zero. Essentially, it rotates all of the contribution of a_{jk} into a_{j-1,k}. See earlier in the video. The G matrix will do nothing to the columns q where q<k, since by the time you reach that point in the algorithm, all of those entries are zero. The G matrix will modify entries in columns q where q>k, but that doesn't matter since those columns haven't been considered yet.
Hi Prof Rycroft, Do you intend to upload more courses or content to your youtube channel? For example your AM225 course? you explain very clearly
Hi Eitan, I'm glad you like the videos. I am definitely planning to make more videos, an in particular I would like to make videos on more advanced topics like the AM225 material. But I have a large number of Ph.D. students graduating in summer 2024, and I am very busy ensuring they complete their thesis work on time, so it will likely be mid-2024 before I have sufficient time to do this.
Very nice!
Thank you for very nice lectures.
Why havent I found these series before
Sorry. I couldn't understand anything.
Wonderful content!
Really helpful!
Another excellent video!
Great explanation!
Very helpful!
Hi! Where can I find 5.7 and 5.8 lectures? Great content
The material for videos 5.7 and 5.8 was given as a stand-alone lecture that was separate from the main course content. Because of this, I did not record videos for that material along with the others. I am intending to record videos on this material, but I want to do a good job so it will take some time. I may complete them around Jan/Feb 2024, outside of my usual teaching responsibilities. In the meantime, you may find the associated notes useful: people.math.wisc.edu/~chr/am205/notes/iter_lecture.pdf
The interpretation of Arnoldi iteration is extremely useful, thank you
Great demonstration in python, very nice!
A simple Singularly Perturbed Convection Diffusion Boundary Value Problem can be defined as follows: \begin{equation} \epsilon y''+ q(x)y'+p(x)y = f(x), \quad 0 \leq x \leq 1 \end{equation} \begin{align*} y'(0)=y'(1) &= 0,\\ \end{align*} where $0<\epsilon \ll 1$, $q(x), p(x), f(x)$ are known functions. Then find exact solutions for the above problems by showing all necessary steps.
Why is it that S^TS is positive definite then A^TA + S^TS is invertible ?
We need the following two properties: 1. A matrix M is positive definite if and only if x^T M x > 0 for all non-zero vectors x. 2. If a matrix M is invertible, then the only vector z that satisfies Mz=0 is the zero vector. Now suppose S^T S is positive definite, and that there exists a vector z such that (A^T A + S^T S) z = 0. Then 0 = z^T (A^T A + S^T S ) z = (Az)^T Az + z^T S^T S z = || Az ||^2 + z^T (S^T S) z. We know that ||Az||^2≥0, and hence the only way that this can be zero is if z^T (S^T S) z = 0. Therefore property 1 implies that z is the zero vector. Hence property 2 shows that (A^T A + S^T S) is invertible.
Thanks a lot 🙏 🌹
WOW! Finally I get it.
Hi Dr. Rycroft. Thank you for posting the wonderful videos for the world! Speaking of the estimation of phi, would it be nice to include the notable work from a Chinese mathematician,Chongzhi Zhu, who estimated "phi as between 3.1415926 and 3.1415927, a record in accuracy which would not be surpassed for over 800 years? en.wikipedia.org/wiki/Zu_Chongzhi
Yes, I am aware of Zu Chongzhi's amazing achievement. The history of the calculation of pi is a fascinating subject that links many mathematicians together across millennia. While the video above could only cover a small fraction of the history, I have previously given a lecture on pi, where I feature Zu Chongzhi. See the slides here: people.math.wisc.edu/~chr/events/pi_day_talk.pdf At some point I may convert these lecture slides into a separate video.
I just finished my undergraduate, saw this course online. Let me go through it I hope I can understand it
👍Thank you for sharing this video
Great lecture :)
Thanks for the amazing content. I guess the Spectral graph analysis is used in the PageRank algorithm?
Glad you find the videos useful. Yes-I think that that the core ideas behind PageRank are similar to the spectral graph analysis example I describe in the video. The example that I give is also called eigenvector centrality, and Wikipedia mentions that PageRank is a variant of eigenvector centrality. (see en.wikipedia.org/wiki/Eigenvector_centrality )
Thank you for uploading these lectures
Thanks for this video!Maybe I need some help . I want use Lanzcos to slove complex Hermitian Matrix eignproblems. Is there any difference between complex Hermitain Matrix and Symmertic Real Matrix in using Lanzcos Method?
I think the only thing that you need to change is that when you compute the scalar products, such as alpha_m = q_m^T v, then they use the Hermitian transpose instead of the regular transpose. Therefore alpha_m = q_m^* v. That also affects the computation of the Euclidean norm. Instead of ||b||_2 = sqrt(b^T b), it becomes ||b||_2 = sqrt(b^* b). The Wikipedia page on the Lanczos algorithm presents the algorithm for the complex case: en.wikipedia.org/wiki/Lanczos_algorithm
This is amazing and top notch content
Excellent Video, I have a question, you discussed both Broyden method and BFGS, the Broyden méthode requires no explicit knowledge of gradient(deltaF), so it looks simpler when exact gradient is not available and numerical gradient is not cheap, my question is, does it mean Broyden method converge slower than bfgs? Or what’s its pros and cons compared to bfgs
Thanks for your comment. The Broyden method and BFGS are connected, but they solve two different problems. For the Broyden method, you are trying to solve a nonlinear system of equations F(x)=0, where x is an n-dimensional vector, and F is an n-dimensional function (i.e. F: R^n -> R^n). BFGS on the other hand solves the optimization problem of minimizing a function g(x), where x is an n-dimensional vector and g is a scalar function (i.e. g: R^n -> R). To minimize (or maximize) a function we can search for points where grad(g)=0. Thus if we define F(x)=grad(g(x)), then this optimization problem can be viewed as a root-finding problem, for finding F(x)=grad(g(x))=0. The key point is that F in the Broyden method is roughly equivalent to grad(g) in BFGS. Thus, it isn't the case that BFGS uses more information by using the gradient, because BFGS is trying set the gradient to zero, whereas Broyden is trying to set F to zero. Overall, the two methods have similar information available, and converge similarly for their respective tasks. Some of the other videos in Unit 4 of the course talk a bit further about these connections between root-finding and optimization.