- 156
- 409 986
Constantine Caramanis
เข้าร่วมเมื่อ 11 ก.พ. 2012
I am a Professor at The University of Texas at Austin.
I research in and teach optimization and machine learning related courses.
My current research interests focus on decision-making in large-scale complex systems, with a focus on learning and computation. Specifically, I am interested in robust and adaptable optimization, high dimensional statistics and machine learning, and applications to large-scale networks, including social networks, wireless networks, transportation networks, and energy networks. I also work on applications of machine learning and optimization to computer-aided design.
I use TH-cam to post educational material that is useful to students in my research group and students in my classes. I am happy if it helps anyone else who is trying to learn challenging material.
You can find more information about me here:
users.ece.utexas.edu/~cmcaram/
I research in and teach optimization and machine learning related courses.
My current research interests focus on decision-making in large-scale complex systems, with a focus on learning and computation. Specifically, I am interested in robust and adaptable optimization, high dimensional statistics and machine learning, and applications to large-scale networks, including social networks, wireless networks, transportation networks, and energy networks. I also work on applications of machine learning and optimization to computer-aided design.
I use TH-cam to post educational material that is useful to students in my research group and students in my classes. I am happy if it helps anyone else who is trying to learn challenging material.
You can find more information about me here:
users.ece.utexas.edu/~cmcaram/
Theorem of Alternatives from Separation
We show how to prove theorem of alternatives using the basic separating hyperplane theorem (which we prove in a different lecture).
มุมมอง: 1 096
วีดีโอ
Separating Hyperplanes
มุมมอง 6K3 ปีที่แล้ว
We prove the basic separating hyperplane theorem for closed convex sets: if X is closed and convex, and y not in X, then there exists a vector c such that c'y \leq c'x, for all x in X
11.7 Lower Bounding Extension Complexity, Part V
มุมมอง 3954 ปีที่แล้ว
This lecture concludes our main lower bound result that shows that any extended formulation for the correlation polytope requires at least 1.5^n facets. We have previously shown that the extension complexity is lower bounded by the size of any covering of D(n) by valid sets. We show that the largest valid subset of D(n) is of size no more than 2^n. Since |D(n)|=3^n, this immediately implies our...
11.1 Extended Formulations, Part I
มุมมอง 3344 ปีที่แล้ว
This is our first of seven lectures on Extended Formulations and Extension Complexity. We give a positive result: we define the permutahedron, and show that while it has exponentially many facets, it is the projection of a n^2 n dimensional polytope defined by just n^2 3n constraints.
11.6 Lower Bounding Extension Complexity, Part IV
มุมมอง 1754 ปีที่แล้ว
In this lecture we define the notion of a "valid" subset of D(n), the set of pairs of disjoint subsets. The idea of validity is critically linked to communication complexity and the relaxed (easier) version of the Face_Vertex game that Alice and Bob play. We show that if we can cover D(n) with r valid sets, then the non-deterministic communication complexity for the relaxed Face_Vertex game can...
11.5 Lower Bounding Extension Complexity, Part III
มุมมอง 2414 ปีที่แล้ว
This is the first of three lectures where we show that the extension complexity of the correlation polytope is at least 1.5^n. The proof follows the paper by Kaibel & Weltge, 2016.
11.4 Lower Bounding Extension Complexity, Part II
มุมมอง 1864 ปีที่แล้ว
In this lecture we give the key link between communication complexity and extended formulation complexity. The key is in the introduction of the FACE_VERTEX problem. We show that if a polytope has an extended formulation with r inequalities, then the non deterministic communication complexity of FACE_VERTEX(P) is at most log r.
11.2 Extended Formulations, Part II
มุมมอง 2154 ปีที่แล้ว
In this video we define precisely the extension complexity of a polytope. Then we prove the result of Yannakakis from 1991, that links nonnegative rank and extension complexity. Specifically, Yannakakis shows that the nonnegative rank of the slack matrix he defines, is exactly the extension complexity: given a nonnegative factorization of the slack matrix of size r, it can be used to obtain an ...
11.3 Lower Bounding Extension Complexity, Part I
มุมมอง 2584 ปีที่แล้ว
The next sequence of lectures will be developing tools, specifically communication complexity, that can be used to lower bound the extension complexity of a polytope. In this lecture we introduce the notion of non-deterministic communication complexity. I learned this material from the notes by Tim Roughgarden.
10.7 Continuous Greedy, Part II
มุมมอง 2304 ปีที่แล้ว
We finish our series on the Continuous Greedy Algorithm. In the last lecture, we showed that for x(1) the output of the Greedy algorithm, and F the multilinear extension, we have: F(x(1)) \geq (1-1/e) OPT = (1-1/e)F(x*). In this section, we show that we can find an independent set S so that f(S) = F(S) \geq F(x(1)). We do this by taking the decomposition of x(1) naturally produced by the contin...
10.6 Continuous Greedy, Part I
มุมมอง 4744 ปีที่แล้ว
The next two lectures revisit the problem of maximizing a monotone submodular function subject to a matroid constraint. Where as Greedy obtains a 1/2 approximation, we show that by using the multilinear extension we can improve this to (1-1/e), which is unimprovable unless P = NP (because, e.g., of the Max-k-Cover problem, as discussed in a previous lecture). We introduce the continuous greedy ...
10.5 Continuous Extensions, Part II
มุมมอง 3764 ปีที่แล้ว
We define the multilinear extension, and prove several important properties when it is derived from a monotone or submodular function. In particular: The gradient of F is nonnegative when f is monotone. The elements of the Hessian are nonnegative (with zeros on the diagonal). These imply that: F is non-decreasing along any direction d \geq 0 F is concave along any direction d \geq 0 F is convex...
10.4 Continuous Extensions, Part I
มุมมอง 6874 ปีที่แล้ว
We give the convex and concave closures for a set function f. We show that they merit their name: the convex closure is indeed a convex continuous extension of f, and similarly, the concave closure is concave. These two may be hard to evaluate. This motivates the definition of the Lovasz extension, which is easy to evaluate. We show that the Lovasz extension coincides with the convex closure if...
10.3 Submodular Functions, Part III
มุมมอง 5104 ปีที่แล้ว
In this lecture we consider the problem of maximizing a monotone submodular function over a general matroid constraint. We show that the Greedy algorithm provides a 1/2 approximation.
10.2 Submodular Functions, Part II
มุมมอง 5934 ปีที่แล้ว
In this lecture we give the basic greedy algorithm, and give the proof by Wolsey, Nemhauser and Fisher stating that if \mathcal{I} is the uniform matroid, and f is a monotone submodular function, then the Greedy algorithm is within a (1-1/e) factor of optimal.
5.1 Proximal and Projected Gradient Descent
มุมมอง 21K4 ปีที่แล้ว
5.1 Proximal and Projected Gradient Descent
3.1 Intro to Gradient and Subgradient Descent
มุมมอง 11K4 ปีที่แล้ว
3.1 Intro to Gradient and Subgradient Descent
1.1 Introduction to Optimization and to Me
มุมมอง 41K4 ปีที่แล้ว
1.1 Introduction to Optimization and to Me
1.2 What these Lectures do and do not cover
มุมมอง 10K4 ปีที่แล้ว
1.2 What these Lectures do and do not cover
7.2 Newton's Method and Affine Transformations
มุมมอง 2K4 ปีที่แล้ว
7.2 Newton's Method and Affine Transformations
3.3 Properties of Smooth and Strongly Convex Functions
มุมมอง 7K4 ปีที่แล้ว
3.3 Properties of Smooth and Strongly Convex Functions
Can you provide some book names or references for these lectures
In your example, the sum of lambdas is 1/2. Why can we claim that the sum of lambdas is always 1?
Your course is amazing! I have weak foundation in matrix computation and time comlexity( intuition as well ) but you make everything comprehensible by examples, graphs and actual matrix valued functions.
Thank you.
I watched the past videos so I could understand the first minute of this video. One of the best instructors ever.
Great, thanks!
Great insight, thanks!
Excellent lecture, thanks!
At 30:16, why were you able to replace x_t with x_{t+1}, \lambda_{t} with \lambda_{t+1}, and d_{t} with d_{t+1}?
hey any way u can reference a solution to the absolute value minimization at 11:43? subgradients would be ai at each iteration, how does it not become the mean again
Hi, thanks for this video, it's very clarifying. Correct me pls if I'm wrong, but in the 7:51 the result of the expectation should be gradient of f(x), or am I wrong?
Excuse me, but substituting the x to get f*(y) doesn't seem to get the formula as 1/2*(y-q)'Q_inv*(y-q), what happens to the q'x+c term after the substitution?
If I may ask, the graph you have of |10x1| + |x2| doesn't seem correct to me. For example, at (0,-3) we should have f(x1,x2) be (0,3) having no negative values or am I missing something?
The most easy-to-understand and clearly-explained course for optimization I've ever seen!
Was there a previous course? He began talking about probability distributions and I don't think I have heard that previously
Are lecture notes and assignments available? Thanks!
Hi Constantine, thanks for the great video! Could you also explain why 1-1/e is tight for maximizing submodular fucntion under uniform constraint?
Very nice. I also think that Dr. Ahmad Bazzis convex optimization lectures are very useful. Youll gain the knowledge and tools needed to tackle complex optimization problems with confidence. Whether youre a beginner or an experienced professional, this
@20:27 I guess the optimality condition w.r.t x^{\hat} should have the term A^T * z. What do you think, prof? @constantine.caramanis
It was said but initially missed syntactically an ^n on the gradient of f. Thank you for these excellent videos.
Can I ask a question here? Suppose there is a multivariable convex function f(x1,x2,...xi), and suppose we know f(x1',x2',...xi')=c. Based on this, can we get the boundaries for x1 to xi such that f(x1,x2,...xi)<c?
Thank you for the lecture
Thank you very much for these materials. They are awesome. I monitor all the list, but I cannot find topics that are very relative to the Optimization: "Conjugate Direction Methods". Do you plan to add this part as well? Thank you for your time to read this comment)
That is impossible to watch since you write really badly
Hi .if any one has the solution of problem (optimize the sum of absolute value if function is convex ) please share
all subsquare matrix have det = -1,0,1
Is it actually the indicator function on 15:30? It appears to be the characteristic function but I agree that they're very closely connected ideas. Thank you Caramani for all your lectures, you're awesome ❤
The handwriting is illegible.
optimization
19:28 "Since this holds for any alpha1 and alpha2 that are non-negative it also holds for a1 and a2 that sum to 1 and therefore a cone has to be convex" Geometrically this makes sense if we fill the interior of the cone but how can we guarantee convexity for all cones. For example, the graph of y=|x| is a cone that is not convex; however, the locus of points (x,y) with y≥|x| is a convex cone. There's a whole wiki on convex cones so I assume you made this statement solely based on the definition above and not as a general truth? I'm a bit confused.
I just found this example too drive.google.com/file/d/1lCPb48aW2kfd-yaOUmxKWILsCQ9UnvBh/view
Beautiful Lecture...Teaching without slides takes more efforts and lead to better understanding..Thanks
People who might get lost, as I did, at minute 2:20 when he draws the simplex: it is a drawing in 3D, not 2D :)
at 32:31, why not does it hold when \eta <= 2/\beta?
Hello professor thank you please Show that this function f(x) = e||x||^2 +1/2 <Ax, x> -<b, x> is quadratic e=epsilon, À matrix SDP b vecteur
Are there homework questions available for these lectures?
Mr Caramanis, I appreciate the clarity of the lectures, thank you so much! Complement it with my course on Optimization for Data Science. May i clarify the last example with the simplex, please? We have a 'triangle' domain, with the global minimum on one ot the vertices (31:36). Why then GD method will move in the other direction (to the left)? I thought that the step of GD descent is directed towards global minimum as well, and there would not be difference with direction of FW method. What causes GD to move in a wrong direction? Thank you very much!
❤️
Does the definition of the quadratic upper bound assume that f is convex? from what I understand that f is not required to be convex just smooth, but to derive the quadratic upper bound we define the function g and proved that g is convex. Another question, why did we define function g in that exact shape?
Sir, you are truly a genius as a teacher. I understood each and every concept of this video just at one go.
thank you for making these lectures open to everyone.
thank you so much this is really useful thanks for saving my finals and everything
prof are those notebooks available now?
17:55 Both cannot happen at the same time, only one of those cases can be true at a time. 19:36 Does |R_1| + |R_2| = |R| or |R_1| + |R_2| <= |R|?
23:28 We ignore the case where | A \intersect B | = 1 because we are only interested in the case where FV(A,B) = 1
16:48 The linear constraint we are referring to is the constraint expressed in terms of y, not the quadratic version expressed in terms of x.
14:10 I believe that's missing a square?
19:39 Note the overload of notation, in 18:47, G is the index of the facet (i.e. the index of the inequality), while in 19:39, G is the set defined by the corresponding inequality
21:18 So we've basically converted the permutahedron formulation using an exponential number of sigmas, into one that uses an nxn matrix Y.
16:46 Notice how the structure of x*, $f(e1), f(e1+e2) - f(e1)$, etc., mirrors the structure of the Lovasz extension $S_0 \subseteq S_1 \subseteq ... \subseteq S_n$ ( th-cam.com/video/HBCJ-5GTSDU/w-d-xo.html ). Intuitively, we can think of the Lavosz extension as having the "normalization" in lambda (since all the lambda must sum to one) while the permutahedron LP has the "normalization" in the e vectors
8:24 Note that ${n+1 \choose 2}$ is just the summation equation of 1 + ... + n. 9:20 This inequality comes from flipping the previous inequality. Instead of requiring that the elements in S are greater than the summation, we require that the "remaining" elements of [n] - S (denoted as T) is less than the sum of all values in n minus the sum of values in S ( |S| = n - |T|)