Solve any equation using gradient descent

แชร์
ฝัง
  • เผยแพร่เมื่อ 28 ก.ย. 2024
  • Gradient descent is an optimization algorithm designed to minimize a function. In other words, it estimates where a function outputs its lowest value.
    This video demonstrates how to use gradient descent to approximate a solution for the unsolvable equation x^5 + x = 3.
    We seek a cubic polynomial approximation (ax^3 + bx^2 + cx + d) to cosine on the interval [0, π].
    / edgardocpu

ความคิดเห็น • 127

  • @rumyhumy
    @rumyhumy 8 หลายเดือนก่อน +254

    ta hell is that intro 💀

    • @sidharthapaul7316
      @sidharthapaul7316 7 หลายเดือนก่อน +10

      💀

    • @Euler-e
      @Euler-e 7 หลายเดือนก่อน

      I was watching at 3 am

  • @kushaagr
    @kushaagr 7 หลายเดือนก่อน +77

    You're narration is like from 90s american television. I liked it very much.

    • @mikeiavelli
      @mikeiavelli 6 หลายเดือนก่อน +1

      More from the 60s I'd say, perhaps the 70s. The 90s? Not really.

  • @kukikukic5539
    @kukikukic5539 7 หลายเดือนก่อน +43

    Bro i was home alone and wanted to watch some math shit, i instantly closed yt when i saw the intro😂

  • @zacvh
    @zacvh 7 หลายเดือนก่อน +2

    Bro this video is so fire. I get so annoyed by the voices in my actual school videos that they make you watch and this is a huge step up from that it actually makes this seem like it’s a top secret information like you’re debriefing the first nuclear tests or something

  • @akrammohamed8374
    @akrammohamed8374 7 หลายเดือนก่อน +6

    I love this explanation, voice, simplicity.
    Im guessing the voice is a text to speech trained on old 60s videos?

  • @Biggietothecheese
    @Biggietothecheese 8 หลายเดือนก่อน +6

    Truly happy i found this gem of a chanel before it blows up

  • @stardreamix786
    @stardreamix786 7 หลายเดือนก่อน +3

    Amazing to hear a new algorithm to solve equations, even the non real ones! - Thanks for helping me understand!

  • @olerask2457
    @olerask2457 8 หลายเดือนก่อน +2

    Nice video.
    Next step is designing a suitable neural network (choose number of hidden layers, and nodes of each layer, as well as an activation function), and use gradient descent to “learn” the node values, such that the neural network f.ex. produces a regression function to a set of points.

  • @ycty
    @ycty 7 หลายเดือนก่อน +3

    fantastic video ur gonna blow up soon (this is a threat)

  • @ananthakrishnank3208
    @ananthakrishnank3208 7 หลายเดือนก่อน

    Thank you for the video!! Took some time to grasp the second example.
    No surprise. This gradient descent optimization is at the heart of machine learning.

  • @CaarabaloneDZN
    @CaarabaloneDZN 7 หลายเดือนก่อน +2

    this video is bizarre but in a good way

  • @hannahnelson4569
    @hannahnelson4569 7 หลายเดือนก่อน

    I like it! This demonstrates a good method! It should be noted that these principles also apply to similar methods which may have more desirable convergence properties such as Newton-Raphson methods.

  • @newmoodclown
    @newmoodclown 7 หลายเดือนก่อน

    i thought my screen got dust, but unique style. Nice!

  • @floreskyle1
    @floreskyle1 8 หลายเดือนก่อน +1

    Am I getting this right? On your first example, our objective from the problem is solving for the values of x in x^5 + x = 3, but then you changed the problem to finding the roots (zeroes) of x^5 + x - 3 = 0 now. I'm bad at the higher math stuff but ain't this two things different? Apologies for the mistake if I made something but the reason seems so arbitrary, or should I just not think of it? Moreover, that method you did on squaring the entire equation, should I always do that? Because that seems really arbitrary too, especially that we're looking at a 5th degree polynomial so I thought this entire process would produce five solutions for us.

    • @sicko5821
      @sicko5821 8 หลายเดือนก่อน

      all you have to do is just adjust the equation by adding 3 on both sides, then substitute the solution into x's and then you get the equation really equal to 3 (or in this case approximately)

    • @floreskyle1
      @floreskyle1 8 หลายเดือนก่อน

      @@sicko5821 Yeah, I got back to some reading and remembered some stuff about that. Thanks for this though.

    • @EdgarProgrammator
      @EdgarProgrammator  8 หลายเดือนก่อน

      In this case, finding the solution of x^5 + x = 3 is equivalent to finding the root of the equation x^5 + x - 3 = 0. The function f(x) = x^5 + x - 3 doesn't have a global minimum, as it extends to both positive and negative infinity. To ensure the existence of a minimum for optimization purposes, we often transform the function to make it non-negative, typically by squaring it (or taking its absolute value). This creates a "low point" that we can then minimize. We do this "squaring preprocessing" when the function doesn't have a minimum.
      Root vs. Minimum: Finding a root (where the function equals zero) is different from finding a minimum (the lowest value).

  • @stokedfool
    @stokedfool 7 หลายเดือนก่อน

    Dig that initial distinction between formulas/expressions and algorithms. Made something click.

  • @ErikNij
    @ErikNij 7 หลายเดือนก่อน

    But how do you choose this "learning rate"? Like in your x^5 example, if you would have chosesn 0.025, then you will never get a solution, as your solver will spiral to infinity? If you know your solution has a 0, could you use the "reseduial" (value of previous evaluation) to guess how far you need to step? Perhaps paired with a relaxation factor?

    • @nolanfaught6974
      @nolanfaught6974 7 หลายเดือนก่อน +1

      More advanced gradient descent algorithms use a decreasing sequence of numbers as the learning rate. This allows the algorithm to quickly converge in the first few iterations and more slowly converge in later iterations to avoid “overstepping” the solution. Another modification involves solving for the optimal learning rate at each step with another gradient descent method, called exact gradient descent. Conjugate gradient descent uses orthogonal step directions to guarantee convergence in exactly n iterations, but each iteration is more costly.
      It’s important to recognize that the learning rate shouldn’t matter too heavily unless your problem is ill-conditioned, in which case derivative-based methods don’t provide a much of an advantage over just guessing and you would use simulated annealing or other stochastic (rng-based) methods.

  • @winstongraves8321
    @winstongraves8321 7 หลายเดือนก่อน

    Great explanation Edgar!

  • @KP-ty9yl
    @KP-ty9yl 7 หลายเดือนก่อน

    Excellent explanation, immediately subscribed 😁

  • @sirdevio6102
    @sirdevio6102 7 หลายเดือนก่อน

    I adore the vibe of the video

  • @DrMcCrady
    @DrMcCrady 7 หลายเดือนก่อน

    Great vid, really impressive editing.

  • @VEDANTA-we8yl
    @VEDANTA-we8yl 7 หลายเดือนก่อน

    I am the 1000th subscriber

  • @bonquaviusdingle5720
    @bonquaviusdingle5720 8 หลายเดือนก่อน

    excellently explained video

  • @facts-ec4yi
    @facts-ec4yi 7 หลายเดือนก่อน +1

    love the old aesthetic you're going for.

  • @aymanelhasbi5030
    @aymanelhasbi5030 7 หลายเดือนก่อน

    thanks sir !

  • @korigamik
    @korigamik 7 หลายเดือนก่อน

    Broo this is a cool video! Could you share the source code for the video animations

    • @EdgarProgrammator
      @EdgarProgrammator  7 หลายเดือนก่อน

      The source code for this is a mess. Dependencies include SVG.js and MathJax. In the future, I will make paper and pencil videos.

    • @korigamik
      @korigamik 7 หลายเดือนก่อน

      @@EdgarProgrammator bro, it's alright if the code is a mess. I'd still love to learn from your process and ideas

    • @EdgarProgrammator
      @EdgarProgrammator  7 หลายเดือนก่อน

      @@korigamik Thanks! Here's my GitHub profile: github.com/isedgar. I'll be uploading something about math animations in the next few days.

  • @LEGEND_SPRYZEN
    @LEGEND_SPRYZEN 6 หลายเดือนก่อน

    We are taught this in high school class 12.

  • @notjoemartinez4438
    @notjoemartinez4438 7 หลายเดือนก่อน

    Your psychosis demon explains gradient decent

  • @AOPSADIQ
    @AOPSADIQ 7 หลายเดือนก่อน

    Isn't it Newton Raphson method?

    • @facts-ec4yi
      @facts-ec4yi 7 หลายเดือนก่อน

      no, Newton-Raphson converges to the root using tangents, this converges to the minimal point unless it's manipulated in a way in which the minimal point is the root.

  • @nicolascamargo8339
    @nicolascamargo8339 7 หลายเดือนก่อน

    Wow genial

  • @Jiffy_Park
    @Jiffy_Park 7 หลายเดือนก่อน

    Ok now solve the navier stokes equations

  • @JoshKings-tr2vc
    @JoshKings-tr2vc 7 หลายเดือนก่อน

    multivariable newton’s method

    • @facts-ec4yi
      @facts-ec4yi 7 หลายเดือนก่อน

      Wrong.

    • @JoshKings-tr2vc
      @JoshKings-tr2vc 7 หลายเดือนก่อน

      @@facts-ec4yi Thank you.

  • @spaaaghetti7106
    @spaaaghetti7106 8 หลายเดือนก่อน +272

    WHY THE SCARY ASS INTRO????

    • @semtex6412
      @semtex6412 7 หลายเดือนก่อน +33

      some sort of waker-upper. students tend to doze of. it should've been flashed from time to time across the entire length of the video lol

    • @williamstaude
      @williamstaude 7 หลายเดือนก่อน +3

      @@semtex6412helllll nahh
      im in bed with the lights out watching some math video i dont wanna see that again

  • @DaMonster
    @DaMonster 7 หลายเดือนก่อน +103

    The fact that there’s no quintic formula was proved by Galois before dying in a duel at 20

    • @NamanNahata-zx1xz
      @NamanNahata-zx1xz 7 หลายเดือนก่อน +8

      Man, I wish him and Niels Henrik Abel didn't die so young

    • @MyOneFiftiethOfADollar
      @MyOneFiftiethOfADollar 7 หลายเดือนก่อน

      over a woman

    • @othila9902
      @othila9902 7 หลายเดือนก่อน

      ​@@MyOneFiftiethOfADollarOver a woman

    • @skomants2997
      @skomants2997 7 หลายเดือนก่อน +4

      Based sigma gigachad grindset life is temporary math is forever

    • @w花b
      @w花b 7 หลายเดือนก่อน +2

      He came, saw and... died.

  • @pawncube2050
    @pawncube2050 8 หลายเดือนก่อน +50

    Bizarre style, I love it.

  • @sicko5821
    @sicko5821 8 หลายเดือนก่อน +47

    that voice kinda gives the vintage vibes of it, it's like you watching a very old vid. good shit homie

  • @richardmarch3750
    @richardmarch3750 8 หลายเดือนก่อน +60

    While being just a machine learning tutorial, this has an unsettling vibe that I find very unique for a educational channel, and honestly much more captivating. Keep it up!

  • @Physics_HB
    @Physics_HB 8 หลายเดือนก่อน +66

    The intro, the speaker's voice, and everything was beautiful in this video

    • @EdgarProgrammator
      @EdgarProgrammator  8 หลายเดือนก่อน

      🙏

    • @flandrinelextensionniste6490
      @flandrinelextensionniste6490 7 หลายเดือนก่อน +2

      That just makes the whole video feel like the introduction to a conspiracy theory.

    • @Muhammed.Abd.
      @Muhammed.Abd. 6 หลายเดือนก่อน

      ​@@EdgarProgrammator is that your real voice?? Or like deep faked or audio mixed with Prof. Feynman's voice from the archives??

  • @cloverisfan818
    @cloverisfan818 8 หลายเดือนก่อน +224

    This is just newton’s method

    • @pcklop
      @pcklop 8 หลายเดือนก่อน +79

      This is similar to newton’s method but not the same. Newtons method updates the initial guess by finding where the tangent line at the current guess intersects the x-axis. Gradient decent uses the tangent line but doesn’t go all the way to the x-axis, just uses it to walk towards the minimum. On a function like x^2+1, newtons method will not converge, since the function has no real root, but gradient decent will converge to x=0 since the function has a minimum there. They operate on a similar principal, but newtons method is for finding zeros whereas gradient decent iOS for finding minima.

    • @pcklop
      @pcklop 8 หลายเดือนก่อน +19

      Interestingly though there is another extrema finding algorithm called newtons method, however it is different from gradient decent, or newtons method for finding zeros.

    • @quantumboss500yearsago2
      @quantumboss500yearsago2 8 หลายเดือนก่อน +25

      Too many likes but this is wrong. Well, newton's method is a root finding algorithm while gradient descent is a optimization algorithm, it never finds roots but instead local minima/maxima. Newton's method can be used as optimization algorithm if you are finding the roots of the gradient (which requires knowing the second derivative)

    • @matthewsarsam8920
      @matthewsarsam8920 8 หลายเดือนก่อน +3

      @@pcklopif you take newtons method and apply it to the first derivative essentially you’re taking a 2nd order Taylor approximation and then just updating your guess to the minimum of the approximation

    • @wkgmathguy218
      @wkgmathguy218 8 หลายเดือนก่อน +3

      You can use Newton to find estimates of maximum/minimum candidates by trying to solve f'(x)=0 in the simple case or grad f =\vec 0 for multivariable problem. One way of thinking of the simple case is that it is estimating by alpha the quantity 1/f'(x) . I seem to recall that this is the basis for quasi Newton methods. @@pcklop

  • @forever_stay6793
    @forever_stay6793 8 หลายเดือนก่อน +7

    Great video! The visualizations were very helpful to my understanding. Will you be making more machine learning videos in the future?

    • @EdgarProgrammator
      @EdgarProgrammator  8 หลายเดือนก่อน +3

      Thanks! I'm glad you like it. Yes, I'm planning to do more videos on that topic: deep learning, super resolution, etc.

  • @beautyofmath6821
    @beautyofmath6821 6 หลายเดือนก่อน +2

    Beautiful and very well made video, I personally loved the old tv vibe to this, not to disregard the instructive yet nicely explained method of gradient descent. Subscribed

  • @RISHABHKUMAR-zk1fu
    @RISHABHKUMAR-zk1fu 7 หลายเดือนก่อน +2

    bro i came to watch math at night but your freaking intro scared the shiiiiiii out of me 😭

  • @cblpu5575
    @cblpu5575 7 หลายเดือนก่อน +2

    Instead of squaring, you can raise to higher even powers(like 4,6,..etc), giving quicker convergence if i remember correctly.

  • @elmoreglidingclub3030
    @elmoreglidingclub3030 7 หลายเดือนก่อน +3

    This is just awesome stuff. Makes me want to study math. To be patient and learn the fundamentals that gets me to being able to understand and solve these equations. Fascinating.

  • @darkseid856
    @darkseid856 7 หลายเดือนก่อน +1

    what is that intro bruh

  • @ktuluflux
    @ktuluflux 8 หลายเดือนก่อน +3

    What is the voiceover? Thanks,

  • @MyOneFiftiethOfADollar
    @MyOneFiftiethOfADollar 7 หลายเดือนก่อน

    Put another way:
    One can "solve any equation" by numerical methods, i.e. a computer with a properly coded algorithm.

  • @UKimpress
    @UKimpress 7 หลายเดือนก่อน +1

    Absolutely love the audio!!!! How do I get access to it?

  • @brickie9816
    @brickie9816 7 หลายเดือนก่อน +1

    A perfect video to watch at 2 am, especially the intro... now I will have a ton of time to think about this algorithm because there is no way i will sleep 😂 but seriously, very much interesting. I will delve deeper into it 👍

  • @bernardoolisan1010
    @bernardoolisan1010 6 หลายเดือนก่อน

    Why squaring the function? do we always need to square the function to solve it via gradient descent?

  • @toddkfisher
    @toddkfisher 6 หลายเดือนก่อน

    Would a sixth degree polynomial in x be referred to as "x hexed"?
    Really like the video.

  • @YannLeBihanFractals
    @YannLeBihanFractals 7 หลายเดือนก่อน

    Use Newton method, it's quiker!

  • @greenstatic9286
    @greenstatic9286 7 หลายเดือนก่อน +1

    Is that the Kingdom Hearts menu item noise at 4:05?

  • @nguyenthanhvinh5942
    @nguyenthanhvinh5942 6 หลายเดือนก่อน

    Gradient descent is finding optimal minimum point of the function f(x), not finding solution of f(x)=0. However, optimal point of any f(x) is exactly the solution of f'(x) (derivative function of f(x)). So, in case your function has only one variable, to find the solution of f(x)=0, you can replace the derivative term with f(x) and so on. If your function has more than one variable, you can't replace, cause there's only one function has been given, so you do not know that function is depends on which variable (as mentioned above, if you have one variable, f(x) is derivative function depends on x when you use Gradient Descent to find solution).
    So, the solution is using Least Square Approximation method as the video has shown. Function f^2(variable) always has optimal minimum point. If minimum point's value is 0, it is the solution. If not, GD still finds optimal minimum point, but it is not the solution.

  • @AhmAsaduzzaman
    @AhmAsaduzzaman 7 หลายเดือนก่อน

    Yes, solving the equation x^5 + x = y for x in terms of y is much more complex than solving quadratic equations because there is no general formula for polynomials of degree five or higher, due to the Abel-Ruffini theorem. This means that, in general, we can't express the solutions in terms of radicals as we can for quadratics, cubics, and quartics.
    However, we can still find solutions numerically or graphically. Numerical methods such as Newton's method can be used to approximate the roots of this equation for specific values of y. If we're interested in a symbolic approach, we would typically use a computer algebra system (CAS) to manipulate the equation and find solutions.

  • @timelessMotivationchannel
    @timelessMotivationchannel หลายเดือนก่อน

    For a second I thought my iPad is possessed

  • @JamalAhmadMalik
    @JamalAhmadMalik 7 หลายเดือนก่อน

    The voice sounded like Carl Sagan. Are you using an AI generated voice?

  • @rosettaroberts8053
    @rosettaroberts8053 6 หลายเดือนก่อน

    The second example would have been solved better by linear regression.

  • @Arycke
    @Arycke 7 หลายเดือนก่อน

    I hear Kingdom Hearts Menu selection sound😮

  • @trumpgaming5998
    @trumpgaming5998 4 หลายเดือนก่อน

    Okay but why don't you explain why this method doesn't work sometimes for particular degrees depending on the function

    • @trumpgaming5998
      @trumpgaming5998 4 หลายเดือนก่อน

      For instance if you wanted to minimize cos(x) = c1 where c1 is a constant, using gradient descent one way or another yields you that c1 = 0, but the constant term in the taylor expansion of cos(x) is 1 since cos(x) = 1 - x^2/2 + ...
      This means that you have to include at least the 2nd term for this to work, or even a higher degree depending on the function other than cos(x) in the example.

  • @stuart_360
    @stuart_360 6 หลายเดือนก่อน

    oh its good , but i thought i will be able to apply it in my exams lol

  • @jadeblades
    @jadeblades 6 หลายเดือนก่อน

    genuinely curious why you put that in the intro

  • @mourensun7775
    @mourensun7775 7 หลายเดือนก่อน

    Want to know how you made this video animation?

  • @markzuckerbread1865
    @markzuckerbread1865 7 หลายเดือนก่อน

    Awesome vid, instant sub

  • @hallooww
    @hallooww 7 หลายเดือนก่อน

    what text to speech do you use

  • @MissPiggyM976
    @MissPiggyM976 7 หลายเดือนก่อน

    Well done, thanks!

  • @oksolets
    @oksolets 7 หลายเดือนก่อน

    Excellent, more!

  • @Daniel_Larson_Records
    @Daniel_Larson_Records 7 หลายเดือนก่อน

    There's something about the way you talk and edit the video together that actually makes it interesting. I can't put my finger on it. Maybe it's how novel it is? I don't know, but PLEASE make more videos like this. It's amazing, and I actually understood it completely (rare for someone so bad at math lol)

  • @theosib
    @theosib 8 หลายเดือนก่อน

    I've found gradient descent to work very badly for polynomials.

  • @petit.croissant
    @petit.croissant 7 หลายเดือนก่อน

    there's a lot of limitations related to gradient descent though, such as determining appropriate initial conditions and hyperparameters and convergence problems... but for simple well-behaved polynomials its definitely fine, although Newton's method would achieve faster convergence

  • @MusicEngineeer
    @MusicEngineeer 7 หลายเดือนก่อน

    Yes - interesting. I've been considering the idea of minimizing f^2 by gradient descent for solving f = 0 but never actually implemented it. Is that a common technique in numerical computing? I think, I haven't seen it in textbooks yet (but maybe my memory is wrong). How does it compare to e.g. Newton iteration convergence wise? Maybe one could also try to minimize the indefinite integral F of f instead of f^2, if that is easily computable - which it is in the case of polynomials. Might be interesting to explore, if that leads to a good (i.e. fast) algorithm.

  • @AhmAsaduzzaman
    @AhmAsaduzzaman 7 หลายเดือนก่อน

    AWESOME Video! Thanks! Trying to put some basic understanding on this: "We seek a cubic polynomial approximation (ax^3 + bx^2 + cx + d) to cosine on the interval [0, π]."
    Let's say you want to represent the cosine function, which is a bit wavy and complex, with a much simpler formula-a cubic polynomial. This polynomial is a smooth curve described by the equation where a, b, c, and d are specific numbers (coefficients) that determine the shape of the curve.
    Now, why would we want to do this?
    Cosine is a trigonometric function that's fundamental in fields like physics and engineering, but it can be computationally intensive to calculate its values repeatedly.
    A cubic polynomial, on the other hand, is much simpler to work with and can be computed very quickly.
    So, we're on a mission to find the best possible cubic polynomial that behaves as much like the cosine function as possible on the interval from 0 to π (from the beginning to the peak of the cosine wave).
    To find the perfect a, b, c, and d that make our cubic polynomial a doppelgänger for cosine, we use a method that involves a bit of mathematical magic called "least squares approximation".
    This method finds the best fit by ensuring that, on average, the vertical distance between the cosine curve and our cubic polynomial is as small as possible. Imagine you could stretch out a bunch of tiny springs from the polynomial to the cosine curve-least squares find the polynomial that would stretch those springs the least.
    Once we have our cleverly crafted polynomial, we can use it to estimate cosine values quickly and efficiently. The beauty of this approach is that our approximation will be incredibly close to the real deal, making it a nifty shortcut for complex calculations.

  • @MansMan42069
    @MansMan42069 7 หลายเดือนก่อน

    TIL "x quartered" is a thing you can say

  • @aminebouobida877
    @aminebouobida877 7 หลายเดือนก่อน

    simple , straight forward and good vibes. i love it

  • @VEDANTA-we8yl
    @VEDANTA-we8yl 7 หลายเดือนก่อน

    I am the 1000th subscriber

  • @featureboxx
    @featureboxx 7 หลายเดือนก่อน

    this clip is from the 70`s?

  • @RAHUDAS
    @RAHUDAS 7 หลายเดือนก่อน

    It is really nice

  • @manasandmohit
    @manasandmohit 7 หลายเดือนก่อน

    Best into so far

  • @alirezaakhavi9943
    @alirezaakhavi9943 7 หลายเดือนก่อน

    really nice video Edgar! subbed! thank you

  • @misterdubity3073
    @misterdubity3073 8 หลายเดือนก่อน

    Very good presentation. Not sure what superscript "T" is, first appearing @8:05 in the update formula.

    • @nathanoher4865
      @nathanoher4865 8 หลายเดือนก่อน +1

      Transpose. Vectors are column matrices (n by 1). Transposition is when you reflect a matrix over its main diagonal (so the entry in row 3 column 4 becomes row 4 column 3). Transposition therefore turns column matrices into row matrices. This is needed because the row matrix is subtracting alpha times the gradient. The gradient is a vector, specifically a column matrix. Addition and subtraction are only defined for matrices which have the sane dimensions. The transpose turns the gradient into a row vector so it can be subtracted from the vector before it, which is written as a row vector.

    • @misterdubity3073
      @misterdubity3073 8 หลายเดือนก่อน

      @@nathanoher4865 Ah, Transpose. I get it. Thanks. I had forgotten that detail of matrix terminology.

  • @gregorymorse8423
    @gregorymorse8423 7 หลายเดือนก่อน

    The clickbait here is so absurd. Obviously this cannot solve non convex problems.

    • @vida91963
      @vida91963 7 หลายเดือนก่อน

      And yet the entire field of deep learning uses it successfully in very non convex cases …

    • @gregorymorse8423
      @gregorymorse8423 7 หลายเดือนก่อน +1

      @vida91963 actually, you are completely wrong. Machine learning aims to make the cost function as convex as possible so that gradient based algorithms can work successfully. It's such a terrible example that it's a joke. How about cryptography where all functions are absurdly non convex like hash functions or symmetric crypto. Sorry gradient methods won't work. If they did then we wouldn't have cryptography.