Another reason for golang is that it's really picky for code design. For example if you have an if else loop, you need to put the else behind the closing curly barsckt from the if stement or it's like you have an else statement without an if statement. You can't have an empty line at the bottom of a list, can't have variables which are not used etc. Another nice thing is that golang's error handler is way better and give more details about the error before you can even run it.
It's maybe a good idea to start from instrumenting APMs or running benchmarks to identify the performance bottlenecks, as Rob Pike famously said in his 5 Rules of Programming. In general, Go excels when your application is mainly IO bound. In the context of machine learning, Python is actually a wrapper language to call scientific calculation libraries written by C and Fortran. These libraries are developed and tuned-up for decades and Go (lol) has literally zero chance to match up with them.
These libraries are built for python only(you can find vice-versa in other languages but it's limited)! The main advantage of python is it's so fking popular that the libraries support is very vast. Also you can't swap support libraries b/w different languages...@@bebobauomy1265
@@bebobauomy1265 would need to write and maintain a wrapper himself. They're written in cython not python or C... (at least many parts of them). But yea, just not by himself in this "tight" schedule.
This is company code so I am not allowed to share it, but I am creating a new video with python vs Rust with the same use case and I am thinking of putting some code there.
Maybe you should also give Julia a try. It's written as a dynamic, high level language like python, but can run with the speed of a a low-level language due to JIT compilation during runtime. It was specifically build for scientific computing and data sciene and has a really fun programming pattern based on multiple dispatch, which you should definitly check out. You have access to all data science tools, including tensor/deep learning libraries and trivial GPU computing. The core idea of julia was to bridge this gap, between fast and easy to code languages like python and low-level languages, so that you don't need to rewrite your finished project again for deployment. Have fun :)
I do use Julia for scientific programming, but I'm not sure about using it for creating servers and APIs (based on me never even trying to use it for these purposes).
@@fieldmojo5304 My thoughts exactly. The languages itself should have a small gap between possible and efficient code; otherwise, the default is to start thinking about optimizing code from rhe start, which creates overhead to the effort of writing even a simple implementation.
Hey nice video! Usually python IA libs are not written in python, but in rust/c/c++. The people use python just for the "easy" part. Try the mojo language, will be nice to see the diff between mojo x python x go
Mojo has good intentions but it will take years, IMO.... Better to use Python with Cython as they are mature or one could try Nim which is now quite stable at version 2.2
Cool video, and I'm also a big fan of go. Did you consider rewriting the problem code as a C extension for Python, or was that not viable for some reason?
@@codeinajiffyalso, the slow python version couldn’t be implemented using numpy, pandas or something like that? I’m not a DS guy, and I don’t know what that function was doing. But things like deduplication should be doable with pandas (?). Also, maybe the problem was a poorly optimized implementation. Python is slow, but these numbers surprises me.
@@greyshopleskin2315 yes general removal of duplicate rows is doable in pandas, but this use case was conditional over the previous rows and I had to loop over the rows here. Also the loop did a couple more things that were not easily done with pandas operations. I spent some time trying to optimize the code, because I didn't want to rewrite the whole code base again, but got no major speed increases and thus I went with go
Pandas is ~3x slower than numpy which is ~2x slower than an ordinary loop in C. Well optimized for what they are but the flexibility of differet usecases they support definitely puts a limit on how fast they can run.
Nice Comparison - thanks 🙂 did you consider julia (programming language) for prototyping? It could be fast enough for inference and can access the ML Libraries...
Yeah definitely is by far the superior option. I tried importing a file in Pandas, and it took 6 minutes. Julia's DataFrames did it in under 10 seconds. Julia for ML is the only real option. If you want a language like Go, you should use Odin. It's like Go but better in just about every way.
I have a question, like how did you make predictions in real time without being the lib's available for go ? as there are for python such as scikit etc ?
Great question, It's a multi-armed bandit approach so all I needed to do here was to sample a distribution using the numpy go alternative. But there is sklearn in go language if you want to use it, but note that it doesn't have as many models as the python implementation. There is Tensofrlow and Pytorch in go lang as well.
The main cause of slowness was the only function in the code base that did not use pandas and numpy and used a normal for loop. This was slowest part of the code. Also Python's concurrency was bad and it affected scalability a lot, requiring a big number of resources to scale to all the user base. On the other hand pandas and numpy were actually efficient and fast.
I find this very fascinating indeed. Given how much C/C++ optimisation has been done in the popular libraries like numpy, pandas etc I'm very curious what was the key bottleneck that caused the massive (30x?) increase in number of pods required? Sounds like there were some inner loops that were running solely in python.
Awesome! More proof that model inference requires more efficient implementations. What do you think about rust in this context ? Given that there are libraries such as polars and arrow, would implementation effort in rust be worth the performance gains if any at all ?
Rust is on my bucket list of things to try out. I saw a couple articles about Polars being faster than pandas. If that's true that would be very exciting.
How can he be sure that his custom code will be more reliable than the already tested and debugged Pandas library? How are we sure that his Python code could not be improved any more?
So what the actual _AI_ code has been written on the Go (or Python) in your project? Rearrange a list, sum it and handle endpoint looks liike no-AI part.
I only moved the deployment part to go lang, but preparing the multi-armed bandit recommender per user was in python. I then read the multi-armed bandits in go lang and predicted the scores in go lang with the numpy alternative.
Python is slow. But it is not easy to beat a well written Python program. There are options to make a Python program fast e.g. Numpy, Pandas, Numba etc. With the time and experience, you will know, where to optimize. Once, I rewrote an API from Java to Python for my customer. LoC were reduced from 10,000 to 1,000 and my Python code runs 100x faster due to algorithmic optimization. Of course, we can do the same thing with Java and it will run even faster. But we will be lost in the jungle of code before we could come up with some ideas.
That's not true, I have done something similar, since I couldn't recreate entire model in Java I trained my model in python then used the trained model for inference in Java, it was ways faster than python and had avg 200 ms response time only, python had it in seconds for a load test. You can't compare single threaded langauge to a multithreaded one. Python is able to handle large volumes thanks to horizontal scalling else try to create non scalling application then compare
Brother can u show us simple example on How u used Structs Maps and various ways to make an alternative for pandas. Loved your perspective and insights keep doing.
Sure here is some sudo code for joining two data frames. df1 = { 'item_id_1': struct({ 'item_name': ..., item_tags:....score:}), 'item_id_2': struct({ 'item_name': ..., item_tags:....score:}), } df2 = { 'item_id_1': struct({ 'prediction_score': 0.35}), 'item_id_2': struct({ 'prediction_score': 0.45}) } for item_id, _ in df1.items(): score = df2[item_id]['prediction_score'] df2[item_id]['score'] = score and now you have the scores of df2 in df1. effectively merging the two data sources. And because those are maps the lookup time in df2 is bigO(1).
Have you tried implementing numpy and pandas with numba b4 finishing the tests. You should see a much higher speedup especially with parallel computation, math, loops etc just by using vectorization or jit with nogil, caching and static types. Jax is also a good alternative to get a faster speed up in python but that mainly replaces numba. I personally have used jax ports of some ai libraries and seen huge training improvements.
Yeah I actually did. There were two modes one is called object mode which improved speed by less 15%. And another one called nopython mode. In this mode Numba aims to generate machine code that relies solely on Numba-supported operations and data types, avoiding the use of Python objects and the Python C API. This one was way faster than normal python indeed, but the main problem is that it required all data types to be supported in numba. That meant no pandas data frames and other useful tools which made writing code restricting and more time consuming than writing with go lang.
I had a similar experience, when I tried to switch python to nim. I've found out nim does not have the same amount of libraries as python and not everything is available in nim.
picking right tool for job is what defines a good programmer. it's quite common practice to use some statically compiled language to optimize some bottleneck caused by dynamic language. so saying i fully switched to language Y from language X because of some performance issue is like saying "declared full scale war over some fart"
yup it was the piece of code that didn't have pandas and numpy. This part was building the discovery screen row by row in a for loop and this was the slowest part here.
This is false, golang just automagically schedules threads and coroutines with a cool scheduler. So this guy probably blocked the eventloop and then wondered why the response time didn't scale with the users. Not saying it was wrong to rewrite to go, because in the end it was easier to get the performance you need. Python has a lot of gotchas....
I was just making a statement that maybe not all data science libraries will be found go lang. There is tensorflow in go lang with the same package name. and sklearn can also be found in go lang but note that it will have a very limited number of models and functions. It's not as comprehensive as python's library.
Yeah I actually did. There were two modes one is called object mode which improved speed by less 15%. And another one called nopython mode. In this mode Numba aims to generate machine code that relies solely on Numba-supported operations and data types, avoiding the use of Python objects and the Python C API. This one was way faster than normal python indeed, but the main problem is that it required all data types to be supported in numba. That meant no pandas data frames and other useful tools which made writing code restricting and more time consuming than writing with go lang.
I'd use DuckDB instead of dataframes imo. Faster than pandas when using it from python, but is also available in Go and makes it easy to port code between the two.
Learning Go is not difficult though, an experienced programmer can become productive with it in a weekend so the investment needed is not that big. But the ops team will generally like you a lot more if you give them a go binary
This was a fantastic video. Thanks for posting
this was a very practical and a real use case you made some really good points about when to use Python and when to use Golang thank you man
Thanks a lot for the support 😁
Damn that's one great video, thanks a ton.
Thanks a lot 🤩. Those words mean a lot for me 😁
Another reason for golang is that it's really picky for code design. For example if you have an if else loop, you need to put the else behind the closing curly barsckt from the if stement or it's like you have an else statement without an if statement.
You can't have an empty line at the bottom of a list, can't have variables which are not used etc.
Another nice thing is that golang's error handler is way better and give more details about the error before you can even run it.
So so argument.
It's maybe a good idea to start from instrumenting APMs or running benchmarks to identify the performance bottlenecks, as Rob Pike famously said in his 5 Rules of Programming.
In general, Go excels when your application is mainly IO bound.
In the context of machine learning, Python is actually a wrapper language to call scientific calculation libraries written by C and Fortran.
These libraries are developed and tuned-up for decades and Go (lol) has literally zero chance to match up with them.
Very well said.
Can not he just use these C libraries in Go instead of Python?
These libraries are built for python only(you can find vice-versa in other languages but it's limited)! The main advantage of python is it's so fking popular that the libraries support is very vast. Also you can't swap support libraries b/w different languages...@@bebobauomy1265
@@bebobauomy1265 would need to write and maintain a wrapper himself.
They're written in cython not python or C... (at least many parts of them).
But yea, just not by himself in this "tight" schedule.
If they were written in C then you would just write a CGO wrapper and you're done with the job, but, they are not.
This was very informational. Thank you for sharing your observations and conclusions. Is there a public github repo that we can check?
This is company code so I am not allowed to share it, but I am creating a new video with python vs Rust with the same use case and I am thinking of putting some code there.
Maybe you should also give Julia a try. It's written as a dynamic, high level language like python, but can run with the speed of a a low-level language due to JIT compilation during runtime. It was specifically build for scientific computing and data sciene and has a really fun programming pattern based on multiple dispatch, which you should definitly check out. You have access to all data science tools, including tensor/deep learning libraries and trivial GPU computing.
The core idea of julia was to bridge this gap, between fast and easy to code languages like python and low-level languages, so that you don't need to rewrite your finished project again for deployment. Have fun :)
I do use Julia for scientific programming, but I'm not sure about using it for creating servers and APIs (based on me never even trying to use it for these purposes).
“I wanted to dump my spaghetti code aside” lmao so relatable
Most of the developers do not know how to use python efficiently
That sounds like a Python problem. Shouldn't most developers know how to use it correctly?
@@fieldmojo5304 My thoughts exactly. The languages itself should have a small gap between possible and efficient code; otherwise, the default is to start thinking about optimizing code from rhe start, which creates overhead to the effort of writing even a simple implementation.
Most of the developers dont know how to code properly
@@envynoirFinally someone said it 😂😂😂
Ideally you must not use non-compilable languages. At all!
Hey nice video! Usually python IA libs are not written in python, but in rust/c/c++. The people use python just for the "easy" part. Try the mojo language, will be nice to see the diff between mojo x python x go
I'll give it a go in the upcoming days.
nice video.. Python ecosystem is very rich.. I hope Mojo delivers
Mojo has good intentions but it will take years, IMO.... Better to use Python with Cython as they are mature or one could try Nim which is now quite stable at version 2.2
Great speed increase.Did the ai inference use gpu? Would be interested to hear about the multi armed bandit recommendation engine.
It was only using CPU. I can make a video explaining multi-armed bandits if it will help you out 😃
would be really nice to explain implementation concepts, thank you.
Excellent video , as data volumes increase so will the demand for more performant solutions
Cool video, and I'm also a big fan of go. Did you consider rewriting the problem code as a C extension for Python, or was that not viable for some reason?
That's a cool option. At the time I didn't think of that, but am intrested to try that out.
@@codeinajiffyalso, the slow python version couldn’t be implemented using numpy, pandas or something like that?
I’m not a DS guy, and I don’t know what that function was doing. But things like deduplication should be doable with pandas (?).
Also, maybe the problem was a poorly optimized implementation.
Python is slow, but these numbers surprises me.
@@greyshopleskin2315 yes general removal of duplicate rows is doable in pandas, but this use case was conditional over the previous rows and I had to loop over the rows here.
Also the loop did a couple more things that were not easily done with pandas operations.
I spent some time trying to optimize the code, because I didn't want to rewrite the whole code base again, but got no major speed increases and thus I went with go
Cool video, did you tried Mojo language? I'm just currios if it is faster than python.
No not yet, but it's on my bucket list to try it out 😍
amazing video man!
Currently learning python thinking about learning go next!
Go is fine but better to learn a full featured higher performance language like C, Rust, Zig, or Nim etc.
Maybe Polaris instead of Pandas could've help. But, Golang rocks too.
Most likely the bottelneck would be in the python code itself not in pandas or numpy (which are supposed to be well optimized).
Pandas is ~3x slower than numpy which is ~2x slower than an ordinary loop in C. Well optimized for what they are but the flexibility of differet usecases they support definitely puts a limit on how fast they can run.
@@benjaminblack91 sure nothing beats pure C as a rule of thumb.
Nice Comparison - thanks 🙂
did you consider julia (programming language) for prototyping? It could be fast enough for inference and can access the ML Libraries...
not yet, but It's on my agenda of things to try out.
Yeah definitely is by far the superior option. I tried importing a file in Pandas, and it took 6 minutes. Julia's DataFrames did it in under 10 seconds. Julia for ML is the only real option. If you want a language like Go, you should use Odin. It's like Go but better in just about every way.
I have a question, like how did you make predictions in real time without being the lib's available for go ?
as there are for python such as scikit etc ?
Great question, It's a multi-armed bandit approach so all I needed to do here was to sample a distribution using the numpy go alternative. But there is sklearn in go language if you want to use it, but note that it doesn't have as many models as the python implementation. There is Tensofrlow and Pytorch in go lang as well.
So, what was the cause? Was it because of the python, the fast api, or the pandas and numpy?
The main cause of slowness was the only function in the code base that did not use pandas and numpy and used a normal for loop. This was slowest part of the code. Also Python's concurrency was bad and it affected scalability a lot, requiring a big number of resources to scale to all the user base.
On the other hand pandas and numpy were actually efficient and fast.
NIce Explanation . Thanks for the vide
I find this very fascinating indeed. Given how much C/C++ optimisation has been done in the popular libraries like numpy, pandas etc I'm very curious what was the key bottleneck that caused the massive (30x?) increase in number of pods required? Sounds like there were some inner loops that were running solely in python.
Awesome! More proof that model inference requires more efficient implementations. What do you think about rust in this context ? Given that there are libraries such as polars and arrow, would implementation effort in rust be worth the performance gains if any at all ?
Rust is on my bucket list of things to try out. I saw a couple articles about Polars being faster than pandas. If that's true that would be very exciting.
Very cool video, nice job and thank you for sharing.
how can panda(c-written) be slow than your custom go implement ?
How can he be sure that his custom code will be more reliable than the already tested and debugged Pandas library? How are we sure that his Python code could not be improved any more?
did you consider polars in python. seems like it would do the trick replacing pandas and numpy
I will give that a go 👍. Thanks 🙏
So what the actual _AI_ code has been written on the Go (or Python) in your project? Rearrange a list, sum it and handle endpoint looks liike no-AI part.
I only moved the deployment part to go lang, but preparing the multi-armed bandit recommender per user was in python. I then read the multi-armed bandits in go lang and predicted the scores in go lang with the numpy alternative.
the problem of Go is that the ML libraries are not as good as python (because python is all C++ behind the scenes)
Isn't C?
@@Cordic45 the code behind the ML libraries on python?, I think is CPP.
@@Luix if u mean tensorflow and pytorch, yes
Python is slow. But it is not easy to beat a well written Python program.
There are options to make a Python program fast e.g. Numpy, Pandas, Numba etc. With the time and experience, you will know, where to optimize. Once, I rewrote an API from Java to Python for my customer. LoC were reduced from 10,000 to 1,000 and my Python code runs 100x faster due to algorithmic optimization. Of course, we can do the same thing with Java and it will run even faster. But we will be lost in the jungle of code before we could come up with some ideas.
That's not true, I have done something similar, since I couldn't recreate entire model in Java I trained my model in python then used the trained model for inference in Java, it was ways faster than python and had avg 200 ms response time only, python had it in seconds for a load test. You can't compare single threaded langauge to a multithreaded one. Python is able to handle large volumes thanks to horizontal scalling else try to create non scalling application then compare
Did you try rust ? The performance is great.
not yet, but It's on my agenda of things to try out.
Brother can u show us simple example on How u used Structs Maps and various ways to make an alternative for pandas. Loved your perspective and insights keep doing.
Sure here is some sudo code for joining two data frames.
df1 = {
'item_id_1': struct({ 'item_name': ..., item_tags:....score:}),
'item_id_2': struct({ 'item_name': ..., item_tags:....score:}),
}
df2 = {
'item_id_1': struct({ 'prediction_score': 0.35}),
'item_id_2': struct({ 'prediction_score': 0.45})
}
for item_id, _ in df1.items():
score = df2[item_id]['prediction_score']
df2[item_id]['score'] = score
and now you have the scores of df2 in df1. effectively merging the two data sources. And because those are maps the lookup time in df2 is bigO(1).
Have you tried implementing numpy and pandas with numba b4 finishing the tests. You should see a much higher speedup especially with parallel computation, math, loops etc just by using vectorization or jit with nogil, caching and static types. Jax is also a good alternative to get a faster speed up in python but that mainly replaces numba. I personally have used jax ports of some ai libraries and seen huge training improvements.
Yeah I actually did. There were two modes one is called object mode which improved speed by less 15%. And another one called nopython mode. In this mode Numba aims to generate machine code that relies solely on Numba-supported operations and data types, avoiding the use of Python objects and the Python C API. This one was way faster than normal python indeed, but the main problem is that it required all data types to be supported in numba. That meant no pandas data frames and other useful tools which made writing code restricting and more time consuming than writing with go lang.
Numbs nor pypy would never match go’s speed
I had a similar experience, when I tried to switch python to nim. I've found out nim does not have the same amount of libraries as python and not everything is available in nim.
That was a good story :D
This is what I've been looking for. I love using Golang but I also love ML
Does your Go program runs in the GPU like python does?
I was only using CPUs on this project, but it would be interesting to see a GPU comparison.
picking right tool for job is what defines a good programmer. it's quite common practice to use some statically compiled language to optimize some bottleneck caused by dynamic language. so saying i fully switched to language Y from language X because of some performance issue is like saying "declared full scale war over some fart"
Hello, can you please do videos on how to deploy AI model with Golang, or any programming languages and some stuff like tensorrt, etc...?
sir can you please upload more object detection(medical domain) or nlp videos soon?i love you contents so much
lol why are go langs data packages slower than structs, maps and loops? I'm a python dev mostly. Strongly considering Go after this video!
Go is much funnier to use that Python, even when Python is very good on its own.
Most of the times you wont need full of pandas but only tiny part of them which you can easily recreated in Go.
Nice job
Have you tried profile the python code to find what was the bottleneck?
yup it was the piece of code that didn't have pandas and numpy. This part was building the discovery screen row by row in a for loop and this was the slowest part here.
@@codeinajiffy it sounds to me that you could find a way to optimise this rather than rewriting
@@breakablecwell loops in python are known to be slow. cannot optimize the runtime of a dynamically typed language that much
What should we do to optimize things, there are so many things to consider 😢
This is false, golang just automagically schedules threads and coroutines with a cool scheduler.
So this guy probably blocked the eventloop and then wondered why the response time didn't scale with the users.
Not saying it was wrong to rewrite to go, because in the end it was easier to get the performance you need.
Python has a lot of gotchas....
why show sklearn, torch, and tf if only numpy and pandas were used? what are the torch, sk learn alternative in go?
I was just making a statement that maybe not all data science libraries will be found go lang. There is tensorflow in go lang with the same package name. and sklearn can also be found in go lang but note that it will have a very limited number of models and functions. It's not as comprehensive as python's library.
@@codeinajiffy that makes sense thanks! Great video btw keep it up
Make video how to use python efficiently
Sir, In future, golang has the potential to replace python.
Can you share your point of view?
What about train in python, export the model to file and deploy for Go read it?
Yup that's what I did. Only the deployment was in Go but preparing the multi--armed bandits was in in python
@@codeinajiffy Nice!
Did you try numba? It is supposed to make normal python code faster
Yeah I actually did. There were two modes one is called object mode which improved speed by less 15%. And another one called nopython mode. In this mode Numba aims to generate machine code that relies solely on Numba-supported operations and data types, avoiding the use of Python objects and the Python C API. This one was way faster than normal python indeed, but the main problem is that it required all data types to be supported in numba. That meant no pandas data frames and other useful tools which made writing code restricting and more time consuming than writing with go lang.
how much work did you just do on your own
Yes, Go is a wonderful work in progress.
I think Go will be good for making high performance chains/agents instead of using LangChain
so how you could switch to GO for AI development if it has no tensorflow or scikit???
but it has ahahah
You need to do it in Mojo. Its aim is to become the default language for all things AI and ML.
I'd use DuckDB instead of dataframes imo. Faster than pandas when using it from python, but is also available in Go and makes it easy to port code between the two.
Sounds interesting, I'll give it a go. Thanks for the tip
Why I Switched from Go Lang to Mojo (Python) for AI Deployment... ^^
What about Mojo? It's not finished yet, but its very promising.
yeah it looks promising. It's on my bucket list of things to try out 👍
I would wish you submit request to the developers of that package
I am planning to do so
lol regardless of the result of the video. i felt that Austin power joke lol inlaughed and they laughed
Why not just write that slow Python bit in C++ and then call it from your Python code using pybind?
I didn't think of that, but It sounds like a good idea to try out.
Did anyone get the answer right this adds 5 credits towards your PHD in python and golang 😂
Only learn GO if you have to other wise python is plenty
Learning Go is not difficult though, an experienced programmer can become productive with it in a weekend so the investment needed is not that big. But the ops team will generally like you a lot more if you give them a go binary
bro you totally could have optimized your Python to handle it. just sayin
but python is c
What?
what about node express server with ngnix
I did not try that, but I might try it out in the near future.
So superfecial explanation, how did you "improved" the lib using restructure of your code,😅
Why not to rust?
I wanted to use a language already being used in the company, but I am considering to try out soon.
Try Rust them 😂
mojo maybe
Wait till you see unexpected Go issues popping up 😂
Goatlang🗿
Spoiler alert... He didn't 😅
RUST!!!
This is going to be my next video in the next week.
Talk is cheap, show me the code.
Pandas is the opposite of highly efficient
Switched from shit to another shit
Because python should be thrown in the dumpster.
Because you can't code in c? LOLOLOL